Loading…
Loading…
42 weighted questions across integration, pricing, security, vendor risk, reliability, support, performance, and governance. Live scoring with deal-breaker detection. Email the result.
Run the scorecard against each vendor in your shortlist. For each question, answer Yes / Partial / No based on what the vendor demonstrably offers (verified in their docs, contract, or pilot — not what their salesperson promised). The live score updates as you answer. Compare scores across vendors to make a defensible decision.
Questions weighted 5 represent capabilities so critical that 'no' should disqualify the vendor regardless of strong scores elsewhere. Examples: 'Data isolation (our prompts aren't used to train other models)', 'Data export rights in contract', 'Pricing predictable at our scale'. A single weight-5 'no' triggers a walk-away verdict.
Simple yes/no checklists treat every requirement as equally important — they don't. A vendor missing a 'public changelog' (weight 2) is fine; a vendor missing 'data export rights' (weight 5) is dangerous. Weighted scoring forces explicit trade-off thinking and produces a defensible final number.
10-20 minutes per vendor, faster after the first one. Most teams can't answer every question from public information — that's the point. The questions where you check 'unsure' become the questions you ask in the next sales call.
Use the 'Email me this scorecard' button below to send the full breakdown to yourself or your team. Each vendor evaluation can be re-run later if details change. We don't store data server-side — your answers live in your browser.
Based on Forrester and Gartner AI agent procurement frameworks (2024-25) plus 40+ post-mortems of real AI agent migrations: what went wrong, what should have been asked before signing. The categories — integration depth, pricing transparency, security/compliance, vendor risk, reliability, support, performance, governance — map to the top recurring failure modes.
Progress: 0 / 41
Live score
0%
0.0 of 152 points (0/41 answered)
Answer more questions to get a meaningful score.
Native integration with our CRM (Salesforce / HubSpot / Pipedrive)?×5
Native integration with our help desk (Zendesk / Intercom / Freshdesk)?×4
Native Slack / Microsoft Teams integration?×3
Public, documented REST API with webhooks?×4
Custom action / tool support (the agent can call our internal APIs)?×4
SSO / SCIM identity provisioning?×3
Required for enterprise IT
Pricing model published on website (not gated behind 'Contact sales')?×3
Pricing predictable at our scale (no usage surprises)?×5
Free tier or trial that exercises real production workflows?×2
Contract terms allow exit without termination penalty?×4
Renewal price terms locked or capped (e.g. CPI-only increases)?×4
Industry average is 15-25% hikes at renewal
SOC 2 Type II report available under NDA?×5
HIPAA BAA / GDPR / industry-specific compliance as needed?×5
Data isolation: our prompts and data are not used for training other models?×5
Audit logs / SIEM export for all agent actions?×4
Per-action approval gates (HITL) configurable for high-risk operations?×4
PII redaction / DLP filters on inputs and outputs?×4
EU data residency option if needed?×3
Vendor has 3+ years operating history or strong backers?×3
Data export rights in contract (workflows, prompts, conversation history)?×5
Critical for avoiding lock-in
Model portability: can we BYO LLM (OpenAI, Anthropic, etc.)?×3
On-premise / VPC deployment option (if regulated industry)?×3
References from 3+ customers at our scale willing to take a call?×4
Published SLA with credits for downtime?×4
Observability dashboards (latency, error rate, deflection rate) included?×4
Built-in eval framework (test suites, regression detection)?×4
Versioned prompts / rollback when an update degrades quality?×3
Multi-region failover / no single point of failure?×3
Dedicated implementation manager during first 90 days?×4
Response time SLA: critical issues under 4 hours?×4
Customer-facing changelog / public roadmap?×2
Documentation depth: API reference, recipes, common-case examples?×3
User community / Slack / Discord with vendor staff present?×2
Latency under 2s for typical responses?×4
Handles our peak load (peak QPS) without throttling?×5
Accuracy on a sample of our real data is acceptable (>85%)?×5
Multilingual coverage for our customer base?×3
Voice / SMS / email channels as needed?×3
Role-based access controls + least-privilege defaults?×3
Cost caps / budget alerts to prevent runaway spend?×4
Built-in red-team / safety evaluation framework?×3
We'll send a detailed PDF including industry benchmarks for teams your size, vendor comparisons, and a 30-day implementation checklist.
Email me this scorecard42 questions across 8 categories: integration depth, pricing transparency, security & compliance, vendor risk, reliability, support, performance, and governance.
Weighting (1-5): 5 = deal-breaker if missing; 4 = strong negative if missing; 3 = significant; 2 = mild; 1 = nice-to-have. Three answer options per question — Yes (full credit), Partial (50%), No (zero).
Verdict thresholds: 85%+ = move forward; 70-85% = negotiate gaps; 50-70% = concerns outweigh fit; <50% = poor fit. Any single deal- breaker (weight-5 "no") flips the verdict to walk-away regardless of total score.
Source weights informed by 2024-25 SaaS procurement analyst frameworks (Forrester, Gartner) and the post-mortem of 40+ AI-agent migrations we tracked in 2025.