Cursor

AI-first IDE forked from VS Code with inline completions, chat, and an agent mode that executes multi-file edits.

ProvisionalEarly evidence
Early verdict — controlled benchmark pending

The score on this page is a provisional research-based estimate. No controlled benchmark suite has been completed for Cursor yet, so this verdict cannot be cited as final proof and Cursor is not eligible for "Verdict Certified" status. When a verified run lands, it will appear in the Evidence Timeline below and the status badge above will switch to "Verified".

Want this agent benchmarked sooner? Sponsored testing gets it into the queue without affecting the verdict.

Verdict

Best-in-class IDE experience with strong agent mode. Verdict reflects public reputation; placeholder pending controlled benchmark.

Best for
  • Developers who live in their editor
  • Fast inline completions
  • Codebase navigation and Q&A
Not ideal for
  • Headless / CI runs (uses an editor surface)
  • Teams with strict IDE standardization on JetBrains

Failure modes we'd watch

  • Agent mode can apply edits faster than a human can review
  • Context selection is opaque on large monorepos
  • Costs scale with model choice — easy to overspend

Evidence Timeline

No controlled benchmark runs published yet for Cursor. The score above is a provisional estimate pending the first run. New runs land on the runs page.
Needs verification

The following fields are flagged for verification before we publish a non-provisional verdict:

  • pricingSummary
  • scoreBreakdown