OpenHands
Open-source coding agent (formerly OpenDevin) from All Hands AI. Runs locally or self-hosted, model-agnostic.
The score on this page is a provisional research-based estimate. No controlled benchmark suite has been completed for OpenHands yet, so this verdict cannot be cited as final proof and OpenHands is not eligible for "Verdict Certified" status. When a verified run lands, it will appear in the Evidence Timeline below and the status badge above will switch to "Verified".
Want this agent benchmarked sooner? Sponsored testing gets it into the queue without affecting the verdict.
Verdict
Open-source autonomous coder useful for research and customization. Reliability depends heavily on model and harness configuration. Placeholder verdict.
- ✓Researchers benchmarking agent loops
- ✓Teams that need self-hosting
- ✓Builders who want to extend the agent harness
- ✕Non-engineers
- ✕Teams that want a polished managed UX
Failure modes we'd watch
- ⚠Loop quality varies dramatically by model
- ⚠Sandbox setup can fail in ways that look like agent failure
- ⚠Long autonomous runs can cost more than a human dev hour
Evidence Timeline
The following fields are flagged for verification before we publish a non-provisional verdict:
- pricingSummary
- scoreBreakdown