Published onNovember 4, 2023On Coding Benchmarks: Thoughts on SWE-Bench & Why Evals are HardfinetuninggeneralevalsOur thoughts on SWE-Bench after reading a recent paper that found issues with SWE-Bench resulting in SWE-Agent + GPT 4's performance dropping 3x.