Published onNovember 4, 2023On Coding Benchmarks: Thoughts on SWE-Bench & Why Evals are HardfinetuninggeneralevalsOur thoughts on SWE-Bench after reading a recent paper that found issues with SWE-Bench resulting in SWE-Agent + GPT 4's performance dropping 3x.
Published onOctober 13, 2023Improving Code Completion LLMs with Repo-Specific FinetuningfinetuninggeneralThe inner workings of our repo-specific finetuning pipeline that yields 50% more accurate code completions.
Published onOctober 12, 2023Why Finetune Code LLMs for your Codebase?finetuninggeneralOur thesis on why finetuning with internal engineering data sources is crucial for complex, enterprise-grade codebases with unique patterns & frameworks.