All Posts

Published on
June 10, 2025
Codebase-Specific RL: Fine-tuning LLMs for generating unit tests that boost coverage
RL general finetuning
How codebase-specific RL fine-tuning enables smaller, specialized AI models to significantly outperform general-purpose models at generating impactful unit tests.
Published on
November 4, 2024
On Coding Benchmarks: Thoughts on SWE-Bench & Why Evals are Hard
finetuning general evals
Our thoughts on SWE-Bench after reading a recent paper that found issues with SWE-Bench resulting in SWE-Agent + GPT 4's performance dropping 3x.
Published on
October 13, 2024
Improving Code Completion LLMs with Repo-Specific Finetuning
finetuning general
The inner workings of our repo-specific finetuning pipeline that yields 50% more accurate code completions.
Published on
October 12, 2024
Why Finetune Code LLMs for your Codebase?
finetuning general
Our thesis on why finetuning with internal engineering data sources is crucial for complex, enterprise-grade codebases with unique patterns & frameworks.

Codebase-Specific RL: Fine-tuning LLMs for generating unit tests that boost coverage