Existing frontier models like GPT-4 struggle with large code edits, exhibiting issues such as laziness, inaccuracy, and slow speeds. Editing hundreds of lines of code often requires multiple model calls, potentially leading to infinite loops. Even small edits are prone to bugs. This significantly hinders programmer workflow.
Cursor addresses these challenges by training a specialized model for a critical aspect of code editing – "fast apply." This model excels at rapidly and accurately applying edits to existing code.
Cursor uses a two-stage approach to code edits: planning and applying. The planning phase involves a chat interface with a powerful frontier model. The applying phase, handled by the fast-apply model, needs to be instantaneous.
Cursor's approach focuses on rewriting the entire file instead of using diffs, this is due to LLMs struggling with diff-formatted edits. Diffs force models to work with fewer tokens, leading to decreased accuracy. Additionally, models often have difficulty outputting and managing line numbers correctly within diff formats.
The Cursor model was rigorously evaluated against GPT-4 and other models, using Claude-3 Opus as a grader. Results show significant improvements in both accuracy and speed, particularly in handling large code edits. The approach was compared to other models like GPT-4 and even GPT-4 Turbo.
Cursor utilizes a custom speculative decoding algorithm, "speculative edits," to further enhance speed. This algorithm leverages the strong prior on draft tokens during code editing, making it significantly faster than traditional methods.
The fast-apply model was trained using a combination of real and synthetic data. Synthetic data was generated using GPT-4 and other language models. This data was curated to focus on improving the model's ability to accurately and quickly apply edits to code files. This was trained using Llama 3 and Deepseek models, leading to the Llama-3-70b-ft model.
Future work will focus on improving the model's capabilities, including expanding the context window to handle even larger files and distilling the model into a smaller, faster version. Reinforcement learning will also be explored to further refine the accuracy of the model.
Ask anything...