Google researchers have developed DIDACT (Dynamic Integrated Developer ACTivity), a methodology for training machine learning models to assist software developers. DIDACT leverages the entire software development process as training data, including code edits, code reviews, interactions with development tools, and more.
DIDACT defines various tasks related to individual developer activities, such as repairing broken builds, predicting or addressing code review comments, renaming variables, and editing files. These tasks are represented using a common formalism:
Google has deployed three DIDACT tools internally, integrated into different stages of the development workflow:
These tools have received enthusiastic feedback from thousands of professional developers at Google, indicating their usefulness in improving developer productivity.
DIDACT exhibits surprising capabilities, enabled by its multimodal nature and the use of developer activity history. One such capability is history-augmented code completion, where the model can complete code snippets based on the developer's recent edits, anticipating their next steps.
DIDACT demonstrates the potential for developing general-purpose AI assistants that can aid developers across the entire software development process. By leveraging machine learning models trained on real-world developer activities, DIDACT paves the way for AI systems that can collaborate with human developers, enhancing productivity and code quality.
The DIDACT approach complements the advancements in large language models, offering tools that:
The DIDACT project is a multi-year collaboration among Google Research, Google Core Systems and Experiences, and DeepMind, involving numerous researchers, engineers, and leaders across Alphabet.
Ask anything...