Today, OpenAI released OpenAI o1, a new series of AI models equipped with advanced reasoning capabilities to solve hard problems. Like you, we are excited to put the new o1 model through its paces and have tested integrating o1-preview with GitHub Copilot. While we are exploring many use cases with this new model, such as debugging large-scale systems, refactoring legacy code, and writing test suites, our initial testing showed promising results in code analysis and optimization. This is because of o1-preview’s ability to think through challenges before responding, which enables Copilot to break down complex tasks into structured steps.
In this blog, we’ll describe two scenarios showcasing the new model’s capabilities within Copilot and how it could work for your day to day. Keep reading for an inside look at what happens when a new model launches, what we test, and how we approach AI-powered software development at GitHub.
Optimize complex algorithms with advanced reasoning
In our first test, we wanted to understand how o1-preview could help write or refine complex algorithms, a task that requires deep logical reasoning to find more efficient or innovative solutions. Developers need to understand the constraints, optimize edge cases, and iteratively improve the algorithm without losing track of the overall objective. This is exactly where o1-preview excels. With this in mind, we developed a new code optimization workflow that benefits from the model’s reasoning capabilities.
In this demo, a new built-in Optimize chat command provides rich editor context out of the box, like imports, tests, and performance profiles. We tested how well o1-preview could analyze and iterate code to come up with a more thorough and efficient optimization in one shot.
The video shows optimizing the performance of a byte pair encoder used in Copilot Chat’s tokenizer library (yes, this means we use AI to optimize a key AI development building block).
This was a real problem the VS Code team faced, as Copilot needs to repeatedly tokenize large amounts of data while it assembles prompts.
The results highlight how o1-preview’s reasoning capability allows a deeper understanding of the code’s constraints and edge cases, which helps produce a more efficient and higher quality result. Meanwhile, GPT-4o sticks to obvious optimizations and would need a developer’s help to steer Copilot towards more complex approaches.
Beyond handling complex code tasks, o1-preview’s math abilities shine as it effortlessly calculates the benchmark results from the raw terminal output, then summarizes them succinctly.
Optimize application code to fix a performance bug
In this next demo on GitHub, o1-preview was able to identify and develop a solution for a performance bug within minutes. The same bug took one of our software engineers a few hours before they came up with the same solution. At the time, we wanted to add a folder tree to the file view in GitHub.com, but the number of elements was causing our focus management code to stall and crash the browser. The video shows side-by-side the difference of using GPT-4o and o1-preview to try and resolve the issue:
With 1,000 elements managed by this code, it was hard to isolate the problem. Eventually we implemented a change that improved the runtime of this function from over 1,000ms to about 16ms. If we had Copilot with o1-preview, we could have quickly identified the problem and fixed it faster.
Through this experimentation, we found a subtle but powerful difference, which is how deliberate and purposeful o1-preview’s responses are, making it easy for the developer to pinpoint problems and quickly implement solutions. With GPT-4o, a similar prompt might result in a blob of code instead of a solution with recommendations broken down line by line.
Bringing the power of o1-preview to developers building on GitHub
Not only are we excited to experiment with integrating o1-preview into GitHub Copilot, we can’t wait to see what you’ll be able to build with it too. That’s why we’re bringing the o1 series to GitHub Models. You’ll find o1-preview and o1-mini, a smaller, faster, and 80% cheaper model, in our marketplace later today, but because it is still in preview you’ll need to sign up for Azure AI for early access.
Stay tuned
As part of Microsoft’s collaboration with OpenAI, GitHub is able to constantly explore how we can leverage the latest AI breakthroughs to drive developer productivity, and, most importantly, increase developer happiness. Although these demos showcase o1-preview’s enhanced capabilities for two specific optimization problems, we’re still early in our experimentation and are excited to see what else it can do.
We’re currently exploring more use cases across Copilot—in IDEs, Copilot Workspace, and on GitHub—to leverage o1-preview’s strong reasoning capabilities to accelerate developer workflows even further. The advancements we’re showcasing today barely scratch the surface of what developers will be able to build with o1-preview in GitHub Copilot. And with the expected evolution of both the o1 and GPT series, this is just the beginning.
Interested in trying out the latest Copilot and AI innovations?
- Sign up for GitHub Copilot Workspace
- Sign up to try GitHub Models
- Sign up to try Copilot fine-tuned models
Tags:
Written by
Blog Article: Here