When people imagine developers leveraging public code, they often think of developers using code that’s already been written and building upon it to create something new. It’s faster to build from something than to build from nothing, right? While this is transactionally true, there’s more to it than that. Central to the open source culture is not only leveraging public code but being able to explore the repository in which it exists, learn about code origins, see who else has worked on the code, increase knowledge sharing among the community, and understand possible licensing structures.
With the advent and scale of new AI technology solutions like GitHub Copilot, the way developers work is changing, and with those changes it’s important to incorporate the habits and practices developers value in new ways of working. In this case, identifying myriad sources where code may appear required the creation of a new solution—one that enables developers to prioritize these values while fostering learning and knowledge sharing at scale.
GitHub engineering teams got to work to address this, and today we’re announcing the general availability of code referencing in GitHub Copilot Chat and GitHub Copilot code completions. Developers can now choose whether to block suggestions containing matching code or allow those suggestions with information about the matches. This feature is currently available in VS Code and will be more widely available soon.
How code referencing works
With billions of files to index and a latency budget of only 10-20ms, it’s a miracle of engineering that finding specific matches is even possible. Still, when a match is found (if public code matching is allowed), a notification appears in the editor showing: (1) the matching code, (2) the file where that code appears, and (3) licensing info (if any) detected in the relevant repository. This information is shown for all the public code matches that are detected in a model response.
We’re also excited to announce that GitHub has partnered with Microsoft Azure to make the code referencing API available on Azure AI Content Safety. Azure AI Content Safety users can leverage this feature via the protected material detection for code filter. We believe in transparency as a core value of the open source community and want to ensure that this capability is available to everyone, no matter which tool you use.
Whether you’re using GitHub Copilot or other generative AI tools leveraging the code referencing API, you can depend on transparency so that you can make informed development decisions for the project at hand.
Why code referencing matters
The power of code referencing for individual developers
For individual developers using GitHub Copilot, this adds a layer of transparency and keeps you in the driver’s seat. We’ve always had a filter that users can apply to prevent Copilot from producing suggestions matching public code. Now, with code referencing, you have the additional option to allow all suggestions while still utilizing Copilot because if Copilot produces suggestions of 150 characters or more that match public code, you’ll be notified about the matches found, the repositories the code was found in, and potential licenses detected. This information helps you make more informed decisions so that you can build, with Copilot, with confidence.
The power of code referencing for businesses
Copilot helps organizations innovate faster than ever. To help businesses innovate responsibly, the option to block suggestions matching public code has always been available to admins, and using that filter ensures customers are protected by GitHub’s indemnification commitment.
For dev teams wanting to benefit from the learning code referencing enables, GitHub’s indemnity will now extend to their use of code referencing where GitHub Copilot Business or GitHub Copilot Enterprise customers comply with cited licenses. This means that Copilot Business and Copilot Enterprise customers can expand their teams’ Copilot context, use, and effectiveness while keeping existing contractual protections.
We’ve collaborated to make code referencing a reality, not just for GitHub, but for all AI dev tools, and have been driven by the values that the open source community has long cultivated and upheld—that surfacing and sharing information can unlock innovation in new and groundbreaking ways. As we continue to grow and scale our capabilities with AI, GitHub is excited to empower developers with greater transparency, knowledge sharing, and tools for innovation.
Learn more about code referencing.
Tags:
Written by
Blog Article: Here