GitHub Availability Report: January 2025


In January, we experienced three incidents that resulted in degraded performance across GitHub services.

January 09 1:26 UTC (lasting 31 minutes)

On January 9, 2025, between 01:26 UTC and 01:56 UTC, GitHub experienced widespread disruption to many services, with users receiving 500 responses when trying to access various functionality. This was due to a deployment which introduced a query that saturated a primary database server. On average, the error rate was 6% and peaked at 6.85% of update requests.

We were able to mitigate the incident by identifying the source of the problematic query and rolling back the deployment. The internal tooling and our dashboards surfaced the relevant data that helped us quickly identify the problematic query. It took us a total of 14 minutes from the time to engage to finding the errant query.

However, we are investing in tooling to detect problematic queries prior to deployment to prevent and to reduce our time to detection and mitigation of issues like this one in the future.

January 13 23:35 UTC (lasting 49 minutes)

On January 13, 2025, between 23:35 UTC and 00:24 UTC, all Git operations were unavailable due to a configuration change related to traffic routing and testing that caused our internal load balancer to drop requests between services that Git relies upon.

We mitigated the incident by rolling back the configuration change.

We are improving our monitoring and deployment practices to improve our time to detection and automated mitigation for issues like this in the future.

January 30 14:22 UTC (lasting 26 minutes)

On January 30, 2025, between 14:22 UTC and 14:48 UTC, web requests to github.com experienced failures (at peak the error rate was 44%), with the average successful request taking over three seconds to complete.

This outage was caused by a hardware failure in the caching layer that supports rate limiting. In addition, the impact was prolonged due to a lack of automated failover for the caching layer. A manual failover of the primary to trusted hardware was performed following recovery to ensure that the issue would not reoccur under similar circumstances.

As a result of this incident, we will be moving to a high availability cache configuration and adding resilience to cache failures at this layer to ensure requests are able to be handled should similar circumstances happen in the future.


Please follow our status page for real-time updates on status changes and post-incident recaps. To learn more about what we’re working on, check out the GitHub Engineering Blog.

Blog Article: Here

  • Related Posts

    Community managers in action: Leading a developer community for good

    GitHub’s Digital Public Goods Open Source Community Manager Program just wrapped up a second successful year, helping Community Managers gain experience in using open source for good.

    The post Community managers in action: Leading a developer community for good appeared first on The GitHub Blog.

    Anthropic’s Claude 3.7 Sonnet hybrid reasoning model is now available in Amazon Bedrock

    Claude 3.7 Sonnet hybrid reasoning model is Anthropic’s most intelligent model to date excelling at coding and powering AI agents. It is the first Claude model to offer extended thinking—the ability to solve complex problems with careful, step-by-step reasoning.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    UFL Leverages Salesforce’s Agentforce To Provide World Class Service

    UFL Leverages Salesforce’s Agentforce To Provide World Class Service

    Leading Companies of All Sizes and Industries Are Transforming to Become Agentforce Companies

    Leading Companies of All Sizes and Industries Are Transforming to Become Agentforce Companies

    AI Agents Will Become the New UI, and Apps Take a Backseat

    AI Agents Will Become the New UI, and Apps Take a Backseat

    March Into Gaming With GeForce NOW’s 14 Must-Play Titles for Spring

    March Into Gaming With GeForce NOW’s 14 Must-Play Titles for Spring

    Telenor Builds Norway’s First AI Factory, Offering Sustainable and Sovereign Data Processing

    Telenor Builds Norway’s First AI Factory, Offering Sustainable and Sovereign Data Processing

    Agentic AI Leaders to Showcase Latest Advancements at NVIDIA GTC

    Agentic AI Leaders to Showcase Latest Advancements at NVIDIA GTC