New Amazon CloudWatch Database Insights: Comprehensive database observability from fleets to instances

Observing your Amazon Aurora databases is now a whole lot easier. Instead of spending time setting up telemetry, building dashboards, and configuring alarms, you just open Amazon CloudWatch Database Insights and take a look. With no further setup, you can monitor the health of all of your Amazon Aurora MySQL and PostgreSQL instances in the selected Region:

Each of the sections contains a wealth of detail and I’ll get to that in a moment (this may be the ultimate “but wait, there’s more” post). From this view, I can open the filter control on the left and filter the set of instances in a couple of different ways. For example, I can filter for all of the instances running Amazon Aurora MySQL, and see that I have 66 such instances, with 3 raising alarms:

I can save the filter as a Fleet (note that Fleets are defined by specific properties and tags of the database instances and as such are inherently dynamic):

And then I can see the overall health of the fleet with a click. The entire page updates to reflect the fleet; I focus on the summary:

Behind the scenes, Database Insights looks for CloudWatch alarms that include a DBInstanceIdentifier dimension, and uses these alarms to establish a correlation between database instances and alarms. This, along with other built-in heuristics and correlation steps, allows Database Insights to deliver helpful, well-organized information that will help you to better understand the overall health of your fleet and to dive deep in order to find bottlenecks and other issues.

Clicking on an instance (represented by a hexagon) reveals details; I click on the instance name (demo-mysql-reader0) to learn more:

In the per-instance view I can also see a myriad of details:

Each of the tabs at the bottom provides additional insights into what’s happening inside the database instance. For example, selecting DB Load Analysis / Top SQL / SQL Metrics shows me which SQL statements are imposing the heaviest load, along with 29 additional metrics (not shown):

From past experience, I know that finding and understanding slow queries is a tedious yet important task. with Database Insights I can see patterns that are common to the slow queries, as well as the actual queries:

With help from AWS X-Ray, Application Signals, and the AWS Distro for OpenTelemetry SDK, I can see the services and operations that originate the queries to the database instance:

The red X indicates that this operation is failing the associated Service Level Objective (SLO), an application performance monitoring aspect of Application Signals. An SLO defines the reliability of a service against customer expectations, and can be set up by selecting the service and clicking Create SLO. There are a couple of steps and some very helpful options, but at the core a SLO is measured as a percentage of successful requests over an extended period of time:

If the database instance is configured to send logs to CloudWatch Logs, I can see and search the logs, filtered by the selected time period, and within a particular log group:

There’s still a lot more to explore at the fleet level. For example, I can see the ten calling services which drive the highest DB load (again, this is powered by AWS X-Ray, Application Signals, and the AWS Distro for OpenTelemetry SDK):

And I can see the top 10 instances with respect to any of eight different metrics:

I could go on all day, but I will leave the rest for you to explore. As I never tire of saying, this feature is available now and you can start using it today.

Things to Know
Here are a couple of things to know about Database Insights:

Supported Databases – You can use Database Insights with Amazon Aurora MySQL and Amazon Aurora PostgreSQL database instances.

Pricing – There is a per-hour, per-database instance charge based on the average number of vCPUs used (for provisioned instances) or Aurora Capacity Units (for Serverless v2 databases) monitored, with separate charges for ingestion and storage of database logs. See the CloudWatch Pricing page for more information.

Regions – This feature is available in all commercial AWS Regions.

Jeff;


Blog Article: Here

  • Related Posts

    OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models

    The December 17 release of OpenAI’s o1 model is now available in GitHub Copilot and GitHub Models, bringing advanced coding capabilities to your workflows.

    The post OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models appeared first on The GitHub Blog.

    Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers

    An interview with economic researchers analyzing the causal effect of GitHub Copilot on how open source maintainers work.

    The post Inside the research: How GitHub Copilot impacts the nature of work for open source maintainers appeared first on The GitHub Blog.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Announcing CodeQL Community Packs

    60 of our biggest AI announcements in 2024

    60 of our biggest AI announcements in 2024

    Our remedies proposal in DOJ’s search distribution case

    Our remedies proposal in DOJ’s search distribution case

    How Chrome’s Autofill can drive more conversions at checkout

    How Chrome’s Autofill can drive more conversions at checkout

    The latest AI news we announced in December

    The latest AI news we announced in December

    OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models

    OpenAI’s latest o1 model now available in GitHub Copilot and GitHub Models