site-logo

JAWS PANKRATION 2024

Detect operational anomalies in Serverless applications with Amazon DevOps Guru

Lv200

Lv200

8/24/2024 07:20 (UTC)

Session Info

In this talk we’ll use a standard serverless application that uses API Gateway, Lambda, DynamoDB, SQS.

We'll give an overview how Amazon DevOps Guru recognizes operational issues and anomalies like increased latency and error rates (timeouts, throttling and increased latency).

We will also explore DevOps Guru "Proactive Insights" which recognize configurational anti-patterns like missing failure destination on Kinesis Data Streams or DLQ on SQS or over-provisioning of AWS services like DynamoDB tables.

Amazon DevOps Guru analyzes data like application metrics, logs, events, and traces to establish baseline operational behavior and then uses ML to detect anomalies.

The service uses pre-trained ML models that are able to identify spikes in application requests, so it knows when to alert and when not to.

Vadym  Kazulkin

Vadym Kazulkin

- AWS Community Builders -



Session Category
Computing
etc


AWS Services
API Gateway
Lambda
DynamoDB
SQS



Session Summary (by Amazon Bedrock)
    The speaker introduces Amazon DevOps Guru, a fully managed AI platform powered by machine learning for operational problem-solving. DevOps Guru aims to reduce human intervention in IT processes and can detect issues like increased latency, error rates, and resource constraints. Key features of DevOps Guru include: 1. Automatic detection of operational incidents 2. Noise reduction 3. Integration with AWS services 4. Specific detectors for various metrics (error rates, availability, latency) 5. Understanding of periodic behaviors (e.g., Black Friday, Christmas) The speaker demonstrates a simple application using API Gateway, Lambda functions, and DynamoDB to showcase DevOps Guru's capabilities. The tool provides a dashboard with insights, aggregated metrics, and incident analysis. Benefits of using DevOps Guru: 1. Correctly recognizes most operational issues 2. Takes about 7 minutes to create incidents 3. Provides recommendations for fixes 4. Offers infrastructure change comparisons when AWS Config is activated Areas for improvement: 1. More precise recommendations 2. Better differentiation between error types 3. Enhanced support for observability services like X-ray and third-party tools Overall, the speaker finds DevOps Guru helpful in allowing teams to focus more on business logic while leveraging AWS's extensive experience in operating services across regions.

©JAWS-UG (AWS User Group - Japan). All rights reserved.