site-logo

JAWS PANKRATION 2024

site-logo
HomeNewsTimetableCfPCommitteePromotionFollow UpPrivacy Policy

Serverless website analytics with Lambda@Edge

Lv300

Lv300

8/24/2024 10:00 (UTC)

Session Info

In this talk we will dive into the innovative approach of utilizing AWS Lambda@Edge to gather statistics for a static website, focusing on server-side data collection for page views, user locations, and devices without client-side scripts.

This solution leverages AWS services like S3, CloudFront, StepFunctions, Glue, Athena, and Managed Grafana to process and analyze data efficiently, offering scalability, cost-effectiveness, and minimal impact on website performance.

We'll delve into the architectural setup, data flow, and performance improvements.

Jimmy  Dahlqvist

Jimmy Dahlqvist

- AWS Community Builders -

- AWS User Community Leaders -

- AWS Ambassadors(APN) -

- AWS Gold Jacket Members(APN) -



Session Category
Analysis
Application integration


AWS Services
AWS Lambda
CloudFront
EventBridge
S3
Athena
Glue

Session Materials


Session Summary (by Amazon Bedrock)
    The speaker discusses creating a serverless website analytics solution using AWS services, particularly for static websites. The need arose from Google Analytics' deprecation of Universal Analytics and dissatisfaction with alternative solutions. Key requirements included: 1. No client-side tracking 2. Understanding page access, time, and location 3. Serverless implementation 4. Decoupled ingestion, storage, and analytics The initial architecture involved CloudFront, Lambda@Edge, EventBridge, Kinesis Firehose, S3, Athena, and Glue. However, challenges emerged: 1. Deployment failures due to incompatibility between CloudFront functions and Lambda@Edge 2. Missing time information 3. Page sluggishness from synchronous EventBridge invocations The refined solution addressed these issues: 1. Moved Lambda function to Origin Response 2. Implemented Step Functions for better data processing 3. Used Kinesis Firehose with a transformer to add newlines 4. Improved data parsing and categorization The final architecture includes: 1. CloudFront function 2. Step Function for data processing 3. Kinesis Firehose for data ingestion 4. S3 for storage 5. Glue for data crawling 6. Athena for querying 7. Amazon Managed Grafana for visualization The solution costs approximately $2 per month, primarily for S3 and Glue. It provides insights into blog post access, reader locations, and popular posts. The speaker shares a sample dashboard showing daily unique readers and a global access map. The presentation concludes by mentioning that the solution is available on serverlesshandbook.com for others to implement and adapt.

©JAWS-UG (AWS User Group - Japan). All rights reserved.