- NordHero

This blog post explains how to build and optimize a serverless data pipeline on AWS, from data ingestion to business insights. In a cloud environment, data is scattered across various sources, making a centralized data lake the best practice for business intelligence. The solution utilizes AWS Glue Jobs for extracting, transforming, and loading data into Amazon S3, AWS Glue Crawlers for schema management, Amazon Athena for SQL queries, and Amazon QuickSight with its SPICE engine for fast data visualization. The post demonstrates optimization techniques for AWS Glue ETL processes and explains how to ensure QuickSight dashboards display the most current data. To automate the entire pipeline, AWS Step Functions are triggered by CloudWatch Events when Glue Crawlers complete, subsequently refreshing QuickSight datasets in the correct order. The concepts are illustrated using real-world Helsinki public transport data.

Want to be the hero of cloud?

Great, we are here to help you become a cloud services hero!

NordHero Oy – Jyväskylä

NordHero Oy – Helsinki

Contact us