Week 7 Worklog

Week 7 Objectives:

  • Master AWS Analytics services: Kinesis, Glue, Athena, QuickSight
  • Understand DynamoDB fundamentals and design patterns
  • Learn data processing with AWS Glue, DataBrew, and EMR
  • Build end-to-end data pipelines and analytics solutions
  • Practice data visualization with QuickSight dashboards

Tasks to be carried out this week:

DayTaskStart DateCompletion DateReference Material
1- Study AWS Analytics services overview
  + Kinesis for streaming data
  + Glue for ETL
  + Athena for querying
  + QuickSight for visualization
- Learn DynamoDB basics
2025/10/202025/10/20https://docs.aws.amazon.com/kinesis/
2- Lab 35: Data streaming pipeline
  + Create S3 bucket and Kinesis Firehose
  + Generate sample data
  + Create Glue Crawler
  + Query with Athena
  + Visualize with QuickSight
2025/10/212025/10/21https://000035.awsstudygroup.com/
3- Lab 39: DynamoDB hands-on
  + Explore DynamoDB console
  + Create tables and items
  + Configure backups
  + Advanced design patterns
  + Build serverless applications
2025/10/222025/10/22https://000039.awsstudygroup.com/
4- Lab 40: DynamoDB cost optimization
  + Prepare and build database
  + Analyze data and costs
  + Configure tagging for cost allocation
  + Monitor usage patterns
2025/10/232025/10/23https://000040.awsstudygroup.com/
5- Lab 60: AWS CLI and SDK
  + Use CloudShell
  + Practice AWS Console operations
  + Work with AWS SDK
- Lab 70: AWS Glue DataBrew
  + Create Cloud9 instance
  + Upload dataset to S3
  + Profile and transform data
2025/10/242025/10/24https://000060.awsstudygroup.com/
https://000070.awsstudygroup.com/
6- Lab 72: Complete data pipeline
  + Ingest and store data
  + Catalog with Glue
  + Transform with Glue, DataBrew, EMR
  + Analyze with Athena and Kinesis Analytics
  + Serve with Lambda
  + Warehouse with Redshift
2025/10/252025/10/25https://000072.awsstudygroup.com/
7- Lab 73: QuickSight dashboards
  + Build basic dashboard
  + Add improvements
  + Create interactive visualizations
- Weekly review and cleanup
2025/10/262025/10/26https://000073.awsstudygroup.com/

Week 7 Achievements:

  • AWS Analytics Services:

    • Understood Kinesis Data Streams, Firehose, and Analytics
    • Learned AWS Glue for serverless ETL
    • Mastered Athena for SQL queries on S3
    • Explored QuickSight for business intelligence
  • Data Streaming Pipeline (Lab 35):

    • Created Kinesis Firehose delivery stream
    • Generated and ingested sample streaming data
    • Configured Glue Crawler to catalog data
    • Queried data with Athena SQL
    • Built QuickSight visualizations and dashboards
    • Implemented data transformation in Firehose
  • DynamoDB Mastery (Labs 39, 40):

    • Created DynamoDB tables with partition and sort keys
    • Implemented GSI and LSI for flexible queries
    • Configured on-demand and provisioned capacity modes
    • Set up point-in-time recovery and backups
    • Learned DynamoDB design patterns (single-table design)
    • Built serverless applications with DynamoDB
    • Implemented DynamoDB Streams for event-driven architecture
    • Optimized costs with tagging and capacity planning
  • Data Processing (Labs 60, 70, 72):

    • Used AWS CloudShell for CLI operations
    • Worked with AWS SDK for programmatic access
    • Created Cloud9 development environment
    • Uploaded and managed datasets in S3
    • Profiled data with AWS Glue DataBrew
    • Cleaned and transformed data with DataBrew recipes
    • Processed data with AWS Glue interactive sessions
    • Used Glue Studio GUI for visual ETL
    • Ran big data processing with EMR
    • Analyzed streaming data with Kinesis Data Analytics
    • Served data via Lambda functions
    • Loaded data into Redshift for warehousing
  • Data Visualization (Lab 73):

    • Connected QuickSight to multiple data sources
    • Built interactive dashboards with filters and parameters
    • Created various chart types (bar, line, pie, heat maps)
    • Implemented drill-down capabilities
    • Shared dashboards with stakeholders
    • Scheduled dashboard refreshes

Challenges Encountered:

  • Kinesis Firehose Buffer: Data not appearing immediately → Understood buffer size and interval settings
  • Glue Crawler Scheduling: Crawler not detecting new data → Configured proper schedule and S3 event triggers
  • DynamoDB Hot Partition: Throttling on high-traffic partition key → Redesigned partition key for better distribution
  • Athena Query Performance: Slow queries on large datasets → Implemented partitioning and columnar formats (Parquet)
  • QuickSight Permissions: Cannot access S3 data → Granted QuickSight proper IAM permissions
  • DataBrew Recipe: Transformation not working as expected → Tested recipe on sample data before full run
  • EMR Cluster Costs: High costs for idle cluster → Used transient clusters and spot instances

References:

AWS Official Documentation:

AWS Workshops & Labs:

Technical Articles: