Posts tagged with
Multi-Table Inserts in Snowflake: One Scan, Many Targets
Snowflake’s Multi-Table Insert feature solves this in a clean and elegant way: you write one SELECT, and Snowflake routes the rows into many target tables
Zero-Downtime AWS EMR Deployments
Zero-Downtime EMR Deployments: Lessons Learned from Production
Real time streaming using Kafka, schema registry and Spark Glue ETL for Avro records
In this article, apart from exploring a robust and scalable solution for real-time data processing using Amazon Managed Streaming for Apache Kafka (MSK), Confluent Schema Registry, and Apache Spark Streaming within AWS Glue ETL, my utmost focus will be on ensuring compatibility & feasibility with Avro schema records using cross platform components like AWS services & confluent services, a popular data serialization format in the Apache Kafka ecosystem. There are abundant articles on real time streaming for json records using MSK & GSR (Glue schema registry) but didn't find anything with confluent schema registry on aws services, hence penning down the article to solve this problem as well.