Building a Scalable Synthetic Data Pipelines with Amazon S3 Tables, Apache Iceberg and Faker

David McAmis
Feb 28
1 min read

Updated: Apr 28

Are you keen to try out Amazon S3 Tables? Learn how to generate millions of highly realistic synthetic financial transactions using Python's Faker library and AWS Glue, then store them in S3 Tables with Apache Iceberg for testing and development without exposing sensitive production data. This hands-on tutorial demonstrates partition optimization and reproducible data generation at scale.

From the AWS Builder Center blog:

https://builder.aws.com/content/3A5pA0YR3Ee4qM1Wuv59dzBheDM/building-a-scalable-synthetic-data-pipelines-with-amazon-s3-tables-apache-iceberg-and-faker

DAVID MCAMIS

Building a Scalable Synthetic Data Pipelines with Amazon S3 Tables, Apache Iceberg and Faker

Recent Posts