Building a Scalable Synthetic Data Pipelines with Amazon S3 Tables, Apache Iceberg and Faker
- David McAmis

- Feb 28
- 1 min read
Updated: 4 days ago
Are you keen to try out Amazon S3 Tables? Learn how to generate millions of highly realistic synthetic financial transactions using Python's Faker library and AWS Glue, then store them in S3 Tables with Apache Iceberg for testing and development without exposing sensitive production data. This hands-on tutorial demonstrates partition optimization and reproducible data generation at scale.
From the AWS Builder Center blog:


