top of page
Search


Blog: Inside AWS Glue: Understanding the Spark Engine and Using the Spark UI for Troubleshooting
Learn how AWS Glue uses the Spark engine to create scalable, performant data pipelines, and how to monitor background processes using the Spark UI. https://builder.aws.com/content/38Rp4bJuE5lSsX89iee3gjDeZkO/inside-aws-glue-understanding-the-spark-engine-and-using-the-spark-ui-for-troubleshooting

David McAmis
Jan 191 min read


Blog: A Beginner’s Guide to Orchestrating AWS Glue Jobs with Amazon Managed Workflows for Apache Airflow (MWAA)
If you’re building data pipelines on AWS, you’ve probably used AWS Glue to run ETL (Extract, Transform, Load) jobs. Glue automates data movement and transformation so you can focus on insights rather than infrastructure. But what if you need to schedule those jobs, run them in sequence, or trigger them based on the completion of other tasks? That’s where Amazon Managed Workflows for Apache Airflow (MWAA) comes in. This blog will provide everything you need to set up your firs

David McAmis
Jan 191 min read


Blog: Connecting to Salesforce Data Using AWS Glue
Integrating Salesforce data into your AWS analytics ecosystem is an essential step in building a comprehensive view of your customers, sales, and operations. With the growing number of options available in AWS for ingesting and transforming external data, it’s important to understand which approach best suits your needs—especially when comparing traditional Glue ETL jobs with newer Zero-ETL features. https://builder.aws.com/content/2zqZrESSbXDWzl2ftsAmL89UhuJ/connecting-to-sa

David McAmis
Nov 7, 20251 min read


Blog: Troubleshooting AWS Glue Jobs
AWS Glue is a powerful serverless ETL service designed to simplify data integration tasks at scale. However, like any data engineering tool, Glue jobs can—and will—fail due to a variety of issues: configuration errors, data mismatches, IAM permission problems, or underlying infrastructure limits. This post is a practical guide to troubleshooting AWS Glue jobs. https://builder.aws.com/content/2y4nDmkmTBfknTTWR5wwtfrbECQ/troubleshooting-aws-glue-jobs

David McAmis
Nov 7, 20251 min read


Blog: Ingest Excel Files into a Data Lake Using AWS Glue
As organizations modernize their data infrastructure, ingesting legacy Excel files into cloud-based data lakes is becoming increasingly...

David McAmis
Jun 5, 20251 min read


AWS Machine Learning - The Art of the Possible (Twitch Series)
Hi everyone, I recently got to host an episode on "AWS Machine Learning - The Art of the Possible" on Twitch. The team covered a super...

David McAmis
May 4, 20211 min read


Snowflake + AWS Resources
Snowflake, the data warehouse built for the cloud, on AWS is an industry-leading platform for both advanced data analytics and machine...

David McAmis
Jul 31, 20202 min read


Generating Leads and Opportunities with Webinars: Getting Started
In this new article series on Medium, we will be looking at new ways of generating leads and opportunities for channel partners and how to a

David McAmis
Apr 22, 20201 min read


Getting Started with AWS Glue
AWS Glue is a full-managed, clusterless ETL service that allows you to quickly extract and prep data for a wide variety of use cases. In...

David McAmis
Nov 7, 20181 min read


Building a Big Data and Analytics Practice: From Zero to Hero in 5 Steps
So you can get the most value from your data, AWS provides the most comprehensive, secure, scalable, and cost-effective portfolio of...

David McAmis
Nov 5, 20181 min read


Modernize your Data Warehouse with Amazon Redshift + Redshift Spectrum
In this session you will learn how to migrate and modernise your legacy data warehouse, moving from an on-premises server or application,...

David McAmis
Nov 5, 20181 min read


Creating a Data Driven Culture with Amazon QuickSight
Data drives good business decisions and a data-driven culture can help organisations increase profitability and reduce costs. Amazon...

David McAmis
Nov 5, 20181 min read


Visualising Your Data Insights with Amazon QuickSight
Visualisation is an essential part of analysing your data to gain insights and communicate them across your organisation. In this session...

David McAmis
Nov 5, 20181 min read


AWS Learning Series: Harnessing the Power of Data
Learn how to architect and build a data lake solution, and how to integrate key AWS services, including Amazon S3, AWS Glue, Amazon...

David McAmis
Nov 5, 20181 min read


AWS Data Lake Resources
Here is a list of resources I have put together for building Data Lakes on AWS. If you have one to add, leave a comment. AWS Data Lake...

David McAmis
Oct 31, 20181 min read


Encrypt your Redshift Cluster with 1-click
In case you missed it, AWS released 1-click encryption to allow you to easily encrypt an existing Redshift Cluster. For the full...

David McAmis
Oct 31, 20181 min read
bottom of page