Category: Glue

AWS Glue Part 3: Automate Data Onboarding for Your AWS Data Lake

Choosing the right approach to populate a data lake is usually one of the first decisions made by architecture teams after deciding the technology to build their data lake with. A recent trend seems to be taking over is using Spark, since it’s fast and powerful and comes with a lot of flexibilities when used … Continue reading AWS Glue Part 3: Automate Data Onboarding for Your AWS Data Lake →

Saeed Barghi AWS, Business Intelligence, Cloud, Glue, Terraform May 1, 2018September 5, 2018 3 Minutes

AWS Glue Part 2: ETL your data and query the result in Athena

In part one of my posts on AWS Glue, we saw how Crawlers could be used to traverse data in s3 and catalogue them in AWS Athena. Glue is a serverless service that could be used to create ETL jobs, schedule and run them. In this post we'll create an ETL job using Glue, execute … Continue reading AWS Glue Part 2: ETL your data and query the result in Athena →

Saeed Barghi AWS, Business Intelligence, Cloud, Glue 2 Comments April 25, 2018April 25, 2018 3 Minutes

AWS Glue Part 1: Discover and Catalogue Data Stored in s3

Learn how to add a Crawler in AWS Glue for data that is stored in s3

Saeed Barghi AWS, Business Intelligence, Cloud, Glue 3 Comments April 23, 2018April 24, 2018 2 Minutes

Search for:

Author

Tags

Alias Athena automate AWS AWS Glue Azure BI Administration Tool Big Data Business Intelligence Business Layer C# Change Data Capture Cloud Combine Dimensions Confluent Crawler Database Role Databricks Data Warehouse Data Warehousing Error ETL Foreach Loop GCP Generate Script Glue Hadoop hdfs IOT IoT Hub KSQL Microsoft BI Missing Modeling Multiple Join Multiple Source namenode OBIEE OLE DB Source org.apache.hadoop.hdfs.BlockMissingException org.apache.spark.sql.SQLContext Parameter Sniffing Parellelism in SSIS Physical Layer Presentation Layer Queue RPD s3 Scala schema Script Component Snow-Flake Spark Spark Streaming Split Records SQL Server SQL Server 2000 SQL Server 2008 SQL Server 2008 R2 SQL Server Log Files SSIS SSIS Execution Tree SSIS File System Deployment SSIS Package Store SSIS Performance SSIS SQL Server Deployment SSIS Synchronous and Asynchronous components SSMS SSRS Star Stored Procedure Streaming Streaming Analytics tempdb YARN

Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address:

Join 34 other subscribers

Create a free website or blog at WordPress.com.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

Subscribe Subscribed
- In Data we trust
- Already have a WordPress.com account? Log in now.