How to trigger a spark job from AWS Lambda
- Event driven pipelines
- Lambda function to trigger spark jobs
- Setup and run
- Monitoring and logging
- Teardown
- Conclusion
- Further reading
- Referen...
Similar Articles (10 found)
π 72.5% similar
How to submit Spark jobs to EMR cluster from Airflow
Table of Contents
Introduction
I have been asked and seen the questions
how others are automating...
π 62.9% similar
3 Key techniques, to optimize your Apache Spark code
- Intro
- Distributed Systems
- Setup
- Optimizing your spark code
- Technique 1: reduce data shu...
π 61.9% similar
How to Pull Data from an API, Using AWS Lambda
Introduction
If you are looking for a simple, cheap data pipeline to pull small amounts of data from a ...
π 60.2% similar
Data Engineering Project for Beginners - Batch edition
- 1. Introduction
- 2. Objective
- 3. Run Data Pipeline
- 4. Architecture
- 5. Code walkthrough...
π 59.8% similar
Data Engineering Projects
1. Introduction
Whether you are new to data engineering or have been in the data field for a few years, one of the most chal...
π 59.0% similar
How to test PySpark code with pytest
- 1. Introduction
- 2. Ensure the codeβs logic is working as expected with tests
- 3. Conclusion
- 4. Further Rea...
π 57.9% similar
Setting up end-to-end tests for cloud data pipelines
- 1. Introduction
- 2. Setting up services locally
- 3. Writing an end-to-end data pipeline test
...
π 57.2% similar
Why use Apache Airflow (or any orchestrator)?
- 1. Introduction
- 2. Features crucial to building and maintaining data pipelines
- 3. Conclusion
- 4. ...
π 56.7% similar
Data Engineering Best Practices - #2. Metadata & Logging
- 1. Introduction
- 2. Setup & Logging architecture
- 3. Data Pipeline Logging Best Practices...
π 55.8% similar
Designing a "low-effort" ELT system, using stitch and dbt
Intro
A very common use case in data engineering is to build a ETL system for a data warehou...