Should Data Pipelines in Python be Function based or Object-Oriented (OOP)?
- 1. Introduction
- 2. Data transformations as functions lead to maintainable code
- 3. Objects help track things (aka state...
Similar Articles (10 found)
π 78.4% similar
Data Pipeline Design Patterns - #2. Coding patterns in Python
- Introduction
- Sample project
- Code design patterns
- Python helpers
- Misc
- Conclus...
π 72.4% similar
Data Engineering Best Practices - #1. Data flow & Code
- 1. Introduction
- 2. Sample project
- 3. Best practices
- 3.1. Use standard patterns that pro...
π 71.8% similar
Python Essentials for Data Engineers
- Introduction
- Data is stored on disk and processed in memory
- Practicing Python
- Python basics
- Python is u...
π 66.6% similar
How to quickly deliver data to business users? #1. Adv Data types & Schema evolution
- 1. Introduction
- 2. Use Schema evolution & advanced data types...
π 65.7% similar
Data Pipeline Design Patterns - #1. Data flow patterns
- 1. Introduction
- 2. Source & Sink
- 3. Data pipeline patterns
- 4. Conclusion
- 5. Further r...
π 65.6% similar
Data Engineering Projects
1. Introduction
Whether you are new to data engineering or have been in the data field for a few years, one of the most chal...
π 65.6% similar
How to test PySpark code with pytest
- 1. Introduction
- 2. Ensure the codeβs logic is working as expected with tests
- 3. Conclusion
- 4. Further Rea...
π 64.6% similar
How to add tests to your data pipelines
Introduction
Testing data pipelines are different from testing other applications, like a website backend. If ...
π 63.8% similar
End-to-end data engineering project - batch edition
- Objective
- Setup
- Components
- Choosing tools & frameworks
- Future work & improvements
- Conc...
π 63.5% similar
Why use Apache Airflow (or any orchestrator)?
- 1. Introduction
- 2. Features crucial to building and maintaining data pipelines
- 3. Conclusion
- 4. ...