I found an interesting, free book which is still a work in progress book – The Data Engineering Cookbook
I will be contributing through the author (Andreas Kretz.com) patreon site : (Link to his Patreon ) because I see data engineering as a topic which is not fully covered.
The book is being built on an ongoing basis with a wide scope (for free as I understand it but with a patreon model of supporters)
The book is split into five parts
- introduction
- basic data engineering skills
- a real world data engineering example
- over 30 case studies with links from companies like Netflix, Twitter, Spotify
- collection of interview questions
Topics covered include
- Data Engineer vs Data Scientists
- Basic Data Engineering Skills
- Git
- Agile development
- Learn how a Computer Works
- Computer Networking
- Security and Privacy
- Linux
- The Cloud
- Security Zone Design
- Big Data
- My Big Data Platform Blueprint
- Lambda Architecture
- Data Warehouse vs Data Lake
- Docker
- REST APIs
- Databases
- Data Processing and Analytics – Frameworks
- Apache Kafka
- Machine Learning
- Data Visualization
- Data Engineering Course: Building A Data Platform
- Case Studies: AirBnB, spotify, Uber, Twitter and a range of others
In my teaching at Oxford University – Artificial intelligence – cloud and edge implementations – I have taken an engineering led approach to data science. Many courses miss that depth and its not easy to teach because you need to cover three job roles: Data Engineering, Data Science and Devops. Its easy to miss many small topics in this vast scope.
Hence, I hope this book will be a useful reference
The book link is – The Data Engineering Cookbook