Thousands of articles and tutorials have been written about data science and machine learning. Hundreds of books, courses and conferences are available. You could spend months just figuring out what to do to get started, even to understand what data science is about.
In this short contribution, I share what I believe to be the most valuable resources – a small list of top resources and starting points. This will be most valuable to any data practitioner who has very little free time.
Map-Reduce Explained
These resources cover data sets, algorithms, case studies, tutorials, cheat sheets, and material to learn the most popular data science languages: R and Python. Some non-standard techniques used in machine-to-machine communications and automated data science, even though technically simpler and more robust, are not included here as their use is not widespread, with one exception: turning unstructured into structured data. We will include them, as well as Hadoop-based techniques (distributed algorithms, or Map-Reduce) in a future article.
1. Technical Material
- Curated Lists of Data Science, Machine Learning, Deep Learning and …
- Data Science Cheat Sheet
- Cheat Sheet: Data Visualization with R
- R, Python, Machine Learning, Dataviz: Most Popular Resources
- Great Github list of public data sets
- The Guide to Learning Python for Data Science
- Learning R in Seven Simple Steps
- 9 Python Analytics Libraries
- A Tour of Machine Learning Algorithms
- 4 easy steps to becoming a data scientist
- Turning Unstructured into Structured Data
2. General Content
- Deep Learning: Definition, Resources, Comparison with Machine Learning
- 24 Uses of Statistical Modeling
- Lifecycle of Data Science Projects
- Data Science Compared to 16 Analytic Disciplines
- 10 types of data scientists
- 11 Important Model Evaluation Techniques Everyone Should Know
- 50 Questions to Test True Data Science Knowledge
- 21 data science systems used by Amazon to operate its business
- 22 tips for better data science
- 10 Great Data Science Articles by Bernard Marr
3. Additional Reading
- State-of-the-Art Machine Learning Automation with HDT
- Fast Combinatorial Feature Selection with New Definition of Predict…
- Implementation of 17 classification algorithms in R
- Tests of Hypotheses Revisited
- 10 types of regressions. Which one to use?
- 17 short tutorials all data scientists should read (and practice)
- Tutorial: How to detect spurious correlations, and how to find the …
- 15 Most Controversial Data Science Articles
- Biased vs Unbiased: Debunking Statistical Myths
- 16 analytic disciplines compared to data science
Enjoy the reading!