Summary: There are several approaches to reducing the cost of training data for AI, one of which is to get it for free. Here are some excellent sources. Recently we w...
Python and R are the two most commonly used languages for data science today. They are both fully open source products and completely free to use and modify as required ...
In the twentieth century, oil was the most valuable resource – but not anymore. In today’s digital age data is the new oil. It will play a similar, perhaps bigger rol...
The Forrester report “Predictions 2018: A year of reckoning” predicted that 80% of firms affected by the GDPR will not be able to comply with the regulation by the ti...
Articulate is an open source project that will allow you to take control of you conversational interfaces, without being worried where and how your data is stored. Also, ...
After my last blog on the use of relational databases PostgreSQL and MonetDB to help compensate for R’s RAM limitations, I received an email from a reader who ask...
Full title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. An alternative title is Organized Chaos. Published June 2, 2...
In the “Ecology of Metrics,” I wrote about “alignment” being a type of metric; alignment can measure the extent to which an organization’s supply or capacity is...
When the first release of Spark became available in 2014, Hadoop had already enjoyed several years of growth since 2009 onwards in the commercial space. Although Hadoop s...
The age of Artificial Intelligence (AI) is almost upon us. Rapid developments in machine learning have allowed us to build better, smarter machines that are capable of ma...