I have used synthetic data sets many times for simulation purposes, most recently in my articles Six degrees of Separations between any two Datasets and How to Lie with p...
Everyone’s talking data. Data is the key to unlocking insight, the secret sauce that will help you get predictive, the fuel for business intelligence. The transformativ...
Probably the worst error is thinking there is a correlation when that correlation is purely artificial. Take a data set with 100,000 variables, say with 10 observations. ...
Please join me in Las Vegas at Hitachi Vantara’s NEXT 2019 in October where I’ll be talking about data lake “second surgeries” – viva, baby. Data science has b...
The housing market has undergone quite a change in the past decade, with more stringent lending criteria for housing having been enforced. A key objective of financial in...
In addition to being the sexiest job of the twenty-first century, Data Science is new electricity as quoted by Andrew Ng. A lot of professionals from various discipline...
After posting my most recent blog using census data to illustrate handling “large” dataframes in R exploiting fst and feather file formats, I realized I c...
Many statistics, such as correlations or R-squared, depend on the sample size, making it difficult to compare values computed on two data sets of different sizes. Here, w...
Artificial intelligence (AI) seemingly has been discussed everywhere over the last few years, and now it’s made its way into the commercial insurance industry. Organiza...
Originally posted by John Bowden. There is a meeting of the world’s most renowned scientists once in two years. These meetings are held to tackle biological puzzles t...