Personal updates and DSC
After more than 10 years being involved with Data Science Central, initially as the founder, and most recently being acquired by TechTarget, I have decided… Read More »Personal updates and DSC
This rubric covers the use of statistical tools working on large datasets to create models and derive inferences, as well as coverage of the field in its entirety. This differs from machine learning primarily in that the latter focuses on functional gradient analysis or neural networks (kernels) to derive models.
After more than 10 years being involved with Data Science Central, initially as the founder, and most recently being acquired by TechTarget, I have decided… Read More »Personal updates and DSC
P-values and critical values are so similar that they are often confused. They both do the same thing: enable you to support or reject the… Read More »P Value vs Critical Value
I can’t find anymore where this chart, featuring relations between distributions, was first published. I remember seeing it on the Cloudera blog. Another shorter one… Read More »Statistical Distributions in One Picture
Earlier this week, I was speaking at an event on AI for Real Estate where I showed an example from a BBC clip which said… Read More »Why we need more Bayesian trained data scientists than frequentist post COVID 19 ..
My views on how to effectively align daily decisions to business objectives At first glance, the odds of winning at rock-paper-scissors is one in three…… Read More »Transforming Day-to-Day Decisions in the Enterprise
If you scour the internet for “ANOVA vs Regression”, you might be confused by the results. Are they the same? Or aren’t they? The answer… Read More »ANOVA vs Regression in One Picture
It is surprising to see the level of innumeracy in the population, even in college-educated professionals. People still have blind faith in so-called experts and… Read More »Three fallacies about Covid-19
How oversampling yielded great results for classifying cases of Sexual Harassment. The problem: Overcoming an imbalanced data set When it comes to data science, sexual… Read More »Overcoming an Imbalanced Dataset using Oversampling.
In my previous article, we analyzed the COVID-19 data of Turkey and selected the cubic model for predicting the spread of disease. In this article,… Read More »Model Selection: Adjusted Coefficient of Determination-Variance Tradeoff
(This article is now a chapter of my github proto-book Bayesuvius) Simpson’s paradox is a recurring nightmare for all statisticians overseeing a clinical trial for… Read More »Simpson’s Paradox, the Bane of Clinical Trials