Gianluca Malato
1 min readApr 28, 2019

--

Hi, Chris. The point here is the finite-size error of selecting a uniform sample from a population and, then, the statistical bias of this sample. If the sample and the population have the same column-wise distributions, all the p-values are very close to 1. I don’t think that having too many features increases the failure probability. It really depends on how the features are distributed.

However, I’ll explain a better technique in an article I’m writing.

--

--

Gianluca Malato
Gianluca Malato

Written by Gianluca Malato

Theoretical Physicists, Data Scientist and fiction author. I teach Data Science, statistics and SQL on YourDataTeacher.com. E-mail: gianluca@gianlucamalato.it

Responses (1)