Causation vs correlation. Big data still follows the basics of statistical analysis. Sample error, sample bias.
“nobody wants ‘data’. What they want are the answers.
Big Data’s four articles of faith debunked:
- “Uncanny accuracy is easy to overrate if we simply ignore false positives.
- The claim that causation has been “knocked off its pedestal” is fine if we are making predictions in a stable environment but not if the world is changing (as with Flu Trends).
- The promise that “N = All”, and therefore that sampling bias does not matter, is simply not true in most cases that count.
- “with enough data, the numbers speak for themselves” – that seems hopelessly naive in data sets where spurious patterns vastly outnumber genuine discoveries.”
“Big data” has arrived, but big insights have not.”