Big Data, Education, and the Wisdom of the Crowd

Partly as an experiment, and partly due to nausea, I’m spending a few days deliberately avoiding broadcast “news” about the POTUS election. For starters, the post-game analysis is highly unlikely to inform a rematch of the contestants later in the season: there is no rematch.

Secondly, I want to do my own speculations on how the results will mean this or that, before letting official experts spoil my fun.

Continuing, but waning, the nausea is not about the results, it is about the hangover from social toxins: prior fear of the possible results, which had for months inspired “really smart” people in families and friendships everywhere to dig down deep to the lowest level of self-debasement and intellectual rigor they could muster in “arguing” their points. This globally limbic phenomenon stripped away much of the sense of security I had presumed from years of having both paid to learn and gotten paid to teach. As it turns out, education certainly creates an audience, even a credentialed one; but the “audience” may be entirely beside the point.

Instead, if education fails to create a community, then there are circumstantial vulnerabilities — situations in which despite all the smarts, all bets are off. The recent epidemic of Frenemies may now start to subside (I hope) as people feel less of a need for mental combat and a restoration of connections — but the election has warned me to get up to speed on how education does and does not work.


One sensational irony of the just completed POTUS election is how much elite intelligence was used to identify communities with lower “levels” of education that didn’t care what the elite intelligence was predicting!

I am reminded of an idea I first heard very long ago that science claims bumblebees should not be able to fly, but the bees don’t know that and so they fly anyway.

The orgy of Big Data that was poured like fuel into the Big Media vehicles sometimes came with disclaimers, but “popular” consumption of it did not want the disclaimers to count. The contrast, of course, would be unpopularconsumption, a behavior that came mostly with the extra oomph of not even caring about whether the disclaimer was there, since the stuff was just not going to be much used or trusted anyway.

If the disclaimer was inconsequential, and we saw that it was, then what is the more substantive and obvious problem with the Big Data?

That would be the problem of the data failing to be Thick Data, regardless of how Big it was.

Thin Data floated like fog over everything, but instead of condensing into a gracious watering of thirsty grounds, it got evaporated quickly and easily by an enormous burst of electoral heat.


For the record, some data processors did work on thickness. Having one surveyor needling a particular idea from four or five different directions was the right impulse, but typically the results were “inconclusive” precisely because they came up with contradictory answers from the same respondent. When the contradictions were not sell-able by media highlighters, they expired on the shelf or were edited for marketability.

While Thin Data was being vigorously bought and sold among the “intelligentsia”, crowd-sourcing of opinions routinely clobbered the transformative ambitions of the data in pre-existing communities. Community information is inherently thick; most information in a community is immediately vetted upon arrival for its ability to support or confirm the community. If it gets past the gate, it is socially protected.

To an important extent, this highlights the persistence of a commonfolk opposition: from the community perspective, external parties firmly grasping high volumes of information may be intellectuals; but internal parties strongly holding high volumes of information are “knowledgeables”…


The emergent headline of the week is that Big Data saw a community without understanding it and consequently mis-explained it, not to the community but to itself.

The community in question — a statistical aggregate of people without college or white-collar certifications — was presumed to be significantly uneducatedbut as the election showed, it knew what the intellectuals did not know about the actual value of the data.

YET… super-high up on the wish-list of the winning electorate is a solution to the problem of granting higher-education to its current generation of teenagers. Arguably, the only reason this electorate would care about Big Data is if it is thick enough to be for them instead of just about them.

It’s going to take much better effort to establish meaningful pathways and uses of data whereby it is adopted by communities as knowledge. It seems more obvious now that the key is for communities to experience their own use of the externally-sourced data, with the accumulation of the shared experience becoming education.

•    •    •

Originally published at Medium on November 9, 2016.