Nothing is more frustrating to me than YouTube, which decides my front page based on my likes. It seems I can’t have multiple interests — variables — and thus, I must watch certain kinds of videos. In its infinite wisdom, Twitter believes that only the people whose content I like or share are the ones whose content I want to consume. And don’t get me started on online dating services — they could learn a thing or two from Sima Taparia.
And that is because the post-social world of today is starting to coalesce around variables that are less humanistic and more biased towards corporate goals. “We live in a world that demands categorization,” I recently read in a newsletter, Tiny Revolutions. “We have to do some self-definition so the world knows what to do with us, and so that we can bond with others who share our interests, values, and concerns.”
While the writer, Sara Campbell, might have been talking about an individual’s desire not to be categorized, her words accurately describe our post-social society’s reality, dilemma, and futility in a handful of lines.
Categorization is part of the human condition. Our brain uses categories to help us make sense of a lot of facts we experience. It is how we learn. As humans, we need categories to contextualize our world, and that includes each other. What is more important is the intent behind the categories.
Categories, as such, have bias by intent. The bias allows us to ignore variables we don’t want to deal with and place boundaries around a category. It’s important because by ignoring them, we have to use fewer cognitive resources. The bias itself is not good or bad. It is the intent that leads you in different directions. That intent determines what variables we focus on and the ones we ‘choose’ to ignore.
And a lot of that intent is determined by the human condition. For example, if you have grown up in a more traditional society, the category that defines you is your lineage for most of your life. The “intent” of that categorization is to find your place in the social hierarchy. Lineage isn’t a primary variable for Americans, but college and money are. That is why in more modern societies, such as America, the college you attend defines your place in society and the workplace.
Ever wondered why most conversations start with a question: what do you do? That question is not only reflective of our fading art of conversation, and it also is a way for us to define the variables and get a quick context on the person. By doing so, we quickly decide to assign a value-metric to the person who is the recipient of our attention.
At best, in the pre-Internet world, categorization would rear its head in a social context, often giving us cues on how to engage with someone. An attractive single woman gets a different kind of attention from another woman versus a single man. Given the nature of modern consumerist society, it wasn’t a surprise that the emergence of databases allowed marketers to categorize us into “buckets” of those who may or may not buy some products. After all, the early usage of computers had been catalyzed by the demands from governmental agencies and corporations that wanted to use data to create categories.
However, in our post-social society, these categories have become even more granular and metastasized. Just take Facebook as an example. School, location, gender, relationships, and many more variables have started to create a profile of us that can be bundled no different than the dastardly collateralized debt obligation (CDO.)
And it isn’t just Facebook that is alone in using so many variables. From online dating services to online marketing to banking, most of them feel both antiseptic and plastic. These data variables are what make up an algorithm whose sole job is categorization. At present, the algorithms are relatively simplistic. They lack the rationality and nuance that comes from social science.
The bigger question is, what if all these data variables picked by companies for their own needs don’t define you or your interests. I suspect all of us be trapped in a data prison — forced to live lives that an invisible black box algorithm will decide what is good for us.
August 24, 2021, San Francisco
4 thoughts on “The Perils of Data Categorization”
I enjoyed this post – thank you.
The more I try to comment (gathering my thoughts before I type), the deeper I go down the rabbit hole (or the tangle of tunnels that is what’s left of my old brain).
Categorization is definitely a faster way to think, however the faster I go more chance of error I make. And the faster I go the less ability it seems I have to think more deeply. In business, speed is life – what’s the old saying “Run fast and break things.”?
I’d guess speed is not the point with YouTube but volume. They want us to watch more content and pull more ad revenue.
I also remember taking surveys with multiple choice answers and sometimes none of them fit my response. Or worse yet, having to ‘adjust’ my response to a personnel performance appraisal as to not trigger a ding on my Supervisor or Manager (because you know there’s always potential retribution). Whatever that category was, then it was falsely weighted. Maybe I’ve digressed into metrics here…
I have to wonder how many people think about being categorized. Do they care? How do they think if affects them (or not)? When I click the thumbs up rating on an Amazon delivery, what happens behind the scenes? When I don’t follow a medication recommendation from my physician (and he gets dinged in his metrics), what category do I get placed in? Just because I disagree with a risk factor doesn’t mean I don’t value the physician, I just don’t want to support the system that rates his performance in a way I don’t agree with.
… i think you’ve over-indexed on “intent” … because it really is impossible to know, with absolute certainty, what our intent might actually be. realistically, it’s a confluence of a number of different known and unknown factors / biases.
categorization is still very useful, as you say, because it provides context. but what you’ve gotten wrong is where you put the blame; it’s poor technology models / implementations that have “metastasized” categories and made them less useful… better, more empathetic tooling solves this by removing things like algorithmically-controlled feeds — this has been my focus for the last 4+ years as i’ve also grown tired of being man-handled by algorithms… i just want a feed that is not controlled by corporate interests (i had to build it myself).
categories are still useful but not all tooling stays useful.
Love this data reflection. I cringe upon seeing Google implementing more autocomplete options in Google Docs, comments, emails, etc. I fear what it does to the brain to not have to fill in one’s own gaps along with removing the ability to pause before figuring out what you want to say. I wonder if it’ll lower our tolerance for starting from scratch and rob us of building more creative muscles. Plus, I’m ashamed to admit that trust in my ability to spell words correctly has greatly diminished in the last 4-5 years as autocorrect takes over!
“Categorization is part of the human condition. Our brain uses categories to help us make sense of a lot of facts we experience. It is how we learn. As humans, we need categories to contextualize our world, and that includes each other. What is more important is the intent behind the categories.”
Om – this resonates with my focus on narrative design for organizations and people over the last six years. Narrative lives at a higher meta-level vs. storytelling. Stories are directional – from me to you. Narratives are bi-directional and open dialogues that allow for “context” and “clarity” to arise between parties.
The other counter-intuitive element to narrative is that it is a “pull strategy” versus the push of a story. To tie that back to your comments above, people will navigate and adopt the narrative that aligns with their belief systems. We are in an era of “slippery slopeism’s” with technology and social media amplifications ready at the touch to distort, twist and scale.
Hence the perils you speak of.
Comments are closed.