prokopetz

Okay, so you know how search engine results on most popular topics have become useless because the top results are cluttered with page after page of machine-generated gibberish designed to trick people into clicking in so it can harvest their ad views?

And you know how the data sets that are used to train these gibberish-generating AIs are themselves typically machine-generated, via web scrapers using keyword recognition to sort text lifted from wiki articles and blog posts into topical subsets?

Well, today I discovered – quite by accident – that the training-data-gathering robots apparently cannot tell the difference between wiki articles about pop-psych personality typologies (e.g., Myers-Briggs type indicators, etc.) and wiki articles about Homestuck classpects.

The upshot is that when a bot that's been trained on the resulting data sets is instructed to write fake mental health resource articles, sometimes it will start telling you about Homestuck.

prokopetz

You think I am joking.

I am not.

cliptech