Recently, Appreciation Bot – my Twitter bot that responds to museum artefacts with pseudo-intellectual responses – tweeted something a bit off-colour. Not something intentionally offensive perhaps, but certainly something that would raise eyebrows were a human to tweet it. I didn’t include the tweet directly but you can view it here. Even a bot tweeting this elicited some responses from people, and I wanted to write a bit about the bot, why this happened, and what it made me think of. Before I go any further, let me just say: my bots shouldn’t offend people, and when they do it’s my fault. But this event did throw up some interesting things for me to think about.
There are two common ways that Twitter bots source free-text material to use in their tweets. One is to use vocabulary lists of phrases or words that are pre-approved by the bot’s author. Appreciation Bot uses this for its little flourishes at the end of its tweets. Here’s some I plucked from the bot source, for instance:
Another option is to use sanitised content sources that you can depend on to be acceptable, true, or generally good. Two Headlines, for instances, merges two data sources (major news organisations) that the bot author trusts to be inoffensive, professional and accurate.
Appreciation Bot was intended to be an attempt at using a computational creativity tool as part of a Twitter bot, to build a bot that was a little bit more computationally creative than the average bot (not because that makes the bot ‘better’ – just a different avenue to explore). To this end, it uses Metaphor Magnet – a tool made by Tony Veale from University College Dublin – to get creative connections between objects. Metaphor Magnet is the result of an analysis of Google n-Grams; not the Books one available online, but a vast 27-DVD corpus of text files providing statistics on the words and phrases people use online. It uses these phrases, and the frequency Google found them on the web, to extract metaphorical and other linguistic relationships. “As XÂ as Y” is a common pattern, for instance, that Tony’s software can readily extract and repurpose.
Metaphor Magnet can tell you, for instance, that chocolate can be compared to a charming actress. It can offer evidence from things that people have said to support this, too: chocolate is alluring, chocolate is celebrated, and so on. These are all backed up by things that people have said online at least 40 times across the Google corpus, which means that a lot of this information is really valuable and reflects things people truly believe. Of course, some comparisons are better and more reliable than others. ‘As black as coffee’ reflects the fact that the speakers believes that the colour of coffee is black. ‘As lazy as a computer scientist’ reflects a prejudiced belief about a particular kind of human being, which is less reliably true (although potentially useful information if we know it is biased in advance).
Appreciation Bot shouldn’t be tweeting things that offend people, and it’s my fault that it tweeted what it did the other day. There is a conflict here, however. I’d like my bots to use intelligent and rich data sources like Metaphor Magnet as a complement to the static, dependable sources used by many other Twitter bots. Twitter represents an exciting platform for doing Computational Creativity research on. At the same time, using more volatile and unreliable data means that our bots need to be more aware of potential offence and react to situations where they might be hurting people.
In the short term, I’m going to make a large list of ‘danger words’ – a subjective list of things I would hope my bots won’t mention without due consideration. Appreciation Bot simply won’t tweet anything that contains these words, it’ll regenerate and try another tweet. In the longer term, though, I don’t want bots to just shy away from these topics – one day, I’d like bots to be intelligent enough to deal with difficult topics in the same way that humans do. Ultimately, I think the real problem with the Appreciation Bot tweet that started this off isn’t the subject matter it chose, but the lack of intelligence in the comment it added on after. The weasel phrase “This really makes me feel optimistic” is inappropriate given the gravity and negative association of the preceding phrase.
If Appreciation Bot was cleverer, rather than shying away from these topics and remaining less intelligent, maybe it could have navigated this difficult terrain and avoided offending people?
I’m interested to hear people’s thoughts on the matter. In January I’m going to attend a code camp where students will be making computationally-creative Twitter bots, and I’ll be giving a talk on the state of Twitter bots today. It’s good for me to have these discussions and understand the world a little better as a result!
Though the tweet may be perceived offensive, it could also just pass as bad sarcasm. Which makes me think, how could you tell them apart? Do you think that you could make a sarcastic bot out of the “filtered tweets”? Perhaps if you could find of measuring humor your could make a Standup comedian bot :p Anyway it was just a thought. Keep up the good work.
Sarcasm and irony detection is an active research area in humour and computational linguistics I think. It’s really tough for loads of reasons, but I have seem some papers claiming to make progress in some areas or types of sarcasm.
I guess the other issue here might be about the perception of Twitter bots. Because we know they’re bots, we assume they’re talking in a particular voice. Most bots are quite straight-laced (although there are a few, like https://twitter.com/wikisext where I guess I internalise a more joking voice sometimes) and so I suppose we’re less likely to initially interpret a statement as humourous right off the bat.
Thanks for commenting! It’s interesting stuff.