Man vs AI vs Man: The Register Problem

Reading Time: 5 minutes

Thanks to a solid 55 years or so of film and TV telling us that the AIs we create will try to kill us, and then watching ChatGPT and operator-attacking-murder-AIs emerge in roughly the same 6 month time frame , I feel absolutely confident in saying that humans are terrible listeners.

But I think, also, in a weird way, this might be the reason that the machines never defeat us.

Hear me out…

It’s still gonna be a minute before the AIs take over humanity

Whether it’s some image recognition API or an insurance chatbot, it’s going to need a lot of training. And much of that training has to be done by humans.

In fact, a whole subsection of the gig economy is emerging that’s based entirely around labeling and training AIs.

We’re going to enter a phase pretty soon where AI training and labeling is going to become a task as mundane as data entry.

There’s probably going to be a solid 10-15 years before the robot apocalypse where we bore ourselves to death training them for the task.

And during that time, we’re not trying to not use our AIs.

Humans don’t talk to all humans the same way

Sociolinguists call this “register“. Register is how we describe the way our language changes based on setting. Our language doesn’t just change in pronunciation, but grammar and vocabulary as well.

One of the ways that linguists evaluate register is through a “formality scale”. In English we recognize five scales right now:

Frozen: Sometimes it’s called a static register. This is language that’s been printed and hasn’t changed since it was printed. Creeds, Bible quotes, and oaths are like this. They get weirder the further we get from when they were printed.
Formal: This is usually unidirectional. It’s a speaker speaking or presenting to an audience using some specialized vocabulary and exact definitions.
Consultative: This is bidirectional. There’s some amount of knowledge that both participants have, but there’s no assumption that both participants have the same depth of that information. It can involve specialized knowledge and jargon. Interruptions are allowed. This happens between teachers and students, lawyers and clients, doctors and patients.
Casual: Also bidirectional. There’s no pre-existing information that’s assumed between participants — sometimes many, because it’s a social setting. Interruptions are common. This is where you can eliminate words or whole phrases because they can be elided (i.e. Ellipsis, e.g. More people are reading this than I thought ~~would read it~~, Who trained this AI? Monkeys ~~trained this AI~~). This is how we talk to friends and peers.
Intimate: This is also bidirectional with no pre-existing information assumed between participants. But it’s usually just two participants. At this point tone matters more than wording or grammar; whole conversations can be had with one word. This is how we talk to family members and close friends.

Keep in mind this is just for English. Different languages may have more points on the scale and may even introduce grammatical shifts based on the formality (looking at you, Japanese).

So ask yourself this question:

Which register are we going to use when we’re talking with AI’s?

AIs will get trained with our AI registers

I don’t really know what level of formality we’re going to use when we’re chatting with AIs.

My guess right now is that it’s more consultative because we’re at risk of encountering poor results and loss of information if we get too casual.

There’s no way in hell I’m going to count on Alexa to know what I mean when I casually say, “dude,” after she tries to upsell me on face-masks when I ask about the weather report (Thanks Canadian wildfire smoke).

I predict that we will develop an “AI Register” in our respective languages that may fit somewhere between the consultative and casual formalities:

It will be somewhat bidirectional, but there won’t be interruptions
There will be knowledge assumed between both parties
There may be an implicit mistrust in the specialized knowledge
The participant receiving the knowledge (the asker) is the one in control of the conversation
- Which means an absence of honorifics (sir, ma’am)
- And dropping most if not all etiquette (thank you, please, you’re welcome)
No interruptions (because they may break the AI)

What that means is that AI trainers are going to train AIs against their own AI Register.

AIs will talk to us the way we talk to them

Sometimes what we find behind the mask is ourselves. Poorly rendered. — “You’re just a crappy photoshop of me,” Fred exclaimed to his captor

The folks that’re gonna be out there labelin’ n trainin’ are going to be doing so with pre-existing knowledge about the AIs. Because they aren’t living in a world where chat bots and image recognizers and murder drones don’t already exist.

Those folks I’m talkin’ bout’ ain’t model developers and computational linguists here.

We’re talking about the massive hoard of people that will be employed to help these AIs figure out which images contain boats, whether “floats” has to do with numbers or boats, and what kind of sowing you’re doing with those oats.

Those people are going to be doing that training with a register that won’t quite match any of the existing registers. And if AIs continue to learn from those people to figure out how to talk to us, then AIs may never truly learn how humans talk.

They’ll only learn how we talk to AI.

The humans and the machines might form a sociolect or even a dialect

So the AIs are gonna talk to us the same way the trainers train them.

And then we’re gonna talk to them how they talk to us.

And then they’re gonna talk to us how we talk to them.

Sometimes what we think is someone else is actually us — Future us are gonna be so pissed

A sociolect is what happens when a particular social group develops a distinct form of communication. In addition to a “AI Register”, we might end up with AI Sociolect where we:

Speak slower with fewer pauses
Enunciate more
Avoid homophones
And use more single-word imperatives (tell, speak, say, activate)

Human-AI sociolects could develop in many languages; we create our own peculiar ways of talking to each other (us and the “AIs” who will actually just be proxies for other humans).

And it may also mean that AI’s just never really quite sound human — because humans may never talk to them like humans. And that might end up being a Good Thing©.

Frank M Taylor

blog