IMG_2035.jpg

Blog

Voice - Redundant Eliza

Speaking to VUI was always awkward, no matter which application/service they are. In fact, Siri is the only service that I have used so far. Maybe it could be the reason for avoiding speaking in front of people that I am the youngest one in a big family. For any reason, I have avoided giving information to the public, especially in English.

That awkward feeling continued when I assembled AIY kit. Testing the command on the floor, while everybody is listening to me was not easy. We can find that kind of situation with a real person in real life also. The awkward feeling when you consistently overlap while the other person is speaking and both of them keep yielding to speak first each other.

A, B: so...
A: you go first.
B: No, you can talk first.
(infinite loop)

Maybe this does not happen in other cultures, but I guess it is pretty common to see in Korea, where appreciates polite attitude in conversation.
Anyway, this overlap makes you awkward to talk since you are interrupted. Hearing something while you are trying to talk something makes your brain jammed.

So I come up with an idea that making a chatbot that maximizes awkwardness in the conversation. The personality of voice is arrogant and sarcastic, so it keeps pretending to miss the conversation, answers in a very sarcastic way and does not do its job.

IMG_5948.jpg

So the key function is managing the time when the user listens to their self. In order to do that,
1. The program needs to record the user's command.
2. It can play the recorded sound (hopefully right back) to the user so the user cannot keep talking.
3. Even if the user successes to deliver the command, the AI answers in a sarcastic way.

I started with the example codes in AIY source. Tell a long story short, I failed. These are what I figured out after 6 hrs programming.
1. Using assistant APIs doesn't support to manage which answers that the AI gives.
- in this case, other people made their own chatbot, but I wanted to 'hack' the chatbot while keeping the structure of it. (That's the reason I needed to use either of gRPC or google assistant library.)
2. To record and playing back, multi treading is required. When I saw assistant_library_with_button_demo.py, multi treading looks possible, but could not figure the way to do it.

So what I did is just make it keep repeating to the user to make them annoyed, and randomly denies to 'assist' them.

As the result, it sounds rather like ELIZA than contemporary AIs we have.
It supposed to turn off and show the youtube video when the user says 'awkward', but the processor was too slow and showed the link a bit l made me embarrassed.