Amy and Andrew Ingram — Your AI assistants from X.AI: An Interview with Diane Kim, Interaction Designer

Part of the Bot Master Builders Series

Arun Rao
Chatbots Magazine

--

The seasoned team at X.AI have spent years doing one thing really well. They have built one of the best AI experiences. Their assistant agents, Amy and Andrew Ingram, help you schedule meetings so you don’t get stuck in email ping-pong. The original agents were designed by the X.AI founders and the first interaction designer, Anna Kelsey. Diane Kim then took over after graduating from Penn, where she studied cognitive science with a focus in computation. So basically conversation design was her first job right out of college and she is the youngest designer I’ve profiled.

How it works: Amy and Andrew are “persons” who exists via their email address. When you want to make a plan to meet with someone, you copy (CC) Amy or Andrew to do the scheduling. If you want an in-person meeting, Amy or Andrew will start to suggest places to meet and times to meet. If you’ve used Amy or Andrew before, they will start to learn what your favorite places to meet are (Philz Coffee or the local bar). Once a meeting is set, it is added into whatever calendar you use and the invitation is sent to the other party. Compared to its two major competitors, X.AI gets great reviews!

X.AI’s co-founder Dennis Mortensen has stated there will be an increasing number of strong players who will do single services based around AI very well. This is vertical AI and it may take years to do mundane but really important tasks with skill, like scheduling meetings, helping people with health care, or organizing their finances. By 2016, X.AI had raised more than $32mm with a reported $100mm valuation. It’s been reported that the actual humans who work at X.ai have ‘human’ in their email address, to contrast with the AI agents amy@x.ai and andrew@x.ai.

The paid service costs as low as $17 per month. Amy and Andrew are great at scheduling, and they allow you to look at their transcripts to oversee them. The UI on the website is very clean. Set up is easy. Users report feeling like they have explicit control over what the agents do, with comments like: “They strike a great balance between automagical and transparent.” The on-boarding is also fantastic, as they advise that you need to schedule 12 meetings before you really grok how to use the service. This is fair and I think it’s great that they set expectations. One downside is that you may have to wait for 10 minutes for Amy to respond, and this is for quality control and some checks along the way. Finally, the training docs on how to use Amy and Andrew are really helpful: the cheat sheet, Getting Started With Doc, and Playbook.

What was the original design vision for your bot? Does it have one or two clear functions, or many?
Anna Kelsey was the original AI interaction designer. She made the early, prior decisions on who Amy and Andrew would be. The main decisions were personality and end goals, what do they do and not do.

Anna and the X.AI executives made a clear decision that the agents will only schedule meetings, nothing more or less. Amy and Andrew won’t book flights or hotels — while this is a natural next step, it has to do one thing super super well. 3.5 years later, we are still working on this same problem.

As soon as a user tries to talk about something else, Amy brings them back to the goal. This is different from Siri. Amy will not ask any open-ended conversations — “What do you want me to do?” We are trying to lead people down the right path, with the end goal of setting up a meeting. Both agents try to “remove email ping-pong from scheduling meetings.”

We’ve thought about the core traits for their personality — things you’d want in a scheduler. Amy must be professional, friendly, very clear, positive but not too cheery, with a good balance of a professional tone — she is representing people in their meetings.

Team: how did your team come together, and what are the roles?
I do it all on the copy side — the flows and conversation. I think through all the scenarios about scheduling meetings. Our engineers will do the logic. I do the writing side and the code to connect my scripts to the logic on the back end.

How do you measure success; what are your metrics?
We are always thinking about this — there are no standardized tools of ways to track chatbot success. Everything we track is super specific. Our goal is to schedule a meeting as quickly and efficiently as possible. Did it get scheduled? How quickly? How are emails are sent back and forth? On the qualitative side, we track how the users feel — we will tag gratitude statements (thanks, well done, etc) or corrections (this is the wrong time zone, you suck) — they are programmatically tagged. We also send surveys to customers. Amy/Andrew can ask “How did I do scheduling this meeting?” All the team gets the results, we discuss it, and implement changes quickly.

What are successful interactions? What are failed interactions?
Success is when a meeting gets scheduled quickly with few emails back and forth. Every email has a purpose and expected response (e.g., to get location details like a street address). We see how closely aligned the responses are to what the agent prompted.

Failed interactions happen if the agent cannot understand what someone is talking about, or if the agent predicted the wrong thing. It’s a fail if there are lots of emails going back and forth if lots of participants.

There could be things related to a meeting that Andrew hasn’t trained about, for example, ways to talk about one’s availability. So if a human says something like this, it’s hard to parse:

“I can only come in person if it’s not raining that day.” Andrew doesn’t have the logic to deal with it.

We have AI trainers to confirm that Amy’s prediction on an email is accurate — so this can lead to the 10 minute latency in her responses. If a high confidence level, her response is automatic. AI checkers will look at the stripped text and make a decision for emails with a lower confidence level.

Editorial and scripting: what have you learn from flows so far? Is all your content evergreen, or do you do limited time or seasonal content also?

It’s challenging because if a chatbot is open-ended, people can come back with anything! Other chatbots only give limited options. But X.AI is over email, so you have a much larger set of possibilities — more branches and scenarios. We have to account for that and catch all those scenarios — we have to understand and elegantly recover any strange conversation.

We have internal tools to keep track of all the branches and subsets. We have our own system to create scenarios and write the text for it, contextual variables.

What are the most common things users ask outside of the main function — do they ask for jokes or other advice?

Testing and Trolling: People are always super-fascinated on how much an AI can understand. Users pretend the bot is an actual human or say ridiculous things, sometimes negative, sometimes say random things like “what color is the sky?” They want to test and try out the technology. Yet our chatbot’s domain is only meeting scheduling — people want to stretch it.

User acquisition strategy — how do people hear about your bot and start using it?
We are lucky and unique — for anyone using the bot, they set meetings up with others who are exposed to the bot. So we are blessed with some virality there. We’ve gotten a lot of signups from the signaturing.

Still, many people who interact with Amy or Andrew don’t al all realize it’s a chatbot. Early on, our users were startup founders and people setting up one-on-one meetings. We now have a new business edition for teams.

Re-triggering and re-use strategy — as many bot developers know, this is tough. How do you get initial users to engage again?

If someone signs up and doesn’t use Amy, we will send some emails based on metrics. Also Christian on the Customer Success team will follow up. We don’t want Andrew or Amy to be super spammy or marketing-heavy — we preserve their persona for just scheduling.

Monetization? Many bots are a great free service — how have you tested monetizing it?
We had a long closed beta period, for the first 2.5 years; that was free. We only started charging last Sept/October [2016] when we released the first paid version.

We had to spend time to get enough of a data set to feel comfortable. We had to nail down the core scheduling features and had to feel confident that Amy and Andrew could schedule well. Then we felt they could collect more data on it.

Every single person at X.AI uses Amy and Andrew on a daily basis. We are constantly dog-fooding and QA testing the project.

What can you tell us about your tech stack? Do you do NLP in-house, what external services do you like?

Much of our NLP is done in house. Our data science is via using Python, with Scala for back-end and JavaScript for the front end.

Thoughts on the different platforms for chatbots or AI agents? FB Messenger vs Amazon and Google or Kik? Voice platforms?

We are purely email. This is an intentional choice, as meeting scheduling is done mostly email. We are thinking about Slack, texting, FB Messenger. Email is still the #1 channel for professional meetings — social meetings may happen on other platforms.

What other bots have you looked to for inspiration — what other chatbots made you say “WOW”? Are there other use cases you’ve thought were simply brilliant?
Amy is not a chatbot; I see her more as an Assistant. I haven’t found any chatbots that have blown me away on FB Messenger. Many of the FB Messenger chatbots seem like a different way of using an app, with too many buttons. For personality design, there is a weather app called Sunshine. It’s a social weather app and it learns your preferences on weather and how you felt — it gives you customized suggestions. Sunshine gives you advice on health and well-being. It feels like I’ve gotten to know Sunshine.

One thing I thought about recently was an AI Assistant to go through my mailbox to find me the best deals and then unsubscribe from the spammy emails. This would be similar to Unroll.me, a bot that went through my inbox to unsubscribe from mailing lists.

Other smart people in the bot world you’ve met, whether on the tech stack, UX, scripting, or even financing sides?

I liked one of the founders from Wade & Wendy, an AI personal career guide, to help you in the recruiting process. Ari Z. was working on personality design for an emoji salad, like emoji pictionary over text — he came up with some insights on personality design. Also my former colleague Anna Kelsey is working in DC on digital campaign stuff, I think.

Any lessons to share with other bot-builders on useful tools?

I wish I had more tools. From the research I’ve done, some exist for voice assistant or FB Messenger bots. There are no tools for an email interface.

Amy and Andrew Ingram — where did the names come from — they’re pretty cryptic?

Their initials are AI — also Ingram comes from “n-gram”; it’s a nod to NLP, a nerdy play of words. [Note: An n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application.]

Read the prior articles in this series:

Rose the Loebner Chatbot Winner (Bruce Wilcox)

Poncho the WeatherCat bot (Greg Leuch)

Howdy and Botkit (Eric Soelzer)

Statsbot for Business Metrics (Artyom Keydunov)

Earplay: What Chatbots can Learn from Interactive Voice Games (Jon Myers)

--

--