Everyone who knows Tony Stark knows about his Artificial Intelligence (AI) assistant Jarvis. The AI bot not just helps Tony to do his daily chores, but also helps his alter-ego Iron Man in fighting bad guys and saving the universe. And if you are a true fan, you know how much Tony depends on Jarvis.
While Tony Stark, Iron Man and Jarvis are fictional characters, Mark Zuckerberg has built a real-life AI bot! What else, he too has named his bot ‘Jarvis’.
According to Mark, it was his personal challenge to build a simple AI bot to run his home. Inspired by Iron Man’s Jarvis, he started to make an AI assistant for himself.
In a note published on his Facebook page, Mark has explained the journey of making his AI bot.
My goal was to learn about the state of artificial intelligence — where we’re further along than people realise and where we’re still a long way off.
So far this year, I’ve built a simple AI that I can talk to on my phone and computer, that can control my home, including lights, temperature, appliances, music and security, that learns my tastes and patterns, that can learn new words and concepts, and that can even entertain Max. It uses several artificial intelligence techniques, including natural language processing, speech recognition, face recognition, and reinforcement learning, written in Python, PHP and Objective-C.
The first step in this process was to connect all the home appliances to Jarvis. This was a difficult task as not all appliances are Internet connected and even those which are, have their programmes written in different languages and follow different protocols.
Before I could build any AI, I first needed to write code to connect these systems, which all speak different languages and protocols. We use a Crestron system with our lights, thermostat and doors, a Sonos system with Spotify for music, a Samsung TV, a Nest cam for Max, and of course my work is connected to Facebook’s systems. I had to reverse engineer APIs for some of these to even get to the point where I could issue a command from my computer to turn the lights on or get a song to play.
Next, it was important to create a system to interact with the AI. So, Mark created a command execution program and fine-tuned it to differentiate between his speech and his wife’s. His system learns gradually by understanding open-ended requests. However, it is not able to grant specific wishes as of now.
The more context an AI has, the better it can handle open-ended requests. At this point, I mostly just ask Jarvis to “play me some music” and by looking at my past listening patterns, it mostly nails something I’d want to hear. If it gets the mood wrong, I can just tell it, for example, “that’s not light, play something light”, and it can both learn the classification for that song and adjust immediately. It also knows whether I’m talking to it or Priscilla is, so it can make recommendations based on what we each listen to. In general, I’ve found we use these more open-ended requests more frequently than more specific asks. No commercial products I know of do this today, and this seems like a big opportunity.
The next challenge for Mark was to create a vision and face recognition program so that his AI can identify and differentiate between people. He installed multiple cameras at his door so that if a person arrives, the AI can identify them and entertain them accordingly.
I installed a few cameras at my door that can capture images from all angles. AI systems today cannot identify people from the back of their heads, so having a few angles ensures we see the person’s face. I built a simple server that continuously watches the cameras and runs a two-step process: first, it runs face detection to see if any person has come into view, and second, if it finds a face, then it runs face recognition to identify who the person is. Once it identifies the person, it checks a list to confirm I’m expecting that person, and if I am then it will let them in and tell me they’re here.
What Mark needed next was a way to communicate with his bot, not just from his computer but from anywhere he wants. So he created a messenger bot to communicate with ‘Jarvis’. This way, he could access his AI from anywhere on the planet through his phone.
I can text anything to my Jarvis bot, and it will instantly be relayed to my Jarvis server and processed. I can also send audio clips and the server can translate them into text and then execute those commands. In the middle of the day, if someone arrives at my home, Jarvis can text me an image and tell me who’s there, or it can text me when I need to go do something.
The final step was to enable Jarvis to recognise the voice of his master. So, Mark built a dedicated Jarvis app that could listen to him and more importantly, identify him. And he used the voice of none other than God himself! That’s right, Morgan Freeman has lent his voice to Jarvis.
On a psychologic level, once you can speak to a system, you attribute more emotional depth to it than a computer you might interact with using text or a graphic interface. One interesting observation is that ever since I built voice into Jarvis, I’ve also wanted to build in more humor. Part of this is that now it can interact with Max and I want those interactions to be entertaining for her, but part of it is that it now feels like it’s present with us. I’ve taught it fun little games like Priscilla or I can ask it who we should tickle and it will randomly tell our family to all go tickle one of us, Max or Beast. I’ve also had fun adding classic lines like “I’m sorry, Priscilla. I’m afraid I can’t do that.”
Mark described this experiment as a great challenge and according to him, AI is something that is there in the future, but we just can’t achieve it until we define what actual intelligence is.
In a way, AI is both closer and farther off than we imagine. AI is closer to being able to do more powerful things than most people expect — driving cars, curing diseases, discovering planets, understanding media. Those will each have a great impact on the world, but we’re still figuring out what real intelligence is.
We all have been fascinated by ‘Iron Man’s’ Jarvis in the movies and Mark’s project is a positive step forward towards the future of AI. Let’s hope that we soon get an AI butler who will keep our superhero suit hidden while helping us with other daily chores.
You can read Mark’s original post here and check his video of Jarvis here.