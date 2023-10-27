Spot, a robot dog tour guide created by Boston Dynamics, can now ‘speak,’ as the Hyundai-owned US robotics design firm leveraged OpenAI's ChatGPT to ‘train’ the robot to answer questions, and generate responses about the company's facilities. Boston Dynamics' 'Spot' robot (Image courtesy: Boston Dynamics)

How does Spot ‘speak’?

As per The Verge, to make Spot ‘talk,’ Boston Dynamics used ChatGPT along with some open-source large language models (LLMs) to carefully train the robot's responses. The engineers then outfitted it with a speaker, and added text-to-speech capabilities.

The team gave Spot a ‘very brief’ script; by combining the script with the imagery it gets from the cameras on its gripper and body, it gathers more information about what is in front of it, and generate a response accordingly.

As per Boston Dynamics, Spot essentially captures images and answers questions about them, doing so using Visual Question Answering models.

A demonstration…

In reality, Spot actually ‘mimics’ the act of speaking. As seen in a demonstration uploaded by Boston Dynamics on YouTube, the ‘tour dog’ opens its ‘mouth’ to answer questions; it is actually the text-to-speech capabilities that generate responses, while the speaker blares out the answers.

“We are excited to continue exploring the intersection of artificial intelligence and robotics. The LLMs can can help provide cultural context, general commonsense knowledge, and flexibility that could be useful for many robotics tasks — for example, being able to assign a task to a robot just by talking to it would help reduce the learning curve for using these systems,” said Matt Klingensmith, principal software engineer, Boston Dynamics.

