Skip to main content

LLM-controlled robot via tool calls

Time: 11:25 AM to 12:15 PM
This is the moment everything comes together. Your robot now understands speech, thinks with a large language model, speaks back to you, and executes physical actions based on the conversation.

Run the AI Voice Assistant Car

sudo python3 21.voice_active_car_gpt.py

How to interact

Say the wake word “Hey buddy” to activate the robot. It will greet you, listen to your request, send it to GPT, speak the response, and execute physical actions.

Available actions

The robot can perform these physical actions through tool calls:
ActionWhat it does
Shake headCamera pans side to side
NodCamera tilts up and down
CelebrateA happy movement sequence
ForwardDrive forward
BackwardDrive backward
Wave handsCamera does a wave pattern
Act cuteA playful movement
DepressedA slow, sad movement

Have a conversation

Try these interactions:
  • Ask the robot to move forward
  • Ask it a trivia question
  • Give it a multi-step task
  • See how it responds to unexpected requests

Inspect the code

Read through the script with the facilitator and find:
  • Where is the system prompt defined?
  • Where are the tool definitions listed?
  • Where does the model’s response get parsed into physical actions?

Make it yours

1

Change the wake word

Replace “Hey buddy” with a custom wake word of your choosing.
2

Update the persona

Modify the system prompt so the robot has a personality you design.

Day 3 wrap-up

Recap: The full loop is now complete. Voice in, speech-to-text, LLM processing, text-to-speech, voice out, and physical actions via tool calls. The robot now has language intelligence, but it is still blind. It can hear and speak but cannot truly see. Preview for tomorrow: The robot’s camera feed goes directly into a vision language model. It will finally be able to see and understand its surroundings.