Skip to main content

Connect to OpenAI and build a voice chatbot

Time: 10:35 AM to 11:25 AM
In this section, you will connect your robot to a real large language model and have a spoken conversation with it. The full loop: you speak, the robot transcribes, the LLM responds, and the robot speaks the answer.

Set up the API key

The facilitator will provide an OpenAI API key. Store it on your robot:
# ~/picar-x/example/secret.py
OPENAI_KEY = 'sk-...'

Test the API connection

Run the LLM connection test to confirm everything works:
sudo python3 test_openai.py
This sends a text prompt to GPT-4o-mini and prints the response in your terminal.

Set up voice input

Test the speech-to-text system using Vosk:
sudo python3 test_stt_vosk.py
Speak a sentence and confirm it transcribes correctly on screen.
The Vosk language model downloads on first run (about 40 MB). This may take a few minutes if it has not been downloaded yet.

Run the voice chatbot

Run the local voice chatbot example, substituting Ollama for the OpenAI API connection:
sudo python3 voice_chatbot.py
The full interaction loop:
  1. You speak a question
  2. Vosk transcribes your speech to text
  3. The text is sent to GPT
  4. GPT generates a response
  5. Piper converts the response to speech
  6. The robot speaks the answer aloud

Customize the personality

Modify the system prompt to give the robot a specific role. For example:
You are a tour guide who only talks about the solar system.
Try different personalities and see how the robot’s responses change.
The system prompt is the single most important control you have over the model’s behavior. Experiment with it freely.