Connect to OpenAI and build a voice chatbot
Time: 10:35 AM to 11:25 AM
In this section, you will connect your robot to a real large language model and have a spoken conversation with it. The full loop: you speak, the robot transcribes, the LLM responds, and the robot speaks the answer.
Set up the API key
The facilitator will provide an OpenAI API key. Store it on your robot:
# ~/picar-x/example/secret.py
OPENAI_KEY = 'sk-...'
Test the API connection
Run the LLM connection test to confirm everything works:
sudo python3 test_openai.py
This sends a text prompt to GPT-4o-mini and prints the response in your terminal.
Test the speech-to-text system using Vosk:
sudo python3 test_stt_vosk.py
Speak a sentence and confirm it transcribes correctly on screen.
The Vosk language model downloads on first run (about 40 MB). This may take a few minutes if it has not been downloaded yet.
Run the voice chatbot
Run the local voice chatbot example, substituting Ollama for the OpenAI API connection:
sudo python3 voice_chatbot.py
The full interaction loop:
- You speak a question
- Vosk transcribes your speech to text
- The text is sent to GPT
- GPT generates a response
- Piper converts the response to speech
- The robot speaks the answer aloud
Customize the personality
Modify the system prompt to give the robot a specific role. For example:
You are a tour guide who only talks about the solar system.
Try different personalities and see how the robot’s responses change.
The system prompt is the single most important control you have over the model’s behavior. Experiment with it freely.