Data privacy

Data privacy: what you shared and how to protect yourself

Time: 10:10 AM to 10:40 AM

Every API call you made this week sent data to a third-party server. Your voice, your questions, and images from the camera all left your Raspberry Pi and traveled to OpenAI’s data centers. This section covers what that means and how to protect yourself.

What you shared this week

Here is a concrete breakdown of every piece of data that left your robot:

Day	What was sent	Where it went
Day 3	Text prompts and responses	OpenAI servers
Day 3	Transcribed voice commands	OpenAI servers
Day 4	Camera images (base64 encoded)	OpenAI servers
Day 4	Images + text questions combined	OpenAI servers

By default, when you use the OpenAI API:

Your prompts, completions, and uploaded images may be logged
The account owner can opt out of training data usage, but this is not the default for all plan types
Even “private” API usage still sends data over the internet to OpenAI’s servers
You do not control how long data is retained

What was NOT shared

Not everything left the Pi. Here is what stayed local:

Component	Where it ran	Data stayed local?
Vosk speech-to-text	On the Pi	Yes — audio never left the robot
espeak text-to-speech	On the Pi	Yes — audio generated locally
Motor control	On the Pi	Yes — commands never left the robot
Camera capture	On the Pi	Images only left when sent to the API

Vosk is a privacy win — your raw voice audio never leaves the robot. Only the transcribed text gets sent to OpenAI.

Practical protections

Check data usage settings

API providers have settings to control whether your data is used for model training. Check your provider’s dashboard and disable training on your data where possible.

Never send personal information

Do not include names, addresses, medical data, passwords, or any personally identifying information in prompts. Treat every API call as a postcard — assume someone can read it.

Use local models for sensitive data

For private or sensitive use cases, run models locally so nothing ever leaves your machine. Tools like Ollama let you run LLMs on your own hardware.

Minimize image data

Use detail: "low" for vision API calls (as we did). This sends less image data. Consider whether you need to send an image at all — sometimes a text description is enough.

Review API terms of service

Read the actual terms. Understand what the provider can do with your data, how long they retain it, and where it is processed.

Local vs cloud: the tradeoff

	Cloud API (OpenAI)	Local model (Ollama)
Privacy	Data leaves your device	Data never leaves your device
Quality	Best available models	Smaller, less capable models
Speed	Fast (powerful servers)	Slow on Pi Zero (limited hardware)
Cost	Pay per token	Free after download
Internet	Required	Not required
Control	Provider controls the model	You control everything

For this camp, cloud APIs make sense — the Pi Zero 2W does not have enough power to run useful LLMs locally. But for real-world applications with sensitive data, local models are often required.

Enterprise and government context

Organizations handling sensitive data face strict requirements:

HIPAA (healthcare) — patient data cannot be sent to unauthorized third parties
FERPA (education) — student records have strict sharing rules
ITAR/EAR (defense) — export-controlled technical data cannot leave US jurisdiction
GDPR (EU) — personal data of EU citizens has strict processing and retention rules
FedRAMP (US government) — cloud services must be certified before government use

In these contexts, sending data to a commercial API like OpenAI’s may be prohibited. Organizations use private deployments, on-premise models, or government-certified cloud environments instead.

Metadata is data too

Even if your prompt content is encrypted, the pattern of API usage can reveal information:

When you make calls (timestamps reveal your schedule)
How often (frequency reveals intensity of work)
How much data (volume reveals project scope)
From where (IP addresses reveal location)

Metadata patterns can be as revealing as the data itself. Intelligence agencies have long known this — it applies to AI systems too.

Discussion questions

You sent camera images to OpenAI this week. What if those images contained someone’s face without their consent? Who is responsible?
Should AI companies be required to delete your data after a fixed period?
The Vosk model ran locally — no data left your Pi. What did you give up in exchange for that privacy? (hint: accuracy)

After this section, take a 10-minute break (10:40 AM to 10:50 AM).

Welcome

Class Recordings

Day 1: Setup and Calibration

Day 2: Code & Computer Vision

Day 3: GenAI and Cloud LLMs

Day 4: Vision AI

Day 5: AI Ethics & Final Project

Data privacy: what you shared and how to protect yourself

What you shared this week

What was NOT shared

Practical protections

Local vs cloud: the tradeoff

Enterprise and government context

Metadata is data too

Discussion questions

​Data privacy: what you shared and how to protect yourself

​What you shared this week

​What was NOT shared

​Practical protections

​Local vs cloud: the tradeoff

​Enterprise and government context

​Metadata is data too

​Discussion questions

Data privacy: what you shared and how to protect yourself

What you shared this week

What was NOT shared

Practical protections

Local vs cloud: the tradeoff

Enterprise and government context

Metadata is data too

Discussion questions