pdfpc (PDF Presenter Console) is a cross-platform presentation tool designed for speakers who need a professional edge. It leverages a multi-monitor setup to give you a private, real-time dashboard of slide previews, speaker notes, and a timer—all while your audience sees only the current slide.
pdfpc-live-translator is a companion tool for the pdfpc presentation software. It captures the presenter's voice in real time, performs speech‑to‑text recognition, displays the spoken text as a subtitle on the screen, and optionally translates the caption live into another language. The translated text appears as an overlay on the audience‑facing pdfpc window, making presentations accessible to multilingual audiences without requiring manual slide notes.
The default translation direction is English to Simplified Chinese, but the source and target languages can be reconfigured.
Uses on‑device speech recognition and machine translation models: no internet required.
Start the translator with default settings:
$> python3 start_talk_translate.py my-slides.pdf
The following elements must be installed to have the live translation working.
You must install Python and its virtual environment.
$> sudo apt install python3-full $> sudo apt install python3-venv
Create the Python virtual environment to install the live translation libraries.
$> mkdir ~/bin/python3_environments $> cd ~/bin/python3_environments $> python3 -m venv live_translator
Install the libraries in the virtual environment
$> ~/bin/python3_environments/live_translator/bin/pip install tk $> ~/bin/python3_environments/live_translator/bin/pip install pydub $> ~/bin/python3_environments/live_translator/bin/pip install screeninfo $> ~/bin/python3_environments/live_translator/bin/pip install whisper $> ~/bin/python3_environments/live_translator/bin/pip install faster-whisper
VOSK is used for capturing the sound wave from the microphone ad converting offline it to text.
$> ~/bin/python3_environments/live_translator/bin/pip install vosk
PyAudio is used for reading the microphone.
$> sudo apt install portaudio19-dev $> ~/bin/python3_environments/live_translator/bin/pip install pyaudio
ARGOS is used for translating offline the text from a source language to a target language.
$> ~/bin/python3_environments/live_translator/bin/pip install argostranslate
Below, you could specify the translation dictionary to be installed.
$> ~/bin/python3_environments/live_translator/bin/argospm install translate-en_zh
start_talk_translate [options] <pdf-file>
pdfpctool accepts the following options from the command line:
| Options | Description |
|---|---|
-h, --help |
show this help message and exit |
| `--input INPUT | index (numeric) of the input sound device. See --inputs for the complete list |
--inputs |
show the list of all the available sound input devices |
| `--langmodel LANGMODEL | path to the VOSK language model |
--notranslate |
disable live translation, same as running disrectly pdfpc |
--partial |
enable partial voice recognition by showing the translation of a text before the end of the sentence is detected |
| `--pythonenv PYTHONENV | path to Python virtual environment |
| `--quiet | disable live output on the console |
| --screens | show the list of all the available screens |
--screen SCREEN |
numeric index of the screen to be used. See --screens for the list |
--single, -S |
force to use only one screen |
--swap, -s |
swap the presentation/presenter screens |
--delay DELAY |
delay in seconds between pdfpc launch and the overlay launch (default: 2s) |
--page PAGE, -P PAGE |
start the talk at the given page number (default: 1) |
--inputbuffersize INPUTBUFFERSIZE |
change the size in bytes of the audio input buffer |
--soundrate SOUNDRATE |
sound rate in Hz |
| `--bothlangs | show the source and target messages |