SELMA open-source

Upload a media item (Video / Audio) and get it transcribed or translated or add a voice-over. SELMA open-source demos how various NLP processes can be done for free and in a single, integrated platform for 30+ languages. Including open-source models from Open AI, Facebook and other EU projects (including Gourmet). No data is being stored!

Tasks done in the example (see picture below): Transcribe the uploaded audio of the video (by Whisper / Open AI, language is automatically detected); Transcript is manually corrected (Person name); Translated into Turkish using Whisper; Voice-over is added (TTS – Text To Speech – by MMS from Hugging Face); the Turkish translation can be downloaded as an SRT file; The Turkish Voice-over as a WAV file.

Screenshot of the SELMA open-source NLP platform

Image shows the SELMA open-source platform with a transcribed English video on the left side and a Turkish translation on the right side. At the bottom, a thumbnail of an uploaded video including the speech sequence can be seen.

Watch this video for a quick jump into what you can do with the tool:

SELMA contribution

  • Development of an open-source platform for media production and NLP tasks (UI)
  • Development of a scalable backend
  • Continuous integration of various NLP open-source technologies


Bring me to Prototypes