Upload a media item (Video / Audio) and get it transcribed or translated or add a voice-over. SELMA open-source demos how various NLP processes can be done for free and in a single, integrated platform for 30+ languages. Including open-source models from Open AI, Facebook and other EU projects (including Gourmet). No data is being stored!
Tasks done in the example (see picture below): Transcribe the uploaded audio of the video (by Whisper / Open AI, language is automatically detected); Transcript is manually corrected (Person name); Translated into Turkish using Whisper; Voice-over is added (TTS – Text To Speech – by MMS from Hugging Face); the Turkish translation can be downloaded as an SRT file; The Turkish Voice-over as a WAV file.
Image shows the SELMA open-source platform with a transcribed English video on the left side and a Turkish translation on the right side. At the bottom, a thumbnail of an uploaded video including the speech sequence can be seen.
Watch this video for a quick jump into what you can do with the tool:
- Development of an open-source platform for media production and NLP tasks (UI)
- Development of a scalable backend
- Continuous integration of various NLP open-source technologies
- SELMA open-source website: https://selma.ailab.lv/#
Bring me to Prototypes