Publications

    2024

    TARIC-SLU: A Tunisian Benchmark Dataset for Spoken Language Understanding (paper will be available in the LREC proceedings in May) 

    Salima Mdhaffar, Fethi Bougares, Renato De Mori, Salah Zaiem, Mirco Ravanelli, Yannick Estève

    LREC-COLING 2024


    2023

    Towards End-to-end Speech-to-text Summarization (PDF)

    Raul Monteiro, Diogo Pernes

    Publication in a Conference Proceeding, TSD2023


    RIGA at SemEval-2023 Task 2: NER Enhanced with GPT-3 (PDF)

    Eduards Mukans, Guntis Barzdins

    Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023) 


    LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

    Titouan Parcollet, Ha Nguyen, Solène Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Estève, Mickael Rouvier, Jerôme Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fa- bien Ringeval, Didier Schwab, Laurent Besacier

    Computer Speech & Language


    Enhancing expressivity transfer in textless speech-to-speech translation (PDF)

    Jarod Duret, Benjamin O’Brien, Yannick Estève, Titouan Parcollet

    2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)


    ON-TRAC consortium systems for the IWSLT 2023 dialectal and low-resource speech translation tasks (PDF)

    Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Antoine Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, Yannick Esteve

    International Conference on Spoken Language Translation (IWSLT) 2023 


    Specialized Semantic Enrichment of Speech Representations

    Gaëlle  Laperrière, Ha Nguyen, Sahar Ghannay, Bassam Jabaian, Yannick Estève

    IEEE ICASSP 2023 workshop on Self-supervision in Audio, Speech and Beyond 


    Federated Learning for ASR based on Wav2vec 2.0 (PDF)

    Tuan Nguyen, Salima Mdhaffar, Natalia Tomashenko, Jean-François Bonastre, Yannick Estève

    2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)


    Learning multilingual expressive speech representation for prosody prediction without parallel data

    Jarod Duret, Titouan Parcollet, Yannick Estève

    12th ISCA Speech Synthesis Workshop (SSW2023)


    Supervising the Centroid Baseline for Extractive Multi-Document Summarization

    Simão Gonçalves, Gonçalo Correia, Diogo Pernes, and Afonso Mendes

    In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 87–96, Singapore, Association for Computational Linguistics


    2022

    Metamodel Specialisation based Tool Extension (PDF)

    Paulis Barzdins, Audris Kalnins, Edgars Celms, Janis Barzdins, Arturs Sprogis, Mikus Grasmanis, Sergejs Rikacovs, Guntis Barzdins

    Baltic J. Modern Computing, Vol.10, No. 1, 17-35, https://doi.org/10.22364/bjmc.2022.10.1.02


    Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Understanding

    Salima Mdhaffar, Valentin Pelloin, Antoine Caubrière, Gaëlle Laperrière, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin, Yannick Estève

    Publication in a Conference Proceeding, LREC 2022


    The Spoken Language Understanding Media Benchmark Dataset in the Era of Deep Learning: data updates, training and evaluation tools

    Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, Salima Mdhaffar, Sahar Ghannay, Bassam Jabaian, Nathalie Camelin, Yannick Estève

    Publication in a Conference Proceeding, LREC 2022


    Speech Resources in the Tamasheq Language (PDF)

    Marcely Zanon Boito, Fethi Bougares, Florentin Barbier, Souhir Gahbiche, Loïc Barrault, Mickael Rouvier, Yannick Estève

    Publication in a Conference Proceeding, LREC 2022


    Le benchmark MEDIA revisité : données, outils et évaluation dans un contexte d’apprentissage profond

    Gaëlle Laperrière, Valentin Pelloin, Antoine Caubrière, Salima Mdhaffar, Nathalie Camelin, Sahar Ghannay, Bassam Jabaian, Yannick Estève

    Publication in a Conference Proceeding, JEP 2022


    LeBenchmark, un référentiel d’évaluation pour le français oral (PDF)

    Hang Le, Sina Alisamir, Marco Dinarelli, Fabien Ringeval, Solène Evain, Ha Nguyen, Marcely Zanon Boito, Salima Mdhaffar, Ziyi Tong, Natalia Tomashenko, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Didier Schwab and Laurent Besacier

    Publication in a Conference Proceeding, JEP 2022


    Modèles neuronaux pré-appris par auto-supervision sur des enregistrements de parole en français

    Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab and Laurent Besacier

    Publication in a Conference Proceeding, JEP 2022


    ON-TRAC Consortium Systems for the IWSLT 2022: Dialect and Low-resource Speech Translation Tasks

    Marcely Zanon Boito, John Ortega, Hugo Riguidel, Antoine Laurent, Loïc Barrault, Fethi Bougares, Firas Chaabani, Ha Nguyen, Florentin Barbier, Souhir Gahbiche, Yannick Estève

    Publication in a Conference Proceeding, IWSLT 2022


    Findings of the IWSLT 2022 Evaluation Campaign

    Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

    Publication in a Conference Proceeding, IWSLT 2022


    Simplifying Multilingual News Clustering Through Projection From a Shared Space

    João Santos, Afonso Mendes and Sebastião Miranda

    Publication in a Conference Proceeding, ECIR 2022


    A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems

    Marcely Zanon Boito, Laurent Besacier, Natalia Tomashenko, Yannick Estève

    Publication in a Conference Proceeding, INTERSPEECH 2022


    ∞-former: Infinite Memory Transformer

    Pedro Henrique Martins, Zita Marinho, Andre Martins

    Publication in a Conference Proceeding, ACL 2022


    End-to-end model for named entity recognition from speech without paired training data

    Salima Mdhaffar, Jarod Duret, Titouan Parcollet, Yannick Estève

    Publication in a Conference Proceeding, INTERSPEECH 2022


    RIGA at SemEval-2022 Task 1: Scaling Recurrent Neural Networks for CODWOE Dictionary Modeling

    Eduards Mukans, Gus Strazds, Guntis Barzdins

    Publication in a Conference Proceeding, SemEval-2022


    RUTA: MED – Dual Workflow Medical Speech Transcription Pipeline and Editor

    Arturs Znotins, Roberts Dargis, Normunds Gruzitis, Guntis Barzdins, Didzis Gosko

    Publication in a Conference Proceeding, NLDB 2022


    On the Use of Semantically Aligned Speech Representations for Spoken Language Understanding (PDF)

    Gaëlle Laperrière, Valentin Pelloin, Mickaël Rouvier, Themos Stafylakis, Yannick Estève

    Publication in a Workshop, SLT 2022


    Improving abstractive summarization with energy-based re-ranking

    Diogo Pernes, Afonso Mendes, André F.T. Martins

    Publication in a Workshop, GEM workshop at EMNLP 2022


    2021

    Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark (PDF)

    Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar,  Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

    Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks


    LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech (PDF)

    Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar,  Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

    Proc. Interspeech 2021, 1439-1443, doi: 10.21437/Interspeech.2021-556


    Where are we in semantic concept extraction for Spoken Language Understanding? (PDF)

    Sahar Ghannay, Antoine Caubrière, Salima Mdhaffar, Gaëlle Laperrière, Bassam Jabaian, Yannick Estève

    Speech and Computer: 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, Proceedings


    Priberam Labs at the 3rd Shared Task on SlavNER (PDF)

    Pedro Ferreira, Rúben Cardoso, Afonso Mendes

    Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing