Manuel Faysse

On a latent space odyssey.

prof_pic.jpg

PhD Candidate

Lead Research Scientist

Paris, France

Hey! I am Manu, a 2nd year PhD student working on applied NLP and ML Privacy research, but curious about (way too) many other things!

After pretraining at EPFL with a master’s in Robotics and Data Science, and an awesome research stint with the Computational Privacy Group at Imperial College London, I worked as a Research Scientist at Illuin Technology on various NLP use cases, notably deep multimodal models for Document ML and neural information retrieval.

I am now in my academic finetuning phase as a PhD student at CentraleSupélec (Université Paris Saclay), supervised by the distilled knowledge of Pierre Colombo. My research focuses on industrial applications of large language models, with papers on instruction model automatic evaluation, bilingual Large Language Model pretraining (CroissantLLM), multimodal information retrieval (ColPali), as well as model memorization, or confidence estimation techniques for neural information retrieval.

My work has been published in top international venues (ICML, EMNLP, TMLR), has been featured in the press (MIT Tech Review, Nature Magazine, Usine Digitale, Usine Nouvelle, etc.), gave way to many invited talks (Meta, IBM, Naver, LlamaIndex, etc.) and has been listed as a top AI innovation of 2024 (State of AI, Tech Radar).

My PhD is funded through the CIFRE French program in collaboration with Illuin Technology, where I currently hold a Lead Research Scientist position, and spend a minor share of my time advising and accompanying various R&D efforts in the LLM and Vision LLM space.

Don’t hesitate to contact me to discuss, or to inquire about potential collaborations or invited talks !

news

Oct 15, 2024 ColPali has been featured in the 2024 edition of the renowned State of AI and is listed in the Tech Radar as a top AI innovation to assess.
Sep 30, 2024 The interview I gave to Jakub Zavrel of Zeta Alpha on the topic of Visual Document Retrieval has been released on Youtube and Spotify.
Sep 23, 2024 Our work “Towards Trustworthy Reranking; A Simple yet Effective Abstention Mechanism” is accepted at TMLR !
Aug 19, 2024 Gave an invited talk at Unbabel on ColPali and Retrieval in Vision Space.
Jul 26, 2024 Invited at the LlamaIndex webinar to talk about ColPali and Document Retrieval in Vision Space.

selected publications

2024

  1. colpali.png
    ColPali: Efficient Document Retrieval with Vision Language Models
    Manuel Faysse, Hugues Sibille, Tony Wu, and 4 more authors
    2024
  2. croissant.png
    CroissantLLM: A Truly Bilingual French-English Language Model
    Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, and 13 more authors
    2024

2023

  1. gavel.png
    Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications
    Manuel Faysse, Gautier Viaud, Céline Hudelot, and 1 more author
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023