Pierre-Carl Langlais

Pclanglais

AI & ML interests

Open data & open LLMs

Recent Activity

updated a dataset 2 minutes ago
PleIAs/common_corpus
published a model about 17 hours ago
LLMDH/350m_ocr
updated a model 1 day ago
LLMDH/350m_treasoning_complete
View all activity

Organizations

AgentPublic's profile picture BigScience Data's profile picture Kheops SAS's profile picture Blog-explorers's profile picture OpenLLM France's profile picture ZeroGPU Explorers's profile picture INAGUA's profile picture PleIAs's profile picture :probabl.'s profile picture Social Post Explorers's profile picture LLM - Digital Humanities's profile picture

Pclanglais's activity

published an article 6 months ago
view article
Article

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

29
published an article 7 months ago
view article
Article

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

20
published an article 10 months ago
view article
Article

Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM

15
published an article 10 months ago
view article
Article

Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data

22
published an article 11 months ago
view article
Article

Releasing Common Corpus: the largest public domain dataset for training LLMs

18