Tristan Thrush
Tristan
AI & ML interests
NLP, Datasets, Multimodality
Recent Activity
upvoted
an
article
15 days ago
Optimizing Pretraining Data Mixes with LLM-Estimated Utility
updated
a model
24 days ago
Tristan/dclm-perplexity-correlations-410m-3
updated
a model
24 days ago
Tristan/dclm-perplexity-correlations-160m-3
Organizations
Tristan's activity
Convert dataset to Parquet
1
#10 opened 4 months ago
by
Tristan
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61e9e3d4e2a95338e04c9f33/c3Pfr2LrD5Dbf0eFkScP6.png)
Trouble getting access to dataset
3
#9 opened 4 months ago
by
iliang1234
Update license_agreement.txt
#7 opened 10 months ago
by
Tristan
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61e9e3d4e2a95338e04c9f33/c3Pfr2LrD5Dbf0eFkScP6.png)
Update README.md
#2 opened 12 months ago
by
Tristan
![](https://cdn-avatars.huggingface.co/v1/production/uploads/61e9e3d4e2a95338e04c9f33/c3Pfr2LrD5Dbf0eFkScP6.png)
Streaming dataset generation
3
#6 opened about 1 year ago
by
davidmezzetti
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1591114977477-noauth.png)
Notifications from Datasets Server
2
#5 opened over 1 year ago
by
parquet-converter
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1658495802629-61f02cf649ea1fb7363729dc.png)
readme: add language tag
1
#6 opened over 1 year ago
by
stefan-it
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1584020801691-noauth.jpeg)
Add code highlighting to the README
1
#4 opened over 1 year ago
by
bryant1410
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1628132691344-5e7b8bd930dc073f817a2ba2.jpeg)
Add LM and MLM tasks
1
#1 opened about 2 years ago
by
lhoestq
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1594214747713-5e9ecfc04957053f60648a3e.png)
Add TF weights
2
#1 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Update tokenizer_config.json
1
#2 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Add TF weights
2
#1 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Add TF weights
2
#1 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Add TF weights
2
#2 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Update tokenizer_config.json
1
#3 opened about 2 years ago
by
joaogante
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1641203017724-noauth.png)
Create README.md
4
#1 opened about 2 years ago
by
puffy310
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1630816930903-noauth.jpeg)
Make filters shareable
4
#10 opened over 2 years ago
by
BramVanroy
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1594192845975-5e1e17b6fcf41d740b6996a8.jpeg)
Resource used to produce this version of dataset?
1
#1 opened about 2 years ago
by
spate141
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1654555959564-5f4066e079c1ba4c353d0c75.png)
Update `hf_hub_url` call
1
#2 opened about 2 years ago
by
xiaohk
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1630793941366-noauth.png)
Replace hf_hub_url calls with relative path
1
#3 opened about 2 years ago
by
mariosasko
![](https://cdn-avatars.huggingface.co/v1/production/uploads/605fee52a88baa3edbe43fce/5IfYnbpUg5LqmebWqbIM7.jpeg)