Hynek Kydlicek's picture

Hynek Kydlicek PRO

hynky

·

AI & ML interests

Data-processing

Recent Activity

authored a paper about 3 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

updated a dataset about 5 hours ago

math-extraction-comp/deepseek-ai__DeepSeek-R1-Distill-Qwen-32B_private

published a dataset about 5 hours ago

math-extraction-comp/deepseek-ai__DeepSeek-R1-Distill-Qwen-32B_private

View all activity

Organizations

hynky's activity

liked a dataset about 2 months ago

data-is-better-together/fineweb-c

Viewer • Updated 2 days ago • 58.1k • 1.2k • 39

liked a dataset 2 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated 29 days ago • 12.5B • 69.7k • 405

liked a Space 2 months ago

Number Tokenization Blog

Explore how tokenization affects arithmetic in LLMs

liked a dataset 2 months ago

CohereForAI/Global-MMLU

Viewer • Updated Dec 12, 2024 • 602k • 8.74k • 104

liked a Space 2 months ago

Discussion Forum

liked a dataset 3 months ago

ClusterlabAi/InstAr-500k

Viewer • Updated Jul 30, 2024 • 481k • 185 • 10

liked a Space 4 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

Evaluate multilingual models using FineTasks

liked a dataset 4 months ago

LLM360/TxT360

Preview • Updated Nov 8, 2024 • 298k • 220

liked 2 Spaces 4 months ago

Hub LFS Analysis

An analysis of LFS files on the Hub.

TxT360: Trillion Extracted Text

Explore a large, deduplicated dataset for LLM training

liked a dataset 5 months ago

Cleanlab/bad_data_gsm8k_svamp.csv

Viewer • Updated Apr 25, 2024 • 34 • 39 • 3

liked a Space 5 months ago

Datasets Metrics Explorer

liked 3 datasets 6 months ago

ThaiSyntheticQA/ThaiQA-v1

Viewer • Updated Jul 24, 2024 • 12.7k • 40 • 4

coastalcph/fairlex

Updated Jul 27, 2023 • 149 • 7

meta-llama/Llama-3.1-405B-Instruct-evals

Viewer • Updated Oct 2, 2024 • 158k • 180 • 20

liked 3 datasets 7 months ago

jon-tow/okapi_mmlu

Updated Oct 24, 2023 • 91 • 5

pakphum/winograd_th

Viewer • Updated Nov 16, 2024 • 285 • 40 • 4

scb10x/thai_exam

Viewer • Updated Jul 8, 2024 • 590 • 155 • 11

liked a Space 8 months ago

Open-LLM performances are plateauing, let’s make the leaderboard steep again

Update leaderboard for fair model evaluation

liked a dataset 8 months ago

m-a-p/Matrix

Viewer • Updated Jun 3, 2024 • 2.5M • 3.34k • 158