GO:OD:AM PRO

tegridydev

https://toolworks.dev/blog

AI & ML interests

Mechanistic Interpretability (MI) Research & sp00ky code stuff

Recent Activity

upvoted an article 6 days ago

LLM Dataset Formats 101: A No‐BS Guide for Hugging Face Devs

published an article 6 days ago

LLM Dataset Formats 101: A No‐BS Guide for Hugging Face Devs

reacted to their post with ❤️ 6 days ago

Open-MalSec v0.1 – Open-Source Cybersecurity Dataset Evening! 🫡 📂 Just uploaded an early-stage open-source cybersecurity dataset focused on phishing, scams, and malware-related text samples. This is the base version (v0.1)—a few structured sample files. Full dataset builds will come over the next few weeks. 🔗 Dataset link: https://huggingface.co/datasets/tegridydev/open-malsec 🔍 What’s in v0.1? A few structured scam examples (text-based) Covers DeFi, crypto, phishing, and social engineering Initial labelling format for scam classification ⚠️ This is not a full dataset yet (samples are currently available). Just establishing the structure + getting feedback. 📂 Current Schema & Labelling Approach "instruction" → Task prompt (e.g., "Evaluate this message for scams") "input" → Source & message details (e.g., Telegram post, Tweet) "output" → Scam classification & risk indicators 🗂️ Current v0.1 Sample Categories Crypto Scams → Meme token pump & dumps, fake DeFi projects Phishing → Suspicious finance/social media messages Social Engineering → Manipulative messages exploiting trust 🔜 Next Steps - Expanding datasets with more phishing & malware examples - Refining schema & annotation quality - Open to feedback, contributions, and suggestions If this is something you might find useful, bookmark/follow/like the dataset repo <3 💬 Thoughts, feedback, and ideas are always welcome! Drop a comment or DMs are open 🤙

View all activity

Organizations

None yet

tegridydev's activity

upvoted an article 6 days ago

Article

LLM Dataset Formats 101: A No‐BS Guide for Hugging Face Devs

•

6 days ago

• 4

published an article 6 days ago

Article

LLM Dataset Formats 101: A No‐BS Guide for Hugging Face Devs

•

6 days ago

• 4

reacted to their post with ❤️ 6 days ago

Post

1397

Open-MalSec v0.1 – Open-Source Cybersecurity Dataset

Evening! 🫡

📂 Just uploaded an early-stage open-source cybersecurity dataset focused on phishing, scams, and malware-related text samples.

This is the base version (v0.1)—a few structured sample files. Full dataset builds will come over the next few weeks.

🔗 Dataset link:

tegridydev/open-malsec

🔍 What’s in v0.1?
A few structured scam examples (text-based)
Covers DeFi, crypto, phishing, and social engineering
Initial labelling format for scam classification

⚠️ This is not a full dataset yet (samples are currently available). Just establishing the structure + getting feedback.

📂 Current Schema & Labelling Approach
"instruction" → Task prompt (e.g., "Evaluate this message for scams")
"input" → Source & message details (e.g., Telegram post, Tweet)
"output" → Scam classification & risk indicators

🗂️ Current v0.1 Sample Categories
Crypto Scams → Meme token pump & dumps, fake DeFi projects
Phishing → Suspicious finance/social media messages
Social Engineering → Manipulative messages exploiting trust

🔜 Next Steps
- Expanding datasets with more phishing & malware examples
- Refining schema & annotation quality
- Open to feedback, contributions, and suggestions

If this is something you might find useful, bookmark/follow/like the dataset repo <3

💬 Thoughts, feedback, and ideas are always welcome! Drop a comment or DMs are open 🤙

posted an update 6 days ago

Post

1397

Open-MalSec v0.1 – Open-Source Cybersecurity Dataset

Evening! 🫡

📂 Just uploaded an early-stage open-source cybersecurity dataset focused on phishing, scams, and malware-related text samples.

This is the base version (v0.1)—a few structured sample files. Full dataset builds will come over the next few weeks.

🔗 Dataset link:

tegridydev/open-malsec

🔍 What’s in v0.1?
A few structured scam examples (text-based)
Covers DeFi, crypto, phishing, and social engineering
Initial labelling format for scam classification

⚠️ This is not a full dataset yet (samples are currently available). Just establishing the structure + getting feedback.

📂 Current Schema & Labelling Approach
"instruction" → Task prompt (e.g., "Evaluate this message for scams")
"input" → Source & message details (e.g., Telegram post, Tweet)
"output" → Scam classification & risk indicators

🗂️ Current v0.1 Sample Categories
Crypto Scams → Meme token pump & dumps, fake DeFi projects
Phishing → Suspicious finance/social media messages
Social Engineering → Manipulative messages exploiting trust

🔜 Next Steps
- Expanding datasets with more phishing & malware examples
- Refining schema & annotation quality
- Open to feedback, contributions, and suggestions

If this is something you might find useful, bookmark/follow/like the dataset repo <3

💬 Thoughts, feedback, and ideas are always welcome! Drop a comment or DMs are open 🤙

reacted to their post with 👀 7 days ago

Post

1372

So, what is #MechanisticInterpretability 🤔

Mechanistic Interpretability (MI) is the discipline of opening the black box of large language models (and other neural networks) to understand the underlying circuits, features and/or mechanisms that give rise to specific behaviours

Instead of treating a model as a monolithic function, we can:

1. Trace how input tokens propagate through attention heads & MLP layers
2. Identify localized “circuit motifs”
3. Develop methods to systematically break down or “edit” these circuits to confirm we understand the causal structure.

Mechanistic Interpretability aims to yield human-understandable explanations of how advanced models represent and manipulate concepts which hopefully leads to

1. Trust & Reliability
2. Safety & Alignment
3. Better Debugging / Development Insights

https://bsky.app/profile/mechanistics.bsky.social/post/3lgvvv72uls2x