Article
Kshitij Thakkar PRO
kshitijthakkar
AI & ML interests
Building the evaluation and observability layer for AI.
Creator of TraceVerse—turning real-world LLM interactions into datasets, benchmarks, and cost-efficient model insights.
Recent Activity
updated a bucket 1 day ago
gemma-challenge/gemma-chiku-inu updated a collection 3 days ago
The Mind of Tashi updated a collection 3 days ago
The Mind of TashiOrganizations
Articles 6
Article
2
Scaling Mixture of Experts: Architecture Search for Billion-Parameter Language Models
DeepSeek V4 Replicas
Small-scale faithful replicas of the DeepSeek-V4 architecture for ablation and weight-transfer research.
-
kshitijthakkar/deepseek-v4-mini-300M-init
Text Generation • 0.3B • Updated • 9 -
kshitijthakkar/deepseek-v4-mini-1B-init
Text Generation • 1B • Updated • 6 -
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4 -
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 6 • 2
DeepSeek V4 Replicas
Small-scale faithful replicas of the DeepSeek-V4 architecture for ablation and weight-transfer research.
-
kshitijthakkar/deepseek-v4-mini-300M-init
Text Generation • 0.3B • Updated • 9 -
kshitijthakkar/deepseek-v4-mini-1B-init
Text Generation • 1B • Updated • 6 -
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4 -
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 6 • 2
mcp-server-bench
This is a collection of Benchmarking results between Gradio and FastMCP
spaces 13
pinned
Running
Agents
GuardianTails
🐾
Pet Health Intelligence Platform
Sleeping
Tracegenix Mini Demo
🔍
Test AI tool calls using mock utilities via chat
Sleeping
Loggenix MoE 0.4B-A0.2B Demo
🧠
Test and evaluate the Loggenix MoE language model
Runtime error
Agents
1
E-Commerce Product Content Generator
🛒
Generate product photos and marketing copy for e‑commerce
Sleeping
Agents
1
Multimodal Content Pipeline
🖼
Generate an image and hear its spoken description
Sleeping
Agents
1
AI Content Creation Pipeline
🎨
Generate complete social media posts from a text prompt
models 142
kshitijthakkar/deepseek-v4-mini-300M-recovered
Text Generation • 0.3B • Updated • 39 • 1
kshitijthakkar/deepseek-v4-mini-300M-recovered-h100
Text Generation • 0.3B • Updated • 20
kshitijthakkar/deepseek-v4-mini-300M-recovered-wip
Text Generation • 0.3B • Updated • 19
kshitijthakkar/tracegenix-mini-sft-clean-3ep
Text Generation • 1B • Updated • 143
kshitijthakkar/deepseek-v4-mini-300M-from-flash-sft-test-lora
Updated • 1
kshitijthakkar/loggenix-moe-300M-base-pt-sft-test
Text Generation • 0.3B • Updated • 6
kshitijthakkar/deepseek-v4-mini-300M-from-flash
Text Generation • 0.3B • Updated • 123 • 5
kshitijthakkar/deepseek-v4-mini-1B-from-flash
Text Generation • 1B • Updated • 122 • 2
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 6 • 2
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4
datasets 418
kshitijthakkar/smoltrace-leaderboard
Viewer • Updated • 108 • 1.92k
kshitijthakkar/smoltrace-metrics-20260424_122614
Viewer • Updated • 1 • 32
kshitijthakkar/smoltrace-traces-20260424_122614
Viewer • Updated • 2 • 25
kshitijthakkar/smoltrace-results-20260424_122614
Viewer • Updated • 2 • 27
kshitijthakkar/smoltrace-metrics-20260424_112312
Viewer • Updated • 1 • 35
kshitijthakkar/smoltrace-traces-20260424_112312
Viewer • Updated • 2 • 31
kshitijthakkar/smoltrace-results-20260424_112312
Viewer • Updated • 2 • 24
kshitijthakkar/smoltrace-metrics-20260424_111528
Viewer • Updated • 1 • 33
kshitijthakkar/smoltrace-traces-20260424_111528
Viewer • Updated • 2 • 31
kshitijthakkar/smoltrace-results-20260424_111528
Viewer • Updated • 2 • 32