Webhooks Explorers (BETA)

community

https://huggingface.co/docs/hub/webhooks

Activity Feed

AI & ML interests

Webhooks are now publicly available on Hugging Face!

Recent Activity

raeidsaqur authored a paper 24 days ago

Universal Time Series Generation with Neural Controlled Differential Equations

raeidsaqur authored a paper 24 days ago

Fast-Vollib: A Fast Implied Volatility Library for Pythonwith PyTorch, JAX, and CUDA Fused-Kernel Backends

View all activity

satpalsr

submitted a paper to Daily Papers about 1 month ago

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

Paper • 2605.05945 • Published May 7 • 10

satpalsr

authored a paper about 2 months ago

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

Paper • 2605.05945 • Published May 7 • 10

satpalsr

posted an update about 2 months ago

Post

193

We're open-sourcing our infra with 10M+ frames of dataset!

We're releasing Stera, an open-source infra that turns an off-the-shelf device in your pocket into a high-fidelity multimodal data pipeline. It's built around four layers. Capture → Process → Evaluate → Export.

Stera Capture removes the need for bespoke/gated hardware and runs on an off-the-shelf iPhone. It fuses together synchronized RGB, IMU, Lidar-guided depth, and 6-DoF pose out of the box from ARKit and exports them to a raw MCAP file.

Dataset: fpvlabs/stera-10m
Launch Details: https://x.com/fpv_labs/status/2055262652033908832

satpalsr

posted an update 3 months ago

Post

189

OpenAI is hiring for SLAM Engineers!
And open-source shouldn't lag behind.

It's pretty hard and necessary problem required to be solved for bringing generalisable robots in real-world.

We are pushing out first deep down & will be open-sourcing stuff in the next releases. Hope everyone is ready! Cheers to HF & more hugs.

Find us at https://x.com/fpv_labs/status/2042585804162371713

chansung

authored a paper 4 months ago

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Paper • 2602.15449 • Published Feb 17 • 7

chansung

submitted a paper to Daily Papers 4 months ago

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Paper • 2602.15449 • Published Feb 17 • 7

vumichien

authored 2 papers 9 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29, 2025 • 11

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 40

osanseviero

authored a paper 9 months ago

EmbeddingGemma: Powerful and Lightweight Text Representations

Paper • 2509.20354 • Published Sep 24, 2025 • 50

davanstrien

posted an update 10 months ago

Post

2958

I fine-tuned a smol VLM to generate specialized art history metadata!

https://huggingface.co/davanstrien/iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)

Trained with TRL + HF Jobs - single UV script, no GPU needed!

Space to explore predictions on a test set: davanstrien/iconclass-predictions

Blog soon!

1 reply

chansung

posted an update 12 months ago

Post

4898

YAML engineering becomes more and more important than ever from infra provisioning to model training (recipes).

Here, I built a simple editor first for @dstackai , and I will share the live endpoint this week. Let me know what you think about this approach.

Based on this approach, if people think this is useful, I am going to do the same thing for the LLM training recipes for popular frameworks such as Hugging Face open-r1, Axolotl, and so on. Let me hear.

davanstrien

posted an update about 1 year ago

Post

3755

Inspired by Hugging Face's official MCP server, I've developed a complementary tool that exposes my semantic search API to enhance discovery across the HF platform.

Key capabilities:

- AI-powered semantic search for models and datasets
- Parameter count analysis via safetensors metadata
- Trending content discovery
- Find similar models/datasets functionality
- 11 tools total for enhanced ecosystem navigation

The semantic search goes beyond simple keyword matching, understanding context and relationships between different models and datasets.

Example query: "Find around 10 reasoning Hugging Face datasets published in 2025 focusing on topics other than maths and science. Show a link and a short summary for each dataset." (results in video!)

https://github.com/davanstrien/hub-semantic-search-mcp

1 reply

davanstrien

posted an update about 1 year ago

Post

2420

Came across a very nice submission from @marcodsn for the reasoning datasets competition (https://huggingface.co/blog/bespokelabs/reasoning-datasets-competition).

The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:

- Extracts both the logical structure AND researcher intuition from academic papers
- Adopts the persona of researchers "before experiments" to capture exploratory thinking
- Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model

It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.

I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.

Dataset can be found here: marcodsn/academic-chains (give it a like!)

davanstrien

posted an update about 1 year ago

Post

1797

I've created a v1 dataset ( davanstrien/reasoning-required) and model ( davanstrien/ModernBERT-based-Reasoning-Required) to help curate "wild text" data for generating reasoning examples beyond the usual code/math/science domains.

- I developed a "Reasoning Required" dataset with a 0-4 scoring system for reasoning complexity
- I used educational content from HuggingFaceFW/fineweb-edu, adding annotations for domains, reasoning types, and example questions

My approach enables a more efficient workflow: filter text with small models first, then use LLMs only on high-value content.

This significantly reduces computation costs while expanding reasoning dataset domain coverage.

awacke1

posted an update over 1 year ago

Post

2994

AI Vision & SFT Titans 🌟 Turns PDFs into text, snaps pics, and births AI art.

https://huggingface.co/spaces/awacke1/TorchTransformers-Diffusion-CV-SFT

1. OCR a grocery list or train a titan while sipping coffee? ☕
2. Camera Snap 📷: Capture life’s chaos—your cat’s face or that weird receipt. Proof you’re a spy!
3. OCR 🔍: PDFs beg for mercy as GPT-4o extracts text.
4. Image Gen 🎨: Prompt “neon superhero me”
5. PDF 📄: Double-page OCR Single-page sniping

Build Titans 🌱: Train tiny AI models. 💪Characters🧑‍🎨: Craft quirky heroes.
🎥

osanseviero

authored a paper over 1 year ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 58

chansung

posted an update over 1 year ago

Post

3963

simple guide on the recipe for GRPO on Open-R1 which is built on top of TRL

I think FastAPI wrapper of vLLM with WeightSyncWorker is pretty cool feature. Also, we have many predefined reward functions out of the box!

5 replies

chansung

posted an update over 1 year ago

Post

2684

Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)

1 reply

bpHigh

authored a paper over 1 year ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19, 2025 • 49

chansung

posted an update over 1 year ago

Post

1607

Gemma 3 Release in a nutshell
(seems like function calling is not supported whereas the announcement said so)

AI & ML interests

Recent Activity

Team members 146

webhooks-explorers's activity