Voice Reach Quick Note

Community Article Published June 14, 2026

Voice Reach is a private Build Small Hackathon staging artifact for the Voice Contact Widget V0. The goal is to keep a lightweight, trace-backed signal on which small models are currently usable for an iframe voice contact flow.

Goal

  • Run a lightweight eval across the configured ASR and text models.
  • Check Hugging Face model cards so each model is called with the right runtime shape where possible.
  • Publish privacy-safe agent traces for hackathon review.
  • Keep the claim small: seed/smoke signals, not production quality and not final submission readiness.

Pinned Artifacts

Model Decision Table

Role Model Display name Artifact ID Engine Format Quantization Deployment finding Quick-eval finding
ASR Nemotron 3.5 ASR 0.6B onnx-community/nemotron-3.5-asr-streaming-0.6b-onnx-int4 onnxruntime / onnxruntime-genai ONNX int4 Deployed through HF Space to Modal; fallback_used=false proven on hosted rows. Best current ASR. Works with explicit language hint; auto routing is risky. 7 pass / 5 fail ASR-role checks.
ASR cohere-transcribe-03-2026 ONNX onnx-community/cohere-transcribe-03-2026-ONNX Intended: Transformers.js/WebGPU; current app: unsupported in Python Modal path ONNX q4 / unknown in app Selector-visible, but current Modal endpoint returns an explicit adapter blocker instead of fake output. No hosted model-proof rows. Not a Hindi/Hinglish proof path yet.
ASR cohere-transcribe-03-2026 official CohereLabs/cohere-transcribe-03-2026 Transformers safetensors none / fp32 Separate comparison endpoint code corrected to official HF call shape, but not redeployed/live-smoked after fix. Pending. Prior deployed smokes used old call path and produced invalid token noise; cannot count as proof yet.
Text MiniCPM5-1B openbmb/MiniCPM5-1B-GGUF llama.cpp GGUF q4_k_m Deployed through Modal; text.fallback_used=false hosted proof exists. Mixed. 2 pass / 4 fail. Usable as English smoke default, not best Hinglish default.
Text tiny-aya fire CohereLabs/tiny-aya-fire-GGUF llama.cpp GGUF q4_k_m Deployed through Modal; text.fallback_used=false hosted proof exists. Best current Hinglish text model. 3 pass / 0 fail, but latency is high.
Text Nemotron 3 Nano 4B nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF llama.cpp GGUF q4_k_m Deployed through Modal; text.fallback_used=false hosted proof exists. Weak on current seed set. 0 pass / 3 fail; runnable but not a good default.

What Was Done

  • Corrected the official Cohere ASR comparison adapter to the HF model-card call shape.
  • Generated quick-signals.json from existing hosted non-fallback rows.
  • Generated a privacy-safe agent trace bundle with 36 hosted trace rows.
  • Uploaded the private Voice Reach Space to build-small-hackathon/voice-reach.
  • Uploaded the private trace dataset to build-small-hackathon/voice-reach-agent-traces.

Evidence Snapshot

The pinned trace Dataset is the machine-readable evidence surface for this note. It contains the privacy-safe hosted trace rows and schema used for the quick decision signal.

Summary from the current local quick-signal snapshot:

  • Hosted model rows: 12.
  • Hosted agent trace rows: 36.
  • Best current ASR signal: Nemotron ASR with explicit language hint.
  • Best current Hindi/Hinglish text signal: tiny-aya fire.
  • English smoke/default fallback signal: MiniCPM5.
  • Not counted as proof yet: Cohere official ASR after adapter correction, because it was not redeployed and live-smoked after the fix.

Source Provenance

These source paths are in the local/Git repo, not in the Hugging Face Space package:

  • v0/eval/results/quick-signals.json
  • v0/eval/README.md
  • v0/traces_public/data/eval_traces.jsonl
  • v0/traces_public/README.md
  • v0/evidence/README.md
  • v0/modal/modal_voice_contact_cohere_asr.py

Suggested Path

Proceed with Nemotron ASR plus explicit language hint, and use tiny-aya fire as the Hindi/Hinglish text default. Keep MiniCPM5 for English smoke/default fallback. Defer Cohere official ASR unless a comparison smoke is worth a paid Modal redeploy and live run.

Community

Sign up or log in to comment