Model Releases

  • Granite-4.1-30B — IBM’s 30B conversational model, Apache 2.0, endpoints-compatible, deployable on Azure. Safetensors format available. If you need an on-prem friendly mid-sized model without licensing headaches, here’s one. 📄
  • Gemma-4-31B-it-assistant — Google’s 31B Gemma 4 variant tuned for assistant-style any-to-any tasks. Apache 2.0, endpoints-compatible. Solid if you want something in the 30B range that doesn’t require a PhD in licensing terms. 🤖
  • Hy-MT1.5-1.8B-1.25bit — Tencent’s translation model quantized to 1.25-bit precision. Three associated arXiv papers. If your deployment constraints are “run on a potato,” this is for you. The papers are genuinely interesting if you care about quantization theory beyond the usual 4-bit hand-waving.
  • Sulphur-2-base — Text-to-video model in diffusers and GGUF formats, with conversational support. Endpoints-compatible. Text-to-video is still mostly “cool demo” territory, but having it in GGUF is a thoughtful touch.

Open Source Releases

  • TabPFN — Foundation model for tabular data with fast inference and strong results on small-to-medium datasets, no heavy tuning needed. If you’re tired of “just throw XGBoost at it,” this is worth a look.
  • Claude Code v2.1.129 — Adds –plugin-url flag for fetching plugin zips at runtime, CLAUDE_CODE_FORCE_SYNC_OUTPUT=1 env var for busted terminal auto-detection (looking at you, Emacs), and package manager auto-update support. Quality-of-life stuff that actually matters when you’re trying to get work done. 🔥
  • DGLZ 0.1.10.post4 — EEG emotion recognition library using CNN-Transformer encoders with hierarchical classifiers. If you’re building biosignal pipelines, this is niche but well-structured.

Research Worth Reading

AI Dev Tools

  • InsForge — Postgres-based backend platform for coding agents: auth, storage, compute, hosting, AI gateway in one. If you’re building an agent platform and don’t want to glue together six different services, this might save you a week.
  • Claude Code v2.1.131 — Fixes VS Code extension activation failure on Windows (hardcoded SDK path causing a createRequire polyfill bug) and Mantle auth failures from a missing x-api-key header. Unsexy bugs, but the kind that eat an afternoon if you hit them.

Today’s Synthesis

If you’re deploying Sulphur-2-base for text-to-video generation, pair it with eOptShrinkQ’s two-stage KV cache compression to cut memory usage on long prompts without tanking quality — the arXiv paper shows near-lossless results under the spiked random matrix model, which maps cleanly to video generation’s sequential attention. The two-stage pipeline in eOptShrinkQ — optimal singular value shrinkage followed by quantization — is what makes the compression near-lossless, not the usual 4-bit hand-waving you see in other papers. Then use Claude Code v2.1.129’s new –plugin-url flag to auto-fetch the required quantization plugins at runtime, and set CLAUDE_CODE_FORCE_SYNC_OUTPUT=1 to fix the Emacs terminal issue that’s eaten more hours than anyone admits. This gives you a deployable text-to-video agent that actually fits on hardware you have, instead of needing a dedicated GPU cluster for every 30-second clip. eOptShrinkQ , Sulphur-2-base , Claude Code v2.1.129 .