Model Releases

Open Source Releases

  • slopguard-cli 0.2.0 — Stops LLM-hallucinated package names from becoming malicious installs. Slop is a real problem when your Copilot has a gambling problem with pip install. 🛠️
  • hermes-katana 3.0.0 — Taint tracking, secret guards, policy engines, red-team benchmarks — the works. If you’re deploying agents in production, this is the kind of paranoia you want. 🔥
  • hexdag 0.8.0.dev4 — OS abstractions for agents: pipelines, ports, drivers, stdlib. Lets you stop writing ad-hoc shell scripts to babysit your agent infrastructure. Finally. 🛠️
  • observeco 0.1.0 — Pulse monitoring, circuit breakers, token compression, dashboard. Because agents will crash at 3am and you’ll wish you had metrics. 📊
  • llm-redacted 0.2.1 — Zero-dependency sanitizer that strips secrets and PII from LLM outputs. Small, focused, does one job — which is more than most security tools manage. 🛡️
  • llm-join 0.2.4 — Fuzzy joins on DataFrames using LLM scoring and embeddings. For when your entity resolution can’t survive on exact string matches alone. 📄

AI Dev Tools

  • mukul975/Anthropic-Cybersecurity-Skills — 754 cybersecurity skills mapped to MITRE ATT&CK, NIST CSF, ATLAS, D3FEND, AI RMF. Plug it into Claude Code, Copilot, or Cursor — your AI can finally pretend to know what it’s defending against. 🤖

Today’s Synthesis

If you’re running LLM agents in any real environment, treat them like untrusted subprocesses — because they are. Start with slopguard-cli to block hallucinated package names from ever hitting your CI pipeline; that’s the “don’t pip install what the model invents” layer. Layer observeco on top so you actually see when an agent chokes at 3am — token usage spikes, circuit breakers trip, you get a dashboard instead of a Slack message you missed. Finally, run every output through llm-redacted before it touches a downstream system; strip secrets and PII at the edge, not after the fact. Together these three give you a skeleton: deploy agents, watch them, sanitize what they say, and make sure they can’t install something stupid. It’s not fancy, but it’s the boring operational hygiene that keeps you out of an incident report.