Rendered at 04:43:37 GMT+0000 (Coordinated Universal Time) with Netlify.
ildar 20 hours ago [-]
The NIST focus on "agent registration/tracking" is the right instinct but the wrong abstraction. Registration is a compliance checkbox — it tells you an agent exists, not what it's doing.
What we actually need is runtime behavioral monitoring: what files is the agent accessing? What network calls is it making? What credentials can it reach? That's where the real threat surface lives.
We've been building exactly this with ClawMoat (open source, MIT) — host-level security that monitors agent behavior in real-time. Permission tiers, forbidden zones, credential isolation, network egress monitoring. Think AppArmor for AI agents.
The gap in NIST's framing: they're treating agents like software to be certified, but agents are more like employees to be supervised. You don't just background-check an employee once — you give them appropriate access levels and monitor for anomalies.
Anyone planning to submit comments to NIST, the deadline is March 9. Would love to see the community push for runtime monitoring requirements, not just pre-deployment certification.
totetsu 4 days ago [-]
With this renaming of AISI to CAISI[1], and the resignation of its founding director[2] Elizabeth Kelly, It seems that the position has sifted to, don't let any concerns about social harms stop tech companies doing what ever they want, and also lets make a show of how bad China is. I think any public comment outside of the narrow definition of AI Risk as risk to national security, might fall on deaf ears.
Point 5 is the one nobody's actually doing yet. It's pretty apparent that everyone agrees we need to measure blast radius but where's the tooling?
I've been running AI models against real vulnerable targets, giving them a Kali box and an objective, letting them go autonomous. Every model I tested popped almost every OWASP top 10 challenge we had. The interesting part is the cost of getting there. One model solved a JWT forgery in 16 seconds and 5K tokens. Another took 170 seconds and 210K tokens. Same result, completely different blast pattern.
If we're serious about measuring agent risk, we need to stop theorizing about what they can do and start actually benchmarking it.
Note
On the othr hand, we had a lab that a jr pentest would have caught in 10 mins, and the best models couldn't figure it out..
ascarola 4 days ago [-]
NIST is requesting public input on security practices for AI agent systems -
autonomous AI that can take actions affecting real-world systems (trading bots,
automated operations, multi-agent coordination).
Key focus areas:
- Novel threats: prompt injection, behavioral hijacking, cascade failures
- How existing security frameworks (STRIDE, attack trees) need to adapt
- Technical controls and assessment methodologies
- Agent registration/tracking (analogous to drone registration)
This is specifically about agentic AI security, not general ML security - one of
the first formal government RFIs on autonomous agents.
Comments from practitioners deploying these systems would be valuable.
NIST asking for agent security comments right as the agent stack is splitting into layers with completely different threat models.
A model layer vulnerability looks nothing like a tool-use layer vulnerability.. but most framework still treats "the AI system" as one blob. But probably nobody who owns the audit trail when an agent chain spans six vendors.
Best security is a proper liability process for damages caused by publically accessible LLMs followed by users.
jksmith 4 days ago [-]
1. Attack surface for agents is tantamount to a virus.
2. Any way for an agent to touch something is a potential compromised vector.
3. The mitigation is controlling the blast radius.
4. Sandboxing capability will have to be baked into architecture.
5. Mitigation includes measuring cost of blast radius.
6. All agent orchestration will likely require an andon cord.
beej71 4 days ago [-]
War Operations Plan Response.
snowhale 4 days ago [-]
[dead]
niyikiza 4 days ago [-]
Good distinction, but I wonder if it's worth going further: context integrity may be fundamentally unsolvable. Agents consume untrusted input by design. Trying to guarantee the model won't be tricked seems like the wrong layer to bet on.
What seems more promising is accepting that the model will be tricked and constraining what it can do when that happens. Authorization at the tool boundary, scoped to the task and delegation chain rather than the agent's identity. If a child agent gets compromised, it still can't exceed the authority that was delegated to it. Contain the blast radius instead of trying to prevent the confusion.
(Disclaimer: working on this problem at tenuo.ai)
wangzhongwang 3 days ago [-]
[dead]
umairnadeem123 4 days ago [-]
[dead]
tucnak 4 days ago [-]
What you're talking about exists, and it's called Relationship-based Access Control, or ReBAC. There are a few implementations, Zanzibar paper, etc. The issue is not capability system, it's governance. The operator needs to write policies, of course! They don't want to read, write policies, audit other people's policies.
mrkmarron 4 days ago [-]
What is your take on usability of these systems? In practice they seem to be rather un-ergonomic and usage devolves into require everything.
As agentic systems seem to mainly interoperate with REST style systems I suspect that using URIs for resource use descriptions would be more natural.
tucnak 3 days ago [-]
You're right on ergonomics.
CodeAct is one way to abstract away some things, and bring others to the forefront. Especially when it comes to anything requiring a sidecar for mTLS, or something agents must be aware of, like error handling for whenever some call fails deep inside the stack. Troubleshooting access issues is key, during tool development and when using said tool in production, too. For many, many things, CodeAct is simply superior to naive calling conventions that you see around MCP clients, think OpenAPI.
jzelinskie 4 days ago [-]
Sorry to piggyback, but if this is of interest to you, feel free to reach out to me over to email (contact info in my profile). I'm one of the founders of the most popular ReBAC solution, SpiceDB, which secures quite a few AI products including big players like OpenAI. I'm always interested in hearing about more use cases or where folks are struggling the most.
tucnak 3 days ago [-]
Hi Jimmy, happy to talk about my experience. I reached out to you over email.
What we actually need is runtime behavioral monitoring: what files is the agent accessing? What network calls is it making? What credentials can it reach? That's where the real threat surface lives.
We've been building exactly this with ClawMoat (open source, MIT) — host-level security that monitors agent behavior in real-time. Permission tiers, forbidden zones, credential isolation, network egress monitoring. Think AppArmor for AI agents.
The gap in NIST's framing: they're treating agents like software to be certified, but agents are more like employees to be supervised. You don't just background-check an employee once — you give them appropriate access levels and monitor for anomalies.
Anyone planning to submit comments to NIST, the deadline is March 9. Would love to see the community push for runtime monitoring requirements, not just pre-deployment certification.
[1] https://www.commerce.gov/news/press-releases/2025/06/stateme... [2] https://www.reuters.com/technology/us-ai-safety-institute-di...
I've been running AI models against real vulnerable targets, giving them a Kali box and an objective, letting them go autonomous. Every model I tested popped almost every OWASP top 10 challenge we had. The interesting part is the cost of getting there. One model solved a JWT forgery in 16 seconds and 5K tokens. Another took 170 seconds and 210K tokens. Same result, completely different blast pattern.
If we're serious about measuring agent risk, we need to stop theorizing about what they can do and start actually benchmarking it.
Note On the othr hand, we had a lab that a jr pentest would have caught in 10 mins, and the best models couldn't figure it out..
Key focus areas: - Novel threats: prompt injection, behavioral hijacking, cascade failures - How existing security frameworks (STRIDE, attack trees) need to adapt - Technical controls and assessment methodologies - Agent registration/tracking (analogous to drone registration)
This is specifically about agentic AI security, not general ML security - one of the first formal government RFIs on autonomous agents.
Comments from practitioners deploying these systems would be valuable.
Deadline: March 9, 2026, 11:59 PM ET Submit: https://www.regulations.gov/commenton/NIST-2025-0035-0001
Priority questions (if limited time): 1(a), 1(d), 2(a), 2(e), 3(a), 3(b), 4(a), 4(b), 4(d)
Full 43-question RFI at link above.
A model layer vulnerability looks nothing like a tool-use layer vulnerability.. but most framework still treats "the AI system" as one blob. But probably nobody who owns the audit trail when an agent chain spans six vendors.
Wrote about this layering in an article last month: https://philippdubach.com/posts/dont-go-monolithic-the-agent...
A more recent release:
Announcing the "AI Agent Standards Initiative" for Interoperable and Secure Innovation
https://www.nist.gov/news-events/news/2026/02/announcing-ai-...
(Disclaimer: working on this problem at tenuo.ai)
As agentic systems seem to mainly interoperate with REST style systems I suspect that using URIs for resource use descriptions would be more natural.
CodeAct is one way to abstract away some things, and bring others to the forefront. Especially when it comes to anything requiring a sidecar for mTLS, or something agents must be aware of, like error handling for whenever some call fails deep inside the stack. Troubleshooting access issues is key, during tool development and when using said tool in production, too. For many, many things, CodeAct is simply superior to naive calling conventions that you see around MCP clients, think OpenAPI.