Show HN: SVAHNAR – Serverless infrastructure to run AI agents in isolated VMs

(svahnar.com)

2 points | by Chethan_Polanki 5 hours ago

3 comments

Sasisundar09 5 hours ago
Reliability and observability seem harder than execution itself once agents are in production. How are you detecting silent tool failures today?
[-]
- Chethan_Polanki 4 hours ago
  Reliability and observability are hard. Silent failures usually lead to infinite loops. We handle this in two ways (currently).
  1. We make sure that tools are not called more than 3 times in a row, and we share the error with the LLM with additional context, so the user can see why a tool or an MCP connection is failing.
  2. In the Chat Interface, you can see what the agent did, meaning all the tool calls and results are displayed to the user.
  If it is going off track (hallucinating), then the user has the option to stop the agent and update the tool or add additional context to the tool (adding custom instructions to the tool itself), as well as update the agent's system prompt ("Agent Function" in SVAHNAR).
  Please let me know if you'd like me to elaborate more on a specific part.
Chethan_Polanki 3 hours ago
[dead]