Dit komt uit mijn link:
The 70% failure rate you found matches my experience. The root cause is that skills are prose — the LLM reads them and guesses what to do. There is no build step. No verification that the tools mentioned actually exist, that parameter types line up, that MCP servers are reachable.
I have been working on a different approach: treat SKILL.md as source code for an agent, not as a prompt for another agent. agenthatch compiles a SKILL.md through a 3-phase pipeline into a standalone AI agent. Schema validation runs at build time. If a tool signature does not match the runtime, the compilation fails. The agent never spawns with a broken spec.
The problem you described — skills that "claim to do things the bundled code cannot actually do" — is exactly what build-time validation catches.
The 70% failure rate you found matches my experience. The root cause is that skills are prose — the LLM reads them and guesses what to do. There is no build step. No verification that the tools mentioned actually exist, that parameter types line up, that MCP servers are reachable.
I have been working on a different approach: treat SKILL.md as source code for an agent, not as a prompt for another agent. agenthatch compiles a SKILL.md through a 3-phase pipeline into a standalone AI agent. Schema validation runs at build time. If a tool signature does not match the runtime, the compilation fails. The agent never spawns with a broken spec.
The problem you described — skills that "claim to do things the bundled code cannot actually do" — is exactly what build-time validation catches.

