Learnings:
- AI agents still work best for simple, well-constrained tasks.
- To create a successful agent, you need to provide it with good tools. The LLM can then figure out the correct sequence of tool calls itself, which feel like a promising direction.
- Tool use is still quite slow and often very expensive. I've spend around $50 just on experimenting with Claude for one day. Imagine what the testing would cost for a production-scale system. Making the unit economics work is difficult but will improve as LLM costs continue to drop.