Hi HN,
I’ve been experimenting with a different approach to computer-using AI agents.
Most current AI agents control computers using: • cloud APIs with stored credentials
• browser automation
• screenshot + vision + mouse control
I tried something else.
Instead of embedding the AI inside the computer, I use the official mobile LLM apps (ChatGPT / Claude) as the intelligence source, and built an external execution gateway that translates model intent into deterministic OS actions.
The model never gets system privileges, and the computer never exposes credentials to the model.
Architecture:
phone LLM app → data link → action gateway → predefined action skills → desktop OS
The gateway only executes whitelisted primitives: keyboard sequences
window operations
command calls
The key idea is separating cognition and execution.
The model outputs decisions, not motor control.
The gateway performs verified actions.
This turns computer control from a continuous UI manipulation problem into a discrete decision problem, which makes it more predictable and auditable.
Early prototype — I’d really appreciate feedback, especially from people working on agent safety or permission models.
2 comments
Show HN: Using a mobile LLM app to safely operate a desktop computer | Heykuki News