What this is: a tour of the new tools that can see your screen and act on your computer, by voice or on their own, from the open-source Clicky project to ready-to-use apps.
Why now: faster voice models, background screen-control technology, and better on-device models have made this suddenly practical.
Quick picks: OpenClicky to tinker, Dottie for privacy and local models, VoiceOS for voice productivity, ChatGPT Desktop or Clicky for a polished feel.
Want plain-English AI explainers daily? Join the free Beginners in AI newsletter.
For years, AI lived in a chat box. Now a new category is escaping the box: AI that can see your screen and act on your computer, either by voice or on its own. Some of these tools sit next to your cursor like a helpful buddy; others work quietly in the background, clicking and typing for you. This is moving fast, so here is a plain-English map of what exists today, what each tool does, and which one fits you.
What does it mean for AI to control your computer?
It means the AI can do two things a chatbot cannot. First, it can see what is on your screen, usually by taking a screenshot or reading the contents of your open apps. Second, it can act: move the cursor, click buttons, type text, or trigger app actions, the same way a person would. When you combine those with voice, you get something that feels less like a tool and more like a coworker you can talk to. The underlying capability is often called “computer use”, and it is what powers an AI agent that can finish a task rather than just describe it.
Why is this exploding right now?
Three changes landed at once:
- Fast voice models. New realtime models handle low-latency voice and can call tools while they talk, so speaking to your computer finally feels natural. Our comparison of AI voice modes shows how far this has come.
- Background screen control. Open infrastructure like Cua lets an agent drive your apps in the background without hijacking your cursor or stealing focus.
- Better local models. Apple Silicon Macs can now run capable AI on-device (tools like Ollama make this approachable), so some of these tools work entirely offline, with your screen data never leaving your machine.
Both Anthropic and OpenAI have shipped “computer use” capabilities, so expect this category to grow quickly, in both polished products and open-source experiments.
What is Clicky, and why does everyone mention it?
Clicky, built by the developer Farza, is the project that made this idea click for a lot of people. It is an AI buddy that lives next to your cursor on a Mac: you hold a key, talk, and a little glowing triangle flies across the screen to point at whatever you asked about while the assistant talks you through it. The original is open-source (it pairs Claude for the thinking with AssemblyAI for hearing and ElevenLabs for speaking), and the maintained, polished version now lives at heyclicky.com. Because it is so approachable, Clicky became the reference point everyone compares new tools to.
What are the open-source options?
If you want to tinker, self-host, or just see how this works under the hood, the open-source side is lively:
| Project | What it is | Cost | Status |
|---|---|---|---|
| Original Clicky (farzaa/clicky) | Farza’s open-source cursor-buddy prototype (Claude + AssemblyAI + ElevenLabs) | Free, open-source | The foundation; uses an older stack. Great for learning. The maintained version is now official at heyclicky.com |
| OpenClicky (jasonkneen/openclicky) | An enhanced fork: a native menu-bar app with background computer-use, local agents, image galleries, and MCP tool support | Free, open-source | The most active and polished open-source take. Works with OpenAI keys |
| Other community forks | Smaller forks experimenting with teaching and tutor modes or extra tooling | Free, open-source | Smaller and experimental |
| Cross-platform rebuilds | Community projects re-creating the idea for Windows and Linux | Free, open-source | Early and scattered |
| MacEcho (realtime-ai/mac-echo) | A fully local, on-device voice assistant for Apple Silicon | Free, open-source | A different focus: privacy-first and fully offline, less of a “cursor buddy” |
If you want one place to start experimenting, OpenClicky is the most developed open-source version of the idea.
What are the ready-to-use tools?
If you would rather install something that just works, these are the closest options as of mid-2026. We have kept this neutral; the right pick depends on what you value.
| Tool | Platform | What it does well | Cost | Best for |
|---|---|---|---|---|
| Dottie | Mac | Local and offline models (MLX, Ollama), deep Mac control, a “Hey Dottie” wake word, agent mode, and granular permissions | Free and open-source (cloud options too) | Privacy-conscious users who want local models |
| VoiceOS | Mac + Windows | Voice dictation plus an Agent Mode that acts across apps (Gmail, Slack, Calendar), with a screen-aware “Ask” mode | Paid | Voice-first productivity |
| ChatGPT Desktop | Mac + Windows | Native voice and screen understanding, backed by strong models | Free, with a paid tier | People already using ChatGPT |
| Raycast AI | Mac | Very fast keyboard-driven AI commands and a big library of extensions | Free, with a paid tier | Keyboard power users |
| Alter | Mac | Reads context from your open apps, automation, meeting notes, voice, and 50+ models with local options | Free, with a paid tier | Automation-heavy users |
Which one should you use?
A quick cheat sheet:
| What you want | Best option |
|---|---|
| Open-source and hackable | OpenClicky (or the original Clicky) |
| Privacy and local models | Dottie |
| Voice productivity across your apps | VoiceOS |
| A polished, just-works experience | Official Clicky (heyclicky) or ChatGPT Desktop |
| Keyboard-first speed | Raycast AI |
| Fully offline | MacEcho |
Keep up with AI without the overwhelm
A free daily email that explains new AI tools in plain English, for beginners.
Free forever. Unsubscribe anytime.
Is it safe to let AI control your computer?
This is the part to take seriously. A tool that can click, type, and send on your behalf can also make mistakes on your behalf, so treat it like a capable but new assistant, the same caution behind zero trust for AI agents:
- Grant the least access it needs. The better tools offer granular permissions that are off by default (Dottie, for example, ships with permissions you switch on one at a time). Turn on only what a task requires.
- Supervise it at first. Watch what it does on low-stakes tasks before you trust it with email, payments, or anything you cannot undo.
- Prefer local or offline options for sensitive screens. Tools that run on-device, the idea behind local-first software, keep your screen contents on your machine instead of sending them to the cloud.
- Keep a human on the send button. Let it draft and prepare, but make the final, irreversible action yours until you are confident.
None of this is a reason to avoid these tools. It is the same care you would take handing a new hire the keys, applied to software. The point of AI here is to take the busywork, while you keep the judgment.
How do you get started?
Pick by what you want to learn. To understand the idea hands-on, install OpenClicky or the original Clicky and ask it to point at something on your screen. To keep everything private, try Dottie with a local model. To speed up daily work, try a voice-productivity tool like Wispr Flow and dictate a few messages. If you want the wider toolkit, our AI Tools Directory and best AI assistants roundup are good next stops, and our AI automation hub covers letting AI run tasks for you.
Common questions
What is “computer use” in AI?
It is the ability of an AI to see your screen and act on it, by moving the cursor, clicking, typing, or triggering app actions, rather than only chatting. Both Anthropic and OpenAI have shipped versions of it.
Is Clicky free?
The original Clicky is free and open-source on GitHub. The maintained, polished version lives at heyclicky.com. OpenClicky, the most active fork, is also free and open-source.
Which tool is best for privacy?
Dottie and the fully local projects like MacEcho, because they can run AI models on your own device so your screen data does not leave your machine.
Do these work on Windows?
Some do. VoiceOS and ChatGPT Desktop are cross-platform, and community projects are rebuilding the open-source ideas for Windows and Linux, though those are still early. Many of the cursor-buddy tools are Mac-first.
Can the AI do something bad on my computer?
It can act on your behalf, so it can also make mistakes. Grant minimal permissions, supervise it on low-stakes tasks first, and keep the final irreversible action in your own hands.
Sources
- Clicky (farzaa/clicky) on GitHub and heyclicky.com
- OpenClicky (jasonkneen/openclicky) on GitHub
- Cua: open-source computer-use infrastructure
- MacEcho (realtime-ai/mac-echo) on GitHub
- Dottie, VoiceOS, and Alter
- OpenAI: Computer-Using Agent
Last reviewed: June 2026. This is a fast-moving category; features and availability change often, so check each project before you rely on it.