What it is: the image-and-vision corner of AI automation, where AI generates pictures and reads what is inside the ones you already have.
Who it is for: anyone who makes images in bulk or has a pile of photos, receipts, or scans they need read, sorted, or described.
Where to start: pick the build below that matches your image chore, and follow it end to end. Make is the friendliest tool to build in.
Good to know: this is our multi-model set. Claude does not make or read images on Make, so these builds use ChatGPT, Google Vision, and Make’s own AI, with Claude writing the brief in one of them.
Want one working AI workflow each morning? Join the free daily Beginners in AI newsletter.
Images are two jobs in one: making them, and reading them. Generating art or headers in bulk is slow one at a time, and the pictures you already own, receipts, scans, product photos, are full of information locked inside pixels. This page is the image-and-vision set of our AI automation hub, a group of build guides that hand both jobs to a workflow.
A note up front, because it matters here: Claude is our usual flagship, but it does not generate or read images through Make. So this is the set where the other models earn their place: ChatGPT for generating and seeing, Google Cloud Vision for reading text, and Make’s own AI for receipts. In the last build, Claude still does what it is best at, writing the brief, and hands the picture-making to ChatGPT. Best tool for each step.
What is image and vision automation, in plain English?
It is a short chain that either makes an image or reads one. Something starts it (a prompt in a sheet, a photo dropped in a folder, a published post), an automation tool carries it through a step or two, and you get back a saved image or the information pulled out of one. The AI model does the creative or seeing step. The tool does everything around it.
Across these builds the AI step does one of a few jobs:
- Generates an image from a written prompt.
- Reads the text out of a photo or scan (OCR).
- Extracts fields from a receipt or document image.
- Describes an image in words, for alt text or tagging.
What can you automate first?
Each guide takes one real image chore from an empty canvas to a working automation, with a screenshot of the finished build and a free importable template. Pick the one that matches what you are buried under:
| Build | What it does | The model |
|---|---|---|
| Generate images from a sheet | A list of prompts becomes a folder of saved images | ChatGPT |
| Scan receipts into a sheet | A receipt photo becomes an expense row, no typing | Make AI |
| Write image alt text | AI describes a folder of images for accessibility | ChatGPT Vision |
| OCR images to text | Pull the words out of photos and scans into a sheet | Google Vision |
| Blog header images | Claude briefs it, ChatGPT draws it, for every post | Claude + ChatGPT |
Every guide comes with a free importable template. Subscribe to the daily newsletter and grab them all on the thank-you page, next to our Special Reports. Import one, connect your own accounts, and you are running in minutes.
Why use Make instead of the AI app by hand?
Because the apps are built for one image at a time. ChatGPT makes a single picture when you ask; you download and file it yourself. That is fine for one and painful for a hundred, and useless for reading a folder of receipts you never open.
Make turns these into pipelines. It watches the sheet or folder, sends each item to the right model, and files every result, no clicking. The model does the creative or seeing step; Make does the watching, the repeating, and the filing. That division is what turns a one-off chat task into a workflow that runs itself.
Is it safe to send images to AI?
Mostly, with care. Generating images from your own prompts is low risk. Reading images means sending them to a model, so keep private or sensitive photos, ID documents, anything confidential, out of the watched folders, and check each service’s data terms. For accessibility and search work on public images, this is a clear win. For anything personal, think first about where the image goes.
How much does it cost to start?
More variable than the text builds, because images cost more. Make’s free plan covers 1,000 operations a month. Generated images run a few cents each, vision reads are cheap per image, Google Cloud Vision has a free monthly allowance, and Make’s receipt extractor runs on your plan. The text-only builds elsewhere in the cluster are cheaper; here, watch your image volume and you stay in the free tiers for normal use.
Do you need to know how to code?
No. Every guide is connecting boxes on a visual canvas and writing a plain-English prompt for the AI step. The one build with extra setup is OCR, which needs a Google Cloud key, and its guide walks through that once. Our Make AI scenarios roundup and the AI Tools Directory are good next stops.
Want it set up with you, live?
Book a 1-on-1 Live Claude AI Crash Course and we build your first AI workflow together, screen to screen.
Want better prompts for images?
The AI Prompt Library includes image-prompt and vision recipes you can paste straight in.
A working AI automation you can copy, every morning
Free daily newsletter. Built for people who want to use AI well, not chase every model.
Free forever. Unsubscribe anytime.
Common questions
Why does this set not use Claude for everything?
Because Claude does not generate or read images through Make. This is the set where ChatGPT, Google Vision, and Make’s AI do the image work, with Claude writing the brief in the blog-image build.
Which build should I start with?
The one that matches your chore. Need lots of images? Start with generate-from-a-sheet. Buried in receipts or scans? Start with the receipt or OCR build.
Is it expensive?
More than the text builds, since images cost cents each, but normal volumes stay inside the free tiers. Watch your image count and you will be fine.
Is it safe with private images?
Keep sensitive or personal images out of watched folders, since reading them means sending them to a model. Public images for accessibility or search are a clear win.
Do I need an API key?
For the ChatGPT and OCR builds, yes. The receipt build uses Make’s built-in AI with no extra key, which makes it the easiest to start.
Sources and official documentation
- Make Help Center
- OpenAI image generation (docs)
- Google Cloud Vision OCR (docs)
- Image alt text guidance (W3C WAI)
- Generative AI — Grokipedia
Last reviewed: May 2026. These tools update their interfaces often; check the official docs above for current details.
