Image to Text OCR: Make

What this does: drop a photo or screenshot of text into a Drive folder and Google Vision pulls the words out into a spreadsheet.

Time to set up: about 25 minutes once, including a little Google Cloud setup. After that, images become searchable text on their own.

What you need: a Make account (free tier is fine), a Google Cloud account with the Vision API on, and Drive.

Skip if: your images are already digital text, or you only have one or two to handle.

Want a working AI workflow in your inbox each morning? Join the free daily Beginners in AI newsletter.

A genealogy researcher has a hard drive full of scanned census pages and handwritten letters. The stories are in there, but you cannot search a photograph. Finding one ancestor means squinting through hundreds of images by eye, which is how good leads stay buried for years.

This build makes the images searchable. Drop a scan in a Drive folder, and Google Cloud Vision reads the text out of it, logging it to a sheet you can search and copy. A pile of unreadable photos becomes a body of text you can actually work with.

We wire it in Make, part of our AI image and vision set. This one uses Google Cloud Vision, the strongest of the OCR options, including a handwriting mode. It needs a little more setup than the others, and the steps below walk through it.

Table of Contents

What does this workflow actually do?

In one line: an image of text becomes searchable text. Make watches a Drive folder. When you add a photo or scan, Google Cloud Vision runs text detection and returns the words, and Make logs them to a sheet next to the file name. Now you can search what used to be a flat image.

A few real cases, none of them the usual ones:

The genealogy researcher above, finally searching a drive full of scans.
A chef digitizing a box of handwritten family recipes.
A paralegal pulling text out of scanned exhibits for a brief.
A teacher turning whiteboard photos into typed lesson notes.

Scanning the pages is done; the images exist. Reading and retyping every one is the wall. OCR is the tool built precisely to knock that wall down.

Why use Make instead of retyping the text yourself?

Because retyping a folder of scans is exactly the kind of work nobody finishes. The text is trapped in pixels, and freeing it by hand is hours per batch.

Make watches the folder and runs each image through Vision the moment it lands, logging the text for you. Vision does the one hard step, turning pixels into characters; Make does the watching and filing. Together they turn an unsearchable pile into a searchable sheet without you typing a word.

What do you need before you start?

A Make account. The free 1,000 operations a month covers a lot of scans.
A Google Cloud account with the Vision API enabled, and a key. This is the extra setup step; Google’s docs walk through it.
A Google Drive folder for the images and a Sheet for the text.
Patience for one-time setup. After Vision is connected, the build runs itself.

Two Make words. A scenario is the whole folder-to-text automation. A module is one box. Three boxes, one scenario.

How does the workflow work, step by step?

Three modules, left to right:

Module	App	What it does
1. Trigger	Google Drive	Fires when you add an image to the folder
2. Brain	Google Cloud Vision	Runs OCR and returns the detected text
3. Output	Google Sheets	Logs the text next to the file name

The finished scenario in Make: a Drive trigger, a Google Cloud Vision OCR step, and a Google Sheets row, wired left to right.

Step 1: Watch a scans folder

Create a scenario and add Google Drive, Watch Files in a Folder, connect your account, and pick the folder of scans. This trigger fires on each new image.

Step 2: Run OCR with Google Cloud Vision

Add Google Cloud Vision, Run Text Detection (OCR) within an Image and map the file from Step 1. Connect it with your Google Cloud key. For handwriting, Vision has a dedicated detection mode that reads cursive far better than basic OCR, so pick that one if your images are handwritten.

Tip: in the module, choose document text detection for dense pages
or handwriting detection for letters and notes. Plain text
detection is best for signs, labels, and screenshots.

Matching the detection mode to your images is what gets clean text out instead of garble.

Step 3: Log the text

Add Google Sheets, Add a Row, and map the file name plus the detected text into columns. Now the folder has a searchable text companion for every scan.

How do you run it and check the result?

Click Run once on a test scan. When I tested this, printed text came back clean immediately, and handwriting needed the handwriting mode plus a tidy scan to read well, which is the real limit of OCR on cursive. Pick the right mode, then turn the scenario on so your scans become searchable as you add them.

After that, a drive full of images turns into a body of text you can search, copy, and quote. The stories stop hiding in pictures.

What does this cost to run?

Piece	Free tier	If you outgrow it
Make	1,000 operations/month free	Core plan from about $9/month
Google Cloud Vision	Generous free monthly tier	A few dollars per thousand images after that
Google Drive	Free	Free
Google Sheets	Free	Free

Google Cloud Vision has a free monthly allowance that covers most personal projects, then charges a few dollars per thousand images. More on Make’s side in our Make guide.

What can go wrong, and how do you avoid it?

Handwriting comes out garbled. Use the handwriting detection mode, and even then, messy cursive is hard. Tidy scans help most.
Layout is lost. OCR returns the words, not the page design. For tables, expect to tidy the result.
Google Cloud setup trips you up. It is the one fiddly part; follow Google’s enable-the-API steps once and it is done.
Wrong language. Set the language hint in the module if your text is not English.

The same read-then-log pattern runs many jobs. See Make AI scenarios.

How do you build this in Zapier or n8n instead?

Same three jobs, different names.

Job	Make	Zapier	n8n
Catch an image	Watch Files in a Folder	New File in Folder trigger	Google Drive Trigger node
Read the text	Cloud Vision OCR	Google Cloud Vision action	HTTP / Vision node
Log the text	Add a Row	Create Spreadsheet Row	Google Sheets node

Make and Zapier are easiest to start. Zapier vs Make vs n8n compares all three.

Want the ready-made template?

The steps above build it from scratch. If you would rather skip the setup, the importable Make blueprint is yours free: subscribe to the daily newsletter and the download is waiting on the thank-you page, next to our Special Reports. Import it, connect your own accounts, and you are running in minutes.

Subscribe free and grab the template →

Want it set up with you, live?

Book a 1-on-1 Live Claude AI Crash Course and we build your first automation together, screen to screen.

Book the 1-on-1 ($75) →

Want better prompts for it?

The AI Prompt Library pairs well once your text is extracted, for summarizing or cleaning up OCR output with Claude.

Get the Prompt Library ($39) →

Free the text trapped in your images

Free daily newsletter. Built for people who want to use AI well, not chase every model.

Free forever. Unsubscribe anytime.

Frequently asked questions

Why Google Vision and not an AI chatbot?

Cloud Vision is purpose-built for OCR, including a strong handwriting mode, and it is cheap at scale. A vision chatbot can also read text, but Vision is the specialist here.

Does it handle handwriting?

Yes, with the handwriting detection mode, though messy cursive is still hard for any tool. Clear writing and clean scans read best.

Is the Google Cloud setup hard?

It is the one fiddly step: create a project, enable the Vision API, and make a key. Google’s docs walk through it, and you only do it once.

Can I clean up the text afterward?

Yes. Add a Claude step to fix OCR errors or summarize the text before the sheet. This build keeps it to raw extraction.

What about non-English text?

Set the language hint in the Vision module, and it handles many languages well.

References and sources

Last reviewed: May 2026. Make, Google Cloud, and Google update their interfaces; check the official pages for exact button names.

The Reading Crisis Before AI

AI Image & Vision Automation

The Tool That Uses You