The Real State of AI Video in 2026

Special Report · Vol. 2

Where it works, where it doesn’t, and what creators are actually using. Seedance, Sora’s exit, Runway’s $5.3B bet, and what Hollywood is saying.

By James · Beginners in AI · May 17, 2026 · 8 figures · 32 verified claims

The AI video race has a new shape. Runway is now worth $5.3 billion. OpenAI’s Sora consumer app shut down on April 26, 2026, and the Sora API ends September 24, 2026. Seedance 2.0 leads the live benchmark. Hollywood is split, not unified. Below is what actually shipped, what the leaders cost, and what creators are running — with every number checked against a primary source on May 17, 2026.

— Get the Full PDF Report

The Real State of AI Video in 2026

21 pages · 8 figures · pricing tables · sources page. Free PDF download.

Already subscribed? Download the PDF directly →

Six numbers that explain everything

$5.3B

Runway valuation · Feb 10, 2026 Series E

4–15s

Seedance 2.0 native clip range · longer is stitched

Sep 24

Sora API ends 2026 · consumer app shut Apr 26

Characters held by Nano Banana 2 consistency

24 fps

Genie 3 real-time 720p interactive worlds

~$50

Full pipeline per month · mid-tier 2026

Update (May 19, 2026): Google has announced Gemini Omni, an any-to-any multimodal video model that integrates conversational editing, character consistency, and synchronized audio in a single model. Full breakdown: Gemini Omni: Google’s Video Leap. The full ranking and capability matrix in this report will be revised to reflect Omni in the next Special Report update.

One claim we could not confirm and dropped: the widely-repeated “Netflix has committed to 50 AI-made films by end of 2026.” No Netflix earnings call, press release, or major outlet supports it. What Netflix has actually done is in Section 7. Two Christopher Nolan quotes (“artistic genocide” and “the moment we let machines dream for us…”) were also dropped — we could not find a primary source for either.

Three things changed in 90 days

The first quarter of 2026 reset the field. Six product moves from December 2025 through April 2026 explain almost everything you see in your feed today.

Runway shipped Gen-4.5 on December 1, 2025. Ten days later, on December 11, 2025, Runway announced GWM-1, its first General World Model, plus a native audio upgrade. GWM-1 is a frame-by-frame, real-time, interactive system.
Kling 3.0 and Kling 3.0 Omni launched on February 5, 2026. Omni is the variant with native audio in five languages: Chinese, English, Japanese, Korean, and Spanish.
Seedance 2.0 launched on February 12, 2026. Within 24 hours, a viral clip of a fake Tom Cruise vs. Brad Pitt fistfight was on X. Within 48 hours, Disney’s lawyers sent a cease and desist. Details in Section 4.
Nano Banana 2 launched on February 26, 2026. Technically named Gemini 3.1 Flash Image. It is an image model, not a video model. It matters here because most serious AI video work starts from a still.
OpenAI announced the Sora discontinuation on March 24, 2026. The consumer app and website shut down on April 26, 2026. The Sora API ends on September 24, 2026. OpenAI’s help center lists no successor product.
Veo 3.1 was released on October 15, 2025, not in 2026. The 2026 part is real but smaller: on January 13, 2026, Google added 4K upscaling, vertical 9:16, and scene extension that chains up to 20 segments for clips of roughly 141 seconds.

Figure 1. Six releases that rewrote the field. Each date confirmed against the company’s own announcement or tier-one press.

Who is actually winning the video race?

The answer depends on what you mean by winning. We pulled live data from the Artificial Analysis Video Arena, which uses blind pairwise voting and reports an Elo score. As of May 2026, the top of the leaderboard:

#1 — Dreamina Seedance 2.0 (ByteDance) at Elo 1,222
#2 — HappyHorse-1.0 (Alibaba research) at 1,214 — no public access yet
#3 — Kling 3.0 Omni 1080p Pro (Kuaishou) at 1,106
#5 — Veo 3.1 (Google) at 1,102
#8 — Sora 2 (OpenAI, December model) at 1,088 — consumer app already shut

S-tier is Seedance 2.0, Kling 3.0 / 3.0 Omni, and Veo 3.1. Each is best at something different. Seedance leads on multimodal control and phoneme-level lip sync. Kling 3.0 is the only major model with native 4K at up to 60 fps. Veo 3.1 is strongest on cinematic look and dialogue audio with sub-120 millisecond lip sync.

A-tier is Runway Gen-4.5 — production workflow leader — plus Luma Ray3, PixVerse V6, Wan 2.6, and Pika. Wan 2.6 supports first- and last-frame control. Pika owns creative effects (Pikaffects, Pikaswaps).

One claim we have to flag: several reviewer roundups still cite Runway Gen-4.5 at an Elo of 1,247. That figure is stale. Gen-4.5 does not appear in the current top 20. The 1,247 number was accurate in December 2025, before Seedance, Kling 3.0, Wan 2.6, and HappyHorse were added to the arena.

Figure 2. Tier list by live Elo benchmark, not vibes. We list Sora 2 for completeness, not as a tool to start with.

Image is where most workflows start

If you watch any popular AI video tutorial in 2026, you will see the same pattern. The creator opens an image tool first. They generate a still. They lock the character. Only then do they animate.

Why? A direct text-to-video prompt asks the model to invent the character, the room, the lighting, and the motion at once. The result tends to morph and melt. Starting with a still gives the video model a spatial anchor, so it can spend nearly all its compute on motion physics. The three image tools that matter for video work in 2026:

Midjourney V8.1 — launched as alpha March 17, 2026, became default April 30. Four to five times faster than V7, supports native HD output, and renders legible text inside images. $10 / $30 / $60 / $120 per month. No free tier.
Nano Banana 2 (Gemini 3.1 Flash Image) — launched February 26, 2026. The headline claim, verified directly from Google’s blog: “Maintain character resemblance of up to five characters and the fidelity of up to 14 objects in a single workflow.” Outputs 512px up to 4K. Default image model in the Gemini app, Google Search AI Mode, Google Lens, and Flow video editor.
FLUX.2 — current version from Black Forest Labs. Open-weight (klein, dev) and proprietary (pro, max) variants. Black Forest raised a $300 million Series B on December 1, 2025. FLUX.2 [pro] uses a 32 billion parameter rectified flow transformer with a Mistral 3 vision-language model and supports multi-reference composition.

The pipeline most creators actually use

Script in any LLM. Voiceover in ElevenLabs. Character stills in Nano Banana 2, locking the same character across every shot. Animate each still in Kling 3.0 or Runway. Edit, caption, and mix in CapCut. Total time per video is roughly 8 to 12 hours.

The reason this pipeline beats single-prompt video generation is identity persistence. The same protagonist in eight different shots, without their face changing. That used to be the hardest problem in AI filmmaking. It is now solved at the image stage, then carried forward as a reference image into the video model.

Figure 4. Image first. Always image first. The stack taught in current Skillshare and Udemy creator courses. ElevenLabs ($22) + Nano Banana (free via Gemini) + Kling Standard ($7) or Pro ($26) + CapCut Pro ($20) = roughly $50-70/month for the full pipeline. The top tier with Veo 3.1 + Project Genie is Google AI Ultra at $250/month.

The Seedance phenomenon

Seedance 2.0 is the most disruptive product of 2026 so far. It is also the most legally complicated.

The model launched on February 12, 2026 from ByteDance’s Seed team. The day before, the Irish filmmaker Ruairí Robinson posted an AI clip of Tom Cruise and Brad Pitt in a fistfight, made with a two-line prompt in a pre-release version. The post reached 1.8 million views in the first week. Deadpool & Wolverine co-writer Rhett Reese posted in response: “I hate to say it. It’s likely over for us.”

On February 13, 2026, Disney’s outside counsel sent a cease and desist to ByteDance global general counsel John Rogovin. The letter said ByteDance had pre-packaged Seedance “with a pirated library of Disney’s copyrighted characters from Star Wars, Marvel, and other Disney franchises, as if Disney’s coveted intellectual property were free public domain clip art.” Paramount Skydance, Warner Bros., Netflix, and Sony Pictures followed within a week.

On March 16, 2026, Senators Marsha Blackburn (R-TN) and Peter Welch (D-VT) sent a letter to ByteDance CEO Liang Rubo demanding Seedance be shut down. The letter said: “Seedance 2.0 is the most glaring example of copyright infringement from a ByteDance product to date, and you must immediately shut down Seedance and implement meaningful safeguards to prevent further infringing outputs.” ByteDance did not shut Seedance down. It added IP-character filtering and real-face interception.

What Seedance 2.0 actually does

A 4.5 billion parameter dual-branch diffusion transformer. Four input types in one call: text, image, audio, and video reference. Native audio generation including dialogue, sound effects, and ambient sound. Phoneme-level lip sync in eight languages. Per the official arXiv paper, native output is 480p and 720p, single-clip range is 4 to 15 seconds. The often-repeated “90 seconds” figure refers to stitched output, not native single-pass generation.

On Dreamina (international), the free tier gives 225 daily shared tokens. Paid plans start around $15 per month and run up to $84. On Jimeng (China), entry is 69 RMB per month, roughly $9.60. API access via Atlas Cloud is around $0.10 per second at standard quality.

Runway: from indie tool to Google challenger

On February 10, 2026, Runway announced a $315 million Series E at a $5.3 billion valuation. That nearly doubled its $3.3 billion valuation from April 2025. Total raised since 2018 is $860 million.

The popular framing of this round was that AMD Ventures and Nvidia led it. That is not what Runway’s official announcement says. The lead investor was General Atlantic. NVIDIA, AMD Ventures, Adobe Ventures, AllianceBernstein, Fidelity, Mirae Asset, Emphatic Capital, Felicis, and Premji Invest participated. Important distinction when citing this in any roundup.

Figure 3. $2M to $5.3B in eight years. Seven rounds. The Series E on Feb 10, 2026 was led by General Atlantic, not AMD or Nvidia as some early reports claimed.

On May 15, 2026, TechCrunch published “Runway started by helping filmmakers — now it wants to beat Google at AI“ by Rebecca Bellan. In that piece, Runway co-CEO Anastasis Germanidis said: “To build a world model, we first needed to build a really great video model. We believe that teaching models to predict pixels directly is the best way to achieve general-purpose simulation.”

CEO Cristóbal Valenzuela framed the pivot: “World models are the most transformative technology of our time. Our mission is to accelerate their development and ensure they have a positive impact on the world.” The TechCrunch piece cites 155 employees, $860 million raised, and $40 million in annual recurring revenue added in Q2 2026. The ARR figure is a founder-supplied number mid-quarter, not audited.

What is a world model?

Most AI video today is a one-way performance. You write a prompt, the model renders a clip, you watch it. You cannot walk through it. A world model is different.

A world model is an AI system that builds an internal simulation of an environment, then renders it frame by frame in response to your inputs in real time. You can walk forward, look up, look down, and the simulation stays consistent. Google DeepMind’s definition: “World models are AI systems that can use their understanding of the world to simulate aspects of it, enabling agents to predict both how an environment will evolve and how their actions will affect it.”

Genie, step by step

Genie 1 arrived in February 2024. Generated 2D platformer-style environments at 256 by 256 pixel resolution. Genie 2 arrived on December 4, 2024. Generated 3D environments, but only held them consistent for about 10 to 20 seconds. Genie 3 was announced on August 5, 2025. It holds consistency for several minutes at 24 frames per second and 720p, per DeepMind: “you can navigate in real time at 24 frames per second, retaining consistency for a few minutes at a resolution of 720p.”

Project Genie is the consumer product. It launched on January 29, 2026 for Google AI Ultra subscribers in the United States, ages 18 and up. Sessions cap at 60 seconds. The Google AI Ultra plan that gates it costs around $250 per month.

Runway shipped GWM-1 on December 11, 2025. Three flavors. GWM-Worlds generates explorable environments at 24 fps and 720p. GWM-Avatars (now called Runway Characters) makes conversational characters with lip sync, gaze, and gesture. GWM-Robotics generates synthetic training data for robot policies.

Other players: World Labs, founded by Fei-Fei Li, released Marble in November 2025. Decart, Odyssey, Tencent HY World 1.5, and Microsoft Muse all have research demos. Waymo quietly adopted a fine-tuned Genie 3 variant in February 2026 for robotaxi edge-case simulation.

Figure 7. Three things people call “AI video.” Only one is interactive. The serious money in 2026 (Runway at $5.3B, World Labs reportedly at $5B) is no longer flowing into text-to-video. It’s flowing into world models. The interactivity difference is why.

A caveat on VR and AR

The original brief asked us to confirm a connection between world models and VR or AR headsets. We did not find one in any official Google, Runway, or DeepMind statement. The applications listed are robotics training, agent training, gaming, interactive entertainment, and scientific simulation. World models could end up powering VR experiences, but no current product ships to a headset.

What Hollywood is actually saying

This section is a neutral tally. Quotes are presented as spoken. We do not editorialize. We also flag two quotes from the original brief that we could not verify.

Peter Jackson · Cannes 2026

Received the honorary Palme d’Or at Cannes on May 12, 2026. At his masterclass the following day: “I don’t dislike it at all. I mean, to me, it’s just a special effect. It’s no different from other special effects.” And: “AI used in the right way, it’s just a tool like any other tool.” Both quotes verified against Variety and The Hollywood Reporter.

Steven Spielberg · SXSW 2026

March 13, 2026: “I am not for AI if it replaces a creative individual.” Also said he had never used AI in any of his films “yet.”

James Cameron

Joined Stability AI’s board of directors on September 24, 2024. There is no public reporting of him leaving. Around the launch of Avatar: Fire and Ash, Cameron said the film would include a title card noting “no generative A.I. was used in the making of this movie,” and added: “I’m not negative about generative AI. I just wanted to point out we don’t use it on the Avatar films.”

Guillermo del Toro · NPR Fresh Air

October 23, 2025: “AI, particularly generative AI, I am not interested, nor will I ever be interested… The other day, somebody wrote me an email, said, ‘What is your stance on AI?’ And my answer was very short. I said, ‘I’d rather die.’”

Darren Aronofsky

Founded Primordial Soup in May 2025 and partnered with Google DeepMind on Veo. His first short, Ancestra by Eliza McNitt, premiered at Tribeca on June 13, 2025. His mantra: “soup, not slop.”

Two attributions we could not verify

The original brief listed two Christopher Nolan quotes: “artistic genocide” and “the moment we let machines dream for us is the moment we lose our own imagination.” After a careful search across Wired, The Washington Post, Variety, Deadline, ScreenRant, and Nolan’s own Directors Guild appearances, we could not find a primary source for either quote. Nolan has spoken about AI — in Wired (July 2023) he called AI’s biggest danger “the abdication of responsibility” — but the two quotes in the brief appear to be paraphrases or fabrications.

Figure 5. Six directors. Six positions. The Nolan position is dashed to mark unverified. Every other position is sourced.

The headlines we had to correct

The “Everyone in Hollywood” headline. The original brief cited a Hollywood Reporter Cannes piece titled “Everyone in Hollywood Is Using AI, but They Are Scared to Admit It” as a May 2026 finding. The article exists but it ran on May 15, 2024, not 2026. It was a Cannes 2024 dispatch by Winston Cho and Scott Roxborough. The point still holds — quiet adoption is widespread — but it is a 2024 article, not a fresh 2026 finding.

The Netflix “50 AI films” claim. We could not verify any Netflix plan or commitment to release 50 AI-made films by the end of 2026. What Netflix has actually done: used generative AI for a building-collapse VFX scene in the Argentine series El Eternauta (confirmed by co-CEO Ted Sarandos on the Q2 2025 earnings call). Launched INKubator, an internal AI animation studio for shorts, in March 2026. Acquired Ben Affleck’s AI firm InterPositive (confirmed on the Q1 2026 earnings call, April 17, 2026). Sarandos’s verified quote: “AI represents an incredible opportunity to help creators make films and series better, not just cheaper.”

The directors who use AI quietly. Surgical, not generative, is the pattern. The Brutalist used Respeecher to refine Hungarian dialogue pronunciation. Late Night with the Devil used three AI-generated still title cards. Secret Invasion used AI for the opening titles. AI as a layer in the pipeline, not as a replacement for the director or writer.

What it means for you

This is a tools section. Pick the stack that matches the job you actually do.

If you make faceless YouTube videos: Script in any LLM. Voice in ElevenLabs Creator at $22 per month for 100,000 credits and commercial use. Character stills in Nano Banana 2 (free in the Gemini app, paid via API). Motion in Kling 3.0 Standard at $6.99 per month or Pro at $25.99. Edit and caption in CapCut, $9.99 to $19.99 per month for Pro.
If you make social short-form video: Seedance 2.0 via Dreamina, free with daily token limits or $15 per month for basic. Edit in CapCut. Both products are owned by ByteDance and integrate cleanly.
If you market a small business: Higgsfield’s Marketing Studio is built for you. Paste a product URL. The system extracts the product page, picks an AI avatar from 40-plus options, and runs Seedance 2.0 for video output. Higgsfield reports roughly 4.5 million video generations per day across its platform (a marketing figure, not independently audited).
If you are an indie filmmaker: Runway Standard at $15 per month gets you Gen-4.5, Aleph, Act-Two, plus access to Veo and Kling 3.0 Pro through Runway’s app layer. Upscale in Topaz Video AI, now subscription-only at $299 per year for Personal or $699 for Pro since October 2025. Music in Suno Pro at $10 per month. Voice in ElevenLabs.
If you are a VFX pro: ComfyUI is the open-source node-based pipeline that runs FLUX.2 [dev], Wan, and ControlNet locally. Nano Banana 2 handles previsualization stills. Runway’s GWM-1 generates environments for set extension and digital backlots. Topaz handles finishing.

Figure 6. Free, mid, pro. Six tools. All prices verified against the company’s own pricing page as of May 17, 2026. A competent creator can run the entire stack for under $100 a month. The $200+ tier exists for power users who need Veo 3.1 at full quality or Project Genie world-model access. Pricing is still the limit only on long-form productions (hundreds of generations stitched).

Myths to bust

Eight myths. Eight receipts. Every reality below ties to a primary source verified in May 2026.

Myth: “AI video is solved.” Reality: native single-clip output for nearly every major model is still 8 to 15 seconds. Longer outputs come from chaining and extension. Veo 3.1’s longest chained sequence is about 141 seconds.
Myth: “Sora was the best.” Reality: Sora 2 Pro never reached the top of the Artificial Analysis leaderboard in 2026. Seedance 2.0 leads at 1,222 Elo. Sora 2 sits at 1,088. The consumer app shut down April 26, 2026.
Myth: “AI is replacing filmmakers.” Reality: most documented professional uses are surgical, not generative. The Brutalist used Respeecher to refine dialogue. Late Night with the Devil used three still title cards. AI as a layer, not a replacement.
Myth: “It is all stolen content.” Reality: true for some, no longer true for others. ElevenLabs Music, launched August 5, 2025, trains on licensed data via deals with Merlin (~30,000 indie labels) and Kobalt. Artists must opt in. Suno and Udio still face active major-label lawsuits over unlicensed training data.
Myth: “You can prompt a feature film.” Reality: no. The most credible 2026 AI-assisted feature, Critterz, has a budget around $30 million and uses a proprietary AI production system called Woven. It is described as “human-led, AI-assisted,” not prompt-and-go.
Myth: “Hollywood rejects it.” Reality: Hollywood is split, not unified. Pro-AI: Cameron, Aronofsky, the Russos. Cautiously open: Jackson, Nolan, Gerwig. Anti-AI: del Toro, Spielberg, Gunn, Rian Johnson, Seth Rogen.
Myth: “Real-time AI video is not here.” Reality: with caveats, it is. Google DeepMind’s Genie 3 generates interactive worlds in real time at 24 frames per second and 720p. Runway’s GWM-1 runs at the same frame rate. Standard text-to-video still takes 30 seconds to several minutes per clip.
Myth: “It costs nothing.” Reality: free tiers are real and useful. Serious work is $10 to $30 per month. The top tier (Google AI Ultra for full Veo 3.1 plus Project Genie) is $249.99 per month.

Figure 8. Eight myths. Eight receipts. Each reality cell ties to a primary source verified in May 2026.

— Get the Full PDF Special Report

The Real State of AI Video in 2026

21 pages · 8 figures · pricing tables · sources page. Subscribers get every Special Report by PDF.

Already subscribed? Download the PDF directly →

Want hands-on help with AI video?

1-on-1 Coaching

1-on-1 AI Coaching with James

Want to set up the full image-to-video pipeline (Nano Banana 2 → Kling 3.0 → CapCut), pick the right tool for your use case, or troubleshoot a workflow that’s not working? Book a 1-on-1 session.

Book a session →

Group Workshops

AI Workshops for Teams — with James

For businesses rolling AI tools out to a team of 3 or more. Custom-built around your team’s actual stack and goals. Hosted by James — same person you’d get for 1-on-1, just teaching your group together.

Get a quote →

Sources & methodology

Every statistic in this report was checked against a primary source on May 17, 2026. Claims that did not survive verification are flagged in the body text, not buried — including two Christopher Nolan quotes from the original brief, the Netflix “50 AI films” claim, and the May 2026 dating of the Hollywood Reporter “Everyone in Hollywood” piece (which actually ran in May 2024).

Primary company sources: Google blog, Google DeepMind blog, Runway research blog, Runway newsroom, ByteDance Seed blog, Kuaishou IR, OpenAI Help Center, Black Forest Labs, Stability AI
Tier-one press: TechCrunch on Runway’s $5.3B Series E (Feb 10, 2026) · TechCrunch — Runway vs. Google (May 15, 2026) · Y.M.Cinema on Spielberg at SXSW (Mar 14, 2026) · Hollywood Reporter on Peter Jackson at Cannes · CNBC, Variety, Bloomberg, Axios, Deadline, Fast Company, NPR Fresh Air, Billboard, Music Business Worldwide
Benchmarks: Artificial Analysis Video Arena, VBench and VBench-2.0 (arXiv:2503.21755)
Government and legal: Senator Welch press release on the Blackburn-Welch letter (welch.senate.gov, Mar 17, 2026), Senator Blackburn press release, Bartz v. Anthropic court filings, Disney/Universal v. Midjourney (CourtListener docket 2:25-cv-05275), Getty v. Stability AI UK High Court ruling (Nov 4, 2025)
Academic and technical: Seedance 2.0 technical paper (arXiv:2604.14148), Google DeepMind Genie 2 and Genie 3 papers and blog posts, Runway GWM-1 research post (Dec 11, 2025)

This report is a snapshot. The field is moving fast enough that some figures will shift within months. Pricing, leaderboard rankings, and product availability change without notice. Treat this as a starting position for your own research, not a final answer.

Last reviewed: May 17, 2026.

Two ways to go further

The AI Prompt Library

1,000+ ready-to-use prompts for Claude, ChatGPT, and Gemini. Stop staring at a blank box.

Get it for $39 →

2-Hour Live AI Crash Course

A private, beginner-friendly session across Claude, ChatGPT, Gemini, and the wider landscape.

Book for $125 →

Best AI Prompts for HR

What Is Google Gemini? A Guide

Slack Claude Connector

The Real State of AI Video in 2026

Six numbers that explain everything

Three things changed in 90 days

Who is actually winning the video race?

Image is where most workflows start

The pipeline most creators actually use

The Seedance phenomenon

What Seedance 2.0 actually does

Runway: from indie tool to Google challenger

What is a world model?

Genie, step by step

A caveat on VR and AR

What Hollywood is actually saying

Peter Jackson · Cannes 2026

Steven Spielberg · SXSW 2026

James Cameron

Guillermo del Toro · NPR Fresh Air

Darren Aronofsky

Two attributions we could not verify

The headlines we had to correct

What it means for you

Myths to bust

Want hands-on help with AI video?

You may also like

Sources & methodology

Best AI Prompts for HR

What Is Google Gemini? A Guide

Slack Claude Connector

Discover more from Beginners in AI