From Sora to Veo: The AI Video Model Timeline 2020-2026
The full AI video model timeline from early research to Veo 3, Kling 3, and Sora 2's retirement in April 2026. What each shift meant for creators and what's next.
The AI video model timeline from 2020 to 2026 covers roughly six years, but most of the meaningful work happened in the last three. This is the compressed version: what shipped, when it shipped, what broke before it, and what each milestone actually changed for people generating video for a living.
TL;DR
- Runway Gen-1 (early 2023) was the first model creators could actually use in production. Everything before it was research.
- Sora's February 2024 preview changed the expectation floor. It took until 2025 before any model shipped with comparable output.
- By mid-2026, three models (Veo 3, Kling 3.0, Seedance 2.0) can produce client-ready video. The era of "one model wins everything" is already over.
- Sora 2 was retired by OpenAI on April 26, 2026. The model that caused the biggest shift in expectations was the first major model to be shut down.
The years before it worked (2020-2022)
Generative video research started well before creators paid attention. Early approaches like VGAN and MoCoGAN produced short, low-resolution clips that were technically impressive and practically useless. You could generate a few frames of something that looked vaguely like motion, but the output degraded fast, faces morphed, objects dissolved.
The field was not standing still during this period. Latent diffusion models, which were quietly laying the groundwork for image generation, were also being explored for video. The key insight researchers were building toward was that video is just structured time, and if you could model the temporal dimension the same way diffusion models handled spatial information, coherent motion would follow.
It didn't follow easily. Temporal consistency turned out to be a hard problem. Training on video data required orders of magnitude more compute than training on images. The results in 2021 and 2022 were incremental. Text-conditional video generation existed, but the text barely controlled anything recognizable.
Runway Gen-1 and the first usable moment (early 2023)
Runway's Gen-1 model, released in early 2023, was the first system creators could run a production brief against and get something usable back. It was not unlimited. Resolution was capped, clips were short, and prompt adherence was inconsistent. But it could take a reference image and apply a style to a video clip. That narrow use case, style transfer applied to existing footage, found a real audience.
Editors used it on music videos. Motion designers used it to add texture to animation. The output looked artificial in ways that were obvious under scrutiny, but social media was tolerant of that in early 2023 in a way it would not be twelve months later.
The significance was not the technical ceiling. It was that the workflow existed. You could open a browser, type a description, upload a reference, and get back a clip you didn't have to shoot. That was new.
Pika 1.0 and the consumer breakthrough (late 2023)
Pika launched version 1.0 in late 2023 with a Discord-first approach that put text-to-video generation in front of hundreds of thousands of users who had no professional post-production background. The model was not stronger than Runway's. The interface was more approachable.
Pika 1.0 mattered for two reasons. First, it demonstrated that text-to-video had a consumer appetite, not just a professional one. Second, it made the limitations visible to a much wider audience. Every person who tried it ran into the same walls: hands deformed, text broke, anything longer than three seconds fell apart. The "AI video is almost there" narrative started circulating broadly in late 2023 because of Pika. It was accurate at three seconds and a lie at ten.
What creators learned from Pika wasn't how to make great video. It was how to scope an AI video brief: shoot for moments, not sequences.
Sora preview and the reset (February 2024)
OpenAI's Sora preview in February 2024 was the moment the AI video timeline breaks cleanly into before and after. The demonstrations showed coherent motion sustained over 60 seconds, physical simulation that understood object persistence, and cinematic camera movement that previous models had not approached.
The preview was only that, a preview. No one outside OpenAI was generating with the model in February 2024. But the benchmark shifted. Every model that shipped after February 2024 got evaluated against what Sora had shown was theoretically possible. Runway, Pika, Stability AI and others all shipped updates in the following months that were partly a response to the new expectation floor Sora set.
The irony is that Sora was not widely accessible until much later, and by the time it was, other models had closed significant portions of the gap. The preview did more to accelerate the field than the actual product release did.
Veo 1 and enterprise credibility (mid-2024)
Google DeepMind's Veo 1 shipped in mid-2024 in limited access through VideoFX and later through the Vertex AI API. It was the first time a hyperscaler's video generation model was directly competitive with the research-preview demonstrations that had been circulating. The resolution, temporal coherence, and lighting quality were legitimately better than what was available in consumer tools.
Veo 1 did not change most creators' workflows immediately because access was restricted. It changed expectations again. If Google could ship something at that quality in limited access, the question became how quickly the technology would reach general availability. The answer turned out to be faster than most analysts predicted.
Kling launch and the cost pressure (mid-2024)
Kuaishou's Kling launched in mid-2024, initially in China before wider international access. It was the first model to challenge Western research labs on quality while pricing significantly below them. Kling could generate two-minute clips at 1080p, which was longer than anything else available at the time.
The response to Kling from Western labs was price movement. Within two quarters of Kling's international launch, the per-clip cost from competing platforms dropped across the board. The model forced an economic reality that pure research competition had not: if a Chinese lab could offer comparable output for a fraction of the price, the business models that assumed premium AI video would stay expensive were wrong.
For creators, Kling meant that regular iteration on AI video became affordable. Before Kling, running twenty variants of a prompt to find the right one was expensive enough to discourage it. After, it wasn't.
Sora 2 release and the full product (early 2025)
OpenAI released Sora 2 in early 2025 with the quality that the February 2024 preview had implied. Full public access, multiple resolution options, direct integration into ChatGPT Plus, and a storyboard interface that let you sequence clips with scene descriptions. The reception was mixed.
The model was excellent. The problem was timing. By early 2025, Veo (now at version 1.5) and Kling 2.0 had both shipped improvements, and the gap between them and Sora 2 was smaller than anyone outside OpenAI expected after a year of hype. Users who had anticipated Sora 2 being a generation ahead of the market found a model that was good, competitive, and not dominant.
Sora 2's commercial integrations with advertising agencies and production companies were real. Several major brands ran campaigns using Sora 2 in 2025. The storyboard workflow in particular had no clean equivalent in other models. But the "Sora wins everything" outcome that the February 2024 preview had suggested never materialized.
Veo 3, Kling 3, Seedance 2.0 and the three-way split (2025-2026)
The period from late 2025 through early 2026 produced three models that are genuinely different rather than incrementally better versions of each other.
Veo 3 (Google DeepMind) topped the field on cinematic quality. Native 4K at 60fps, lighting that held up under comparison to real camera footage, and temporal consistency that earlier versions had achieved only sporadically. For brand films and any shot where the brief includes words like "cinematic" or "premium," Veo 3 became the clear pick.
Kling 3.0 (Kuaishou) won on value and practical throughput. Three-minute max clip length, native 4K at 30fps, and a price per clip that was roughly a third of Veo 3.1. The model's weakness in fluid organic motion (wing beats, water, cloth) was consistent enough to plan around. Teams doing high-volume ad iteration defaulted to Kling 3.0 and reserved Veo for hero shots.
Seedance 2.0 (ByteDance) was the most distinct of the three. Multi-reference conditioning meant you could feed it a reference image of a product and a reference image of a location and get a composed scene that integrated both. For product advertising and ecommerce video, Seedance 2.0 solved a workflow problem the other two models didn't address well.
We tested all three on the same prompt on the 8frame canvas in May 2026. The bee-through-grass prompt we used for the best AI video generator 2026 comparison shows the three-way split clearly: Veo 3 wins on lighting, Kling 3.0 wins on speed and value, Seedance wins on physical accuracy. None of them wins on all three simultaneously.
Sora 2 shutdown (April 26, 2026)
On April 26, 2026, OpenAI retired Sora 2. The announcement gave users a short migration window and recommended the ChatGPT video generation pipeline as the forward path. The shutdown was the first time a major AI video model had been retired at scale, and it forced teams who had built production workflows around Sora 2 to migrate fast.
The most common landing spots were Veo 3 for cinematic work and Kling 3.0 for everything else. The storyboard interface that Sora 2 had pioneered had no direct equivalent in either, which created a real workflow gap for teams that had relied on it for sequencing. Some migrated to manual sequencing on multi-model canvases like 8frame. Others rebuilt the storyboard logic in their own tooling.
For a look at the current model landscape after Sora 2's retirement, the Veo 3 vs Sora 2 vs Kling 3 comparison covers the full breakdown. If you're specifically looking for where to take a Sora 2 workflow, Sora 2 alternatives covers the migration options.
What the cost curve actually looked like
The price per 5-second AI video clip dropped by roughly 80% between early 2024 and mid-2026. In early 2024, generating ten variants of a single prompt would cost most individual creators $20 to $40 in platform credits, which was enough to think twice about it. By June 2026, the same ten variants cost $3 to $8 depending on model.
The cost reduction came from three overlapping factors: model efficiency improvements, competition from lower-cost Chinese labs, and infrastructure amortization as providers scaled. The quality ceiling rose during the same period. Both moves happened simultaneously, which is unusual in hardware markets and almost unprecedented in software.
The practical effect for creators was a shift from "generate one and commit" to "generate ten and pick." Teams that adapted their workflows to this economic reality got materially better output because they could afford to treat generation as iteration rather than production.
What late 2026 looks like from here
Three signals are worth watching.
Native audio integration is the most significant near-term capability gap. Veo 3 has audio output. Sora 2 had early audio work before it was retired. The other major models don't ship audio natively, and adding audio in post adds friction. The next version of Kling or Seedance with coherent audio output will close a workflow gap that matters for social and ad production.
Longer clip lengths with maintained coherence is the other frontier. Three-minute Kling clips are technically possible but coherence drops meaningfully after the first minute. If any model achieves five minutes of consistent character, lighting, and motion in 2026, the use cases that open up (short film production, explainer video, full ad spots without editing) are substantially larger than the current market.
Pricing may have reached a floor. The cost reductions from 2024 to 2026 were steep. The models that can offer further reductions without quality degradation are limited. Expect pricing to stabilize and competition to shift toward capabilities, workflow features, and enterprise integrations rather than per-clip cost.
FAQ
What was the first AI video model creators could actually use?
Runway Gen-1 in early 2023 was the first widely accessible model with a workflow that matched a real production task: style transfer on existing footage. Everything before it was research output or technical demonstration without a practical use case most creators could apply.
Why was the Sora February 2024 preview so significant if the model wasn't accessible?
The preview reset the expectation floor for the entire field. Every model that shipped after February 2024 got measured against what Sora had demonstrated was achievable. That benchmark pressure accelerated development at Runway, Pika, Google, and Kuaishou faster than a competitive product release would have, because it forced teams to respond to a public quality reference rather than just each other's shipping history.
What should Sora 2 users do after the April 2026 shutdown?
The main migration paths are Veo 3 for cinematic and brand work, and Kling 3.0 for high-volume iteration and ad production. The storyboard sequencing workflow Sora 2 offered has no direct equivalent, but the 8frame canvas lets you sequence multi-model outputs manually. The Sora 2 alternatives guide covers the full migration options with model-specific comparisons.
You can run any of the current generation models, Veo 3, Kling 3.0, Seedance 2.0, and the rest, on a single canvas and compare outputs directly. The 8frame workflows library includes templates for the most common use cases: cinematic brand film, product demo, UGC iteration, and character-consistent sequences. Run one prompt against all of them and you'll see the current state of the timeline in about ten minutes.