← Back to blog

Image-to-Video Workflow Chain: From Still to Motion

Why image-to-video beats text-to-video for production, and how to run the 4-step chain (Nano Banana, Seedance, Kling, export) for $5.40 and 6 variants.

The image to video workflow chain is the most reliable path from a brief to a production-ready clip. You start with a controlled still, then add motion to that still. That order matters. Skipping the still and going text-to-video direct means the video model has to invent everything: composition, product color, lighting, depth. It mostly gets it wrong. Starting from an approved image, the video model only has to add motion. It's a much narrower problem, and the outputs are consistently better.

TL;DR

Why image-to-video beats text-to-video for production

Text-to-video models are impressive in demos. In production, they create problems.

When you prompt a video model with text alone, the model has to make decisions about every visual element: the exact shade of the product, where the label sits, how the light falls, what's in the background. It makes reasonable guesses. Reasonable guesses are not your brand.

Image-to-video eliminates those decisions. You supply the approved still with correct product color, correct label placement, correct environment. The video model's only job is to put that still into motion. The output inherits your visual choices instead of overwriting them.

This is why the product hero shot pipeline, workflow 1 in the 10 AI video workflows every brand should have, runs Nano Banana Pro into Seedance rather than going text-to-Seedance direct. The intermediate still is not overhead. It's the control layer that makes everything downstream predictable.

For creators specifically, the economics compound. A master still you generate once can produce a 9:16 social clip, a 16:9 horizontal ad, a slow push-in, a product rotation, and a reveal sequence, all from the same reference frame. Each variant adds motion direction. None of them reinvent the product.

The 4-step image to video workflow chain

Step 1. Master still via Nano Banana Pro

Nano Banana Pro generates the reference image. Parameters that matter here:

Nano Banana Pro averages 90 seconds per 4K still on 8frame. This is the slowest part of the chain, but it's the one step that defines everything else.

What you have after step 1: One approved master still, 4K, with your actual product color, label, and composition locked in.

Step 2. Motion via Seedance 2.0

Seedance 2.0 takes the Nano Banana still as its primary reference and animates it. Multi-reference conditioning keeps the product visually stable through the motion pass. This is the step that separates clean product video from the smeared, color-drifted output you get when you try to skip the still.

Prompts that work for this step:

Keep Seedance prompts under 80 words. Over-specified prompts produce motion that follows instruction order instead of physical logic, and the output stiffens. One action, one framing note, one aspect ratio. That's the formula. The Seedance 2.0 prompts for product demos guide covers the full formula with 8 tested examples.

What you have after step 2: A 5-8 second hero clip that inherits all visual decisions from the master still.

Step 3. Polish and variant generation via Kling 3.0

Once the Seedance hero clip clears review, Kling 3.0 handles two jobs: polish passes and variant generation.

For polish, Kling runs the same reference still with a tighter or wider motion direction to produce a secondary cut. Kling generates in ~60 seconds versus Seedance's 2 minutes, and costs roughly half as much per clip. It's the right model for volume once the reference frame is established.

For variants, Kling produces different motion directions from the same master still. From one 4K product still you can run:

All 4 variants run in parallel on 8frame. With Kling averaging 60 seconds per clip, 4 variants take about 60-75 seconds wall-clock time.

What you have after step 3: Multiple motion variants, all anchored to the same approved still.

Step 4. Export

The export node handles file naming, codec, and delivery format. Standard configuration for a creator workflow:

The export step is where you split the asset pack into platform-ready files. A well-named export folder is the difference between a clean handoff and 30 minutes of file archaeology.

Walkthrough: 4K product photo into 6 variants for $5.40

This is a run we did on 8frame with a skincare bottle brief. Starting point: one product photo from a client, white background, already approved by the brand.

What we built:

  1. Nano Banana Pro still (skipped, used the client photo as the reference input directly)
  2. Seedance 2.0 hero clip: slow push-in, 16:9, 1080p. Result: 5.2 seconds, label held, no color shift. Generation: 2 min 11 sec.
  3. Kling 3.0 variants, all from the same reference still:
    • 9:16 vertical push-in (Reels/TikTok). 62 sec.
    • 16:9 slow rotation. 64 sec.
    • 16:9 overhead product orbit. 67 sec.
    • 9:16 close-up with subtle zoom-out (Stories format). 61 sec.
    • 16:9 wider shot, camera drift left. 59 sec.

Total credits spent:

Step Model Cost
Hero clip Seedance 2.0 $1.20
Variant 1 (9:16 push) Kling 3.0 $0.38
Variant 2 (16:9 rotation) Kling 3.0 $0.38
Variant 3 (16:9 orbit) Kling 3.0 $0.38
Variant 4 (9:16 close-up) Kling 3.0 $0.38
Variant 5 (16:9 drift) Kling 3.0 $0.38
Total $5.40

6 clips, 4 aspect ratios, 3 motion directions. Wall-clock time was about 7 minutes. The Seedance hero ran first while the 5 Kling variants queued, then all 5 Kling jobs ran in parallel.

Pitfalls

Fidelity loss at the still-to-motion handoff

The most common failure. The video model introduces color shift, label smear, or geometric distortion when converting the still into motion. It happens most when the reference image has high-contrast text or detailed pattern work, like a label with fine print or a fabric with a tight weave.

Fix: attach the reference still at the highest resolution available, and use Seedance rather than Kling for the hero clip. Seedance's multi-reference conditioning handles fine detail better. Kling is for volume; Seedance is for fidelity. Use both, but in the right positions.

Motion direction drift

The model starts the animation in one direction and gradually changes course mid-clip. A push-in becomes a slight pan. A rotation gains an unexpected tilt. This is more common on longer prompts and on clips over 6 seconds.

Fix: shorten the clip duration to 5-6 seconds and keep the motion prompt to one clear direction. If you need a direction change mid-clip (push-in that holds, then pulls back), produce two separate clips and cut them in post. One motion direction per clip.

Scale shifts

The product appears to change size during the clip. Most visible on rotation and orbit motions. The cause is the model treating the still as a texture mapped onto an inferred 3D object, and the inferred geometry doesn't match the actual product proportions exactly.

Fix: use a reference still with neutral depth cues. High-contrast shadows, dramatic perspective angles, and fisheye-style product shots give the model more geometry to guess at and more surface area to get wrong. Studio flat-light stills with minimal shadow give it less to work with and fewer places to drift.

FAQ

Can I use a client's existing product photo instead of generating with Nano Banana Pro?

Yes. Step 1 of the chain is about establishing a clean reference still. If you already have an approved product photo, use it. Nano Banana Pro is for cases where you're generating the still from scratch. A real product photo at 4K or higher will produce better results than a generated one in most cases.

Which motion directions work best for Seedance 2.0 on product stills?

Push-in, pull-back, and slow rotation hold product fidelity the most reliably. Orbit (camera moving around the subject) works but is more prone to scale shifts at the edges. Full 360-degree rotation in a single clip rarely works cleanly. 90-180 degrees is the practical range before geometry artifacts appear.

How do I keep the 6 variants visually consistent when Kling generates each one independently?

The consistency comes from the reference still, not from model memory. Every Kling variant in this chain uses the same Nano Banana or client still as its reference input. The visual anchor is shared. The motion direction is the only variable. Kling doesn't need to see the other variants to stay consistent because the still is doing that work.


Run this workflow with your own product photos using the image-to-video chain template at 8frame workflows.

Related articles

workflow recipe10 AI Video Workflows Every Brand Should Have Saved in 2026workflow recipeBuilding a Sales Asset Library with AI in One Afternoonworkflow recipeThe Creator's Daily AI Workflow: One Post Per Day, Sustainably

Your frames start here

Watch the canvas power your creative flow in real time

Stay in the loop

Be the first to hear about our launch and get product updates