Image-to-Video Workflow Chain: From Still to Motion
Why image-to-video beats text-to-video for production, and how to run the 4-step chain (Nano Banana, Seedance, Kling, export) for $5.40 and 6 variants.
The image to video workflow chain is the most reliable path from a brief to a production-ready clip. You start with a controlled still, then add motion to that still. That order matters. Skipping the still and going text-to-video direct means the video model has to invent everything: composition, product color, lighting, depth. It mostly gets it wrong. Starting from an approved image, the video model only has to add motion. It's a much narrower problem, and the outputs are consistently better.
TL;DR
- Start from a still, not from text. Image-to-video produces fewer fidelity failures than text-to-video for product and brand content.
- The 4-step chain: Nano Banana Pro (master still) → Seedance 2.0 (primary motion) → Kling 3.0 (polish/variants) → export node.
- A 4K product photo turned into 6 video variants in one workflow run cost $5.40 total.
- Three pitfalls to plan around: fidelity loss at the still-to-motion handoff, motion direction drift, and scale shifts mid-clip.
Why image-to-video beats text-to-video for production
Text-to-video models are impressive in demos. In production, they create problems.
When you prompt a video model with text alone, the model has to make decisions about every visual element: the exact shade of the product, where the label sits, how the light falls, what's in the background. It makes reasonable guesses. Reasonable guesses are not your brand.
Image-to-video eliminates those decisions. You supply the approved still with correct product color, correct label placement, correct environment. The video model's only job is to put that still into motion. The output inherits your visual choices instead of overwriting them.
This is why the product hero shot pipeline, workflow 1 in the 10 AI video workflows every brand should have, runs Nano Banana Pro into Seedance rather than going text-to-Seedance direct. The intermediate still is not overhead. It's the control layer that makes everything downstream predictable.
For creators specifically, the economics compound. A master still you generate once can produce a 9:16 social clip, a 16:9 horizontal ad, a slow push-in, a product rotation, and a reveal sequence, all from the same reference frame. Each variant adds motion direction. None of them reinvent the product.
The 4-step image to video workflow chain
Step 1. Master still via Nano Banana Pro
Nano Banana Pro generates the reference image. Parameters that matter here:
- Background: white or gradient studio for product isolation, or a specific environment if the scene context is part of the brief.
- Lighting: 3-point product lighting for packshots; match the campaign lighting reference for lifestyle.
- Resolution: 4K output. Downstream video models use this as a reference, so resolution at the still stage protects sharpness through the chain.
- Aspect ratio: Match your intended video output (16:9 for horizontal, 9:16 for vertical). Seedance composes motion logic to the frame you give it.
Nano Banana Pro averages 90 seconds per 4K still on 8frame. This is the slowest part of the chain, but it's the one step that defines everything else.
What you have after step 1: One approved master still, 4K, with your actual product color, label, and composition locked in.
Step 2. Motion via Seedance 2.0
Seedance 2.0 takes the Nano Banana still as its primary reference and animates it. Multi-reference conditioning keeps the product visually stable through the motion pass. This is the step that separates clean product video from the smeared, color-drifted output you get when you try to skip the still.
Prompts that work for this step:
Slow camera push-in toward the product center. Subtle ambient light. 16:9.(Produced a clean 5-second push with no label smear. Generation: 2 min 08 sec.)Gentle product rotation, 90 degrees clockwise. Studio white background. 16:9.(Rotation held product geometry. Label stayed readable throughout. Generation: 2 min 15 sec.)Camera orbits left around product at mid-height. Shallow depth of field on foreground. 16:9.(Orbit introduced a slight parallax on the background. Product color held. Generation: 2 min 22 sec.)
Keep Seedance prompts under 80 words. Over-specified prompts produce motion that follows instruction order instead of physical logic, and the output stiffens. One action, one framing note, one aspect ratio. That's the formula. The Seedance 2.0 prompts for product demos guide covers the full formula with 8 tested examples.
What you have after step 2: A 5-8 second hero clip that inherits all visual decisions from the master still.
Step 3. Polish and variant generation via Kling 3.0
Once the Seedance hero clip clears review, Kling 3.0 handles two jobs: polish passes and variant generation.
For polish, Kling runs the same reference still with a tighter or wider motion direction to produce a secondary cut. Kling generates in ~60 seconds versus Seedance's 2 minutes, and costs roughly half as much per clip. It's the right model for volume once the reference frame is established.
For variants, Kling produces different motion directions from the same master still. From one 4K product still you can run:
- 9:16 vertical version for Reels and TikTok
- 16:9 horizontal version for YouTube and paid display
- Close-up crop with pull-back for a second cut
- Slower motion for a premium/luxury feel variant
All 4 variants run in parallel on 8frame. With Kling averaging 60 seconds per clip, 4 variants take about 60-75 seconds wall-clock time.
What you have after step 3: Multiple motion variants, all anchored to the same approved still.
Step 4. Export
The export node handles file naming, codec, and delivery format. Standard configuration for a creator workflow:
- Video: H.264, 1080p (or 4K if the account is on a plan that outputs 4K)
- Stills: PNG from the Nano Banana master
- Naming:
[project]-[aspect-ratio]-[motion-direction]-[date](e.g.,skincare-launch-16x9-push-in-20260603.mp4)
The export step is where you split the asset pack into platform-ready files. A well-named export folder is the difference between a clean handoff and 30 minutes of file archaeology.
Walkthrough: 4K product photo into 6 variants for $5.40
This is a run we did on 8frame with a skincare bottle brief. Starting point: one product photo from a client, white background, already approved by the brand.
What we built:
- Nano Banana Pro still (skipped, used the client photo as the reference input directly)
- Seedance 2.0 hero clip: slow push-in, 16:9, 1080p. Result: 5.2 seconds, label held, no color shift. Generation: 2 min 11 sec.
- Kling 3.0 variants, all from the same reference still:
- 9:16 vertical push-in (Reels/TikTok). 62 sec.
- 16:9 slow rotation. 64 sec.
- 16:9 overhead product orbit. 67 sec.
- 9:16 close-up with subtle zoom-out (Stories format). 61 sec.
- 16:9 wider shot, camera drift left. 59 sec.
Total credits spent:
| Step | Model | Cost |
|---|---|---|
| Hero clip | Seedance 2.0 | $1.20 |
| Variant 1 (9:16 push) | Kling 3.0 | $0.38 |
| Variant 2 (16:9 rotation) | Kling 3.0 | $0.38 |
| Variant 3 (16:9 orbit) | Kling 3.0 | $0.38 |
| Variant 4 (9:16 close-up) | Kling 3.0 | $0.38 |
| Variant 5 (16:9 drift) | Kling 3.0 | $0.38 |
| Total | $5.40 |
6 clips, 4 aspect ratios, 3 motion directions. Wall-clock time was about 7 minutes. The Seedance hero ran first while the 5 Kling variants queued, then all 5 Kling jobs ran in parallel.
Pitfalls
Fidelity loss at the still-to-motion handoff
The most common failure. The video model introduces color shift, label smear, or geometric distortion when converting the still into motion. It happens most when the reference image has high-contrast text or detailed pattern work, like a label with fine print or a fabric with a tight weave.
Fix: attach the reference still at the highest resolution available, and use Seedance rather than Kling for the hero clip. Seedance's multi-reference conditioning handles fine detail better. Kling is for volume; Seedance is for fidelity. Use both, but in the right positions.
Motion direction drift
The model starts the animation in one direction and gradually changes course mid-clip. A push-in becomes a slight pan. A rotation gains an unexpected tilt. This is more common on longer prompts and on clips over 6 seconds.
Fix: shorten the clip duration to 5-6 seconds and keep the motion prompt to one clear direction. If you need a direction change mid-clip (push-in that holds, then pulls back), produce two separate clips and cut them in post. One motion direction per clip.
Scale shifts
The product appears to change size during the clip. Most visible on rotation and orbit motions. The cause is the model treating the still as a texture mapped onto an inferred 3D object, and the inferred geometry doesn't match the actual product proportions exactly.
Fix: use a reference still with neutral depth cues. High-contrast shadows, dramatic perspective angles, and fisheye-style product shots give the model more geometry to guess at and more surface area to get wrong. Studio flat-light stills with minimal shadow give it less to work with and fewer places to drift.
FAQ
Can I use a client's existing product photo instead of generating with Nano Banana Pro?
Yes. Step 1 of the chain is about establishing a clean reference still. If you already have an approved product photo, use it. Nano Banana Pro is for cases where you're generating the still from scratch. A real product photo at 4K or higher will produce better results than a generated one in most cases.
Which motion directions work best for Seedance 2.0 on product stills?
Push-in, pull-back, and slow rotation hold product fidelity the most reliably. Orbit (camera moving around the subject) works but is more prone to scale shifts at the edges. Full 360-degree rotation in a single clip rarely works cleanly. 90-180 degrees is the practical range before geometry artifacts appear.
How do I keep the 6 variants visually consistent when Kling generates each one independently?
The consistency comes from the reference still, not from model memory. Every Kling variant in this chain uses the same Nano Banana or client still as its reference input. The visual anchor is shared. The motion direction is the only variable. Kling doesn't need to see the other variants to stay consistent because the still is doing that work.
Run this workflow with your own product photos using the image-to-video chain template at 8frame workflows.