Video on the PDP: Why Zalando and ASOS Started Rewarding It

Video on the PDP: Why Zalando and ASOS Started Rewarding It

Marketplaces now prioritize video in PDP ranking. Here's the packshots-to-model-to-video pipeline that delivers up to 60% more time-on-page without a shoot.

Somewhere in the past eighteen months, video stopped being a campaign asset and became a catalogue requirement. Zalando and ASOS both shifted their ranking algorithms to favor PDPs that include video. No press release announced the shift. Brands noticed it in their traffic data. Sellers who added video to existing listings saw time-on-page climb. Those who did not saw their products slide in category grids. The format changed, and the production question followed: how do you add video to thousands of SKUs without doubling your content budget?

What time-on-page actually measures

What time-on-page actually measures

Time-on-page is an engagement signal, not a conversion metric. Confuse the two and the business case falls apart later.

Twelve seconds of watch time guarantees nothing about the sale. What it shows is narrower: the content held attention past the first glance. That is not nothing. On a fashion PDP, where Contentsquare's 2025 Digital Experience Benchmark shows dwell time has dropped 22% since 2022, holding attention past the first three seconds is harder than it sounds, and a video that earns twelve seconds has already cleared a bar most static galleries never reach.

Zalando's own data, cited in GoPackshot's ECD Munich 2026 masterclass, puts time-on-page lift from video at up to 60%. That figure covers product video embedded in PDP galleries, not campaign content living on brand channels. The mechanism is straightforward: a still photo answers one question per frame. A short video answers several in sequence, how the fabric moves, how the silhouette shifts when the model walks, whether the collar holds its shape or collapses.

For outerwear, knitwear and anything with structure, the upgrade is anything but marginal. It decides whether a customer scrolls past or spends forty-five seconds on the product before adding to cart.

The 60% figure is worth scrutinizing. It comes from Zalando internal data, and it covers a broad product category mix. For basics and accessories, the lift is probably smaller. For technical garments where fit and movement matter, it may run higher. Honest answer: we do not have clean per-category breakdowns in the public record. Use it as a directional signal, not a guaranteed outcome.

What the signal does confirm is that marketplace algorithms have started treating video presence as a positive ranking input. ASOS already prioritizes video in PDP display order. Zalando's partner documentation increasingly references video as a recommended content type. The format went from optional to competitive advantage inside roughly two years.

Why catalogue-scale video was out of reach

Why catalogue-scale video was out of reach

Traditional video production for e-commerce is expensive by design. Campaign day rates run between EUR 2,500 and EUR 10,000 (ProShot Media, Lars Miller Media 2026). Then casting, location permits, travel, and post-production, plus a separate export for every marketplace format. A brand with 500 SKUs per season would need to either select a handful of hero products for video or accept costs that made the format inaccessible at catalogue scale.

So most brands did neither cleanly. They shot video for hero SKUs, hero campaigns, hero seasons. The rest of the catalogue stayed static. And the catalogue is where most of the revenue actually lives, in the mid-tier SKUs that do not get campaign budgets but still need to convert.

Some brands tried the workaround of animated packshots: slow zooms, parallax effects, subtle motion added in post. Technically a video. Practically a still image wearing a costume. Marketplaces do not treat a looping animation and a genuine motion clip the same way, and the ranking benefit for animated packshots stays lower than for real motion video.

The bottleneck was always production, not intent. Most heads of e-commerce understood video converted two years ago. They held back anyway, and the reason was organizational, not technical. The format did not fit their content pipeline, which was built for photography batches rather than motion capture.

AI-generated video from stills changes that math. A packshot that already exists becomes the input. The AI generates motion: the model walks, the fabric moves, the silhouette shows from two angles. The same sample that went through the photography pipeline produces a video asset inside the same workflow, with no second shoot and no location to rebook.

At thirty percent lower production cost than a traditional model video shoot (GoPackshot internal benchmark, ECD Munich 2026), the math shifts enough to justify video across a full seasonal catalogue, not just the top twenty products.

Packshots to model to video: the three-stage pipeline

Packshots to model to video: the three-stage pipeline

GoPackshot's production pipeline runs in sequence: packshots first, model photos second, video third. Three stages, one set of inputs.

It starts with the packshot. Clean product on white, shot to marketplace spec. This is the mandatory baseline, and everything downstream depends on getting it right: the packshot has to capture the garment accurately, its color, its proportion, the construction detail a customer will judge it on before they ever click add to cart. Bad packshots produce bad model images and worse video. Motion makes it worse, because a color that looks slightly off in a still looks obviously wrong the moment the fabric moves through frame.

Next comes the on-model imagery. Using GoPackshot's Packshot2Model workflow, a packshot feeds an AI pipeline that generates a complete set of model photographs: front, back, detail, marketplace-compliant framing. The face is AI-generated from a pre-approved pool of ethnically and age-diverse options validated by the GoPackshot team. The body is real. The garment is real. The AI handles face and pose only, which is why the product accuracy stays at 90% or above.

Video closes the sequence, generated from a single model still. One frame becomes motion. The output is typically one to four cuts showing the garment in movement, formatted for PDP embed and marketplace delivery.

The sequencing matters more than each individual step. A brand that tries to skip to stage three, feed a bad packshot directly to video AI, gets unusable output and spends more time in corrections than it saved in production. Treat stage one as mere prep for stage two and the whole pipeline buckles. Stage one is what makes everything downstream work.

From sample arrival to delivered video assets, the timeline runs forty-eight hours on standard throughput. That is the same turnaround as photography-only batches. Video stops being a separate production event and becomes a format option within the existing content order.

What the marketplace algorithms actually reward

What the marketplace algorithms actually reward

Zalando's algorithm is less of a black box than people describe. The platform says plainly that content quality affects ranking. OTTO's marketplace documentation states directly that better content wins placement against competing sellers of the same product.

Video fits into this logic in a specific way. Marketplace category pages compete for the same customer attention that Instagram and TikTok are also competing for. A grid of static images loses the eye to a row with a looping video thumbnail almost every time. Platforms that want customers converting on-platform have every incentive to surface the more engaging content higher.

About You applies a different kind of pressure. Non-compliant images trigger a 1.0 percentage point commission rate penalty (per About You partner documentation). The platform penalizes quality failures in stills. As video becomes a recommended content type across platforms, brands can expect similar compliance pressure to extend to motion content.

For brands selling across Zalando, OTTO, About You, and Amazon at once, the specifications are already fragmented. Each platform has different image dimensions, background requirements, and framing rules. Video adds another compliance layer: format, length, aspect ratio, audio requirements. The argument for a production partner that handles cross-marketplace formatting, rather than a single-platform video tool, becomes more practical as the specification matrix grows.

One thing the algorithms do not reward: animated packshots that add a slow zoom and call it video. The Zalando ranking uplift tied to video content appears in the data for genuine motion content, not for stills with added zoom. Brands that took the shortcut and added parallax animation to packshots found the ranking benefit smaller than expected. The effort to distinguish the formats matters because the investment is different, and the return is different.

The five production rules that keep video trustworthy

The five production rules that keep video trustworthy

AI-generated video does not replace real content wholesale. GoPackshot runs five non-negotiable rules across all AI-assisted production, and video sits inside the same framework.

At least 20% of PDP content must be real photography. Always. The 20% floor works as a calibration mechanism. No regulator demands it; the trust math does. Without real reference points in the gallery, customers lose the ability to trust the AI-assisted content. Pull the real images and the AI images start reading as synthetic within about six weeks of launch.

Detail shots stay real. Zips, seams, texture, stitching. These are the images customers zoom on before committing to a purchase. AI video handles movement and silhouette well. It does not handle a close-up of a Gore-Tex membrane or a cashmere weave at the same fidelity. Those shots stay in the camera.

Packshots determine output quality. This cannot be overstated. A packshot photographed at ISO 125, f/16, through Capture One with a calibrated grey card three times daily produces an input the AI can work from. A packshot from a phone against a bedsheet produces noise in the model image and artifacts in the video. The investment in packshot quality is an investment in every downstream format.

Garbage in, garbage out. Stated simply because it is still the most violated rule in AI content workflows. The promise of AI acceleration tempts production teams to rush the input stage. The correction time in post is always longer than the time saved in setup.

One team owns the full sequence. From sample arrival through packshot, model image, video, marketplace delivery, and QA. Handoffs between teams introduce drift, different color interpretations, different crop standards, different AI prompts. Single-source accountability reduces inconsistency across a seasonal catalogue.

These rules do not slow down the pipeline. They are why the pipeline produces anything worth publishing.

What changes when video becomes a catalogue default

What changes when video becomes a catalogue default

Run the packshots-to-model-to-video pipeline through one full seasonal catalogue and content planning changes shape. Video stops being a line item in campaign budgets and starts appearing in the per-SKU cost calculation.

At GoPackshot's AI video pricing, the incremental cost per SKU for adding a video cut runs at approximately 30% below a traditional model video shoot. For a brand processing 2,000 SKUs per season, that arithmetic changes which products get video from twenty to two thousand.

The operational change this creates is less obvious than the cost change. When video is accessible at catalogue scale, the brief for a seasonal content order changes. A head of e-commerce can specify: packshot, front model, back model, detail, video cut for everything in outerwear and knitwear, packshot only for basics. The format decision happens at brief stage, not campaign approval stage.

Zalando's shift from zero AI content to 90% AI-assisted content in under a year (Zalando Corporate FY2025) shows how fast a platform's content mix can change once production constraints are removed. The brands that adjusted their pipelines first captured the ranking benefit earliest. The brands that waited for the format to stabilize missed a window.

Video at catalogue scale also changes what the marketing team can do with content that already exists. A library of product videos is a paid social asset, an email asset, a retargeting asset. Brands with video across their full catalogue have creative inputs for performance channels that brands with twenty hero videos do not. The production investment compounds across uses.

None of this requires a new production partner, a new technology stack, or a new internal team. It requires integrating video as a format option into the content pipeline that already handles photography. The pipeline already exists. Video is one more output format inside it.

The brands asking whether video belongs in their content strategy are about eighteen months behind the brands already optimizing for it. Zalando and ASOS moved their ranking signals. The question now is operational: can you produce video across a full catalogue without a production system that collapses under the volume?

The packshots-to-model-to-video pipeline exists specifically because that question needed a practical answer. Forty-eight hour turnaround. Marketplace-formatted output. Video as a format option within a photography order, not a separate campaign event.

Want to see how 120+ fashion brands run supervised AI without losing trust?

Talk to our team

If your seasonal catalogue already has the packshots, the inputs are sitting there unused. Pull one category, run it through the pipeline, and watch time-on-page on those PDPs against the static ones. That comparison answers the budget question faster than any benchmark in this article.

Back to all articles