State of generative AI technology for product photography: creating lifestyle perfume shots with AI
Generative AI creates new visuals from scratch: imagined objects, places, and scenes. But product photography plays by different rules. It’s not about inventing; it’s about showing the product as it is.
That raises some interesting questions:
- Can a lifestyle photoshoot be replaced entirely by generative AI?
- Which AI background generator is best to achieve an authentic lifestyle shot?
- Can these images be trusted to represent real products accurately?
This article takes a close look at how generative AI is being used to create lifestyle perfume bottle shots and what that means for the future of product photography. We’ll compare 4 different AI background generator tools/models using a single prompt, with zero additional edits. As if the photos were generated by average users, who aren’t experts, and expect to achieve acceptable results as the tools promise. This approach lets us test how AI technology performs in a realistic scenario.
Can generative AI be a game-changer for lifestyle product photography?
Today’s business is all about finding quick, cost-efficient, and effective ways of producing content. Up to recently, lifestyle photography required meticulous planning, budgeting, finding a studio location, proper photo equipment, and an expert photographer. Now, generative AI promises a potentially simpler and more efficient way: all you need is a packshot, a generative image-to-image AI tool, and a good prompt. The promised result is a perfect lifestyle image in no time and at a fraction of the cost. But is that really the case?
Time for a test: 4 different perfume bottles, 4 challenges for AI
To thoroughly test how generative AI models can handle virtual photoshoots, we decided to select perfumes as a representative example. Perfume bottles, being transparent, reflective with distinct branding, pose challenges for AI algorithms for proper lighting, blending with the environment, maintaining authentic branding, and captions.
We opted for four different fragrances, each representing a different style and challenge for algorithms, from metallic reflections, transparency to intricate ornamentation, and non-standard shapes.
Although perfumes are used as a primary example, the results of this research can be applied broadly to other types of products.
Time for a test: perfumes
- Just Cavalli (Roberto Cavalli) — an elegant bottle with a metallic finish and a distinctive logo that reflects its surroundings in the light.
Why we chose this: Good to test how different models blend reflective products with the environment. Additionally, the bottle features a futuristic design, making it ideal for a CGI scene with a sci-fi aesthetic. We immediately wanted to create something that resembles a 3D rendering.

- Qaed Al Fursan (Lattafa) — a square bottle in an oriental style with intricate gold and black graphics and Arabic inscriptions.
Why we chose this: We wanted to test how well non-Latin texts and patterns are replicated by the AI tool.

- Spicebomb Extreme (Viktor&Rolf) – a designer grenade-shaped bottle with a matte black finish and a copper-colored metallic band.
Why we chose this: Generic, simple product that shouldn’t create issues for a generative AI tool.

- Devotion (Dolce & Gabbana) – a classic transparent bottle with a decorative gold heart-shaped plaque in a vintage style.
Why we chose this: Chosen for its transparency as well as complicated and distinctive ornament with branding.

Time for a test: AI tools
Generative image-to-image AI technologies are creating a new image, based on the input image and the prompt. By design, a genAI model “wants” to change the input image and specifically the product within it. Older technologies struggled to maintain product fidelity in the newly generated scene, and the original product was usually distorted. When fidelity was preserved, the product often appeared artificially blended with the environment. The most advanced tools can balance this by preserving product authenticity in the new image while seamlessly integrating it into the new environment through realistic reflections, shadows, adapted lighting, and transparency.
There are hundreds of virtual photoshoot tools out there. Most of them rely on the same base technologies/AI models. We decided to pick the most popular AI models and tools that promise high-fidelity results.
- Midjourney - an advanced AI image generator known for creating extremely realistic, stylized, and artistic backgrounds. Its biggest advantage is a deep visual style, which attracts creators, graphic designers, and marketers.
- ChatGPT model 5 - an image generator integrated with ChatGPT based on the gpt-image-1 model. It creates images based on text descriptions or with image input. It’s easy to use, and to some degree, output image fidelity can be controlled.
- Flux.1 Kontext Pro - a model for generating scenes and editing images that promises high input image fidelity. Specifically designed to maintain high product fidelity (in this context). There are two options Flux.1 Kontext Pro or Flux.1 Kontext Max. We decided to go for the “Pro” variant, which is less expensive, supposedly less accurate, but we found it generated better results for our test.
- Flair AI - background generator and photo editor for product photos. Claims to create “photorealistic product images that are indistinguishable from professional photography. Accurately renders textures, reflections, and lighting to create stunning product visualizations.”
Time for a test: input packshots
All packshots were taken in high resolution, in PNG format with a transparent background, maintaining semi-transparency in bottles. We used our automated photo studio ALPHASHOT PRO G2 with Orbitvu Station software.
High-quality input images are crucial for maintaining precision when generating AI backgrounds. This quality allows for an accurate assessment of how algorithms handle details, edges, and integration with the generated scene.

Comparing AI tools: which ones are best at generating backgrounds for perfumes?
So, we have 4 products and 4 popular AI background generators. For every perfume, we prepared a separate prompt describing a lifestyle scene, generated 2-4 photos, and picked the best one. To measure the quality of AI models, we took into account key lifestyle photography features, assigning points for each:
- Product fidelity (max. 10 pts.): The ideal generated image should accurately maintain the product's shape, colors, and distinct features, like transparency and reflection. Maintaining product branding, captions, and ornaments is crucial. A score of 10 points means that no additional post-production would be required to achieve a result comparable to traditional methods, which is crucial in lifestyle product photography.
- Environment blending (max. 8 pts.): The product should blend in naturally with the generated environment/background. Reflections, colors, lighting, and shadows should all match the generated surroundings. This is important for the perceived quality of lifestyle photography, but not as important as product fidelity. An 8-point score indicates results comparable to a traditional photoshoot.
- Scene aesthetics (max. 7 pts.): This includes composition, the creativity of the scenery, and the natural appearance of the scene. It’s our subjective measure.
- Prompt adherence (max. 5 pts.): The scene should be generated as described, and the product's position should be maintained. While important for a stylist's workflow, this is less critical than product fidelity. Max. 5 points for 100% prompt following.
Comparison of D&G
The prompt:
“A luxurious Mediterranean terrace overlooking the sea, with a panoramic view of a sunlit coastline and deep blue water. Elegant stone surface in the foreground, surrounded by blooming citrus flowers, green glossy leaves with morning dew, and subtle elements like vanilla pods and candied fruit pieces. Bright, clear sky, a few yachts sailing in the distance. Sophisticated, warm summer atmosphere — perfect backdrop for a high-end fragrance product. Keep the original angle, position, and perspective of the perfume bottle from the uploaded image exactly as it is. Create in resolution 16:9, maintain original identity, and input fidelity to high.”
Midjourney

Our take: Bottle shape and proportions, logotype, and ornament are only slightly distorted. Overall, product features are well-preserved. The product doesn’t blend in perfectly with the background: the reflections in the cup are studio-like (like reflections from the environment), the transparency is somehow handled, but in reality, the bottle is less transparent (real transparency was provided in the input image). Also, the shadow is a little too big for a small transparent bottle. The position of the bottle is maintained as requested in the prompt. The scenery, however, is clearly artificial, and the prompt regarding the perfume ingredients hasn’t been fully followed. Overall score: 63%

Flux.1 Kontext PRO

Our take: The product's proportions in the image differ from the real product, appearing wider and bulkier. While the fluid color is slightly altered, this may be an adaptation to the scene lighting. The product nicely blends into the new scene, featuring a pleasing reflection from the light in the bottom left corner. Transparency is well highlighted and aligns with the actual product. Although the reflection in the cup is modified and doesn't match the environment, it still surpasses other models. The shot's perspective was modified from the straight-on packshot. We tried several other attempts modifying the prompt, but somehow the model “insists” on the angled diagonal shot of the fragrance. Overall, the scene looks natural and pleasing. Overall score: 70%

Chat GPT model

Our take: The fragrance proportions and shape in the image differ significantly from the real product: the cup is longer and thinner, and the bottle is bulkier. The branding and ornament are well-maintained. Fluid color is altered too much, even considering the scene lighting. The product blends well into the new scene, with natural shadow and semi-transparency in the bottle. The reflection in the cup is modified and doesn't match the environment, nor the lighting, which is coming from left, not right. Position is not maintained. Again, this model also tries to “improve” it. Apart from that, the AI model followed all the prompt instructions. When it comes to aesthetics, the scene looks quite artificial, especially the flowers and oversaturated colors. Overall score: 57%

Flair AI

Our take: The bottle cup proportions and shape differ significantly from the real product: the cup is longer and thinner in the generated image. The branding and ornament are distorted: the ornament and logotype are “reinvented” by the model. Fluid color is altered too much: oversaturated. The product blends well into the new scene, with natural shadow and semi-transparency in the bottle, which distorts elements behind the bottle. The reflection in the cup is modified; it doesn't match the environment and the lighting, which is coming from the left, not from both sides. Position isn’t maintained. This model also changes the product position, although instructed to maintain the one from the input image. The AI model followed all the prompt instructions. As for aesthetics, the scene looks quite artificial, especially the flowers and oversaturated colors, similar to ChatGPT. Overall score: 50%


Comparison of Spice Bomb
The prompt:
“A high-end dramatic studio background with large autumn leaves bursting from the center, water splashes surrounding the base, cinematic lighting with a gradient grey-to-white backdrop, hyperrealistic detail, luxury advertising style. Do not modify the original perfume bottle; leave it exactly as it is. Create in resolution 16:9, maintain original identity and input fidelity to high.”
Midjourney

Our take: Although at first glance, the image looks very appealing, there are many issues. The bottle proportions differ significantly from the real product: the generated perfume is slimmer, when in reality it’s bulkier. The branding is distorted. Moreover, the model added the SKORTEO M5 caption, which doesn’t exist in the real product. The bottle has no transparency, but Midjourney added it to the lower part of the bottle. The product blending with the new scene is ok, but nothing sophisticated. Product position is well maintained. The AI model followed the prompt instructions well (apart from product alteration). Overall, the scene looks appealing, and the model was very creative in generating it. Overall score: 53%

Flux.1 Kontext PRO

Our take: Not as appealing as Midjourney, and without the “wow effect”. The bottle proportions differ only slightly from the real product. The branding is a bit distorted and blurred. The bottle opacity is preserved. The product blends quite well with the new scene, but the product was made darker and lost many details. The reflective surfaces don’t catch reflections from the environment. The position is well maintained. The prompt instructions were well adhered to. Overall, even if the bottle is too dark, the scene doesn’t look that bad and, in our opinion, better than ChatGPT or Flair.AI. Overall score: 53%

Chat GPT model 5

Our take: It’s even less appealing than the Flux model. The bottle proportions differ slightly from the real product: it’s made slimmer by ChatGPT. The branding is distorted: a different font, “O” letter instead of “&” inside “O”. The product blends with the new scene; however, there are no reflections from the environment. The lighting looks good, and the product details are highlighted. The position is well-maintained, and the prompt was followed, except for branding. The scene looks very artificial and AI-generated-like. Overall score: 60%

Flair AI

Our take: The bottle proportions differ from the real product: it’s made bulkier by Flair.ai. There is a collar missing at the spray part. The branding is altered: “&” letter instead of “&” inside “O”. The product blends well with the new scene but lacks authenticity - there are no reflections from the environment. The lighting looks good and natural. The position is well maintained, and the prompt was generally followed. The scene looks unnatural, sort of made in a studio with the floor and background clearly visible. Overall score: 60%


Comparison of Just Cavalli
The prompt:
“Create a cinematic, futuristic background environment with a high-tech, metallic aesthetic. The rendering scene should feature smooth, reflective steel surfaces, glowing blue ambient lights, and layered geometric architecture with concentric rings, panels, and structural depth — evoking a luxurious sci-fi atmosphere. The lighting should be dramatic, with cool-toned reflections that enhance the sleekness of the setting. Avoid clutter — the environment should feel premium, clean, and engineered with symmetry. The color palette should primarily feature shades of metallic silver, chrome, and deep blue. The background must seamlessly accommodate and highlight a central luxury product, without interfering with its position or scale. Create in resolution 16:9, maintain original identity and input fidelity to high.”
Midjourney

Our take: Once again, Midjourney got very creative with the surroundings. The problem is that it was also creative with the product, which isn’t desirable. The shape and fragrance color were altered, while the branding appears blurred and distorted. Bonus points go to Midjourney for recognizing that the top part of the bottle is mirror-reflective. However, it didn’t do well in blending the product with the surroundings. The product is disappearing in the new scenery, so overall the aesthetics vibe is poor in our opinion. Overall score: 43%

Flux.1 Kontext PRO

Our take: The product position was slightly modified - fragrance is rotated for a more direct front shot. Original camera position - slightly from the bottom - was not maintained. The branding was also altered and doesn’t look as sharp as in the packshot. The color of the liquid was modified. As for blending, it’s poor; you can see some reflections from the scene in the bottle, but it feels very artificial and unnatural. The product isn’t highlighted, and disappears in the scene. Having said all that, the image is unattractive and artificial. Overall score: 37%.

Chat GPT model 5

Our take: Again, ChatGPT slightly modified the logotype — using a different font in Just Cavalli and even changing it to Just Cavali (with a single ‘L’). The bottle was also reinvented, with slightly altered proportions. The fragrance liquid color is different. Image blending with the environment is quite good, with nice reflections and lighting. In our opinion, the whole scene looks attractive. However, the product appears a bit too large in the final image, and its angle was slightly adjusted. Overall score: 67%

Flair AI

Our take: The bottle itself, much like in the case of ChatGPT, has been reinvented. The branding is altered, the bottle shape and details are changed, as well as the color of the fragrance. The product’s position also slightly deviates from the source packshot. Image blending is quite good and looks natural, with nice reflections and lighting. Overall, it’s quite a good lifestyle, but it isn’t authentic. Overall score: 73%


Comparison of Qaed Al Fursan
The prompt:
“Create a realistic, luxurious background for a product photo. The perfume bottle must stay fixed in place on a rustic wooden fence of a horse stable. In the distance, add blurred silhouettes of horses behind the fence, within a warm golden-hour setting. Include visual themes inspired by these notes: saffron, pineapple, jasmine, fir, oud, cedarwood, amber. Use earthy textures and warm tones. Only generate the background – do not change or move the product in the foreground. Create in resolution 16:9, maintain original identity and input fidelity to high.”
Midjourney

Our take: Again, if you don’t go into details, the image isn’t bad. Looking closely, though, the branding is mostly changed, and Midjourney added transparency to the bottle, which is opaque. Position isn’t kept: diagonal instead of frontal as in the input image. The product isn’t well separated from the background, which, although blurred, is very saturated, making the whole composition hard to look at, and the product gets “lost” in all that. Overall score: 47%

Flux.1 Kontext Pro

Our take: Very well-maintained product features, including branding and ornaments. Just as usual, in the case of Flux, the product is slightly blurred. Great work on color coordination - everything blends in smoothly, and the horse on the right is well done. Good reflections and product details. With the one on the left, though, something went wrong as it stands in the middle of the fence. :) As of composition, it looks artificial on an oval bench - probably physics wouldn’t hold it. However, it’s aesthetically very appealing. Overall score: 80%

Chat GPT model 5

Our take: Very well-maintained product features, including branding and ornaments. Average blending with the environment - lighting from the back, reflects in the front. Slightly artificial composition with the flowers and a pineapple. Strange horse silhouettes. Maintained position and well adhered to the prompt. Overall score: 77%

Flair AI

Our take: Good composition and high product fidelity, except for slight modifications in gold color on the bottle ornament and proportions of the cup. Well blended in, with very good re-lighting. Changed product position, and part of the prompt was ignored. Generally quite good, naturally looking image. Overall score: 73%


Summing up the tests
Taking everything into consideration, let’s see how they scored in terms of proportion, color, and authenticity:

Which AI tool is the best?
When it comes to lifestyle images, generative AI can already be an alternative to traditional photo shoots. Tools like Midjourney, ChatGPT, Flux, or FlairAI can place a perfume bottle into sophisticated, emotional scenes — from minimalist interiors to sunlit beaches — with convincing realism.
For us, Midjourney stands out in terms of creativity—it did a great job generating backgrounds, but it also alters the product the most, which most of the time isn’t acceptable for product photography. This can be fixed in a photo editing program, but requires additional skills. On the other hand, Flux Kontext Pro most faithfully reproduces the product, but the backgrounds it generates aren’t always impressive.
All tools sometimes ignore parts of the prompt. Why? We aren’t sure, but it’s probably related to the learning datasets and the stochastic nature of how those tools work. For sure, there are ways to improve the prompt to achieve more desirable results, or to use JSON prompting.
A key finding from this research is the inconsistency of generative AI. While results for products like Al Fusan and Dolce & Gabbana were remarkably brilliant, others were unacceptable, suggesting that the outcome is highly dependent on the specific product. We also had to do several tries before achieving acceptable results that were good enough for this research.
Which tool is the best for you? It all depends on how much authenticity you require from the tool. If not much, and you require stunning scenery, maybe even Midjourney, which alters products, can be acceptable for you. If you care about product branding, shape, and details, it seems Flux.1 Kontex is the right choice.
Summing up, each AI tool/model has its strengths and weaknesses, especially when it comes to generating content from a single prompt without extra revisions.
FAQ
Q: What does AI change for product photography?
A: For photographers and content managers, AI in product photography means more control over time, budget, and creativity. Instead of planning complex shoots, they can focus on capturing one perfect packshot, then use AI tools/models to create multiple variations tailored for campaigns, social media, or seasonal updates.
Generative AI isn’t replacing photography; it’s reshaping how it’s used. The core image stays authentic, while AI expands its possibilities.
Q: Will AI replace photographers?
A: We don’t think so. If you want to achieve authentic visual content, AI needs a good packshot. And for a good packshot, you need a photographer. As a result, photographers become co-creators of creative and fast-paced productions. Their experience, combined with innovative technologies like AI, translates into the quality of the final result. Creative, high-end visual content will still require professional photographers and a more traditional way of working.
Q: Will AI ever generate a ready-to-publish product photo for a PDP page?
A: Yes, but not without a solid starting point. A well-prepared packshot is essential. Without it, AI struggles to reproduce a product’s exact shape, color, and details. Even with a good packshot, small errors can happen: a slightly distorted logo, uneven glass reflections, or misplaced text. Fortunately, these are quick fixes. A few minutes in Photoshop or another editing tool, and the image is ready to go live.
-----------------------------------------------
This research article was done by the Orbitvu team:
Packshots - Julia Banduch
Prompts, generative images & descriptions - Marek Herceliński
Copywriting - Elżbieta Binkowska
Guidance & support - Tomasz Bochenek
Articles you may also like

Photographing tools and spare parts isn’t easy. The shapes are irregular, the surfaces reflective, and the sheer number of it...

In 2025, e-commerce is more about visuals than ever. Online shoppers make quick decisions based on first impressions, so prof...

Your product photos do the selling when customers can't hold your items in their hands. Online shoppers rely completely on wh...