When AI Outpaint Fails: Cluttered Desks, Half Faces, Cut-off Text

Honest documentation of where AI outpaint produces hallucinated artefacts — and the workarounds, including when to crop the source first and outpaint the crop instead of the original.

By AI Image Extender Team

Why we wrote this

Most outpaint tools sell you a fantasy: drop in any image, pick a ratio, get a clean result. We built our own three-step agent (vision reads the photo, prompt writes the extension brief, outpaint paints the new pixels) precisely because the one-step tools we tried kept failing in the same predictable ways. After running thousands of generations across our own product photos, real estate shots, and screenshots, we have a working list of cases where outpaint should not be your first move.

This is that list. Not the marketing version. The version we wish someone had given us before we wasted a weekend.

1. Cluttered desks and tabletops

Photos of busy surfaces, a desk with six objects, a kitchen counter mid-prep, a workshop bench, are where outpaint hallucinates the worst. The model sees a partial mug at the edge and decides the right move is to invent another mug, or a cable, or a plant pot, to fill the empty canvas. You end up with objects you never owned in a photo you shot yourself.

Why it fails: the model has strong priors about what belongs on a desk. When it cannot see a clean wall or an obvious floor at the edge, it fills with more clutter because more clutter is statistically what desks contain. The vision step in our agent flags this with “busy foreground, low edge context,” but the outpaint model still tries.

Workaround: write a prompt hint that explicitly forbids new objects. Something like “extend the wooden surface and the wall behind, no additional items, no cables, no cups.” Our prompt step does this automatically when it detects clutter density above a threshold, but you can override it.

When to crop first: if the clutter goes all the way to the edge of the frame and there is no clean breathing room, crop the source down to the subject before outpainting. Outpaint a tighter crop with clean edges and you get back the empty surface the model needed. This is counter-intuitive, smaller input, better extension, but it works.

2. Cut-off faces

A portrait where the chin or the forehead is cropped is the second worst case. The model has half a face and it tries to reconstruct the rest. The result is almost always uncanny: a chin that does not match the jawline, a forehead that warps the hairline, an eye line that drifts.

Why it fails: faces are high-frequency, asymmetric, and the human visual system is brutal about detecting flaws. The model has not seen this specific person. It is averaging across faces it has seen and stitching the average onto your subject. The seam is where the brain catches the lie.

Workaround: do not extend through faces. Pick an aspect ratio that keeps the existing face fully inside the new canvas and only extends background. If you are going from 1:1 to 16:9, that means extending sideways, not up or down through the head.

When to crop first: if the face is already cropped in your source, crop tighter to make it an obvious head-and-shoulders composition, then outpaint the background only. Tell the prompt step “do not modify the subject, extend only the wall and the air around them.” A bounded extension on a finished crop beats a generative reconstruction every time. For portrait-heavy work like real estate listings or staff photos, see our notes on real-estate use cases for what we recommend instead.

3. Cut-off text labels

Packaging shots, book covers, signage, anything with text near the edge, will fool the outpaint model into inventing characters. We have seen “ORGAN” become “ORGANIC” and we have seen it become “ORGANUM.” The model picks whichever letter sequence looks most plausible to its visual prior. It is wrong about half the time, and when it is wrong it is invisible to anyone who does not already know the real word.

Why it fails: outpaint operates on pixels, not on a typographic understanding of letterforms. It sees serifs and stroke widths and fills in shapes that match the style. It does not know what word you meant.

Workaround: there is no good prompt hint for this. “Do not invent text” helps a little. “Extend only the background, keep all text exactly as shown” helps more. Neither is reliable.

When to crop first: this is the case where cropping is mandatory, not optional. If the text is cut off, either crop it out entirely (extend a frame that contains no text) or crop in tighter so the full word is visible and the extension happens around it. We tell our ecommerce users to compose the original product shot with text fully inside a safe zone before any extension. If the original was shot wrong, redo the shot. Do not let the model guess at letters.

4. Mirrors and reflective surfaces

Mirrors, polished tables, car paint, glass, anything that reflects, will cause the outpaint to either duplicate part of the existing scene or warp the reflection in a way that breaks geometry. We have seen a desk lamp reflected in a window get extended into a second lamp floating outside.

Why it fails: the model treats the reflection as content. It does not understand that the reflection should match the rest of the scene by physical law. So when it extends the wall behind the mirror, the mirror itself often picks up a different reflection from the new pixels.

Workaround: write the prompt hint to flag the surface explicitly. “Mirror on the left wall, reflection should remain consistent with original, do not introduce new reflected objects.” This works maybe sixty percent of the time. The remaining forty percent you regenerate.

When to crop first: if the reflective surface is at the edge of the frame, crop it out. A mirror that is half in your photo is going to produce a worse result than no mirror at all. Frame so the mirror is either fully inside the original (extend around it) or fully outside (crop it before extending).

5. Patterned wallpaper, carpets, and tiles

Repeating patterns at the seam, this is where outpaint fails in a way that looks fine until you zoom in. The pattern continues, but the period is off by a few pixels, or the rotation drifts, or the colour shifts at the seam line. A trained eye spots it immediately. An untrained eye spots it the second time they look.

Why it fails: the model knows what wallpaper looks like in general. It does not always lock onto the exact period of your specific wallpaper. So it extends with a similar pattern that does not align.

Workaround: in the prompt step, describe the pattern as concretely as possible. “Vertical stripes, two centimetres wide, navy and white, repeating exactly.” This biases the model toward the correct period. Pair this with a wider extension area so the model has more room to settle into the pattern before the seam, rather than fighting it.

When to crop first: if the pattern is small and busy and the extension is large, crop the source so the patterned region is a minority of the result. Better to have a wall section the model can match against a known floor or ceiling line than to have it freelance across a pure pattern field. For aspect ratios where pattern matching matters most, 16:9 landscape extensions tend to be safer than vertical because there is more horizontal context for the seam to land on.

The rule

Outpaint is a tool for extending context, not for inventing it. When the original photo gives the model enough context to guess correctly, it works beautifully. When the original photo is missing the very thing you want extended, the model guesses, and the guess is rarely what you wanted.

So our rule, the one we tell ourselves before every generation:

Do not use outpaint for cluttered surfaces with edge-to-edge objects. Do not use it to reconstruct cut-off faces. Do not use it to complete cut-off text. Be cautious with mirrors and reflective surfaces. Describe patterns explicitly or accept that the seam will drift.

If your source is hitting one of these, crop first. A tighter crop with clean edges gives the model the breathing room it needs. We built the tool we wish we had ourselves, and a real part of building it was learning when to put it down and reach for the crop tool instead.