How to use JuggernautXL for inpainting

  Kiến thức lập trình

This is the basic inference code I have to do inpainting:
device = “cuda”

model_path = "runwayml/stable-diffusion-inpainting"

pipe = StableDiffusionInpaintPipeline.from_pretrained(model_path,torch_dtype=torch.float16,).to(device)

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = PIL.Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid


def present_img(url):
    return PIL.Image.open(url)

mask_url = "masks.png"
img_url = "original.jpg"

image = present_img(img_url).resize((512, 512))
mask_image = present_img(mask_url).resize((512, 512))

prompt = "car on a desert highway. Detailed. High resolution. Photorealistic. Soft light."

guidance_scale=7.5
num_samples = 3
generator = torch.Generator(device="cuda").manual_seed(random.randint(0,1000)) # change the seed to get different results

images = pipe(
    prompt=prompt,
    image=image,
    mask_image=mask_image,
    guidance_scale=guidance_scale,
    generator=generator,
    num_images_per_prompt=num_samples,
).images

This works but the results are not great at all. What I want to do is to swap the model with JuggernautXLv9. When I simply changed the model link, I got this error:

OSError: Error no file named diffusion_pytorch_model.bin found in directory /root/.cache/huggingface/hub/models--RunDiffusion--Juggernaut-XL-v9/snapshots/795a223a588ef39ef84ae41a7a819ab477a7623a/unet.

Looking it up, a user from a github issue suggested to download the files manually from huggingface and put them in the folder, which I did. Now I am getting this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-a988ae50b0ad> in <cell line: 5>()
      3 model_path = "RunDiffusion/Juggernaut-XL-v9"
      4 
----> 5 pipe = StableDiffusionInpaintPipeline.from_pretrained(model_path,torch_dtype=torch.float16,).to(device)

4 frames
/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    658                     missing_keys = set(model.state_dict().keys()) - set(state_dict.keys())
    659                     if len(missing_keys) > 0:
--> 660                         raise ValueError(
    661                             f"Cannot load {cls} from {pretrained_model_name_or_path} because the following keys are"
    662                             f" missing: n {', '.join(missing_keys)}. n Please make sure to pass"

ValueError: Cannot load <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'> from /root/.cache/huggingface/hub/models--RunDiffusion--Juggernaut-XL-v9/snapshots/795a223a588ef39ef84ae41a7a819ab477a7623a/unet because the following keys are missing: 
 down_blocks.2.attentions.0.transformer_blocks.9.ff.net.0.proj.weight, up_blocks.0.attentions.1.transformer_blocks.1.norm1.weight, down_blocks.2.attentions.0.transformer_blocks.0.norm3.bias, down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.bias, up_blocks.0.attentions.1.transformer_blocks.2.attn2.to_out.0.bias, down_blocks.0.resnets.1.conv1.bias, up_blocks.1.attentions.2.transformer_blocks.1.norm3.weight, up_blocks.1.attentions.2.transformer_blocks.1.norm2.weight, mid_block.attentions.0.transformer_blocks.1.attn1.to_out.0.weight, down_blocks.2.resnets.1.time_emb_proj.weight, down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.bias, mid_block.attentions.0.transformer_blocks.4.norm3.bias, up_blocks.0.attentions.0.transformer_blocks.9.norm1.weight, mid_block.attentions.0.transformer_blocks.1.norm1.weight, up_blocks.0.attentions.0.transformer_blocks.2.norm1.bias, up_blocks.0.attentions.2.transformer_blocks.0.attn1.to_q.weight, up_blocks.1.attentions.0.transformer_blocks.1.norm2.bias, mid_block.attentions.0.transformer_blocks.1.attn2.to_q.weight, up_blocks.2.resnets.2.conv_shortcut.bias, up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_v.weight, up_blocks.2.resnets.2.conv1.weight, down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.weight, up_blocks.0.attentions.0.transformer_blocks.2.attn1.to_out.0.bias, up_blocks.0.attentions.2.transformer_blocks.3.attn2.to_out.0.bias, up_blocks.0.attentions.2.transformer_blocks.6.attn1.to_out.0.weight, up_blocks.1.r...

Where am I going wrong?

LEAVE A COMMENT