Comments (3)
Yes, you're right that there is no explicit alignment in Controlnet. What it does is just read the geometric control embeddings into features and add them to the SD intermediate feature.
from paint-with-words-sd.
hi, @lwchen6309 ,
by the way, i have another question:
your test codes in runner_inpait.py,
" input_prompt": "A digital painting of a half-frozen lake near mountains under a full moon and aurora. A boat is in the middle of the lake. Highly detailed.","
now i have printed the value of the color cross_attention_weight_64 corresponding to token="aurora" like this:
[0.0000, 0.0000, 0.0000, 0.0000, 0.5000, 0.5000, 0.5000, 0.5000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.5000, 0.5000, 0.5000, 0.5000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.5000, 0.5000, 0.5000, 0.5000],
[0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000, 0.5000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000
so we guess the aurora's cross location will near upper right, same with token="full moon".
But why should we also put another mask image file pointing out the moon's real postion into latent space like
latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
will this be duplicated with previous color cross_attention_weight?
PTAL!
thank you !!!
from paint-with-words-sd.
Hi, I think the image_mask is just to specify the region for inpainting. The object segmentation is still controlled by cross attention weight.
from paint-with-words-sd.
Related Issues (20)
- Is it it possible that some of the features in this research paper could be implemented?
- Suggestion about the tool to generate mask image? HOT 2
- SMB World 1-1 Segmentation Map and Key. HOT 1
- About porting to the diffusers pipeline HOT 1
- Question: How is this project going? HOT 1
- inj_forward() got an unexpected keyword argument 'attention_mask' HOT 3
- Parsing of Color values going wrong HOT 1
- I am trying to run this on Stable Diffusion 2.1 but I keep getting black images HOT 3
- inj_forward() got an unexpected keyword argument 'encoder_hidden_states' HOT 2
- having trouble getting cuda installed with this
- Found another implementation
- Will the Gradio demo look something like this? HOT 2
- AUTOMATIC1111 extension of PwW + ControlNet HOT 2
- Using with lora HOT 1
- Confliction of A1111 extension of PwW+Control to the original extension of ControlNet HOT 28
- Commas, Periods and Tokens HOT 1
- Incredibly High VRAM Usage
- diffusers multicontrolnet pipeline with paint with words HOT 3
- Paint with words with Lora? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paint-with-words-sd.