The autostudio from donahowe

项目代码中包含太多硬代码了，自己改来改去还是报错，没有生成结果图片，跟Issues#9差不多，只输出一堆中间图片。

在上报了Issues#3之后，修改如下：

删除：DETECT_SAM/Grounding-DINO\groundingdino\util\get_tokenlizer.py 脚本中第25行的硬代码：
return BertModel.from_pretrained("/data2/chengjunhao/THEATERGEN/pretrained_models/dino_bert")
尝试改成：
return BertModel.from_pretrained(text_encoder_type)
从 HF 下载 groundingdino_swint_ogc.pth 模型：
模型路径：https://hf-mirror.com/ShilongLiu/GroundingDINO/tree/main
下载到：DETECT_SAM/Grounding-DINO 目录下，否则脚本无法自动下载。

再次运行验证脚本后，终端窗口输出如下：
D:\AITest\AutoStudio\model\pipeline_stable_diffusion.py:41: FutureWarning: Importing DiffusionPipeline or ImagePipelineOutput from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
Using box scale: (512, 512)
D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models\GroundingDINO\ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Welcome to the AutoStudio
Loading pipeline components...: 100%|██████████████████████████████████████████████████| 5/5 [00:00<00:00, 5.99it/s]
You have disabled the safety checker for <class 'model.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
Succesfully load models
dialogue 1 turn 1
Saved boxes visualizations to output/dialogue 1/turn 1/boxes.png ind: None
ERROR!
dialogue 1 turn 2
Saved boxes visualizations to output/dialogue 1/turn 2/boxes.png ind: None
ERROR!
dialogue 1 turn 3
Saved boxes visualizations to output/dialogue 1/turn 3/boxes.png ind: None
ERROR!
dialogue 1 turn 4
Saved boxes visualizations to output/dialogue 1/turn 4/boxes.png ind: None
ERROR!
dialogue 1 turn 5
Saved boxes visualizations to output/dialogue 1/turn 5/boxes.png ind: None
ERROR!
dialogue 1 turn 6
skip
dialogue 1 turn 7
skip
dialogue 1 turn 8
skip
dialogue 1 turn 9
skip
dialogue 1 turn 10
skip
dialogue 1 turn 11
skip
dialogue 1 turn 12
skip
dialogue 1 turn 13
skip
dialogue 1 turn 14
skip
dialogue 1 turn 15
skip
single dialogue time: 1.032346248626709
dialogue 2 turn 1
Saved boxes visualizations to output/dialogue 2/turn 1/boxes.png ind: None
ERROR!
dialogue 2 turn 2
Saved boxes visualizations to output/dialogue 2/turn 2/boxes.png ind: None
ERROR!
dialogue 2 turn 3
Saved boxes visualizations to output/dialogue 2/turn 3/boxes.png ind: None
ERROR!
dialogue 2 turn 4
skip
dialogue 2 turn 5
skip
dialogue 2 turn 6
skip
dialogue 2 turn 7
skip
dialogue 2 turn 8
skip
dialogue 2 turn 9
skip
dialogue 2 turn 10
skip
dialogue 2 turn 11
skip
dialogue 2 turn 12
skip
dialogue 2 turn 13
skip
dialogue 2 turn 14
skip
dialogue 2 turn 15
skip
single dialogue time: 0.5198547840118408
dialogue 3 turn 1
Saved boxes visualizations to output/dialogue 3/turn 1/boxes.png ind: None
ERROR!
dialogue 3 turn 2
Saved boxes visualizations to output/dialogue 3/turn 2/boxes.png ind: None
ERROR!
dialogue 3 turn 3
Saved boxes visualizations to output/dialogue 3/turn 3/boxes.png ind: None
ERROR!
dialogue 3 turn 4
skip
dialogue 3 turn 5
skip
dialogue 3 turn 6
skip
dialogue 3 turn 7
skip
dialogue 3 turn 8
skip
dialogue 3 turn 9
skip
dialogue 3 turn 10
skip
dialogue 3 turn 11
skip
dialogue 3 turn 12
skip
dialogue 3 turn 13
skip
dialogue 3 turn 14
skip
dialogue 3 turn 15
skip
single dialogue time: 0.428234338760376
dialogue 4 turn 1
Saved boxes visualizations to output/dialogue 4/turn 1/boxes.png ind: None
ERROR!
dialogue 4 turn 2
Saved boxes visualizations to output/dialogue 4/turn 2/boxes.png ind: None
ERROR!
dialogue 4 turn 3
Saved boxes visualizations to output/dialogue 4/turn 3/boxes.png ind: None
ERROR!
dialogue 4 turn 4
Saved boxes visualizations to output/dialogue 4/turn 4/boxes.png ind: None
ERROR!
dialogue 4 turn 5
Saved boxes visualizations to output/dialogue 4/turn 5/boxes.png ind: None
ERROR!
dialogue 4 turn 6
Saved boxes visualizations to output/dialogue 4/turn 6/boxes.png ind: None
ERROR!
dialogue 4 turn 7
Saved boxes visualizations to output/dialogue 4/turn 7/boxes.png ind: None
ERROR!
dialogue 4 turn 8
Saved boxes visualizations to output/dialogue 4/turn 8/boxes.png ind: None
ERROR!
dialogue 4 turn 9
Saved boxes visualizations to output/dialogue 4/turn 9/boxes.png ind: None
ERROR!
dialogue 4 turn 10
D:\AITest\AutoStudio\model\utils.py:164: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam figure.max_open_warning). Consider using matplotlib.pyplot.close().
plt.figure()
Saved boxes visualizations to output/dialogue 4/turn 10/boxes.png ind: None
ERROR!
dialogue 4 turn 11
Saved boxes visualizations to output/dialogue 4/turn 11/boxes.png ind: None
ERROR!
dialogue 4 turn 12
Saved boxes visualizations to output/dialogue 4/turn 12/boxes.png ind: None
ERROR!
dialogue 4 turn 13
skip
dialogue 4 turn 14
skip
dialogue 4 turn 15
skip
single dialogue time: 1.5412306785583496
dialogue 5 turn 1
Saved boxes visualizations to output/dialogue 5/turn 1/boxes.png ind: None
ERROR!
dialogue 5 turn 2
Saved boxes visualizations to output/dialogue 5/turn 2/boxes.png ind: None
ERROR!
dialogue 5 turn 3
Saved boxes visualizations to output/dialogue 5/turn 3/boxes.png ind: None
ERROR!
dialogue 5 turn 4
Saved boxes visualizations to output/dialogue 5/turn 4/boxes.png ind: None
ERROR!
dialogue 5 turn 5
Saved boxes visualizations to output/dialogue 5/turn 5/boxes.png ind: None
ERROR!
dialogue 5 turn 6
Saved boxes visualizations to output/dialogue 5/turn 6/boxes.png ind: None
ERROR!
dialogue 5 turn 7
Saved boxes visualizations to output/dialogue 5/turn 7/boxes.png ind: None
ERROR!
dialogue 5 turn 8
Saved boxes visualizations to output/dialogue 5/turn 8/boxes.png ind: None
ERROR!
dialogue 5 turn 9
Saved boxes visualizations to output/dialogue 5/turn 9/boxes.png ind: None
ERROR!
dialogue 5 turn 10
skip
dialogue 5 turn 11
skip
dialogue 5 turn 12
skip
dialogue 5 turn 13
skip
dialogue 5 turn 14
skip
dialogue 5 turn 15
skip
single dialogue time: 1.449096441268921
Press any key to continue . . .

全是 error ，但不知道是哪里错了。。。

请项目大佬帮忙分析指点一下，这么多人尝试，有成功跑出来的吗？？？

RuntimeError: shape '[0, 3, 1, 2]' is invalid for input of size 1572864

生成图像报错如下！！

│ │
│ /xxx/workpace/AIGC/AutoStudio/DETECT_SAM/efficient_sam.py:31 in inference_with_box │
│ │
│ 28 │ bbox_labels = torch.reshape(torch.tensor([2, 3]), [1, 1, 2]) │
│ 29 │ img_tensor = ToTensor()(image) │
│ 30 │ │
│ ❱ 31 │ predicted_logits, predicted_iou = model( │
│ 32 │ │ img_tensor[None, ...].to(device), │
│ 33 │ │ bbox.to(device), │
│ 34 │ │ bbox_labels.to(device), │
│ │
│ /xxx/anaconda/envs/th23/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 │
│ in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: shape '[0, 3, 1, 2]' is invalid for input of size 1572864

Author: Version Update / Bug Fix

Hi,

Thank you for your interest in AutoStudio. I have fixed the bugs mentioned in the issue and released the SDXL version. The main issue was related to the environment file. Please reconfigure according to the new requirements.txt. Additionally, I have made some minor bug fixes in run.py, autostudio.py, and diffusionpipeline/diffusionpipelinexl.py.

AutoStudioAttnProcessor2_0 reshape异常

cond = cond.reshape(cond.shape[0], height, width, cond.shape[2])

RuntimeError: shape '[3, 64, 64, 640]' is invalid for input of size 23592960

12288无法分解成64*64。可是使用1.0的却正常。

run.py没有生成图片

你好，运行run.py只是生成Database如下图

./output/dialogue 1/turn 1/boxes.png

sd_path: runwayml/stable-diffusion-v1-5
vae_path: runwayml/stable-diffusion-v1-5/vae

请问怎么生成图片？

模型配置

由于是SD这方面的新手，在模型配置方面有点疑惑，想请教一下本地Pretrained模型的配置问题

StableDiffusion模型按照推荐下载了dreamlike-anime-1.0
IP-Adaptor 下载
efficient_sam_s_gpu.jit 下载
使用wget下载了groundingdino_swint_ogc.pth
运行过程中自动从Huggingface下载了bert-base-uncased

在run.py脚本中：

sd_path配置了dreamlike-anime-1.0的文件夹路径
vae_path我也使用了dreamlike-anime-1.0/vae文件夹路径
unet我也使用了dreamlike-anime-1.0/unet文件夹路径
ip_ckpt和image_encoder_path用了1.5plus的版本

RTX 4090/24GB CUDA 11.8

不知道这样的配置会不会有什么问题，因为输出效果很差，完全无法复现repo和论文中提到的效果

Excuse me, can you do a simple tutorial, code water product is not enough, can not get out a little anxious, please

name '_C' is not defined

python 3.10

from groundingdino import _C
导入失败
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")

发生异常: NameError
name '_C' is not defined
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 53, in forward
output = _C.ms_deform_attn_forward(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/ms_deform_attn.py", line 338, in forward
output = MultiScaleDeformableAttnFunction.apply(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/transformer.py", line 793, in forward
src2 = self.self_attn(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/transformer.py", line 584, in forward
output = checkpoint.checkpoint(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/transformer.py", line 266, in forward
memory, memory_text = self.encoder(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/models/GroundingDINO/groundingdino.py", line 334, in forward
hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/util/inference.py", line 79, in predict
outputs = model(image[None], captions=[caption])
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/Grounding-DINO/groundingdino/util/inference.py", line 218, in predict_with_classes
boxes, logits, phrases = predict(
File "/home/admin123/repository/lmm/AutoStudio/DETECT_SAM/detectSam.py", line 100, in process_image
detections = detect_model.predict_with_classes(
File "/home/admin123/repository/lmm/AutoStudio/model/autostudio.py", line 341, in generate
seg_img, detection = process_image(detect_model=dino_model, same_model=same_model, input_image=i[1][0], categories=character_prompt_full, device=self.device)
File "/home/admin123/repository/lmm/AutoStudio/run.py", line 231, in
output = autostudio.generate(
NameError: name '_C' is not defined

发现：
_C.cpython-38-x86_64-linux-gnu.so
_C.cpython-39-x86_64-linux-gnu.so

没有_C.cpython-310-x86_64-linux-gnu.so
这个问题怎么解决？是不是不支持python310？能不能提供3.10的文件

Core function missing

On line 199 in run.py, " output = theatergen2.generate(", the object "theatergen2" doesn't get imported , and it has nowhere to be found in this repo.

Are there any plans for a ComfyUI version?

specific directory rely issue

/data2/chengjunhao/THEATERGEN/pretrained_models/dino_bert

could help to update the project？？？

Quality ..

Great job first, I download all the checkpoints your codes refer to, then do running this code, but I got some pictures like these...

sd version is 1.5
ip-adapter version is ip-adapter-plus_sd15.bin

Can you give me some advise for above unsatisfactory results

entire pipeline

Great work, are there any plans to release the entire model?

Anyone actually get this thing to work?

I've been stuck on that vae config.json error all day. Tried to download a bunch of them from Huggingface such as https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main to no avail even after modified the path on run.py script. Would be nice if any of you can share how you get it working by sharing a run.bat or something.

论文Table2中的TheaterGen比较结果

TheaterGen论文是基于sd1-5，但是本文Table2的TheaterGen有sdxl版本的结果，
请问这个结果是怎么得到的

需要 run.py 所需要的三个文件的连接

当我运行 run.py 时，缺少以下三个 Hugging Face 模型：

/data2/chengjunhao/THEATERGEN/pretrained_models/vae_ft_mse
/data2/chengjunhao/THEATERGEN/pretrained_models/diffusion_1.5_comic
/diffusion_1.5/unet
您能否提供这些模型或其权重的正确 Hugging Face 链接？此外，如果您能提供一个用于此设置的 Docker 容器，将不胜感激。

谢谢！

numpy.core.multiarray failed to import

Getting this problem on an i9-9900k, RTX 2080 + RTX 3060 Ti system. Any suggestions to fix?

NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/home/parsa/HDD/machine learning/AutoStudio/run.py", line 19, in <module>
    import torch
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/__init__.py", line 1382, in <module>
    from .functional import *  # noqa: F403
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/functional.py", line 7, in <module>
    import torch.nn.functional as F
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/home/parsa/HDD/machine learning/AutoStudio/model/pipeline_stable_diffusion.py:41: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/gradio_client/documentation.py:106: UserWarning: Could not get documentation group for <class 'gradio.mix.Parallel'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/gradio_client/documentation.py:106: UserWarning: Could not get documentation group for <class 'gradio.mix.Series'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
Using box scale: (512, 512)

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/home/parsa/HDD/machine learning/AutoStudio/run.py", line 31, in <module>
    from model.autostudio import AUTOSTUDIO, AUTOSTUDIOPlus, AUTOSTUDIOXL, AUTOSTUDIOXLPlus
  File "/home/parsa/HDD/machine learning/AutoStudio/model/autostudio.py", line 16, in <module>
    from detectSam import process_image
  File "/home/parsa/HDD/machine learning/AutoStudio/DETECT_SAM/detectSam.py", line 14, in <module>
    import cv2
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/cv2/__init__.py", line 153, in bootstrap
    native_module = importlib.import_module("cv2")
  File "/home/parsa/.pyenv/versions/3.10.6/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
AttributeError: _ARRAY_API not found
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/parsa/HDD/machine learning/AutoStudio/run.py:31 in <module>                                │
│                                                                                                  │
│    28                                                                                            │
│    29 from model.unet_2d_condition import UNet2DConditionModel                                   │
│    30 from model.utils import show_boxes, show_image, get_global_prompt                          │
│ ❱  31 from model.autostudio import AUTOSTUDIO, AUTOSTUDIOPlus, AUTOSTUDIOXL, AUTOSTUDIOXLPlus    │
│    32                                                                                            │
│    33 from detectSam import EFFICIENT_SAM_MODEL, GROUNDING_DINO_MODEL                            │
│    34                                                                                            │
│                                                                                                  │
│ /home/parsa/HDD/machine learning/AutoStudio/model/autostudio.py:16 in <module>                   │
│                                                                                                  │
│     13                                                                                           │
│     14 from PIL import Image                                                                     │
│     15 from typing import List                                                                   │
│ ❱   16 from detectSam import process_image                                                       │
│     17 from diffusers.pipelines.controlnet import MultiControlNetModel                           │
│     18 from safetensors import safe_open                                                         │
│     19 from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection                │
│                                                                                                  │
│ /home/parsa/HDD/machine learning/AutoStudio/DETECT_SAM/detectSam.py:14 in <module>               │
│                                                                                                  │
│    11 sys.path.append(f"{dpath}/YOLO-World/")                                                    │
│    12 sys.path.append(f"{dpath}/Grounding-DINO/")                                                │
│    13                                                                                            │
│ ❱  14 import cv2                                                                                 │
│    15 import time                                                                                │
│    16 import contextlib                                                                          │
│    17 import numpy as np                                                                         │
│                                                                                                  │
│ /home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/cv2/__init__.py:181 in  │
│ <module>                                                                                         │
│                                                                                                  │
│   178 │   if DEBUG: print('OpenCV loader: DONE')                                                 │
│   179                                                                                            │
│   180                                                                                            │
│ ❱ 181 bootstrap()                                                                                │
│   182                                                                                            │
│                                                                                                  │
│ /home/parsa/HDD/machine learning/AutoStudio/lib/python3.10/site-packages/cv2/__init__.py:153 in  │
│ bootstrap                                                                                        │
│                                                                                                  │
│   150 │                                                                                          │
│   151 │   py_module = sys.modules.pop("cv2")                                                     │
│   152 │                                                                                          │
│ ❱ 153 │   native_module = importlib.import_module("cv2")                                         │
│   154 │                                                                                          │
│   155 │   sys.modules["cv2"] = py_module                                                         │
│   156 │   setattr(py_module, "_native", native_module)                                           │
│                                                                                                  │
│ /home/parsa/.pyenv/versions/3.10.6/lib/python3.10/importlib/__init__.py:126 in import_module     │
│                                                                                                  │
│   123 │   │   │   if character != '.':                                                           │
│   124 │   │   │   │   break                                                                      │
│   125 │   │   │   level += 1                                                                     │
│ ❱ 126 │   return _bootstrap._gcd_import(name[level:], package, level)                            │
│   127                                                                                            │
│   128                                                                                            │
│   129 _RELOADING = {}                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: numpy.core.multiarray failed to import

结果跟样例比差远了，为什么我们跑出来的图片很不理想呢？

大佬，用你昨天发布的源码，跑了一遍，环境和结果描述如下：
ENV：windows 11 x64, Python 3.10, Torch 2.1.0, CUDA 11.8, VS 2019
SD Model: 依你推荐的 dreamlike-anime-1.0
demo.json 脚本配置文件没动过，5段对话没能全部跑完，到 dialogue 5 turn 4 报错了，报错信息见下面。

部分结果图片：

大佬给分析分析原因，是环境问题？CUDA版本？CUDA精度？还是模型？参数？还是输出稳定性问题？

请问DETECT_SAMefficient_sam_s_gpu.jit在哪里下载？

如题。

DETECT_SAM/efficient_sam.py 报错

DETECT_SAM/efficient_sam.py报错，TorchScript相关的错误

报错日志：

Traceback (most recent call last):
  File "AutoStudio/run.py", line 231, in <module>
    output = autostudio.generate(
  File "AutoStudio/model/autostudio.py", line 891, in generate
    seg_img, detection = process_image(detect_model=dino_model, same_model=same_model, input_image=i[1][0], categories=character_prompt_full, device='cuda:1') #if CUDA out of spa
  File "AutoStudio/DETECT_SAM/detectSam.py", line 108, in process_image
    detections.mask = inference_with_boxes(
  File "AutoStudio/DETECT_SAM/efficient_sam.py", line 62, in inference_with_boxes
    mask = inference_with_box(image, box, model, device)
  File "AutoStudio/DETECT_SAM/efficient_sam.py", line 31, in inference_with_box
    predicted_logits, predicted_iou = model(

  File "/data/condaenvs/autostudio/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: shape '[0, 3, 1, 2]' is invalid for input of size 1572864

Question about evaluation metric aCCS in paper

Hi, I would like to ask that what is the average character-character similarity (aCCS) metric mentioned in your paper, I cannot find its information on the internet. I'm looking forward to your reply, thanks!

代码里面缺agent相关的code

您好，复现的时候遇到了些问题

大佬，微信群过期了呢，麻烦更新一下群呢

SDXL code

good job!
looking forward to the SDXL code!

code release

code release?

可否提供以下安装步骤的说明，谢谢

Prepare /DETECT_SAMefficient_sam_s_gpu.jit and /DETECT_SAM/Grounding-DINO/groundingdino_swint_ogc.pth for groundingdino and efficientSAM

是否可以预先指定图片，通过预先指定的图片来生成后续的内容??

图片是走文生图的模式生成的，能否提前预设一张或多张图片？ @donahowe 谢谢

TypeError: StableDiffusionXLPipeline.call() got an unexpected keyword argument 'repeat_ind'

使用XL

更新代码后运行run.py报错如下

TypeError: StableDiffusionXLPipeline.call() got an unexpected keyword argument 'repeat_ind'

去掉这个参数正常运行，生成结果有问题，纯黑色

boxes：

img0_25step:

theatergen2

theatergen2在Theatergen的基础上做了什么修改呢

Pipeline failed

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /root/code/AutoStudio/run.py:199 in │
│ │
│ 196 │ │ │ ind_offset = repeat_ind * LARGE_CONSTANT2 + args.seed_offset │
│ 197 │ │ │ vis_location = [dialogue, turn] │
│ 198 │ │ │ │
│ ❱ 199 │ │ │ output = autostudio.generate( │
│ 200 │ │ │ │ │ │ │ │ │ │ │ GROUNDING_DINO_MODEL, │
│ 201 │ │ │ │ │ │ │ │ │ │ │ EFFICIENT_SAM_MODEL, │
│ 202 │ │ │ │ │ │ │ │ │ │ │ character_database, │
│ │
│ /root/code/AutoStudio/model/autostudio.py:367 in generate │
│ │
│ 364 │ │ │ │ │ │ continue │
│ 365 │ │ │ │
│ 366 │ │ │ # prepare latent_guidance │
│ ❱ 367 │ │ │ latent_guidance_mask, latent_guidance_image = prepare_mid_image(guide_masks, │
│ 368 │ │ │ latent_guidance_mask = latent_guidance_mask.resize((int(width/8), int(height │
│ 369 │ │ │ latent_guidance_mask = np.array(latent_guidance_mask) │
│ 370 │ │ │ latent_guidance_mask = torch.from_numpy(latent_guidance_mask).to(self.device │
│ │
│ /root/code/AutoStudio/model/utils.py:21 in prepare_mid_image │
│ │
│ 18 print(f"Using box scale: {box_scale}") │
│ 19 │
│ 20 def prepare_mid_image(mask_tensor_list_512, single_obj_img_list, bboxes, height, width, │
│ ❱ 21 │ mask_tensor_512 = mask_tensor_list_512[0] │
│ 22 │ #m,n = mask_tensor_512.size() │
│ 23 │ m,n = width, height │
│ 24 │ new_mask_tensor = np.zeros((n, m)).astype(np.uint8) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
IndexError: list index out of range

The above error has been taking place every time and wouldn't get fixed by alternating inputs.

find some pretrained model loss and hard code error.

Good project, I want to know what is the different between with StoryDiffusion.

env: windows 11 x64, Python3.10.11 + torch 2.1.0+ cu11.8, prepared all stable-diffusion-v1-5/sd-vae-ft-mse/IP-Adapter checkpoints.

error message in the terminal windows:

D:\AITest\AutoStudio\model\pipeline_stable_diffusion.py:41: FutureWarning: Importing DiffusionPipeline or ImagePipelineOutput from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
Using box scale: (512, 512)
D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models\GroundingDINO\ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\AITest\AutoStudio\run_me.py:31 in │
│ │
│ 28 │
│ 29 from model.unet_2d_condition import UNet2DConditionModel │
│ 30 from model.utils import show_boxes, show_image, get_global_prompt │
│ ❱ 31 from model.autostudio import AUTOSTUDIO, AUTOSTUDIOPlus, AUTOSTUDIOXL, AUTOSTUDIOXLPlus │
│ 32 │
│ 33 from detectSam import EFFICIENT_SAM_MODEL, GROUNDING_DINO_MODEL │
│ 34 │
│ │
│ D:\AITest\AutoStudio\model\autostudio.py:16 in │
│ │
│ 13 │
│ 14 from PIL import Image │
│ 15 from typing import List │
│ ❱ 16 from detectSam import process_image │
│ 17 from diffusers.pipelines.controlnet import MultiControlNetModel │
│ 18 from safetensors import safe_open │
│ 19 from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection │
│ │
│ D:\AITest\AutoStudio/DETECT_SAM\detectSam.py:36 in │
│ │
│ 33 │ RESULTS = "results" │
│ 34 │ │
│ 35 │ DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu") │
│ ❱ 36 │ EFFICIENT_SAM_MODEL = load(device=DEVICE) │
│ 37 │ GROUNDING_DINO_MODEL = Model(f"{dpath}/Grounding-DINO/groundingdino/config/Grounding │
│ 38 │ │
│ 39 │ BOUNDING_BOX_ANNOTATOR = sv.BoundingBoxAnnotator() │
│ │
│ D:\AITest\AutoStudio/DETECT_SAM\efficient_sam.py:14 in load │
│ │
│ 11 │
│ 12 def load(device: torch.device) -> torch.jit.ScriptModule: │
│ 13 │ if device.type == "cuda": │
│ ❱ 14 │ │ model = torch.jit.load(GPU_EFFICIENT_SAM_CHECKPOINT) │
│ 15 │ else: │
│ 16 │ │ model = torch.jit.load(CPU_EFFICIENT_SAM_CHECKPOINT) │
│ 17 │ model.eval() │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\torch\jit_serialization.py:152 in load │
│ │
│ 149 │ │
│ 150 │ if isinstance(f, str): │
│ 151 │ │ if not os.path.exists(f): # type: ignore[type-var] │
│ ❱ 152 │ │ │ raise ValueError(f"The provided filename {f} does not exist") # type: ignor │
│ 153 │ │ if os.path.isdir(f): │
│ 154 │ │ │ raise ValueError(f"The provided filename {f} is a directory") # type: ignor │
│ 155 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: The provided filename D:\AITest\AutoStudio\DETECT_SAM/pretrain/efficient_sam_s_gpu.jit does not exist
Press any key to continue . . .

I find and download the efficient_sam_s_gpu.jit from "https://huggingface.co/merve/EfficientSAM" to put it above directory, and go ahead.

2.------
D:\AITest\AutoStudio\model\pipeline_stable_diffusion.py:41: FutureWarning: Importing DiffusionPipeline or ImagePipelineOutput from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
Using box scale: (512, 512)
D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models\GroundingDINO\ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\configuration_utils.py:629 in │
│ _get_config_dict │
│ │
│ 626 │ │ │ │
│ 627 │ │ │ try: │
│ 628 │ │ │ │ # Load from local folder or from cache or download from model Hub and ca │
│ ❱ 629 │ │ │ │ resolved_config_file = cached_file( │
│ 630 │ │ │ │ │ pretrained_model_name_or_path, │
│ 631 │ │ │ │ │ configuration_file, │
│ 632 │ │ │ │ │ cache_dir=cache_dir, │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\utils\hub.py:417 in cached_file │
│ │
│ 414 │ user_agent = http_user_agent(user_agent) │
│ 415 │ try: │
│ 416 │ │ # Load from URL or cache if already cached │
│ ❱ 417 │ │ resolved_file = hf_hub_download( │
│ 418 │ │ │ path_or_repo_id, │
│ 419 │ │ │ filename, │
│ 420 │ │ │ subfolder=None if len(subfolder) == 0 else subfolder, │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\huggingface_hub\utils_validators.py:110 in │
│ inner_fn │
│ │
│ 107 │ │ │ kwargs.items(), # Kwargs values │
│ 108 │ │ ): │
│ 109 │ │ │ if arg_name in ["repo_id", "from_id", "to_id"]: │
│ ❱ 110 │ │ │ │ validate_repo_id(arg_value) │
│ 111 │ │ │ │
│ 112 │ │ │ elif arg_name == "token" and arg_value is not None: │
│ 113 │ │ │ │ has_token = True │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\huggingface_hub\utils_validators.py:158 in │
│ validate_repo_id │
│ │
│ 155 │ │ raise HFValidationError(f"Repo id must be a string, not {type(repo_id)}: '{repo │
│ 156 │ │
│ 157 │ if repo_id.count("/") > 1: │
│ ❱ 158 │ │ raise HFValidationError( │
│ 159 │ │ │ "Repo id must be in the form 'repo_name' or 'namespace/repo_name':" │
│ 160 │ │ │ f" '{repo_id}'. Use repo_type argument if needed." │
│ 161 │ │ ) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name':
'/data2/chengjunhao/THEATERGEN/pretrained_models/dino_bert'. Use repo_type argument if needed.

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\AITest\AutoStudio\run_me.py:31 in │
│ │
│ 28 │
│ 29 from model.unet_2d_condition import UNet2DConditionModel │
│ 30 from model.utils import show_boxes, show_image, get_global_prompt │
│ ❱ 31 from model.autostudio import AUTOSTUDIO, AUTOSTUDIOPlus, AUTOSTUDIOXL, AUTOSTUDIOXLPlus │
│ 32 │
│ 33 from detectSam import EFFICIENT_SAM_MODEL, GROUNDING_DINO_MODEL │
│ 34 │
│ │
│ D:\AITest\AutoStudio\model\autostudio.py:16 in │
│ │
│ 13 │
│ 14 from PIL import Image │
│ 15 from typing import List │
│ ❱ 16 from detectSam import process_image │
│ 17 from diffusers.pipelines.controlnet import MultiControlNetModel │
│ 18 from safetensors import safe_open │
│ 19 from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection │
│ │
│ D:\AITest\AutoStudio/DETECT_SAM\detectSam.py:37 in │
│ │
│ 34 │ │
│ 35 │ DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu") │
│ 36 │ EFFICIENT_SAM_MODEL = load(device=DEVICE) │
│ ❱ 37 │ GROUNDING_DINO_MODEL = Model(f"{dpath}/Grounding-DINO/groundingdino/config/Grounding │
│ 38 │ │
│ 39 │ BOUNDING_BOX_ANNOTATOR = sv.BoundingBoxAnnotator() │
│ 40 │ MASK_ANNOTATOR = sv.MaskAnnotator() │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\util\inference.py:142 in init │
│ │
│ 139 │ │ model_checkpoint_path: str, │
│ 140 │ │ device: str = "cuda" │
│ 141 │ ): │
│ ❱ 142 │ │ self.model = load_model( │
│ 143 │ │ │ model_config_path=model_config_path, │
│ 144 │ │ │ model_checkpoint_path=model_checkpoint_path, │
│ 145 │ │ │ device=device │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\util\inference.py:40 in load_model │
│ │
│ 37 def load_model(model_config_path: str, model_checkpoint_path: str, device: str = "cuda") │
│ 38 │ args = SLConfig.fromfile(model_config_path) │
│ 39 │ args.device = device │
│ ❱ 40 │ model = build_model(args) │
│ 41 │ checkpoint = torch.load(model_checkpoint_path, map_location="cpu") │
│ 42 │ model.load_state_dict(clean_state_dict(checkpoint["model"]), strict=False) │
│ 43 │ model.eval() │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models_init_.py:17 in │
│ build_model │
│ │
│ 14 │ │
│ 15 │ assert args.modelname in MODULE_BUILD_FUNCS.module_dict │
│ 16 │ build_func = MODULE_BUILD_FUNCS.get(args.modelname) │
│ ❱ 17 │ model = build_func(args) │
│ 18 │ return model │
│ 19 │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models\GroundingDINO\groundingdino. │
│ py:395 in build_groundingdino │
│ │
│ 392 │ dec_pred_bbox_embed_share = args.dec_pred_bbox_embed_share │
│ 393 │ sub_sentence_present = args.sub_sentence_present │
│ 394 │ │
│ ❱ 395 │ model = GroundingDINO( │
│ 396 │ │ backbone, │
│ 397 │ │ transformer, │
│ 398 │ │ num_queries=args.num_queries, │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\models\GroundingDINO\groundingdino. │
│ py:115 in init │
│ │
│ 112 │ │ │
│ 113 │ │ # bert │
│ 114 │ │ self.tokenizer = get_tokenlizer.get_tokenlizer(text_encoder_type) │
│ ❱ 115 │ │ self.bert = get_tokenlizer.get_pretrained_language_model(text_encoder_type) │
│ 116 │ │ self.bert.pooler.dense.weight.requires_grad(False) │
│ 117 │ │ self.bert.pooler.dense.bias.requires_grad_(False) │
│ 118 │ │ self.bert = BertModelWarper(bert_model=self.bert) │
│ │
│ D:\AITest\AutoStudio\DETECT_SAM/Grounding-DINO\groundingdino\util\get_tokenlizer.py:25 in │
│ get_pretrained_language_model │
│ │
│ 22 │
│ 23 def get_pretrained_language_model(text_encoder_type): │
│ 24 │ if text_encoder_type == "bert-base-uncased" or (os.path.isdir(text_encoder_type) and │
│ ❱ 25 │ │ return BertModel.from_pretrained("/data2/chengjunhao/THEATERGEN/pretrained_model │
│ 26 │ if text_encoder_type == "roberta-base": │
│ 27 │ │ return RobertaModel.from_pretrained(text_encoder_type) │
│ 28 │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\modeling_utils.py:2251 in │
│ from_pretrained │
│ │
│ 2248 │ │ # Load config if we don't provide a configuration │
│ 2249 │ │ if not isinstance(config, PretrainedConfig): │
│ 2250 │ │ │ config_path = config if config is not None else pretrained_model_name_or_pat │
│ ❱ 2251 │ │ │ config, model_kwargs = cls.config_class.from_pretrained( │
│ 2252 │ │ │ │ config_path, │
│ 2253 │ │ │ │ cache_dir=cache_dir, │
│ 2254 │ │ │ │ return_unused_kwargs=True, │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\configuration_utils.py:547 in │
│ from_pretrained │
│ │
│ 544 │ │ assert config.output_attentions == True │
│ 545 │ │ assert unused_kwargs == {"foo": False} │
│ 546 │ │ ```""" │
│ ❱ 547 │ │ config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwarg │
│ 548 │ │ if "model_type" in config_dict and hasattr(cls, "model_type") and config_dict["m │
│ 549 │ │ │ logger.warning( │
│ 550 │ │ │ │ f"You are using a model of type {config_dict['model_type']} to instantia │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\configuration_utils.py:574 in │
│ get_config_dict │
│ │
│ 571 │ │ """ │
│ 572 │ │ original_kwargs = copy.deepcopy(kwargs) │
│ 573 │ │ # Get config dict associated with the base config file │
│ ❱ 574 │ │ config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwar │
│ 575 │ │ if "_commit_hash" in config_dict: │
│ 576 │ │ │ original_kwargs["_commit_hash"] = config_dict["_commit_hash"] │
│ 577 │
│ │
│ D:\AITest\AutoStudio\python310\lib\site-packages\transformers\configuration_utils.py:650 in │
│ _get_config_dict │
│ │
│ 647 │ │ │ │ raise │
│ 648 │ │ │ except Exception: │
│ 649 │ │ │ │ # For any other exception, we throw a generic error. │
│ ❱ 650 │ │ │ │ raise EnvironmentError( │
│ 651 │ │ │ │ │ f"Can't load the configuration of '{pretrained_model_name_or_path}'. │
│ 652 │ │ │ │ │ " from 'https://huggingface.co/models', make sure you don't have a l │
│ 653 │ │ │ │ │ f" name. Otherwise, make sure '{pretrained_model_name_or_path}' is t │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Can't load the configuration of '/data2/chengjunhao/THEATERGEN/pretrained_models/dino_bert'. If you were
trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name.
Otherwise, make sure '/data2/chengjunhao/THEATERGEN/pretrained_models/dino_bert' is the correct path to a directory
containing a config.json file
Press any key to continue . . .

I check the "https://github.com/IDEA-Research/GroundingDINO" and "https://github.com/donahowe/Theatergen", but can't find your pretrained model "dino_bert".

Please help me to check and analayze the error message, thanks!

请问大佬，跟 StoryDiffusion 比，技术差异在哪里？

StoryDiffusion 用的已经是 XL 模型，出图效果不错，在本地部署验证后，感觉输出也比较稳定，输出图片如下：

请问大佬，能讲一下本项目跟 StoryDiffusion 项目的差异点在哪里吗？让我们学习学习，谢谢！

大佬，显存太小可以玩这个吗？

能提供抱脸脸的试玩也行~要是有1键包就更好了，谢谢

请问作者，这个报错什么原因怎么解决呢？ImportError: cannot import name '_C' from 'groundingdino'

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/run.py:32 in │
│ │
│ 29 │
│ 30 from model.unet_2d_condition import UNet2DConditionModel │
│ 31 from model.utils import show_boxes, show_image, get_global_prompt │
│ ❱ 32 from model.autostudio import AUTOSTUDIO, AUTOSTUDIOPlus, AUTOSTUDIOXL, AUTOSTUDIOXLPlus │
│ 33 │
│ 34 from detectSam import EFFICIENT_SAM_MODEL, GROUNDING_DINO_MODEL │
│ 35 │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/model/autostudio.py:16 in │
│ │
│ 13 │
│ 14 from PIL import Image │
│ 15 from typing import List │
│ ❱ 16 from detectSam import process_image │
│ 17 from diffusers.pipelines.controlnet import MultiControlNetModel │
│ 18 from safetensors import safe_open │
│ 19 from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/DETECT_SAM/detectSam.py:22 in │
│ │
│ 19 import supervision as sv │
│ 20 import torch │
│ 21 │
│ ❱ 22 from groundingdino.util.inference import Model, load_image │
│ 23 │
│ 24 #from mmengine.runner import Runner │
│ 25 #from mmengine.config import Config │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/util/inference.py:20 in │
│ │
│ │
│ 17 sys.path.append(f"{ppath}/") │
│ 18 │
│ 19 import groundingdino.datasets.transforms as T │
│ ❱ 20 from groundingdino.models import build_model │
│ 21 from groundingdino.util.misc import clean_state_dict │
│ 22 from groundingdino.util.slconfig import SLConfig │
│ 23 from groundingdino.util.utils import get_phrases_from_posmap │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/models/init.py:8 in │
│ │
│ │
│ 5 # Licensed under the Apache License, Version 2.0 [see LICENSE for details] │
│ 6 # ------------------------------------------------------------------------ │
│ 7 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved │
│ ❱ 8 from .GroundingDINO import build_groundingdino │
│ 9 │
│ 10 │
│ 11 def build_model(args): │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/models/GroundingDINO/init.p │
│ y:15 in │
│ │
│ 12 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. │
│ 13 # ------------------------------------------------------------------------ │
│ 14 │
│ ❱ 15 from .groundingdino import build_groundingdino │
│ 16 │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/models/GroundingDINO/groundingd │
│ ino.py:54 in │
│ │
│ 51 │ generate_masks_with_special_tokens, │
│ 52 │ generate_masks_with_special_tokens_and_transfer_map, │
│ 53 ) │
│ ❱ 54 from .transformer import build_transformer │
│ 55 from .utils import MLP, ContrastiveEmbed, sigmoid_focal_loss │
│ 56 │
│ 57 │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/models/GroundingDINO/transforme │
│ r.py:36 in │
│ │
│ 33 from groundingdino.util.misc import inverse_sigmoid │
│ 34 │
│ 35 from .fuse_modules import BiAttentionBlock │
│ ❱ 36 from .ms_deform_attn import MultiScaleDeformableAttention as MSDeformAttn │
│ 37 from .transformer_vanilla import TransformerEncoderLayer │
│ 38 from .utils import ( │
│ 39 │ MLP, │
│ │
│ /shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/models/GroundingDINO/ms_deform_ │
│ attn.py:28 in │
│ │
│ 25 from torch.autograd.function import once_differentiable │
│ 26 from torch.nn.init import constant_, xavier_uniform_ │
│ 27 │
│ ❱ 28 from groundingdino import _C │
│ 29 try: │
│ 30 │ from groundingdino import _C │
│ 31 except: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name '_C' from 'groundingdino' (/shared/jiayu_ti/git_project/backing_up/AutoStudio/groundingdino/init.py)

donahowe / autostudio Goto Github PK

autostudio's People

Contributors

Stargazers

Watchers

Forkers

autostudio's Issues

Recommend Projects

Recommend Topics

Recommend Org