Code Monkey home page Code Monkey logo

image2paragraph's Issues

TypeError: issubclass() arg 1 must be a class

Hello. Thanks for great project.

I faced with an error "TypeError: issubclass() arg 1 must be a class"
when I use "python main.py --image_src [image_path] --out_image_name [out_file_name]".

I don't know how to solve it. Please give me an advice.

I used these commands for making an environment.

  • conda create -n i2p python=3.8
  • pip install Pillow==9.5
  • pip install requests
  • pip install -r requirements.txt

full error code here. ↓

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/matsuzaki.takumi/workspace/nissan/Image2Paragraph/main.py:2 in │
│ │
│ 1 import argparse │
│ ❱ 2 from models.image_text_transformation import ImageTextTransformation │
│ 3 from utils.util import display_images_and_text │
│ 4 │
│ 5 if name == 'main': │
│ │
│ /home/matsuzaki.takumi/workspace/nissan/Image2Paragraph/models/image_text_transformation.py:5 in │
│ │
│ │
│ 2 from models.grit_model import DenseCaptioning │
│ 3 from models.gpt_model import ImageToText │
│ 4 from models.controlnet_model import TextToImage │
│ ❱ 5 from models.region_semantic import RegionSemantic │
│ 6 from utils.util import read_image_width_height, display_images_and_text, resize_long_edg │
│ 7 import argparse │
│ 8 from PIL import Image │
│ │
│ /home/matsuzaki.takumi/workspace/nissan/Image2Paragraph/models/region_semantic.py:2 in │
│ │
│ 1 from models.segment_models.semgent_anything_model import SegmentAnything │
│ ❱ 2 from models.segment_models.semantic_segment_anything_model import SemanticSegment │
│ 3 from models.segment_models.edit_anything_model import EditAnything │
│ 4 │
│ 5 │
│ │
│ /home/matsuzaki.takumi/workspace/nissan/Image2Paragraph/models/segment_models/semantic_segment_a │
│ nything_model.py:16 in │
│ │
│ 13 from utils.util import resize_long_edge, resize_long_edge_cv2 │
│ 14 # from mmdet.core.visualization.image import imshow_det_bboxes # comment this line if yo │
│ 15 │
│ ❱ 16 nlp = spacy.load('en_core_web_sm') │
│ 17 │
│ 18 class SemanticSegment(): │
│ 19 │ def init(self, device): │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/init.py:50 in load │
│ │
│ 47 │ │ keyed by section values in dot notation. │
│ 48 │ RETURNS (Language): The loaded nlp object. │
│ 49 │ """ │
│ ❱ 50 │ return util.load_model( │
│ 51 │ │ name, vocab=vocab, disable=disable, exclude=exclude, config=config │
│ 52 │ ) │
│ 53 │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/util.py:324 in │
│ load_model │
│ │
│ 321 │ │ if name.startswith("blank:"): # shortcut for blank model │
│ 322 │ │ │ return get_lang_class(name.replace("blank:", ""))() │
│ 323 │ │ if is_package(name): # installed as package │
│ ❱ 324 │ │ │ return load_model_from_package(name, **kwargs) │
│ 325 │ │ if Path(name).exists(): # path to model data directory │
│ 326 │ │ │ return load_model_from_path(Path(name), **kwargs) │
│ 327 │ elif hasattr(name, "exists"): # Path or Path-like to model data │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/util.py:357 in │
│ load_model_from_package │
│ │
│ 354 │ RETURNS (Language): The loaded nlp object. │
│ 355 │ """ │
│ 356 │ cls = importlib.import_module(name) │
│ ❱ 357 │ return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config) │
│ 358 │
│ 359 │
│ 360 def load_model_from_path( │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/en_core_web_sm/init.py:10 │
│ in load │
│ │
│ 7 │
│ 8 │
│ 9 def load(**overrides): │
│ ❱ 10 │ return load_model_from_init_py(file, **overrides) │
│ 11 │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/util.py:517 in │
│ load_model_from_init_py │
│ │
│ 514 │ data_path = model_path / data_dir │
│ 515 │ if not model_path.exists(): │
│ 516 │ │ raise IOError(Errors.E052.format(path=data_path)) │
│ ❱ 517 │ return load_model_from_path( │
│ 518 │ │ data_path, │
│ 519 │ │ vocab=vocab, │
│ 520 │ │ meta=meta, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/util.py:392 in │
│ load_model_from_path │
│ │
│ 389 │ config_path = model_path / "config.cfg" │
│ 390 │ overrides = dict_to_dot(config) │
│ 391 │ config = load_config(config_path, overrides=overrides) │
│ ❱ 392 │ nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude) │
│ 393 │ return nlp.from_disk(model_path, exclude=exclude, overrides=overrides) │
│ 394 │
│ 395 │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/util.py:429 in │
│ load_model_from_config │
│ │
│ 426 │ # This will automatically handle all codes registered via the languages │
│ 427 │ # registry, including custom subclasses provided via entry points │
│ 428 │ lang_cls = get_lang_class(nlp_config["lang"]) │
│ ❱ 429 │ nlp = lang_cls.from_config( │
│ 430 │ │ config, │
│ 431 │ │ vocab=vocab, │
│ 432 │ │ disable=disable, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/language.py:1672 in │
│ from_config │
│ │
│ 1669 │ │ │ │ │ factory = pipe_cfg.pop("factory") │
│ 1670 │ │ │ │ │ # The pipe name (key in the config) here is the unique name │
│ 1671 │ │ │ │ │ # of the component, not necessarily the factory │
│ ❱ 1672 │ │ │ │ │ nlp.add_pipe( │
│ 1673 │ │ │ │ │ │ factory, │
│ 1674 │ │ │ │ │ │ name=pipe_name, │
│ 1675 │ │ │ │ │ │ config=pipe_cfg, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/language.py:774 in │
│ add_pipe │
│ │
│ 771 │ │ │ │ │ lang=util.get_object_name(self), │
│ 772 │ │ │ │ │ lang_code=self.lang, │
│ 773 │ │ │ │ ) │
│ ❱ 774 │ │ │ pipe_component = self.create_pipe( │
│ 775 │ │ │ │ factory_name, │
│ 776 │ │ │ │ name=name, │
│ 777 │ │ │ │ config=config, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/spacy/language.py:660 in │
│ create_pipe │
│ │
│ 657 │ │ cfg = {factory_name: config} │
│ 658 │ │ # We're calling the internal _fill here to avoid constructing the │
│ 659 │ │ # registered functions twice │
│ ❱ 660 │ │ resolved = registry.resolve(cfg, validate=validate) │
│ 661 │ │ filled = registry.fill({"cfg": cfg[factory_name]}, validate=validate)["cfg"] │
│ 662 │ │ filled = Config(filled) │
│ 663 │ │ filled["factory"] = factory_name │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/thinc/config.py:746 in │
│ resolve │
│ │
│ 743 │ │ overrides: Dict[str, Any] = {}, │
│ 744 │ │ validate: bool = True, │
│ 745 │ ) -> Dict[str, Any]: │
│ ❱ 746 │ │ resolved, _ = cls._make( │
│ 747 │ │ │ config, schema=schema, overrides=overrides, validate=validate, resolve=True │
│ 748 │ │ ) │
│ 749 │ │ return resolved │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/thinc/config.py:795 in _make │
│ │
│ 792 │ │ orig_config = config │
│ 793 │ │ if not is_interpolated: │
│ 794 │ │ │ config = Config(orig_config).interpolate() │
│ ❱ 795 │ │ filled, _, resolved = cls._fill( │
│ 796 │ │ │ config, schema, validate=validate, overrides=overrides, resolve=resolve │
│ 797 │ │ ) │
│ 798 │ │ filled = Config(filled, section_order=section_order) │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/thinc/config.py:850 in _fill │
│ │
│ 847 │ │ │ │ │ field = schema.fields[key] │
│ 848 │ │ │ │ │ schema.fields[key] = copy_model_field(field, Any) │
│ 849 │ │ │ │ promise_schema = cls.make_promise_schema(value, resolve=resolve) │
│ ❱ 850 │ │ │ │ filled[key], validation[v_key], final[key] = cls._fill( │
│ 851 │ │ │ │ │ value, │
│ 852 │ │ │ │ │ promise_schema, │
│ 853 │ │ │ │ │ validate=validate, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/thinc/config.py:849 in _fill │
│ │
│ 846 │ │ │ │ │ # validation if it doesn't receive the function return value │
│ 847 │ │ │ │ │ field = schema.fields[key] │
│ 848 │ │ │ │ │ schema.fields[key] = copy_model_field(field, Any) │
│ ❱ 849 │ │ │ │ promise_schema = cls.make_promise_schema(value, resolve=resolve) │
│ 850 │ │ │ │ filled[key], validation[v_key], final[key] = cls._fill( │
│ 851 │ │ │ │ │ value, │
│ 852 │ │ │ │ │ promise_schema, │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/site-packages/thinc/config.py:1057 in │
│ make_promise_schema │
│ │
│ 1054 │ │ │ │ name = RESERVED_FIELDS.get(param.name, param.name) │
│ 1055 │ │ │ │ sig_args[name] = (annotation, default) │
│ 1056 │ │ sig_args["config"] = _PromiseSchemaConfig │
│ ❱ 1057 │ │ return create_model("ArgModel", **sig_args) │
│ 1058 │
│ 1059 │
│ 1060 all = ["Config", "registry", "ConfigValidationError"] │
│ │
│ in pydantic.main.create_model:990 │
│ │
│ in pydantic.main.ModelMetaclass.new:299 │
│ │
│ in pydantic.fields.ModelField.infer:411 │
│ │
│ in pydantic.fields.ModelField.init:342 │
│ │
│ in pydantic.fields.ModelField.prepare:451 │
│ │
│ in pydantic.fields.ModelField._type_analysis:550 │
│ │
│ /home/matsuzaki.takumi/.conda/envs/i2p/lib/python3.8/typing.py:774 in subclasscheck
│ │
│ 771 │ def subclasscheck(self, cls): │
│ 772 │ │ if self._special: │
│ 773 │ │ │ if not isinstance(cls, _GenericAlias): │
│ ❱ 774 │ │ │ │ return issubclass(cls, self.origin) │
│ 775 │ │ │ if cls._special: │
│ 776 │ │ │ │ return issubclass(cls.origin, self.origin) │
│ 777 │ │ raise TypeError("Subscripted generics cannot be used with" │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: issubclass() arg 1 must be a class

Unable to load weights from checkpoint file

Hi, it is a nice work. I followed the install.md to build the virtual env with scapy==3.0.0. But when I run the example with python main.py --image_src "examples/3.jpg" --out_image_name "output/3_result.jpg", there is a OSError as follow:
------This is time-consuming, please wait...------
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │
│ :109 in load_state_dict │
│ │
│ 106 │ │ if os.path.basename(checkpoint_file) == _add_variant(WEIGHTS_NAME, variant): │
│ 107 │ │ │ return torch.load(checkpoint_file, map_location="cpu") │
│ 108 │ │ else: │
│ ❱ 109 │ │ │ return safetensors.torch.load_file(checkpoint_file, device="cpu") │
│ 110 │ except Exception as e: │
│ 111 │ │ try: │
│ 112 │ │ │ with open(checkpoint_file) as f: │
│ │
│ /miniconda3/envs/i2p/lib/python3.8/site-packages/safetensors/torch.py:261 in │
│ load_file │
│ │
│ 258 │ result = {} │
│ 259 │ with safe_open(filename, framework="pt", device=device) as f: │
│ 260 │ │ for k in f.keys(): │
│ ❱ 261 │ │ │ result[k] = f.get_tensor(k) │
│ 262 │ return result │
│ 263 │
│ 264 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: module 'torch' has no attribute 'frombuffer'

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │
│ :113 in load_state_dict │
│ │
│ 110 │ except Exception as e: │
│ 111 │ │ try: │
│ 112 │ │ │ with open(checkpoint_file) as f: │
│ ❱ 113 │ │ │ │ if f.read().startswith("version"): │
│ 114 │ │ │ │ │ raise OSError( │
│ 115 │ │ │ │ │ │ "You seem to have cloned a repository without having git-lfs ins │
│ 116 │ │ │ │ │ │ "git-lfs and run git lfs install followed by git lfs pull in │
│ │
│/miniconda3/envs/i2p/lib/python3.8/codecs.py:322 in decode │
│ │
│ 319 │ def decode(self, input, final=False): │
│ 320 │ │ # decode input (taking the buffer into account) │
│ 321 │ │ data = self.buffer + input │
│ ❱ 322 │ │ (result, consumed) = self._buffer_decode(data, self.errors, final) │
│ 323 │ │ # keep undecoded input until the next call │
│ 324 │ │ self.buffer = data[consumed:] │
│ 325 │ │ return result │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /LOG/realman/LLM/Image2Paragraph/main.py:23 in │
│ │
│ 20 │ │
│ 21 │ args = parser.parse_args() │
│ 22 │ │
│ ❱ 23 │ processor = ImageTextTransformation(args) │
│ 24 │ generated_text = processor.image_to_text(args.image_src) │
│ 25 │ generated_image = processor.text_to_image(generated_text) │
│ 26 │ ## then text to image │
│ │
│ /LOG/realman/LLM/Image2Paragraph/models/image_text_transformation.py:24 in init
│ │
│ 21 │ def init(self, args): │
│ 22 │ │ # Load your big model here │
│ 23 │ │ self.args = args │
│ ❱ 24 │ │ self.init_models() │
│ 25 │ │ self.ref_image = None │
│ 26 │ │
│ 27 │ def init_models(self): │
│ │
│ /LOG/realman/LLM/Image2Paragraph/models/image_text_transformation.py:38 in init_models │
│ │
│ 35 │ │ self.image_caption_model = ImageCaptioning(device=self.args.image_caption_device │
│ 36 │ │ self.dense_caption_model = DenseCaptioning(device=self.args.dense_caption_device │
│ 37 │ │ self.gpt_model = ImageToText(openai_key) │
│ ❱ 38 │ │ self.controlnet_model = TextToImage(device=self.args.contolnet_device) │
│ 39 │ │ self.region_semantic_model = RegionSemantic(device=self.args.semantic_segment_de │
│ 40 │ │ print('\033[1;32m' + "Model initialization finished!".center(50, '-') + '\033[0m │
│ 41 │
│ │
│ /LOG/realman/LLM/Image2Paragraph/models/controlnet_model.py:15 in init
│ │
│ 12 class TextToImage: │
│ 13 │ def init(self, device): │
│ 14 │ │ self.device = device │
│ ❱ 15 │ │ self.model = self.initialize_model() │
│ 16 │ │
│ 17 │ def initialize_model(self): │
│ 18 │ │ if self.device == 'cpu': │
│ │
│ /LOG/realman/LLM/Image2Paragraph/models/controlnet_model.py:22 in initialize_model │
│ │
│ 19 │ │ │ self.data_type = torch.float32 │
│ 20 │ │ else: │
│ 21 │ │ │ self.data_type = torch.float16 │
│ ❱ 22 │ │ controlnet = ControlNetModel.from_pretrained( │
│ 23 │ │ │ "fusing/stable-diffusion-v1-5-controlnet-canny", │
│ 24 │ │ │ torch_dtype=self.data_type, │
│ 25 │ │ │ map_location=self.device, # Add this line │
│ │
│ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │
│ :602 in from_pretrained │
│ │
│ 599 │ │ │ │ # if device_map is None, load the state dict and move the params from me │
│ 600 │ │ │ │ if device_map is None: │
│ 601 │ │ │ │ │ param_device = "cpu" │
│ ❱ 602 │ │ │ │ │ state_dict = load_state_dict(model_file, variant=variant) │
│ 603 │ │ │ │ │ model._convert_deprecated_attention_blocks(state_dict) │
│ 604 │ │ │ │ │ # move the params from meta device to cpu │
│ 605 │ │ │ │ │ missing_keys = set(model.state_dict().keys()) - set(state_dict.keys( │
│ │
│ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │
│ :125 in load_state_dict │
│ │
│ 122 │ │ │ │ │ │ "model. Make sure you have saved the model properly." │
│ 123 │ │ │ │ │ ) from e │
│ 124 │ │ except (UnicodeDecodeError, ValueError): │
│ ❱ 125 │ │ │ raise OSError( │
│ 126 │ │ │ │ f"Unable to load weights from checkpoint file for '{checkpoint_file}' " │
│ 127 │ │ │ │ f"at '{checkpoint_file}'. " │
│ 128 │ │ │ │ "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please s │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Unable to load weights from checkpoint file for
'/.cache/huggingface/hub/models--fusing--stable-diffusion-v1-5-controlnet-canny/snapshots/7f2f69197050967007f6bbd23ab5e52f0384162a/d
iffusion_pytorch_model.safetensors' at
'/.cache/huggingface/hub/models--fusing--stable-diffusion-v1-5-controlnet-canny/snapshots/7f2f69197050967007f6bbd23ab5e52f0384162a/d
iffusion_pytorch_model.safetensors'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

In order to debug, I try to build a new virtual env following the install.sh and deleted the cache model documents and re-downloaded them again by running the main.py. But the error still happens.
How can I deal with the bug?
My torch version is as follows:
torch 1.9.0+cu111
torchaudio 0.9.0
torchvision 0.10.0+cu111

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

image

image

Best Wishes,

Qiao

Data Generation

May I ask if it is convenient for you to make the generated results public? For example, images and corresponding descriptive data?

Comment instructions outdated

The comment instructions in the README.md seem to be outdated? The lines numbers listed don't seem to be meaningful if I am understanding this correctly.

install not work with python = 3.8.16

Hi,
Thanks for the great work.
try to replicate your work here, i created a new env and try pip install -r requirements.txt. but it give me an error.

ERROR: Could not find a version that satisfies the requirement torch==1.9.0+cu111 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0)
ERROR: No matching distribution found for torch==1.9.0+cu111

I also try install torch using conda, but it seems not work at the implementation stage.

Any suggestions are appreciated.

Out of Memory Issue in Semantic Segmentation

Why is it that when working on semantic segmentation, I constantly encounter out of memory errors, even though I have two GPUs with 15GB each? Is it possible to distribute the model workload across the GPUs in parallel?

Region Semantic Models Do Not Work Well

First of all, thanks for the great work.

Image caption and dense caption modules all work fine here, however, the region caption module does not seem work well. I tested both edit_anything and ssa models.

For edit_anything model, it returns obviously wrong object descriptions. The following the the test image I input.
image
And the Region Segment module returns

a dog is walking on the floor in a room: [0, 50, 383, 165]; a person riding a skateboard down a street: [234, 49, 149, 166]; a piece of paper with a black background: [0, 0, 64, 110]; a white light switch with a black light: [312, 0, 53, 80]; the moon is seen over the city skyline: [116, 0, 56, 38]; 

There are clearly no dogs or skateboard in the picture.

For the ssa model, when I add --region_classify_model ssa option and change region_semantic method to use ssa, the method errors out with

│ /share/data/ripl/fjd/Image2Paragraph/models/segment_models/semantic_segment_anything_model.py:14 │
│ 7 in semantic_class_w_mask                                                                       │
│                                                                                                  │
│   144 │   │   │                                                                                  │
│   145 │   │   │   valid_mask_large_crop = mmcv.imcrop(valid_mask.numpy(), np.array([bbox[0], b   │
│   146 │   │   │   scale_large)                                                                   │
│ ❱ 147 │   │   │   top_1_patch_large = torch.bincount(class_ids_patch_large[torch.tensor(valid_   │
│   148 │   │   │   top_1_mask_category = mask_categories[top_1_patch_large.item()]                │
│   149 │   │   │                                                                                  │
│   150 │   │   │   ann['class_name'] = str(top_1_mask_category)                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
IndexError: The shape of the mask [3, 23] at index 0 does not match the shape of the indexed tensor [23, 3] at index 0

I wonder you have a good way to use region segment methods.

pip install markupsafe not compatible with project source code

I just run

pip install -r requirements.txt

in conda enviroment with python 3.8.10

I got error

ImportError: cannot import name 'soft_unicode' from 'markupsafe'

Version MarkupSafe:2.1.0 doesn't have soft_unicode so temporary solution might be adding this

MarkupSafe<=2.0.1

to requirements.txt

Dense Caption always return empty

Hi,

Thanks for sharing this work. Very interesting and potentially very impactful.

I encounter this issue while running python main.py --image_src "/Code/Image2Paragraph/examples/3.jpg" --out_image_name "output/3_result.jpg"

"Dense Cpation" always returns "/", and the program processes without error. I was able to get the generated text at the end along with the style-transferred image, but the caption is a bit off potentially due to the missing dense caption.

1

Retrieval Result on COCO

Hi, thanks for your interesting work. Could you explain why better retrieval result on COCO achieved by the Image2Paragraph method more clearly?

run question

run python main.py --image_src "examples/3.jpg" --out_image_name "output/3_result.jpg" is not work
image

pydantic

in pydantic.fields.ModelField._type_analysis:550
TypeError: issubclass() arg 1 must be a class

Anyone with this issue, can advise how to resolve?

DenseCaptioning contains hardcoded paths to local env

    def __init__(self) -> None:
        self.grit_working_directory = "../GRiT/"
        self.grit_env_python = '/home/aiops/wangjp/anaconda3/envs/grit/bin/python'
        self.grit_script = 'image_dense_captions.py'
        self.model_weights = 'models/grit_b_densecap_objectdet.pth'

About the visualization

Thanks for this great work and open-sourced repo. I didn't touch the segment tasks before. I am wondering to know how to visualize the dense segment image like you show in the repo.
3_semantic_segment_anything

Sent from PPHub

Possible Langchain integration?

Hey, kind man! Your model is just awesome, wondering if you have some plans for langchain integration? Current they're support blip-image-captioning for image capture, but your variant looks much more useful honestly! Great work!

The GRIT integration doesn't work

hi - i think there may be an issue with the grit integration; there's no git .submodules file so the repo doesn't know about GRIT at all and running the code as is returns the error :

ModuleNotFoundError: No module named 'models.grit.image_dense_captions'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.