Code Monkey home page Code Monkey logo

spatially-conditioned-graphs's Issues

Inference code

@fredzzhang @nikanor97 hi thanks for sharing the code base great work

  1. do we inference code which we can run on the custom data / custom videos and visualize the results
  2. currently when i tested the model using test.py for some scenes present in the validation data like an only person running on a beach without any other object present there is no detections/activity in the output, is there any way i can get results like people walking , fighting, waving without depending on the object present in the scene

Thanks in advance

Add group batch sampler

When input images are batched, zero-padding is applied to fill in the gaps. This results in large input when images have distinct aspect ratios. Following torchvision references, images with similar aspect ratios will be batched together as much as possible.

Average out the denominator when computing focal loss

The focal loss used in the model is normalised by the number of positive logits, which tends to have unstable statistics. Therefore, it should be averaged across all sub-batches (ranks) to better utilise the large batch size.

Training GPU+CPU Utilization Stops

I'm trying the training procedure as laid out by the README file, and ran CUDA_VISIBLE_DEVICES=0 python main.py &>log &.

It seems to run fine up until the near end of the first epoch, where the GPU and CPU utilization completely stops. This drop in utilization never recovers and makes it so that the first epoch never actually finishes.

Here is the output from my log:

Namespace(batch_size=4, cache_dir='./checkpoints', data_root='hicodet', human_thresh=0.2,
learning_rate=0.001, lr_decay=0.1, milestones=[10], model_path='', momentum=0.9, 
num_epochs=15, num_iter=2, num_workers=4, object_thresh=0.2, print_interval=2000,
random_seed=1, train_detection_dir='hicodet/detections/train2015', 
val_detection_dir='hicodet/detections/test2015', weight_decay=0.0001)
Epoch [1/15], Iter. [2000/9409], Loss: 1.3726, Time[Data/Iter.]: [3.80s/1123.74s]
Epoch [1/15], Iter. [4000/9409], Loss: 1.1580, Time[Data/Iter.]: [3.53s/1105.00s]
Epoch [1/15], Iter. [6000/9409], Loss: 1.0998, Time[Data/Iter.]: [3.52s/1102.66s]
Epoch [1/15], Iter. [8000/9409], Loss: 1.0792, Time[Data/Iter.]: [3.72s/1140.21s]

My system specs as well:
OS: Pop!_OS 20.04 LTS x86_64
CPU: AMD Ryzen 7 2700X (16) @ 3.700G
GPU: NVIDIA GeForce RTX 2070 SUPER
Memory: 16017MiB
CUDA: 10.2

Remove single-GPU training script

As training on a single GPU can be achieved by setting the --world-size option to 1, the single-GPU training script is no longer maintained, and thus to be removed.

How to train to get the model with mAP of 28.54

Sorry to bother you but would you mind giving me more tips about how to achieve the mAP of 28.54? I download the fasterrcnn_resnet50_fpn_hicodet_e13.pt but do not know how to use it. Shall I train the network with the detections produced with this .pt file? With python preprocessing.py --partition test2015 ckpt_path='path to .pt'? (Did u forget an h after .pt?)

Thx.

About label generate

Hello, I am a bit confused about the calculation of the classification loss in your code. If you have time, can you give me an answer?
https://github.com/fredzzhang/spatio-attentive-graphs/blob/main/interaction_head.py#L182
In your postprocess function, you pass in the one-hot form of the prediction result of each pair and the one-hot form of the label. Next, you use a series of magic operations to convert the one-hot form of the label into one I don't quite understand things.

multi-GPU finetune fatser rcnn

When I use your detector finetune script in a multi-GPU situation, the following problems will occur. Do you have any good solutions to this problem?

RuntimeError: expected device cuda:1 and dtype Float but got device cuda:0 and dtype Float                                                                 

The bug of main.py

Dear Professor , excuse me. I have a problem that I want to get a solution from you. when I run main.py, it has a problem as follows.
image

And I have followed your guide step by step, but it still can not run rightly, I am very grateful to you if you give me a solution when you have time. Thank you very much!

KeyError when indexing dict using Tensor

In interaction_head.py, the mapping from object classes to action/verb classes use torch.LongTensor as the index, which works for a list, but not for a dict

428        # Map object class index to target class index
429        # Object class index to target class index is a one-to-many mapping
430        target_cls_idx = [self.object_class_to_target_class[obj]
431            for obj in object_class[y]]

How to use the 'CacheTemplate' class when testing for V-COCO?

Hi,

Recently, I tried to test a model on V-COCO dataset. I found that there is a class 'CacheTemplate' provided in README.md, however, I don't know how to use that correctly.

I tried to add that class to the V-COCO eval file directly, but it doesn't work - the file keep loading the detection file over half an hour and nothing happened. Could you please give more detail for using that class? Thanks a lot!

Here is my evaluation code.

from vsrl_eval import VCOCOeval
from collections import defaultdict

class CacheTemplate(defaultdict):
    """A template for VCOCO cached results """
    def __init__(self, **kwargs):
        super().__init__()
        for k, v in kwargs.items():
            self[k] = v
    def __missing__(self, k):
        seg = k.split('_')
        # Assign zero score to missing actions
        if seg[-1] == 'agent':
            return 0.
        # Assign zero score and a tiny box to missing <action,role> pairs
        else:
            return [0., 0., .1, .1, 0.]


if __name__ == "__main__":
    vsrl_annot_file = "data/vcoco/vcoco_test.json"
    coco_file = "data/instances_vcoco_all_2014.json"
    split_file = "data/splits/vcoco_test.ids"
    det_file = "../SCG/vcoco_cache/vcoco_results.pkl"
    vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
    vcocoeval._do_eval(det_file, ovr_thresh=0.5)

About mAP in your paper

The highest precision in this repo is pointing to DRG, but your issue says that your highest precision is 28.54
image
image

About fine-tuned detector

Hi, I wonder after fine-tuning the object detector on HICO-DET, whether you retrain your hoi classification model or just replace the object detector at the test time.

"Too many open files" when running test.py

The problem originates in the multiprocessing when computing average precisions. Refer to fredzzhang/pocket#6 for details.

Specifying the number of processes has fixed the problem partially. Now during training, the computation of classification mAP did not result in the same error, but during test the computation of detection mAP did.

TypeError: pic should be PIL Image or ndarray. Got <class 'torch.Tensor'>

Full error is as follows

Traceback (most recent call last):
  File "test.py", line 91, in <module>
    main(args)
  File "test.py", line 68, in main
    test_ap = test(net, dataloader)
  File "/home/fred/spatio-attentive-graphs/utils.py", line 128, in test
    for batch in tqdm(test_loader):
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/tqdm/std.py", line 1167, in __iter__
    for obj in iterable:
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/fred/spatio-attentive-graphs/utils.py", line 113, in __getitem__
    image = pocket.ops.to_tensor(image, 'pil')
  File "/home/fred/pkgs/pocket/pocket/ops/transforms.py", line 33, in to_tensor
    return torchvision.transforms.functional.to_tensor(x).to(
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 50, in to_tensor
    raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
TypeError: pic should be PIL Image or ndarray. Got <class 'torch.Tensor'>

About mAP in your paper

Hi, I found that in your paper, there is a mAP 27.18 model on HICO-DET. I wonder how this model is trained as I did not find corresponding one in this repo.

How can i use the pretrained HICO model for OKVQA action detections

Hi,
I want to retrieve the action detections on OKVQA dataset. I only want to detect actions on OKVQA dataset using pretrained HICO model. Can you please guide me how do I do that?

Also, do I need object detections on OKVQA beforehand? in order to use HICO pretrained model on it?

I read #63 too. but I didn't understand how to implement it to my problem

Exception happened during the evaluation of VCOCO

Hi, I trained your method on VCOCO dataset and generated the vcoco_results.pkl. However, when I ran the utilities provided by Gupta, I got the exception below:
image

Could you please kindly help me handling this problem? Thanks a lot. Besides, I also attached the code for evaluation of mine:

from vsrl_eval import VCOCOeval
from collections import defaultdict

class CacheTemplate(defaultdict):
    """A template for VCOCO cached results """
    def __init__(self, **kwargs):
        super().__init__()
        for k, v in kwargs.items():
            self[k] = v
    def __missing__(self, k):
        seg = k.split('_')
        # Assign zero score to missing actions
        if seg[-1] == 'agent':
            return 0.
        # Assign zero score and a tiny box to missing <action,role> pairs
        else:
            return [0., 0., .1, .1, 0.]



if __name__ == "__main__":
    vsrl_annot_file = "data/vcoco/vcoco_val.json"
    coco_file = "data/instances_vcoco_all_2014.json"
    split_file = "data/splits/vcoco_val.ids"
    det_file = "spatio-attentive-graphs/vcoco_cache/vcoco_results.pkl"
    vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
    vcocoeval._do_eval(det_file, ovr_thresh=0.5)

json file

hello, this is my first work on hoi, and can you explain the meaning of instances_train2015.json, instances_val2015.json, coco80tohico80.json, coco91tohico80.json, what they are represented?

Running HICO Evaluation

Hi @fredzzhang, thank you for your great work!
I am trying to reproduce the reported mAPs of the pretrained COCO version, but I'm confused on how to get the HICO evaluation running.

I ran the network through test2015 successfully and generated both JSON and .mat files. I tried running the mat files with eval_run.m from HO-RCNN but got very lower mAPs:

image
image

Is there something I am missing? I would appreciate any help.

Fix relative paths for the demo code

The demo script demo.py under diagnosis had incorrect paths for relative imports after the relocation. To import relevant modules, the parent directory should be added to the search path.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.