The spatially-conditioned-graphs's discuss from fredzzhang

Inference code

@fredzzhang @nikanor97 hi thanks for sharing the code base great work

do we inference code which we can run on the custom data / custom videos and visualize the results
currently when i tested the model using test.py for some scenes present in the validation data like an only person running on a beach without any other object present there is no detections/activity in the output, is there any way i can get results like people walking , fighting, waving without depending on the object present in the scene

Thanks in advance

When input images are batched, zero-padding is applied to fill in the gaps. This results in large input when images have distinct aspect ratios. Following torchvision references, images with similar aspect ratios will be batched together as much as possible.

Average out the denominator when computing focal loss

The focal loss used in the model is normalised by the number of positive logits, which tends to have unstable statistics. Therefore, it should be averaged across all sub-batches (ranks) to better utilise the large batch size.

Training GPU+CPU Utilization Stops

I'm trying the training procedure as laid out by the README file, and ran CUDA_VISIBLE_DEVICES=0 python main.py &>log &.

It seems to run fine up until the near end of the first epoch, where the GPU and CPU utilization completely stops. This drop in utilization never recovers and makes it so that the first epoch never actually finishes.

Here is the output from my log:

Namespace(batch_size=4, cache_dir='./checkpoints', data_root='hicodet', human_thresh=0.2,
learning_rate=0.001, lr_decay=0.1, milestones=[10], model_path='', momentum=0.9, 
num_epochs=15, num_iter=2, num_workers=4, object_thresh=0.2, print_interval=2000,
random_seed=1, train_detection_dir='hicodet/detections/train2015', 
val_detection_dir='hicodet/detections/test2015', weight_decay=0.0001)
Epoch [1/15], Iter. [2000/9409], Loss: 1.3726, Time[Data/Iter.]: [3.80s/1123.74s]
Epoch [1/15], Iter. [4000/9409], Loss: 1.1580, Time[Data/Iter.]: [3.53s/1105.00s]
Epoch [1/15], Iter. [6000/9409], Loss: 1.0998, Time[Data/Iter.]: [3.52s/1102.66s]
Epoch [1/15], Iter. [8000/9409], Loss: 1.0792, Time[Data/Iter.]: [3.72s/1140.21s]

My system specs as well:
OS: Pop!_OS 20.04 LTS x86_64
CPU: AMD Ryzen 7 2700X (16) @ 3.700G
GPU: NVIDIA GeForce RTX 2070 SUPER
Memory: 16017MiB
CUDA: 10.2

Remove single-GPU training script

As training on a single GPU can be achieved by setting the --world-size option to 1, the single-GPU training script is no longer maintained, and thus to be removed.

Update the instructions on using data utilities

Commit 7e73330 embedded HICO-DET repo as a submodule in the current repository. As a result, the dataset repo can be cloned and updated within this repo. Instructions on this part should be updated

How to train to get the model with mAP of 28.54

Sorry to bother you but would you mind giving me more tips about how to achieve the mAP of 28.54? I download the fasterrcnn_resnet50_fpn_hicodet_e13.pt but do not know how to use it. Shall I train the network with the detections produced with this .pt file? With python preprocessing.py --partition test2015 ckpt_path='path to .pt'? (Did u forget an h after .pt?)

Thx.

About label generate

Hello, I am a bit confused about the calculation of the classification loss in your code. If you have time, can you give me an answer?
https://github.com/fredzzhang/spatio-attentive-graphs/blob/main/interaction_head.py#L182
In your postprocess function, you pass in the one-hot form of the prediction result of each pair and the one-hot form of the label. Next, you use a series of magic operations to convert the one-hot form of the label into one I don't quite understand things.

Clean up recent changes and update documentation

Load state dict for optimiser and scheduler when continuing from a checkpoint

In main.py and main_dist.py, the argument --model-path allows the training process to start from a given model checkpoint. However, state dict for both the optimiser and the learning rate scheduler were not loaded.

Update README to include instructions on V-COCO dataset

multi-GPU finetune fatser rcnn

When I use your detector finetune script in a multi-GPU situation, the following problems will occur. Do you have any good solutions to this problem?

RuntimeError: expected device cuda:1 and dtype Float but got device cuda:0 and dtype Float

The bug of main.py

Dear Professor , excuse me. I have a problem that I want to get a solution from you. when I run main.py, it has a problem as follows.

And I have followed your guide step by step, but it still can not run rightly, I am very grateful to you if you give me a solution when you have time. Thank you very much!

Add the maximum number of human and object instances in the argument list

Clean up the visualisation code

KeyError when indexing dict using Tensor

In interaction_head.py, the mapping from object classes to action/verb classes use torch.LongTensor as the index, which works for a list, but not for a dict

428        # Map object class index to target class index
429        # Object class index to target class index is a one-to-many mapping
430        target_cls_idx = [self.object_class_to_target_class[obj]
431            for obj in object_class[y]]

Clean up code used for diagnosis

Learning curve visualisation
Class-wise human-object pair visualisation
Image-wise human-object pair visualisation

Add warnings when the given model path is incorrect

A follow-up on #16

How to use the 'CacheTemplate' class when testing for V-COCO?

Hi,

Recently, I tried to test a model on V-COCO dataset. I found that there is a class 'CacheTemplate' provided in README.md, however, I don't know how to use that correctly.

I tried to add that class to the V-COCO eval file directly, but it doesn't work - the file keep loading the detection file over half an hour and nothing happened. Could you please give more detail for using that class? Thanks a lot！

Here is my evaluation code.

from vsrl_eval import VCOCOeval
from collections import defaultdict

class CacheTemplate(defaultdict):
    """A template for VCOCO cached results """
    def __init__(self, **kwargs):
        super().__init__()
        for k, v in kwargs.items():
            self[k] = v
    def __missing__(self, k):
        seg = k.split('_')
        # Assign zero score to missing actions
        if seg[-1] == 'agent':
            return 0.
        # Assign zero score and a tiny box to missing <action,role> pairs
        else:
            return [0., 0., .1, .1, 0.]


if __name__ == "__main__":
    vsrl_annot_file = "data/vcoco/vcoco_test.json"
    coco_file = "data/instances_vcoco_all_2014.json"
    split_file = "data/splits/vcoco_test.ids"
    det_file = "../SCG/vcoco_cache/vcoco_results.pkl"
    vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
    vcocoeval._do_eval(det_file, ovr_thresh=0.5)

Make the dataset submodule track the latest head

According to here, since git 1.8.2, it is possible to force a submodule to be synced with a branch.

About mAP in your paper

The highest precision in this repo is pointing to DRG, but your issue says that your highest precision is 28.54

About fine-tuned detector

Hi, I wonder after fine-tuning the object detector on HICO-DET, whether you retrain your hoi classification model or just replace the object detector at the test time.

Allow setting detection directory via argument

Currently in test.py, cache.py and a few other scripts, the path to the directory of object detections is hard-coded. It should be added to the argument list for flexibility

how to get the finetuned detections on VCL DRG and Yours

hello, I can't download the finetuned detections, could you give me a link ? thanks, looking forward to your reply

"Too many open files" when running test.py

The problem originates in the multiprocessing when computing average precisions. Refer to fredzzhang/pocket#6 for details.

Specifying the number of processes has fixed the problem partially. Now during training, the computation of classification mAP did not result in the same error, but during test the computation of detection mAP did.

Add instructions for diagnostic tools

Inconsistent evaluation result

The evaluation results returned from test.py are inconsistent with the Matlab evaluation code.

TypeError: pic should be PIL Image or ndarray. Got <class 'torch.Tensor'>

Full error is as follows

Traceback (most recent call last):
  File "test.py", line 91, in <module>
    main(args)
  File "test.py", line 68, in main
    test_ap = test(net, dataloader)
  File "/home/fred/spatio-attentive-graphs/utils.py", line 128, in test
    for batch in tqdm(test_loader):
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/tqdm/std.py", line 1167, in __iter__
    for obj in iterable:
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/fred/spatio-attentive-graphs/utils.py", line 113, in __getitem__
    image = pocket.ops.to_tensor(image, 'pil')
  File "/home/fred/pkgs/pocket/pocket/ops/transforms.py", line 33, in to_tensor
    return torchvision.transforms.functional.to_tensor(x).to(
  File "/home/fred/miniconda3/envs/pocket/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 50, in to_tensor
    raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
TypeError: pic should be PIL Image or ndarray. Got <class 'torch.Tensor'>

Update instructions with new training schedule and performance

Add instructions on usage

Installation instructions
Data utilities
Testing
Training

About mAP in your paper

Hi, I found that in your paper, there is a mAP 27.18 model on HICO-DET. I wonder how this model is trained as I did not find corresponding one in this repo.

excuse me, can you provide the best model of vcoco? Thanks!

Update the code for compatibility with PyTorch 1.7.1 and torchvision 0.8.1

Add random horizontal flip as a form of data augmentation

Switch optimiser to AdamW

Implement relevant code to train and test on V-COCO

NameError: name 'HICODet' is not defined

A recent commit 066bee8 removed the import of HICODet class in test.py as a follow-up on the update of dataset interface, which has resulted in an error

How can I inference a single image and visualize the result?

Many thank : )

How can i use the pretrained HICO model for OKVQA action detections

Hi,
I want to retrieve the action detections on OKVQA dataset. I only want to detect actions on OKVQA dataset using pretrained HICO model. Can you please guide me how do I do that?

Also, do I need object detections on OKVQA beforehand? in order to use HICO pretrained model on it?

I read #63 too. but I didn't understand how to implement it to my problem

Exception happened during the evaluation of VCOCO

Hi, I trained your method on VCOCO dataset and generated the vcoco_results.pkl. However, when I ran the utilities provided by Gupta, I got the exception below:

Could you please kindly help me handling this problem? Thanks a lot. Besides, I also attached the code for evaluation of mine:

from vsrl_eval import VCOCOeval
from collections import defaultdict

class CacheTemplate(defaultdict):
    """A template for VCOCO cached results """
    def __init__(self, **kwargs):
        super().__init__()
        for k, v in kwargs.items():
            self[k] = v
    def __missing__(self, k):
        seg = k.split('_')
        # Assign zero score to missing actions
        if seg[-1] == 'agent':
            return 0.
        # Assign zero score and a tiny box to missing <action,role> pairs
        else:
            return [0., 0., .1, .1, 0.]



if __name__ == "__main__":
    vsrl_annot_file = "data/vcoco/vcoco_val.json"
    coco_file = "data/instances_vcoco_all_2014.json"
    split_file = "data/splits/vcoco_val.ids"
    det_file = "spatio-attentive-graphs/vcoco_cache/vcoco_results.pkl"
    vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
    vcocoeval._do_eval(det_file, ovr_thresh=0.5)

Number of images hardcoded in the caching script

In cache.py, the number of images for the collected results has been hardcoded for the test set. This should be made adaptive based on the number of images for a selected partition

json file

hello, this is my first work on hoi, and can you explain the meaning of instances_train2015.json, instances_val2015.json, coco80tohico80.json, coco91tohico80.json, what they are represented?

Add scripts to download from google drive

Running HICO Evaluation

Hi @fredzzhang, thank you for your great work!
I am trying to reproduce the reported mAPs of the pretrained COCO version, but I'm confused on how to get the HICO evaluation running.

I ran the network through test2015 successfully and generated both JSON and .mat files. I tried running the mat files with eval_run.m from HO-RCNN but got very lower mAPs:

Is there something I am missing? I would appreciate any help.

How to use code to infer in my own data set? My own data set is not labeled, just want to see the actual application effect of HOI algorithm

thank you for your work!

How to use code to infer in my own data set? My own data set is not labeled, just want to see the actual application effect of HOI algorithm.

thank you very much.

fredzzhang / spatially-conditioned-graphs Goto Github PK

spatially-conditioned-graphs's Issues

Recommend Projects

Recommend Topics

Recommend Org