Code Monkey home page Code Monkey logo

anchor_computation_tool's Introduction

Anchor computation tool

This repo primarily targets to help those who needs to compute anchors to customer dataset in object detection. Two type of tools have been implemented at this stage. They are,

  1. An anchor visualization tool (anchor_inspector) to help you check if your anchor is suitable for your current dataset. If yes, then you do not need to modify your anchors. Else,
  2. If no, then give a try to my implementation (k_mean_anchor_size) which is able to compute anchors for two-stage detectors and return anchor_scale + anchor_ratios. Those parameters are critical for object detection (for example in mmdetection). For some single-stage detector, you only need kmean results. Simple take them from my implementation.

The result has been tested on mmdetection framework with Faster Rcnn FPN algorithm, obtained an AP improvement of 2.2 points on typical aerial image detection dataset. This is a decent improvements from my perspective.

Update log

[05-12-2020] As observed by jinfagang, passing boolean variable ("anchors" and "annotations") from terminal may not work. The author originally considers to load those parameters from yaml file only. An updated will be provided to allow passing boolean variables from terminal.

Usage

For anchor_inspector, you need to provide a configuration file (.yml) together with the path to the dataset. If you are not able to access gui (maybe you host code on a server), then it is fine. Just enable --no_gui option. Else, you will be able to visualize your current input and annotations boxes on image. The green bbox indicates a match but red is not. So be alarmed if you see lots of red bboxs.

Some examples:

alt text

Usage: anchor_inspector.py [-h] [-project_name PROJECT_NAME] [-dataset_path DATASET_PATH] [-n NUM_WORKERS] [--no-resize] [--anchors] [--annotations] [--random-transform] [--image-min-side IMAGE_MIN_SIDE] [--image-max-side IMAGE_MAX_SIDE] [--config CONFIG] [--no-gui] [--output-dir OUTPUT_DIR] [--flatten-output]

To use anchor_inspector, it is highly recommended to setup data in following structure:

Root

----------Datasets

--------------Train data

--------------Valid data

--------------Test data

----------Projects

--------------project.yaml

Simple code to run:

python anchor_inspector.py -dataset_path ./datasets/project_folder -project_name ./projects/project_folder --output-dir debug_out

The "project_folder" in "dataset_path" refers to the data folder where you save image and annotation, and in "project_name" refers to the yaml file for configuration purpose.

How to get customer anchor

If you need to modify your anchors, then try k_mean_anchor_size notebook! Run all ceils to get your anchor_scale and anchor_ratios. They will jump out to console at the end.

Good luck!

Acknowledgement

I refer to and modify code from those two repos,

https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch

https://github.com/zhouyuangan/K-Means-Anchors

Thanks to their high quality code. And please take your time to check their repo if you want.

Development plan

I understand that it is necessary to provide more tutorial/examples on how anchor works. Maybe more visualization. To improve this repo,

  1. I am currently conducting experiment on some open-source detection frameworks with an aerial dataset, as it contains more small objects. I will share some testing results once they are available.

  2. I'm also writing to wrap up what I know, as best as I can, for basic concepts and some tips you may need to develop your customer anchors. Stay tuned.

  3. I'm not a computer vision phd and will always run into some errors/bugs. If you find some, please share your finding with everyone in this repo. You are welcome to PR.

  4. If you fail to observe any improvements after you plug-in and play with this repo, please open an issue and describe how thing goes. I will take my best effort to help.

TODO list

The repo author is currently working on some side projects as well for the following two weeks. So there might be delays for those to-dos. Apologize for any inconvenience this may cause. And you're welcome to PR.

  1. Build tutorial (shape dataset) and provide a working example.
  2. Improve part of code to make it easier to use.
  3. Revision for coord normalization for kmean code.
  4. Pass boolean variable from terminal.

anchor_computation_tool's People

Contributors

cli98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

anchor_computation_tool's Issues

Problems of CLUSTERS.

Hello!
Problems of CLUSTERS.

Is CLUSTERS=3 used in Yet-Another-EfficientDet-Pytorch?

Now I want to use your code to get the anchor of my own dataset, but I found that you added '' category_id='l' '' when calculating the anchor. It seems that the purpose is to calculate merely the anchors of large-scale target.

So, what should I do to get the right anchors. Should I set 'category_id' to a string other than 's, m, l'?

Input ratios for anchor inspector may have format (h,w)?

I came across this repo while browsing issues in zylo117's EffDet. Great tool btw. Thanks for hosting.

Zylo's effdet implementation uses format (w,h) for his anchor box implementation. When I pass a set of ratios that have boxes which are wider and shorter (That's how the gt boxes are too), according to zylo's format, they'd be something like:
anchors_ratios: '[(0.9, 1.1), (1.2, 0.8), (2.2, 0.5)]'

When I pass it to the anchor inspector along with the training data directory, I see that I see a log of red boxes. Some green.
However, If i pass the inverted ratios, lots of green start popping up:
eg: [(1.1,0.9), (0.8,1.2), (0.5,2.2)]

So, is it possible that the format required for the anchor inspector is (h,w)?

anchor calculation

I tried to re-calculate anchor_scale and anchor_ratio on my own dataset using method in your repo. The result accuracy is 58.11% for CLUSTER=3, what do you think about this result? Is that satisfactory?

Also, what does the parameter CLUSTER mean? I notice it controls the length of output list anchor_scale and anchor_ratio. I use EfficientDet from "Yet-Another-EfficientDet-Pytorch" and the length anchor_ratio is 3 in that repo, so I should set CLUSTER=3 in this case?
Thanks!

anchor_base_scale and anchor_stride setting

hello, thanks for your work for this repo.
I wonder how to set the params like anchor_base_scale & anchor_stride for efficientdet?
Could you tell me the way to get above two ?

Dataset

Hi there, please could you provide a link to the dataset in the notebook example so I may inspect the xml files as I have some errors I would like to debug. Thank you.

Wouldn't be better to scale the boxes?

Hi @Cli98, I think that in the current implementation the bbox size has too much importance.

Immagine this situation:
10 bboxes 10x5
5 bboxes 2x1
5 bboxes 1x2

If K=2 the two clusters would be:
10 bboxes 10x5

and
5 bboxes 2x1
5 bboxes 1x2

and so the suggested ratios would be (1.4, 0.7) and (1., 1.) instead of (1.4, 0.7) and (0.7, 1.4) (or (1., 0.5), (1., 2.)).

I tried this experiment with COCO annotations, I'm trying to calculate good anchors for EfficientDet0 and my dataset:

# load COCO train 2017 annotations
with open("instances_train2017.json") as f:
    annotations = json.load(f)
image_size = 512 #efficientdet0 input_size
min_size = 32 * 32
images_scale = {ann["id"]: image_size / max(ann["width"], ann["height"]) for ann in annotations["images"]}
scaled_bboxes = np.array([np.array(ann["bbox"][-2:]) * images_scale[ann["image_id"]] for ann in annotations["annotations"] if np.prod(ann["bbox"][-2:]) > min_size])
out = kmeans(scaled_bboxes, k=3)
# results:
2 * out / out.sum(axis=1, keepdims=True)
array([[0.91217741, 1.08782259],
       [0.914852  , 1.085148  ],
       [1.00598999, 0.99401001]])

As you see the ratios are very different from the suggested ones and very similar to each other.

Instead if I normalize the bboxes so they sum to 2 (like in EfficientDets libraries):

min_size = 32 * 32
bboxes = np.array([annotation["bbox"][-2:] for annotation in annotations["annotations"] if np.prod(annotation["bbox"][-2:]) > min_size])
nbboxes = 2 * bboxes / bboxes.sum(axis=1, keepdims=True)
out = kmeans(nbboxes , k=3)
# results:
2 * out / out.sum(axis=1, keepdims=True)
array([[0.60458624, 1.39541376],
       [0.96894304, 1.03105696],
       [1.33333333, 0.66666667]])

Here as you can see we obtain more or less the ratios suggested, for example, in the EfficientDet implementation

what the ****!

my dataset is:(according to Yet-Another-EfficientDet-Pytorch)

dataset
object
annotations
train
val

And I think all the settings are right

when running program:
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading image with numpy array
Image id: 0
Loading image with numpy array
Image id: 1
Loading image with numpy array
Image id: 2
Loading image with numpy array
Image id: 3

No task results,What's the problem??????????

The size between object need to detect and images

Hi, Bro. It haunts me much. The question is as followed:
I am using the images of my uav. And the image size is 38402160, the object i want to detect is 94325 or 236938, so is the object small or mid or large?(According to the coco, it is absolutely large cause it is bigger than 9696, but how am i going to process the 3840*2160 images when put it into yolo series or ssd?)
I will appreciate it if can i get your reply soon. Thanks.

need your help

I was wondering if you can give me a template yaml files, and template xml annotation file.
the below folder structure is within anchor_computation_tool.

+---projects
| mask.yml
+---datasets
| +---mask
| | | instances_train.json
| | | instances_val.json
| +---train
| | ---images
| | train1.png
| | train10.png
+---val
| | ---images
| val5.png
| val6.png

the instances_train.json are COCO annotation that have bounding box and annotation.
The mask.yaml has the project according to ZYLO117 recommendation.

I need help on how to format the xml annotation file from the json file.
Thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.