tum-phoenix / drive_ml Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 4.0 35.74 MB

This is the repository for our Machine Learning applications

Jupyter Notebook 93.89% Python 5.74% MATLAB 0.37%

drive jupyter-notebook machine-learning

drive_ml's People

Contributors

Stargazers

Watchers

Forkers

rajesh-maheswaran1996 shumailaahmed olimcy xabirizar9

drive_ml's Issues

Design a network for localsation

Output should be (n,4) range each 0 to 64 (max value of our pixels).

Try the Classification NN with a Localsiation Layer, output linearisation with (n,4) instead of (n,44)
Literatur Reseach to the topic of localisation (optional), please brief the rest of the team.
Train with the picture in picture data set

Tensorboard

Try using Tensorboard to viszualize training results.

generate ground truth verification data

Ground truth data can be easily generated using the Matlab Automated Driving System Toolbox.

ToDo List

evaluate Matlab Automated Driving System Toolbox
find optimal workflow
write down workflow
crop image to some region? -> no
script converting the matlab file to .csv
store everything in a separate folder in drive_ml->sign-recoginition->utilities

Wiki: Update the wiki page for open source datasets

Goal:

Update the wiki-page of the datasets, such that it is clear how it can be used (i.e. add a table structure with information such as "annotations available?", ...)

Assigments:

get an overview of the used datasets
document for each dataset what it includes, and how it can be used

Notes:

TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning
Dataset: https://bdd-data.berkeley.edu/
Dataset: TuSimple
Dataset: CULane

LaneDetection: Improve script for estimating line-polynomials from segmented image (different scenarios)

Objective:
Improve the estimation of line-polynomials for different scenarios (straight/curve/intersection/...)

How to start with the ticket:

Get current script from github (https://github.com/tum-phoenix/drive_ml)
Update the script for estimating the polynomials from the segmented image.
Commit this script back to github.

Constraints:

The network has to run at least at 25FPS on our on-board hardware (A Jetson TX2)
This also includes communication with the ROS core running on the main Intel NUC board

Benchmarks:

No current measurement about framerate yet

Files:
https://github.com/tum-phoenix/drive_ml

generate synthetic test-data from gazebo simulator

From gazebo simulation generate:

class labels
bounding boxes
images

https://github.com/tum-phoenix/drive_sim_road_generation

For naming and storing conventions, please follow the GTSRB structure.

TrafficSigns: [object detection] Implement SlimYOLOv3 (Yolo with Pruning) (open source dataset)

Goal:

Implement the SlimYOLOv3 algorithm and apply it to different datasets to obtain results

Background:

When this ticket is completed, the SlimYOLOv3 can be applied to the CaroloCup-Dataset

Assigments:

research: what is SlimYOLOv3
implement the algorithm
check if the accuracy is the same as from other papers on existing datasets
test the algorithm with the different methods
documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

TrafficSigns: [classification] Add a ConfusionMatrix to classification_eval script

Goal:

Update the evaluation script of the classification with a confusion matrix, such that more insights of the results are gained.

Assigments:

obtain code from GiT
add a confusion matrix plot (with def)
test, and commit to GiT
documentation how to use the notebook in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

See file "classification_eval.ipynb"
https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html

Deep Reinforcement/Imitation Learning

Objective: Train a network to control the car

Deep Reinforcement learning is a promising approach for autonomous robots and has been utilized in Autonomus Driving. This would be end-to-end trainable though.

Network architecture (Agent)

TBD
Imitation Learning: Alternative approach, learn the driving strategy end-to-end from recorded data

Simulation Environment:

Gazebo Simulation, realism could be higher, but integrates all of our ROS inputs directly, so directly reusable in the real world. LIDAR and cameras as well as the side switch are simulated.
Other publicly available simulators with custom environments: AirSim or Carla

Constraints:

The network has to run at least at 25FPS on our on-board hardware (A Jetson TX2 )
This also includes communication with the ROS core running on the main Intel NUC board

Tasks:

Setup Gazebo simulation
Research Deep Reinforcement Learning.
Try out different approaches in the Simulation
Benchmark on the car to measure performance and detect problems

Resources:

https://github.com/simoninithomas/Deep_reinforcement_learning_Course

LaneDetection: spatialCNN (SCNN) - Implement/Test lane detection (open source dataset)

Goal:

Implement, train and test a spatialCNN algorithm and test it with the open source datasets CULane and TuSimple

Assigments:

Understand how the ML-algorithms works
Implement the ML-algorithms
Train and test it with an open source dataset
Documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

https://paperswithcode.com/task/lane-detection/codeless
See approach 2 of https://towardsdatascience.com/tutorial-build-a-lane-detector-679fd8953132
optional dataset: https://xingangpan.github.io/projects/CULane.html
optional dataset: TuSimple/tusimple-benchmark#3
basic information about lane detection: https://wiki.tum.de/display/phoenix/Lane+Detection

TrafficSigns: [object detection] Identify road markings (open source dataset)

Goal:

Create an object detection for road markings based on YOLOv3 or SlimYOLOv3 algorithm

Assigments:

define which road markings are required for the carolaCup
check if a dataset already exists that fullfills the requirements (see notes)
select a dataset, and check the quality of it (balance of labels by histogram)
train the neural network
documentation in the TUM-Wiki where the additional dataset is stored, what it contains, and the histogram: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

external dataset: http://www.ananth.in/RoadMarkingDetection.html

LaneDetection: Apply a new ML approach to go from segmented image to line polyonomial

Objective:
Currently, a simple regression is applied to go from the segmented image to a line polyomial. This ticket is thinking about a different ML approach and implement this idea.

How to start with the ticket:

Get current script from github (https://github.com/tum-phoenix/drive_ml)
Create the new script for estimating the polynomials from the segmented image.
Commit this script to github.

Constraints:

The network has to run at least at 25FPS on our on-board hardware (A Jetson TX2)
This also includes communication with the ROS core running on the main Intel NUC board

Benchmarks:

No current measurement about framerate yet

Files:
https://github.com/tum-phoenix/drive_ml

Evaluate MS Azure

Evaluate tools from MS for ML development.

Workbench
Azure

We have a subscription from MS available.

LaneDetection: LaneNet - Implement/Test lane detection (open source dataset)

Goal:

Implement, train and test a LaneNet algorithm and test it with the open source datasets CULane and TuSimple

Assigments:

Understand how the ML-algorithms works
Implement the ML-algorithms
Train and test it with an open source dataset
Documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

https://paperswithcode.com/task/lane-detection/codeless
https://github.com/MaybeShewill-CV/lanenet-lane-detection
optional dataset: https://xingangpan.github.io/projects/CULane.html
optional dataset: TuSimple/tusimple-benchmark#3
basic information about lane detection: https://wiki.tum.de/display/phoenix/Lane+Detection

TrafficSigns: [object detection] Improve the dataset for the object detection in the CaroloCup

Goal:

Create a 2nd dataset for the object detection in the carolo-cup with a second folder with additional data for the same labels as the original dataset. A second should be created such that we still have the original one for comparison purposes.

Background:

When this is completed, the object detection algorithm can train on both datasets to improve the accuracy

Assigments:

get an overview which data for which label exist
create a second dataset: check how data should be labeled, and then a program such as LabelImg might be used
documentation in the TUM-Wiki which data in the dataset exist: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

a dataset for object detection not only requires the label, but also the bounding box
/drive_ml/blob/master/tf_object_detection
Quality and quantity are both important
Make sure, that also images exist that are photographed from further away
Training requires a dataset that is i.i.d. (Independent and identically distributed)
information about the current YOLOv3 implementation and the current datasets: https://wiki.tum.de/display/phoenix/Sign+Detection

Zero/Background images are just being resized to 64,64, which we wont do in real-life

TrafficSigns: Create a MultiTask-Neural Network (proof of concept)

Goal:

Create a neural network structure that is able to perform multiple tasks (see also paper), such as classification, object detection, and lane detection.

Assigments:

read papers
create a proof of concept with for example traffic signs and road signs
documentation how to use the notebook in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

paper: https://arxiv.org/abs/1710.06288
dataset for road markings: http://www.ananth.in/RoadMarkingDetection.html
dataset for traffic signs: http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset

paper_2017_Lee_LaneandRoadMarkingDetectionandRecognition.pdf

training on unbalanced set

All classes should have approximately the same number of images to be balanced.

remove pictures?
add new pictures?

Historgram available:
https://github.com/tum-phoenix/drive_ml/blob/master/sign_recognition/utilities/data_augmentation/data_augmentation.ipynb

Lane Position Estimation

Goal: estimate the position of the car on the lane based on the labeled images from the Bosch workshop.

Convention for representation: 0 is middle, -1 middle of left lane, 1 is middle of the right lane (goes up to 2)

Can be put into a sub-folder directly at drive_ml (as a jupyter notebook for example)

Where to put the data:
SSD mount is under /datasets_2 on the workstation
Make a sub-folder called lane_offset for the data recorded

Tasks:

import the data (on the workstation SSD)
train CNN-based regression approach to detect offset to the current middle line

LaneDetection: Create a general evaluation script

Goal:

Create a general evaluation script that is able to evaluate the performance of a line detection algorithm.

Assigments:

Ground Truth
Documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

Pruning Network Architectures

Prune trained on our data.

Colab notebooks for this are available from google.

Text TBD

LaneDetection: Hough Transform - Implement/Test lane detection

Goal:

Implement and test the "Hough Transform" algorithm on the offline datasets CULane and TuSimple. Note: this is a classical approach and no machine learning.

Assigments:

Understand the algorithm
Understand the dataset
Evaluate the performance of the algorithm on the datasets
documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

See approach 1 of https://towardsdatascience.com/tutorial-build-a-lane-detector-679fd8953132
Datasets:

LaneDetection: Find open source datasets for the CaroloCup

Goal:

Understand which datasets are currently used for lane detection, and define which dataset can be used for the CaroloCup.

Assigments:

understand the basics of lane detection
understand which labels are required for line detection in the CaroloCup (dashed line, straight line, etc.)
look for open source datasets and select at least 2 good datasets
store dataset on TUM-PC
documentation in the TUM-Wiki which datasets look promising, explain how to use them, and what the dataset contains: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

Quality and quantity are both important
Training requires a dataset that is i.i.d. (Independent and identically distributed).
optional dataset: https://xingangpan.github.io/projects/CULane.html
optional dataset: TuSimple/tusimple-benchmark#3
basic information about lane detection: https://wiki.tum.de/display/phoenix/Lane+Detection

Pipeline for tensorflow

Develop a pipeline for learning

data preprocessing
parameter handling
training validation test routine
evaluation / visualization tool (e.g. tensorboard)

Create a docker image where everything works.

Modeldesign

Read the Keras Documentation and try to create a better model.

Read the Keras documentation.
https://keras.io/layers/convolutional/

Segmentation of the Driving Environment

Objective: Segmentation of the Camera Image

Segmentation is a useful tool to perceive the environment and extracts a very rich representation. In
combination with Depth Perception, this can lead to a very accurate snapshot of the track.

Classes:

Line
Street
- Drivable
- Non-Drivable
  - Barred Area
  - Parking Spot
  - ...
- Road Marking
  - Turning Signal
  - Speed Limit
  - ...
Obstacle
- Moving
- Static
Pedestrian
Traffic sign
Other (surrounding track etc.)

Network architecture

FCN-based
- Pros: simple, fast, many implementations exist, pretrained weights available, easy to train
- Cons: not state-of-the-art anymore
DeepLab-based
- Pros: very good performance (still state-of-the-art, implementations in TF exist, pretrained available)
- Cons: harder to train, relatively performance-hungry (but fast version with MobileNet V2 backend exists)
SCNN-based
- Pros: extremely fast, even on slow hardware
- Cons: performance might not be ideal

Data sources:

Use real-world datasets like Berkeley Deep Drive, KITTIor CityScapes (however, these might not directly be transferable to our domain -> Domain Adaptation)
Use our environment generator (originally for Gazebo). A possibility is to render the track in 3D software like Blender to create a more realistic image. Segmentation ground truth can be generated by an additional render pass.

Constraints:

The network has to run at least at 10 FPS on our on-board hardware (A Jetson TX2 )
This also includes communication with the ROS core running on the main Intel NUC board

Benchmarks

IoU and other scores on common semantic segmentation datasets
Empiric evaluation in the real world

Tasks:

Checkout state of the art segmentation algorithms for mobile robots.
Define useful classes for the segmentation and compare it with different datasets (CityScapes, BDD, Kitti).
Train a model with a public dataset or our data and test performance on the Jetson
Fine tune the model on our data

TrafficSigns: [object detection] Add one custom class to YOLOv3 (proof of concept)

Goal:

Augment the YOLOv3 algorithm, such that it is able to detect (with 2D bounding box) an additional class

Assigments:

understand the current YOLOv3 algorithm (see repo)
understand which classes are currenty detected
determine the performance with the current classes
add additional data for one additional class
test and evaluate the algorithm
documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

File: /drive_ml/blob/master/tf_object_detection/Train Yolo3.ipynb
information about the current YOLOv3 implementation and the current datasets: https://wiki.tum.de/display/phoenix

Lane Detection using Deep Learning

Objective: Detect the Road Lanes

State-of-the-art Convolutional Neural Networks can be used to detect road lanes.

Different approaches and techniques can be implemented to train a NN that can detect lines. Several aspects need to be tried and decided on:

Representation:

Output of coefficient corresponding to lane boundaries
Output of drivable area in image
Segmentation of lane boundaries (can be combined with scene segmentation for other tasks)

Network architecture

Convolutional CNN outputting a set of coefficents using normal FCN layers at the head (easiest)
Convolutional CNN that upsample a feature map to generate the drivable road area (medium)
Segmentation network (bottleneck at feature map, upsampling afterwards again) (hardest)

Data sources:

Generation of coefficients by our current line detection algorithm (can use existing rosbags for generation)
Use real-world datasets like Berkeley Deep Drive which contain drivable area and lane information (however, these might not directly be transferable to our domain)
Use our environment generator (originally for Gazebo). A possibility is to render the track in 3D software like Blender to create a more realistic image. This requires investigation into the influence of the Domain Adaptation problem in this domain.

Constraints:

The network has to run at least at 25FPS on our on-board hardware (A Jetson TX2)
This also includes communication with the ROS core running on the main Intel NUC board

Benchmarks:

Comparison with our current Line Detection approach
Driving in simulation (requires investigation into domain adaptation)

Jetson: Create Wiki tutorial how to flash the Jetson

Goal:

Create a wiki totorial how to flash the Jetson

Assigments:

understand how the Jetson should be flashed
test the procedure
document results in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Assigments:

for more information, please contact Mykyta

evaluate Tensorflow Object Detection API

Google recently announced its new Object Detection API. It could be very interessting for us.

The tools include:

Faster-RCNN with NASNet-A image featurization
Single Shot Multibox Detector (SSD) with MobileNet,
SSD with Inception V2,
Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101,
Faster RCNN with Resnet 101,
Faster RCNN with Inception Resnet v2

More infos:
https://github.com/tensorflow/models/tree/master/research/object_detection

LaneDetection: Research how lane-detection with the car shoud be evaluated

Goal:

Determine how lane algorithms can be tested and evaluated with the Phoenix car on the racetrack (ground-thruth).

Assigments:

Brainstorm how the performance of a line detection algorithm can be evaluated (ground-thruth)
Documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

basic information about lane detection: https://wiki.tum.de/display/phoenix/Lane+Detection
we need to either reuse the evaluation metric from existing lane detection datasets (TuLane has one) or implement it ourselves

TrafficSigns: [classification] Update the train script that an image of the training accuracy is stored for each epoch (Reserved for Ussamma)

Goal:

End the end of the training phase, store a plot with the accuracy and loss that was recorded each epoch.

Assigments:

get train script from git
create accuracy/loss plot and save it automatically
test the modified script and commit to GiT
create a documentation in the TUM-Wiki how this notebook should be used: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

File "classification_train.ipynb"

Error difference

The accuracy while fitting and during the evaluation part is not the same.
Find out why they are different and with the network is mostly 100% sure with the given data.

image brightness per class zero mean and unit variance

images should be normalized in terms of:

same brightness mean per class
same brightness variance per class

ObjectDetection: Implement first version of YoloV4

Objective:
Implement YoloV4

How to start with the ticket:

Read the paper (https://arxiv.org/abs/2004.10934)
Implement the code: https://github.com/AlexeyAB/darknet

When is ticket finised?:
The ticket is finished when a small prove of concept is shown how YoloV4 can be implemented, and how it can be trained for custom classes (any dataset).

Further Information:
In a later ticket, this code will be used for our custom pictures. Sidenote: Labeled images (object + box) are also possible to be retrieved from here: /home/Images (LRZ cloud.tum-phoenix.de)

Files:
https://github.com/tum-phoenix/drive_ml

Data Argumentation - Missing Signs

For the Carolo Cup some signs are not in the database, to solve this problem we need other pictures.

try to find an approch to generate pictures of this signs
Maybe you can cut the boundingbox of the sign and replace it by an other sign picuture from google or other.
Deform the sign to put in with a shadow tranformation (constant, linear, polynominal), with changes the RGB values.
Mix and use the mean of a set of backgrounnds (eg. two).
Evaluate with real camera data

TrafficSigns: [object detection] Improve the current YOLOv3 script

Goal:

Improve the accuracy of the current YOLOv3 by first getting the baseline performance with open source datasets and the carolocup dataset, then find out which problems exist, and finally add improvements/test resutls.

Assigments:

analyze the current script
determine out-of-the-box performance for the open source dataset and the carolocup dataset
understand where the model can be improved
retrain/retest
documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

File: /drive_ml/blob/master/tf_object_detection/Train Yolo3.ipynb
information about the current YOLOv3 implementation and the current datasets: https://wiki.tum.de/display/phoenix

Data Analytics - Class and Boundingbox statistics

Currently, we have included most public and private datasets. The notebook with the labels and conversion examples is available here. Dataset evaluation is found here.

Available public datasets:

GTSDB (German Traffic Sign Detection Benchmark)
LISA (Laboratory for Intelligent and Safe Automobiles)
BTSD (Belgian Traffic Sign Dataset)
STS (Swedish Traffic Sign Dataset)
DITS (Dataset of Italian Traffic Signs)
RTSD (Russian Traffic Sign Dataset)

Todo:

add conversion dict for Mapillary Traffic Sign dataset
add evaluation for Mapillary traffic sign dataset
add conversion dict for Berkeley Deep Drive dataset
add evaluation for Berkeley Deep Drive dataset

Data - Database

Create a Database from the picutes:
Relativen Pfad
Größe
Klassen
Boundingbox (x1 y1 x2 y2)
Datenset
SchildID - Szenenweise...

document internal data convention

ClassID

For better documentation there should be a table in a ReadMe.md (Markdown):
| Name | GTRSB Number | Picture |

{
    '3': 'Geschwindigkeitsbegrenzung 60', 
    '11': 'Nächste Kreuzung Vorfahrt', 
    '22': 'Bodenwellen', 
    '33': 'Rechts Abbiegen', 
    '4': 'Geschwindigkeitsbegrenzung 70', 
    '27': 'Vorsicht Fußgänger', 
    '40': 'Kreisverkehr', 
    '39': 'Blauer Pfeil untenlinks', 
    '41': 'Überholverbot aufgehoben', 
    '43': 'Kein Schild', 
    '1': 'Geschwindigkeitsbegrenzung 30', 
    '19': 'Scharfe Kurve links', 
    '24': 'Straße wird enger links', 
    '28': 'Vorsicht Kinder', 
    '12': 'Vorfahtsstraße', 
    '38': 'Blauer Pfeil untenrechts', 
    '7': 'Geschwindigkeitsbegrenzung 100', 
    '0': 'Geschwindigkeitsbegrenzung 20', 
    '2': 'Geschwindigkeitsbegrenzung 50', 
    '30': 'Vorsicht Schnee', 
    '14': 'STOP', 
    '5': 'Geschwindigkeitsbegrenzung 80', 
    '13': 'Vorfahrt gewähren', 
    '8': 'Geschwindigkeitsbegrenzung 120', 
    '36': 'Geradeaus oder rechts abbiegen', 
    '17': 'Einbahnstraße falsche Seite', 
    '9': 'Überholverbot PKW', 
    '6': 'Geschwindigkeitsbegrenzung Ende 80', 
    '21': 'Kurvige Straße', 
    '26': 'Vorsicht Ampel', 
    '42': 'LKW-Überholverbot aufgehoben', 
    '29': 'Vorsicht Fahrrad', 
    '31': 'Vorsicht Wild', 
    '10': 'Überholverbot LKW', 
    '37': 'Geradeaus oder links abbiegen', 
    '35': 'Geradeaus', 
    '15': 'Durchfahrt verboten', 
    '32': 'Alle Regeln frei', 
    '23': 'Rutschgefahr', 
    '16': 'Durchfahrt verboten LKWs', 
    '25': 'Vorsicht Bauarbeiten', 
    '20': 'Scharfe Kurve rechts', 
    '34': 'Links Abbiegen', 
    '18': '!-Zeichen'
}

File names and formats

folder and filename structure
csv name
csv content
zero class csv content

ImageSegmentation: Improve style transfer to create more realistic images

Objective:
A neural network is implemented so create segmented images. The neural network learns from rendered images (created in gazebo). This task is about retraining the neural network with improved images, after applying a better style tranfer.

How to start with the ticket:

Talk to Frithjof

Constraints:

The network has to run at least at 25FPS on our on-board hardware (A Jetson TX2)

Benchmarks:

No current measurement about framerate yet

Files:

LaneDetection: ENet-SAD - Implement/Test lane detection (open source dataset)

Goal:

Implement, train and test a ENet-SAD (Self Attention Distillation) algorithm and test it with the open source datasets CULane and TuSimple

Assigments:

Understand how the ML-algorithms works
Implement the ML-algorithms
Train and test it with an open source dataset
Documentation in the TUM-Wiki: https://wiki.tum.de/display/phoenix/Machine+Learning

Notes:

https://paperswithcode.com/task/lane-detection/codeless
paper: https://arxiv.org/abs/1908.00821
optional dataset: https://xingangpan.github.io/projects/CULane.html
optional dataset: TuSimple/tusimple-benchmark#3
basic information about lane detection: https://wiki.tum.de/display/phoenix/Lane+Detection

generate synthetic data projecting signs in recorded pictures

We could generate synthetic training data with GTRSB or other sign images. The images can be projected in our "environment" to create the synthetically labeled data. Based on the projection parameters we can obtain the bounding boxes easily.

Code lives here: https://github.com/tum-phoenix/drive_ml/tree/master/sign_recognition/utilities/data_augmentation

1st approach

use GTRSB images and project them into some random camera images we recorded

problems:

images from GTRSB are only very small (no pictures with big signs possible)
background from GTRSB and background from our environment can be very different
not all of our sings are in GTRSB

2nd approach

use signs with transparent background
project signs into random pictures of recordings (zero class images)
evaluate different processing steps to make the projected signs look more realistic
use of random parameters (using some kind of normal distribution?)
evaluate order of processing steps
automate process (e.g. python)
validation?

code

we already have some code