Workzone Boundary Detection
nlitz88 / workzone Goto Github PK
View Code? Open in Web Editor NEWWorkzone Boundary Detection
License: MIT License
Workzone Boundary Detection
License: MIT License
For points/cones that are sparse, use curvature to associate them into occupied zones.
Could use nearest neighbors/k-means, also need to come up with a curvature-dependent heuristic.
The workzone
repository will serve as the wrapper node that contains scripts for setting up a workspace with all the repositories and dependencies needed for the pipeline and any packages and launch files to run the pipeline.
.repos
file for the project and commit it to this repo.Basically, we need a ROS node that takes the OpenCV code we have written and wraps it up in a ROS node so that it can ingest 2D costmap images with bounding boxes (or whatever format the data comes in as) and spit out a 2D occupancy grid OR image (or whatever we decide on) with the workzone segmented.
Some rough ideas/requirements/notes:
I will take maybe 30 minutes to an hour and put some pictures or slides together real quick demonstrating some of the high level ideas I have had so far.
Leave a comment below if there are other ideas you want to add on, or even just pitch to Raj.
I will link the document below when created.
Basically, we just need a way to play-back the images from the nuScenes dataset so that they can be ingested in realtime from the processing pipeline.
There are some different ways we can approach this / some different approaches that will work, depending on how streamlined/robust we want the implementation to be.
For the sake of simplicity, for the time being, we have just been using a visualization of the ground truth annotations from the nuScenes dataset.
However, in the future, we ideally want live-predictions from the model itself. Therefore, see if we can get live predictions from either the patched model repository package or the TensorRT ROS wrapper version
Outline of the Initial Project Presentation
The initial project presentation can follow a roughly similar format as the project scope document (as listed in the previous announcement). However, it is an in-class presentation in front of a live audience. Hence, it needs to be much more visual.
Use lots of figures - and make them colorful and appealing.
Use animation where appropriate. Do not overuse animation, but it can present concepts coherently in an otherwise-visually cluttered slidescape).
Make your use cases very clear. And present them from the simplest to the most complex.
List your anticipated demo sequences for the intermediate and final demos, as discussed during the finalization of your project scope.
Time duration: ~20 minutes for each group (~5-6 minutes for each project member + 5 minutes for Q&A from the entire class + transition time between projects).
We have a good set of projects in store.
IMPORTANT: You MUST send your project presentations to me ([email protected]) before 3pm on the day of your presentation. All presentations will be made from my laptop in class to avoid technical glitches with the presentation equipment. Powerpoint, pdf and Google Slides can be used.
Looking forward to your presentations and the scope documents next week.
Some different options:
Made a task specifically for this part as this may prove to be a tiny bit more involved then the rest of the contents of the presentation. Although, for a first draft, maybe not!
Made a draw.io document here: https://drive.google.com/file/d/1mE79ahct1FuurYCvGkRm1PcesUFvAQa6/view?usp=sharing
At the very least, we can just depict the flow of sensor data (images, maybe a lidar point cloud) into the pipeline, maybe previewing what the output looks like at each stage. Honestly, for our first presentation, the rough drawing I had in the demo slides wouldn't be horrible.
Find one construction dataset that we can start with. Traffic cones, construction lights, barriers, construction vehicles (like excavators, work lights, generators, bulldozers, pavers, etc.--anything like that). Anything you might find in a construction zone. We can definitely combine multiple datasets if you can't find a single one containing all relevant construction objects.
Update: Maybe to be more specific, I think there are two different kinds of datasets that we might want to look for:
Also, in looking for these datasets, it's good to check out all different kinds. We can use ones that look kinda scrappy/thrown together from roboflow--but we may have better luck with some of the better known, somewhat "vetted" datasets that are cited/used in other people's research.
Project Scope Document
The recommended outline for the project scope document is as follows:
Cover Page with Project Title and Group Members
Project Summary
Project Motivation
Project Goals
Use Cases
Methodology
What are your inputs, outputs and intermediate processing steps?
How will you tackle this problem? What is your proposed solution?
System Design โ include block diagrams, subsystems and components
Demonstration Sequences (from the simplest to the most complex)
Final Demonstration
Intermediate Demonstration (a subset of the Final Demo)
Development Milestones
What will be accomplished by the Intermediate Demo and the Final Demo?
Make this schedule fine-grained to guide your own project development process.
Work Partitioning
How will you partition the work among team-members?
Conclusions
References
Include a few relevant pieces of work available in the literature โ ideally, these references are cited in the rest of the document.
Submission: Please submit your project scope document by email to [email protected] and [email protected]
Basically, once we have a local costmap generation tool/algorithm/model selected, take the steps necessary to train this model to be capable of detecting objects commonly observed in construction workzones. We will deal with edge cases or objects that are out of that category later.
Basically, in order to work with the costmap received from the first stage, we may need some kind of parsing code that takes the costmap and converts it into whatever internal representation is desired. I.e., maybe you instantiate a new costmap class and include code internally to create a graph of the constructon objects as an adjacency list.
Of course, the implementation details are up to you--this is just a high level task to track that step in the process if applicable. Feel free to @me and change this up if the plan changes or you have some other ideas :)
I'll add more to this later, but the short of it is: There are many ways to take sensor inputs, detect objects, figure out where they are in 3D space, and plot those on a 2D occupancy grid (which lots of people call a "costmap." Specifically, a "local" costmap just means an occupancy grid around our vehicle with all of the objects positioned relative to it). I.e., this is a very open-ended task in AV, and there is definitely no single, de-facto approach.
However, in thinking about the scope of our project, creating this local costmap around the car isn't really the task that we should be stressing over. Rather, our project is more focused on "given a local costmap of all the objects detected around our car--how do we detect a construction zone and draw a boundary around it?" That is, we shouldn't spend all our energy figuring out how to construct the costmap, but instead focus our energy on that second part: identifying and drawing a boundary around construction zones.
Having said that, I'm thinking it would be good for us to go out and do a small "literature review" on some of the "out of the box" approaches to obtaining a local costmap around our vehicle (which will be situated in Carla, if I'm remembering correctly what he told us in class). This could mean going out and looking for research papers, open source projects, YouTube tutorials, etc.
This task is to track the implementation of whatever approach we want to experiment with first. I.e., if you want to try a clustering algorithm or maybe BFS--or maybe some completely different idea.
I encourage you to create tasks that parallel this one if you are attempting multiple approaches at once--we can just link this to those issues. This task is mainly for gantt chart purposes.
This task is essential for work on the two stages of this pipeline to be done in parallel.
Essentially, the first stage of the system is supposed to produce a 2D costmap with detected construction objects scattered around it. Then, the second stage is supposed to take that 2D array (basically a 2D image or 2D array) and somehow identify groups of construction objects and draw a line around each grouping to define the boundary (which denotes "non-drivable" area.
In order to not hold back those working on the second stage, we NEED a "fake" costmap to be made (perhaps by hand in something like photoshop, mspaint, or that weird photo editor available on Ubuntu @CMUBOB97 you know what I'm talking about) so that work can be started on the "grouping" stage algorithm (or whatever approach we end up going with). This 2D grid/costmap will essentially be the interface between the two stages for now, so the sooner we create a mock output for that second stage to work on, the better.
Down below, whenever you get some time to think about it, can you think of any requirements we need to have on these mock costmaps? Also, maybe it'd be good to look at some sample outputs from algorithms like BEVFusion so that our mock costmap is somewhat representative of what we will really see. Hell, even a screenshot of one of their sample outputs could work pretty well.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.