Difference between revisions of "Projects:Augment marker tracking with visual tracking"

From Collective Computational Unit
Jump to navigation Jump to search
m (Suggested/tested approaches)
 
Line 1: Line 1:
 
== Overview ==  
 
== Overview ==  
  
- Input : Video data, 2D location of birds , 3D trajectory (Labelled and unlabeled)
+
While the tracking output of the Vicon system is in general very reliable, there are still mistakes and missing data in the trajectories. The idea is therefore to use visual tracking as an additional source of information to cope with e.g. markers which are missing due to occlusion. Since the Vicon system allows to easily generate a large amount of ground truth data for the task of object detection, this is also a way to train visual object detectors for the future and more "in-the-wild" scenarios.
- Output:
 
A. Create 2D trajectories. Object detection on birds. (using simple blob tracking, or using machine learning)
 
B. Match 3D - 2D trajectories to identify identity flips.
 
C. Unlabeled trajectories can be combined with 2D trajectories, to fill gaps in 3D trajectories. (Project 2 output can be useful).  
 
  
Subprojects:
 
3.1 Offline data processing;
 
3.2 Online/quasi-real time solution working from the data stream and video stream (video stream is not directly available in real time for processing, but realtime view is generated in the software, so this could be grabbed)
 
  
 
== Contact ==
 
== Contact ==
  
Add name of and preferred method how to contact the main PI (i.e. you).
+
* Mate Nagy, mnagy@orn.mpg.de
 +
* Hemal Naik, hnaik@orn.mpg.de
 +
* Bastian Goldluecke, bastian.goldluecke@uni-konstanz.de
  
  
 
== Aims ==
 
== Aims ==
  
List the aims of your project, or what you expect anyone taking up the project is supposed to hopefully achieve. The more specific, the better.
+
The project essentially has two main parts. The first is to establish a pipeline for generating training data for visual object detection
 +
using the Vicon system. The second part is to use the trained detectors to augment the tracking, i.e. filling in gaps, helping with establishing identity, etc.
 +
 
 +
Just like in other projects, both a high-quality offline solution as well as an online/quasi-real time solution working from the data stream and video stream would be desirable (the video stream is not directly available for processing, but a real-time view is generated in the software, so this could be grabbed).
  
  
Line 25: Line 23:
 
Generating training data for visual object detection should be a pretty straight-forward standard problem, and is a good way to get into the project and data structures in the framework of a Bachelor/Master project.
 
Generating training data for visual object detection should be a pretty straight-forward standard problem, and is a good way to get into the project and data structures in the framework of a Bachelor/Master project.
  
An elaborate problem is to find a way to integrate the visual detections into the overall tracking pipeline, since this requires to find a suitable new algorithmic framework. It is closely related to [[Projects:Improve tracking of individual markers and marker patterns|this project]].
+
An elaborate problem is to find a way to integrate the visual detections into the overall tracking pipeline, since this requires to find a suitable new algorithmic framework. It is closely related to [[Projects:Improve tracking of individual markers and marker patterns|this project]]. Real-time is of course always harder.
 +
 
  
 
== Provided data ==
 
== Provided data ==
  
 
The project uses [[Vicon:Data format documentation|data from the Vicon system]] to establish (partially labeled) 3D tracks, as well as input from RGB video cameras. Code for reading the data and calibration, as well as mapping 3D points to 2D images is available (TODO: put on CCU server one git server is up).
 
The project uses [[Vicon:Data format documentation|data from the Vicon system]] to establish (partially labeled) 3D tracks, as well as input from RGB video cameras. Code for reading the data and calibration, as well as mapping 3D points to 2D images is available (TODO: put on CCU server one git server is up).
 +
  
 
== Suggested/tested approaches ==
 
== Suggested/tested approaches ==

Latest revision as of 09:08, 21 May 2019

Overview

While the tracking output of the Vicon system is in general very reliable, there are still mistakes and missing data in the trajectories. The idea is therefore to use visual tracking as an additional source of information to cope with e.g. markers which are missing due to occlusion. Since the Vicon system allows to easily generate a large amount of ground truth data for the task of object detection, this is also a way to train visual object detectors for the future and more "in-the-wild" scenarios.


Contact

  • Mate Nagy, mnagy@orn.mpg.de
  • Hemal Naik, hnaik@orn.mpg.de
  • Bastian Goldluecke, bastian.goldluecke@uni-konstanz.de


Aims

The project essentially has two main parts. The first is to establish a pipeline for generating training data for visual object detection using the Vicon system. The second part is to use the trained detectors to augment the tracking, i.e. filling in gaps, helping with establishing identity, etc.

Just like in other projects, both a high-quality offline solution as well as an online/quasi-real time solution working from the data stream and video stream would be desirable (the video stream is not directly available for processing, but a real-time view is generated in the software, so this could be grabbed).


Estimated level of difficulty

Generating training data for visual object detection should be a pretty straight-forward standard problem, and is a good way to get into the project and data structures in the framework of a Bachelor/Master project.

An elaborate problem is to find a way to integrate the visual detections into the overall tracking pipeline, since this requires to find a suitable new algorithmic framework. It is closely related to this project. Real-time is of course always harder.


Provided data

The project uses data from the Vicon system to establish (partially labeled) 3D tracks, as well as input from RGB video cameras. Code for reading the data and calibration, as well as mapping 3D points to 2D images is available (TODO: put on CCU server one git server is up).


Suggested/tested approaches

  • to generate training data and build an initial incarnation of the detector:
    • find valid segments of 3D trajectories
    • use existing code to project 3D tracks into 2D images
    • find suitable bounding box and use image crop as a training image
    • build database of these and retrain object detector, see Tutorials.
  • talk to people working on tracking for ideas on how to integrate visual and marker detections.