Hi, my name is Carolina Matos and I'm finishing my master degree in Mechanical Engineer at Aveiro University. This blog will be a kind of weekly diary for my master thesis project: "Large base-line stereo rig for pedestrian and target detection". This project will be oriented by Professor Vitor Santos, co-oriented by Professor António Neves and it will count with the colaboration of PhD student, Jorge Almeida. As it's seen in the left side bar, the blog will be divided in multiple topics. First, a brief presentation of what I think to be the project - STEREOVISIONLAB. In the next topic the WEEKLY OBJECTIVES are presented; this ones can either be proposed by one of the "conductors" of my project or simply me. After, in PROJECT REPORTS, the results, problems or doubts related to the weekly objectives will be explained. Because this project is about 3D perception with cameras there will be some really interesting results to exhibit and other pictures related to the project in PHOTOGRAPHIC REPORTS. In the section CALENDAR OF EVENTS there will be the main mini-tasks and some other major events related to my master thesis projecto. At last, QUESTIONS & SUGESTIONS is an open topic where anyone can ask me or sugest me anything related to my Stereo Vision Lab. Enjoy and thank you for your visit!


Problem Statement

Perception in 3D is currently a non-turn back trend, and suited sensors are being developed and are becoming common. Besides multimedia and entertainment environments, 3D sensors applications also cover advanced perception for more complete representations of the world, including safety systems and surveillance.  3D perception can be achieved by two fundamental processes: active and passive techniques. Active sensors include laser range finders (LIDAR) or structured light (like Microsoft Kinect and similar devices); passive sensors include mainly stereo cameras. The main difference between these two categories is that active sensors protrude the environment with special light (visible, or invisible – infrared) either to measure distances or to create geometric shapes of light that deform appropriately over the environment forms so they can be caught in ordinary monocular cameras, and passive sensors rely solely on the existing sources of light in the scenes. Active sensors may cause interference with humans or other sensors, and passive sensors are on the other extreme, that is, fully innocuous! Stereo cameras are becoming popular because their price has been dropping and the quality has been increasing. Stereo relies on disparity, that is, two cameras capture the same scene from different points of view and appropriate software calculates disparity and, henceforth, range. A property that characterizes stereo systems is the relative position of the two cameras, mainly their separation, called the base-line. Common cameras have short base-lines (5-15 cm) for a matter of portability, but that limits the maximal distance where range can be discriminated, at most up to a few meters. For applications in Advanced Driver Assistance Systems (ADAS), longer distances are required for example to detect vehicles or other targets in 3D. But large base-lines (1 meter or more) not only provide longer distances with range discrimination, but also provide better 3D models of pedestrians, allowing catching their posture, which increases the possibilities of detecting their body language and expectantly their intentions in crossing roads, or similar actions. Large base-line rigs of cameras have to be custom-made depending on the application, and ADAS is a context of very high relevance that justifies a dedicate approach to develop such a rig and install it on cars for important studies on safety and driver assistance, including also autonomous driving.

Main Objectives

The main objectives of this proposal are:

  1. Development of a structure to mechanically support and assemble, with fine-tuning possibilities, two high speed gigabit Ethernet cameras with a dedicated dual input board. The system is to be mounted on a car, like for example ATLASCAR.
  2. Development of software in ROS environment to obtain disparity and point clouds of scenes with high frame rates in ADAS context and similar applications.
  3. Develop a software application implementing one or more algorithms from the literature to extract features from pedestrians or other relevant targets.

Main Tasks

  1. State of Art about the already existing tecnhiques to obtein stereo images and identification of the systems available. 
  2. Conceptual study of the global system including the hardware and software units.
  3. Project and implementation of a fixation system and camera adjustment to install in a vehicle. 
  4. Installation of an image aquisition unit and low-level software.
  5. Camera calibration system.
  6. Implementation of a software to obtein the disparity image.
  7. Development of a demo aplication to show the performance of the system applied to advanced driver assistance (ADAS).
  8. Write all the documentation and thesis.

(from my thesis proposition, delivered in the Mechanical Engineering Department of Aveiro University)