Scene Understanding: Extracting surface normals from depth maps

For the direction of our project, we want to identify geometric classes in the scene as well as object labels. We can use the depth maps from the NYU dataset to train geometric classes for objects based on their surface normals, which we can extract from the depth map. We will follow closely the methods used by Hoiem et al. in the following paper: Geometric Context from a Single Image.

The following is from a sample image in the NYU dataset of a room with a depth map, and object labels.

(From left to right).

1) RGB image

2) Depth map with the vector field of gradients, which gives us the 3D orientation of the surface

3) Magnitude of the gradient divided by the depth value squared, and with histogram normalization

4) Object labels from dataset

5) Superpixels segmentation

Close up of the depth map with gradients. We are hoping to extract the surface normal of different surfaces in the scene from the 2D gradient of the depth map. The direction of the gradient should indicate the X and Y components of the surface normal, while the magnitude should give us some indication of the Z component of the normal.

(Left to right)

1) Depth map image

2) Raw gradient magnitude

3) Gradient magnitude after applying histogram equalization. For surfaces that recede into the scene, the gradient magnitude increases. This represents a problem, because for a flat surface, the surface normal should remain consistent throughout (otherwise it looks like the surface is curved).

4) After dividing the gradient magnitude by the depth-value squared, the gradient magnitude remains more consistent across flat surfaces receding into the scene

Here is a more problematic image

For this image, dividing by the depth squared did not fix the problem of the gradient magnitude remaining consistent throughout the flat surface of the hallway.

Another problem with using the depth map gradient to determine the surface normal is that object edges cause spikes in the gradient magnitude, and they don't necessarily represent surfaces (like the object on the wall near the bottom of the image)

We are most likely going to be using some form of Adaboost to develop our classifier.

Scene Understanding

Monday, May 14, 2012

Extracting surface normals from depth maps

1 comment: