For the direction of our project, we want to identify geometric classes in the scene as well as object labels. We can use the depth maps from the NYU dataset to train geometric classes for objects based on their surface normals, which we can extract from the depth map. We will follow closely the methods used by Hoiem et al. in the following paper: Geometric Context from a Single Image.
The following is from a sample image in the NYU dataset of a room with a depth map, and object labels.
1) RGB image
2) Depth map with the vector field of gradients, which gives us the 3D orientation of the surface
3) Magnitude of the gradient divided by the depth value squared, and with histogram normalization
4) Object labels from dataset
5) Superpixels segmentation
Close up of the depth map with gradients. We are hoping to extract the surface normal of different surfaces in the
scene from the 2D gradient of the depth map. The direction of the
gradient should indicate the X and Y components of the surface normal,
while the magnitude should give us some indication of the Z component of
the normal.
(Left to right)
1) Depth map image
2) Raw gradient magnitude
3) Gradient magnitude after applying histogram equalization. For surfaces that recede into the scene, the gradient magnitude increases. This represents a problem, because for a flat surface, the surface normal should remain consistent throughout (otherwise it looks like the surface is curved).
4) After dividing the gradient magnitude by the depth-value squared, the gradient magnitude remains more consistent across flat surfaces receding into the scene
Here is a more problematic image
For this image, dividing by the depth squared did not fix the problem of the gradient magnitude remaining consistent throughout the flat surface of the hallway.
Another problem with using the depth map gradient to determine the surface normal is that object edges cause spikes in the gradient magnitude, and they don't necessarily represent surfaces (like the object on the wall near the bottom of the image)
We are most likely going to be using some form of Adaboost to develop our classifier.
Hey Hi,
ReplyDeleteAm working on Adaptive projection with depth cam. Seems you have done great work in area of depth cam. I would like to know the method used for raw gradient magnitude detection