Addressing Ambiguity in Object Instance Detection

Edward Hsiao
doctoral dissertation, tech. report CMU-RI-TR-13-16, Robotics Institute, Carnegie Mellon University, June, 2013

  • Adobe portable document format (pdf) (54MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

In this thesis, we study the topic of ambiguity when detecting object instances in scenes with severe clutter and occlusions. Our work focuses on the three key areas: (1) objects that have ambiguous features, (2) objects where discriminative point-based features cannot be reliably extracted, and (3) occlusions. Current approaches for object instance detection rely heavily on matching discriminative point-based features such as SIFT. While one-to-one correspondences between an image and an object can often be generated, these correspondences cannot be obtained when objects have ambiguous features due to similar and repeated patterns. We present the Discriminative Hierarchical Matching (DHM) method which preserves feature ambiguity at the matching stage until hypothesis testing by vector quantization. We demonstrate that combining our quantization framework with Simulated Affine features can significantly improve the performance of 3D point-based recognition systems. While discriminative point-based features work well for many objects, they cannot be stably extracted on smooth objects which have large uniform regions. To represent these feature-poor objects, we first present Gradient Networks, a framework for robust shape matching without extracting edges. Our approach incorporates connectivity directly on low-level gradients and significantly outperforms approaches which use only local information or coarse gradient statistics. Next, we present the Boundary and Region Template (BaRT) framework which incorporates an explicit boundary representation with the interior appearance of the object. We show that the lack of texture in the object interior is actually informative and that an explicit representation of the boundary performs better than a coarse representation. While many approaches work well when objects are entirely visible, their performance decrease rapidly with occlusions. We introduce two methods for increasing the robustness of object detection in these challenging scenarios. First, we present a framework for capturing the occlusion structure under arbitrary object viewpoint by modeling the Occlusion Conditional Likelihood that a point on the object is visible given the visibility labelings of all other points. Second, we propose a method to predict the occluding region and score a probabilistic matching pattern by searching for a set of valid occluders. We demonstrate significant increase in detection performance under severe occlusions.


Text Reference
Edward Hsiao, "Addressing Ambiguity in Object Instance Detection," doctoral dissertation, tech. report CMU-RI-TR-13-16, Robotics Institute, Carnegie Mellon University, June, 2013

BibTeX Reference
   author = "Edward Hsiao",
   title = "Addressing Ambiguity in Object Instance Detection",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "June",
   year = "2013",
   number= "CMU-RI-TR-13-16",
   address= "Pittsburgh, PA",