Discriminative Techniques For The Recognition Of Complex-Shaped Objects

Owen Carmichael
doctoral dissertation, tech. report CMU-RI-TR-03-34, Robotics Institute, Carnegie Mellon University, September, 2003

  • Adobe portable document format (pdf) (7MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision, as doing so would have an immediate impact on a far-reaching set of applications in medicine, surveillance, manufacturing, robotics, and other areas.

The central theme of this work is that formulating object recognition as a discrimination problem can ease the burden of system design. In particular, we show that thinking of recognition in terms of discriminating between objects and clutter, rather than separately modeling the appearances of objects and clutter, can simplify the processes of extracting information from the image and identifying which parts of the image correspond with parts of objects.

The bulk of this thesis is concerned with recognizing "wiry" objects in highly-cluttered images; an example problem is finding ladders in images of a messy warehouse space. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because they tend to lack distinctive color or texture characteristics and their appearance is not easy to describe succinctly in terms of rectangular patches of image pixels. Here, we present a set of algorithms which extends current capabilities to find wiry objects in highly cluttered images across changes in the clutter and object pose. Specifically, we present discrimination-centered techniques for extracting shape features from portions of images, classifying those features as belonging to an object of interest or not, and aggregating found object parts together into overall instances of objects. Moreover, we present a suite of experiments on real, wiry objects?a chair, cart, ladder, and stool respectively ?which substantiates the utility of these methods and explores their behavior.

The second part of the thesis presents a technique for extracting texture features from images in such a way that features from objects of interest are both well-clustered with each other and well-separated from the features from clutter. We present an optimization framework for automatically combining existing texture features into features that discriminate well, thus simplifying the process of tuning the parameters of the feature extraction process. This approach is substantiated in recognition experiments on real objects in real, cluttered images.

Associated Center(s) / Consortia: Vision and Autonomous Systems Center
Number of pages: 182

Text Reference
Owen Carmichael, "Discriminative Techniques For The Recognition Of Complex-Shaped Objects," doctoral dissertation, tech. report CMU-RI-TR-03-34, Robotics Institute, Carnegie Mellon University, September, 2003

BibTeX Reference
   author = "Owen Carmichael",
   title = "Discriminative Techniques For The Recognition Of Complex-Shaped Objects",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "September",
   year = "2003",
   number= "CMU-RI-TR-03-34",
   address= "Pittsburgh, PA",