Popis: |
Convolutional neural nets have been applied to the task of semantic image segmentation and surpassed previous methods. But even state-of-the-art systems fail on many portions of modern segmentation datasets. We observe that these failures are not random, but in most cases systematic and partially predictable. In particular, the confusion of a segmentation model is mostly stable. We propose compact descriptors of classifier behavior and of visual scene type. These descriptors can be applied in a Bayesian framework to reason about the reliability of predictions returned by a semantic segmentation model, and to correct mistakes in those results contingent on the ability to characterize images at the scene level. We demonstrate, using a competitive semantic segmentation model and several challenging datasets, that the upper bound of this approach is a great improvement in accuracy. The future work we describe has the potential to yield flexible and broad-ranging improvements to deep scene understanding and similar classification problems. |