Testing DNN Image Classifiers for Confusion &amp; Bias Errors

Yuchi Tian; Ziyuan Zhong; Vicente Ordonez; Gail Kaiser; Baishakhi Ray

doi:10.1145/3377811.3380400

← back to publications

publication

Testing DNN Image Classifiers for Confusion & Bias Errors

Yuchi Tian, Ziyuan Zhong, Vicente Ordonez, Gail Kaiser, Baishakhi Ray.

International Conference on Software Engineering. ICSE 2020. October 2020.

paper pdf raw bibtex

Lab News Desk

News Release Summary

This section is intentionally written in a reporter-style news release voice for general readers.

Researchers from Columbia University and the University of Virginia have developed a testing tool called DeepInspect that automatically hunts for systematic errors in the deep neural networks used to classify images — the kind of software behind everything from Google Photos to medical diagnostic systems. The team was motivated by a class of failures that go beyond one-off mistakes: cases where a model consistently muddles an entire category of images with another, or treats two groups of people unequally — problems they call "confusion" and "bias" errors. Rather than examining individual images the way most existing testing methods do, DeepInspect works by tracking which neurons inside a network fire when the model processes each class of images, then building a statistical profile of those activation patterns per class. If two classes activate suspiciously similar sets of neurons, the tool flags them as likely confused; if the model keeps different distances between, say, "man" and "surfboard" versus "woman" and "surfboard," it flags that asymmetry as a potential bias. Tested across eight neural network models and six well-known datasets — including ImageNet, COCO, and CIFAR — the tool found hundreds of real classification mistakes, detecting confusion errors with precision as high as 100 percent and bias errors with precision up to 84 percent when focusing on its top-ranked findings. Notably, it uncovered these class-level flaws even in models specifically designed to be robust against adversarial attacks, suggesting the two types of problems are largely independent. The work matters because class-level bugs, unlike isolated mispredictions, represent structural weaknesses that affect entire groups of users or objects — the sort of flaw that led to Google's infamous 2015 incident tagging photos of Black people as gorillas — and existing testing frameworks largely miss them.

abstract

Image classifiers are an important component of today's software, from consumer and business applications to safety-critical domains. The advent of Deep Neural Networks (DNNs) is the key catalyst behind such wide-spread success. However, wide adoption comes with serious concerns about the robustness of software systems dependent on DNNs for image classification, as several severe erroneous behaviors have been reported under sensitive and critical circumstances. We argue that developers need to rigorously test their software's image classifiers and delay deployment until acceptable. We present an approach to testing image classifier robustness based on class property violations. We found that many of the reported erroneous cases in popular DNN image classifiers occur because the trained models confuse one class with another or show biases towards some classes over others. These bugs usually violate some class properties of one or more of those classes. Most DNN testing techniques focus on per-image violations, so fail to detect class-level confusions or biases. We developed a testing technique to automatically detect class-based confusion and bias errors in DNN-driven image classification software. We evaluated our implementation, DeepInspect, on several popular image classifiers with precision up to 100% (avg.~72.6%) for confusion errors, and up to 84.3% (avg.~66.8%) for bias errors. DeepInspect found hundreds of classification mistakes in widely-used models, many exposing errors indicating confusion or bias.

details

DOI: 10.1145/3377811.3380400

citation