performance evaluation, rail defect detection, statistical methods
Link or Download
Testing binary classifiers usually requires a test set with labeled positive and negative examples. In many real-world applications however, some positive objects are manually labeled while negative objects are not labeled explicitly. For instance in the detection of defects in a large collection of objects, the most obvious defects are normally found with ease, while normal-looking objects may just be ignored. In this situation, datasets will consist of only positive and unlabeled samples. Here we propose a measure to estimate the performance of a classifier with test sets lacking labeled negative examples. Experiments are performed to show the effect of several criteria on the accuracy of our estimation, including that of the assumption of “random sampling of the labeled positives”. We put the measure into use for classification of real-world defect detection data with no available validation sets.