Evaluating Classification Performance with only Positive and Unlabeled Samples

January 1, 2016 in

Journal Paper

Siamak Hajizadeh

ISSN 0302-9743
DOI 10.1007/978-3-662-44415-3_24



Publishing date:

performance evaluation, rail defect detection, statistical methods

Link or Download
Not available


Testing binary classifiers usually requires a test set with labeled positive and negative examples. In many real-world applications however, some positive objects are manually labeled while negative objects are not labeled explicitly. For instance in the detection of defects in a large collection of objects, the most obvious defects are normally found with ease, while normal-looking objects may just be ignored. In this situation, datasets will consist of only positive and unlabeled samples. Here we propose a measure to estimate the performance of a classifier with test sets lacking labeled negative examples. Experiments are performed to show the effect of several criteria on the accuracy of our estimation, including that of the assumption of “random sampling of the labeled positives”. We put the measure into use for classification of real-world defect detection data with no available validation sets.