online read us now
Paper details
Number 4 - December 2018
Volume 28 - 2018
The feature selection problem in computer-assisted cytology
Marek Kowal, Marcin Skobel, Norbert Nowicki
Abstract
Modern cancer diagnostics is based heavily on cytological examinations. Unfortunately, visual inspection of cytological
preparations under the microscope is a tedious and time-consuming process. Moreover, intra- and inter-observer variations
in cytological diagnosis are substantial. Cytological diagnostics can be facilitated and objectified by using automatic image
analysis and machine learning methods. Computerized systems usually preprocess cytological images, segment and detect
nuclei, extract and select features, and finally classify the sample. In spite of the fact that a lot of different computerized
methods and systems have already been proposed for cytology, they are still not routinely used because there is a need
for improvement in their accuracy. This contribution focuses on computerized breast cancer classification. The task at
hand is to classify cellular samples coming from fine-needle biopsy as either benign or malignant. For this purpose, we
compare 5 methods of nuclei segmentation and detection, 4 methods of feature selection and 4 methods of classification.
Nuclei detection and segmentation methods are compared with respect to recall and the F1 score based on the Jaccard
index. Feature selection and classification methods are compared with respect to classification accuracy. Nevertheless, the
main contribution of our study is to determine which features of nuclei indicate reliably the type of cancer. We also check
whether the quality of nuclei segmentation/detection significantly affects the accuracy of cancer classification. It is verified
using the test set that the average accuracy of cancer classification is around 76%. Spearman’s correlation and chi-square
test allow us to determine significantly better features than the feature forward selection method.
Keywords
nuclei segmentation, feature selection, classification, breast cancer, convolutional neural network