Computer-aided detection (CADe) has become a standard tool in diagnostic radiology. Because CADe systems are highly sensitive, they can help radiologists detect abnormalities in medical images. However, CADe systems also detect large numbers of false positives that distract radiologists and reduce their confidence in the CADe technology. Colon cancer is the second leading cause of cancer deaths for men and women in the United States, but it would be preventable by early detection and removal of its precursor lesions. Computed tomographic colonography (CTC) has been endorsed as a viable option for colorectal screening under the guidelines of the American Cancer Society, U.S. Multi-Society Task Force, and the American College of Radiology. However, the interpretation of CTC images requires skill and it is time-consuming. The first-reader CADe paradigm, where a radiologist interprets a CTC study by reviewing an image gallery of potential abnormalities detected by CADe at high sensitivity, has recently been shown to provide an accurate workflow for the detection of colorectal lesions. Although most of the false-positive (FP) CADe detections are easy to dismiss even for inexperienced human readers, for a radiologist a review of a large number of CADe detections is tedious, time-consuming, and expensive. If, however, we could use crowdsourcing to distribute the images of high-sensitivity CADe detections to knowledge workers (KWs) who have been trained to dismiss FP CADe detections, we could implement an advanced high-performance CTC interpretation system that yields both high detection sensitivity and specificity. Crowdsourcing with big data can also be used to improve the performance of machine learning that is largely the reason for the low specificity of current CADe systems. The incorporation of big data in the form of large and representative online training databases to improve the classification performance of machine learning can be implemented by identifying relevant new training cases by detecting disagreements between crowdsourcing-based interpretations and computer- estimated lesion likelihoods. The goal of this project is to develop a Crowdsourcing-Aided MachinE Learning (CAMEL) scheme that will integrate machine learning with crowdsourcing for the detection of colorectal lesions in CTC. Although we will use colon CADe as an example system, the proposed concept applies to other CADe applications as well. We hypothesize that the CAMEL scheme will achieve a classification accuracy that is higher than that of machine learning alone and equivalent to that of unaided expert radiologists, and that it can improve radiologists' performance in the detection of clinically significant lesion in CTC. To achieve the goal and to test the study hypothesis, we will explore the following specific aims: (1) Develop a decision support (DES) system which allows human participation; (2) Develop a CAMEL scheme for polyp detection with crowdsourcing; (3) Evaluate the clinical benefit of CAMEL. Successful development of CAMEL will demonstrate the clinical benefit of an engaging crowdsourcing platform for accurate colon cancer screening.
Successful development of Crowdsourcing-Aided MachinE Learning (CAMEL) will demonstrate the clinical benefit of a crowdsourcing platform for accurate colon cancer screening. Because computer-aided detection (CADe) systems excel at detecting lesions (high sensitivity), whereas humans excel at removing false-positive CADe detections (high specificity), a crowdsourcing-assisted machine learning scheme should outperform individual human readers and CADe systems. In long term, broad adoption and use of the CAMEL scheme will facilitate early and accurate diagnoses, and thus will reduce mortality from colon cancer that is the third leading cause of cancer deaths in the United States.