We have created a system that automatically records the inside-the-patient images of each colonoscopy in de-identified fashion. At present this “big data” database contains around 100 TB of de-identified endoscopy data. Interval colorectal cancers (CRCs) are CRCs that develop despite periodic colonoscopy and are due to de novo tumor growth, a missed lesion or incomplete lesion removal. Using a combination of location, date, time and image information we were able to find a video file within our de-identified big data from a prior colonoscopy that belonged to a patient with a recently diagnosed large interval lesion. Analysis of the video file showed that a large lesion was incompletely removed. Analysis of big endoscopy datasets has the potential to resolve the cause of most if not all interval lesions and CRCs and can provide specific, focused education to endoscopists related to their individual limitations.