Oral squamous cell carcinoma (OSCC) is associated with substantial mortality and morbidity. To identify potential biomarkers for the early detection of invasive OSCC, we compared the gene expressions of incident primary OSCC, oral dysplasia, and clinically normal oral tissue from surgical patients without head and neck cancer or preneoplastic oral lesions (controls), using Affymetrix U133 2.0 Plus arrays. We identified 131 differentially expressed probe sets using a training set of 119 OSCC patients and 35 controls. Forward and stepwise logistic regression analyses identified 10 successive combinations of genes which expression differentiated OSCC from controls. The best model included LAMC2, encoding laminin-γ2 chain, and COL4A1, encoding collagen, type IV α1 chain. Subsequent modeling without these two markers showed that COL1A1, encoding collagen, type I α1 chain, and PADI1, encoding peptidyl arginine deiminase, type 1, could also distinguish OSCC from controls. We validated these two models using an internal independent testing set of 48 invasive OSCC and 10 controls and an external testing set of 42 head and neck squamous cell carcinoma cases and 14 controls (GEO GSE6791), with sensitivity and specificity above 95%. These two models were also able to distinguish dysplasia (n = 17) from control (n = 35) tissue. Differential expression of these four genes was confirmed by quantitative reverse transcription-PCR. If confirmed in larger studies, the proposed models may hold promise for monitoring local recurrence at surgical margins and the development of second primary oral cancer in patients with OSCC.