BACKGROUND: Tone languages such as Thai and Mandarin Chinese use differences in fundamental frequency (F0, pitch) to distinguish lexical meaning. Previous behavioral studies have shown that native speakers of a non-tone language have difficulty discriminating among tone contrasts and are sensitive to different F0 dimensions than speakers of a tone language. The aim of the present ERP study was to investigate the effect of language background and training on the non-attentive processing of lexical tones. EEG was recorded from 12 adult native speakers of Mandarin Chinese, 12 native speakers of American English, and 11 Thai speakers while they were watching a movie and were presented with multiple tokens of low-falling, mid-level and high-rising Thai lexical tones. High-rising or low-falling tokens were presented as deviants among mid-level standard tokens, and vice versa. EEG data and data from a behavioral discrimination task were collected before and after a two-day perceptual categorization training task. RESULTS: Behavioral discrimination improved after training in both the Chinese and the English groups. Low-falling tone deviants versus standards elicited a mismatch negativity (MMN) in all language groups. Before, but not after training, the English speakers showed a larger MMN compared to the Chinese, even though English speakers performed worst in the behavioral tasks. The MMN was followed by a late negativity, which became smaller with improved discrimination. The High-rising deviants versus standards elicited a late negativity, which was left-lateralized only in the English and Chinese groups. CONCLUSION: Results showed that native speakers of English, Chinese and Thai recruited largely similar mechanisms when non-attentively processing Thai lexical tones. However, native Thai speakers differed from the Chinese and English speakers with respect to the processing of late F0 contour differences (high-rising versus mid-level tones). In addition, native speakers of a non-tone language (English) were initially more sensitive to F0 onset differences (low-falling versus mid-level contrast), which was suppressed as a result of training. This result converges with results from previous behavioral studies and supports the view that attentive as well as non-attentive processing of F0 contrasts is affected by language background, but is malleable even in adult learners.