Data-driven modeling usually suffers from data sparsity, especially for large-scale modeling for urban phenomena based on single-source urban infrastructure data under fine-grained spatial-temporal contexts. To address this challenge, we motivate, design and implement UrbanCPS, a cyber-physical system with heterogeneous model integration, based on extremely-large multi-source infrastructures in a Chinese city Shenzhen, involving 42 thousand vehicles, 10 million residents, and 16 million smartcards. Based on temporal, spatial and contextual contexts, we formulate an optimization problem about how to optimally integrate models based on highly-diverse datasets, under three practical issues, i.e., heterogeneity of models, input data sparsity or unknown ground truth. We further propose a real-world application called Speedometer, inferring real-time traffic speeds in urban areas. The evaluation results show that compared to a state-of-the-art system, Speedometer increases the inference accuracy by 21% on average.