Operating, managing and securing networks require a thorough understanding of the demands placed on the network by the endpoints it interconnects, the characteristics of the traffic the endpoints generate, and the distribution of that traffic over the resources of the network infrastructure. A major differentiator in the types of resource required by traffic is the class of endpoint application that generates it. Service providers determine the application mix present in traffic via measurements, e.g., flow measurements furnished by routers. Previous work has shown that a fairly accurate determination of application type can be made from this data. However, protocol level information, such as TCP/UDP ports and other parts of the transport header, and also parts of the network header in some cases, may not be accessible due to the use of encryption or tunneling protocols by endpoints or gateways. Furthermore, the utility of ports as signifiers of application type has some limitations due to abuseand non-standard usage, amongst other reasons. These factors reduce the classification accuracy. In this paper, we propose a novel technique for inferring the distribution of application classes present in the aggregated traffic flows between endpoints, that exploits both the measured statistics of the traffic flows, and the spatial distribution of those flows across the network. Our method employs a two-step supervised model, where the bootstrapping step provides initial (inaccurate) inference on the traffic application classes, and the graph-based calibration step adjusts the initial inference through the collective spatial traffic distribution. In evaluations using real traffic flow measurements from a large ISP, we show how our method can accurately classify application types within aggregate traffic between endpoints, even without knowledge of ports and other traffic features. While the bootstrap estimate classifies the aggregates with 80% accuracy, incorporating spatial distributions through calibration increases the accuracy to 92%, i.e., roughly halving the number of errors.