The problem of estimating an unknown function from a finite number of noisy data points has fundamental importance for many applications. This problem has been studied in statistics, applied math, engineering, artificial intelligence, and, more recently, in the fields of artificial neural networks, fuzzy systems, and genetic optimization. In spite of many papers describing individual methods, very little is known about the comparative predictive (generalization) performance of various methods. We discuss subjective and objective factors contributing to the difficult problem of meaningful comparisons. We also describe a pragmatic framework for comparisons between various methods, and present a detailed comparison study comprising several thousand individual experiments. Our approach to comparisons is biased toward general (nonexpert) users who do not have detailed knowledge of the methods used. Our study uses six representative methods described using a common taxonomy. Comparisons performed on artificial data sets provide some insights on applicability of various methods. No single method proved to be the best, since a method's performance depends significantly on the type of the target function (being estimated), and on the properties of training data (i.e., the number of samples, amount of noise, etc.). Hence, our conclusions contradict many known comparison studies (performed by experts) that usually show performance superiority of a single method (promoted by experts). We also observed the difference in a method's robustness, i.e., the variation in predictive performance caused by the (small) changes in the training data. In particular, statistical methods using greedy (and fast) optimization procedures tend to be less robust than neural-network methods using iterative (slow) optimization for parameter (weight) estimation.