## Abstract

This paper addresses selection of the loss function for regression problems with finite data. It is well-known (under standard regression formulation) that for a known noise density there exist an optimal loss function under an asymptotic setting (large number of samples), i.e. squared loss is optimal for Gaussian noise density. However, in real-life applications the noise density is unknown and the number of training samples is finite. For such practical situations, we suggest using Vapnik's ε-insensitive loss function. We use practical method for setting the value of ε as a function of known number of samples and (known or estimated) noise variance [1,2]. We consider commonly used noise densities (such as Gaussian, Uniform and Laplacian noise). Empirical comparisons for several representative linear regression problems indicate that Vapnik's ε-insensitive loss yields more robust performance and improved prediction accuracy, in comparison with squared loss and least-modulus loss, especially for noisy high-dimensional data sets.

Original language | English (US) |
---|---|

Pages (from-to) | 395-400 |

Number of pages | 6 |

Journal | IEEE International Conference on Neural Networks - Conference Proceedings |

Volume | 1 |

State | Published - 2004 |

Event | 2004 IEEE International Joint Conference on Neural Networks - Proceedings - Budapest, Hungary Duration: Jul 25 2004 → Jul 29 2004 |