InsideDarkWeb.com

Loss function for regression with uncertain labels

I have a regression task, for which I’m training a model with MSE loss. So for label $y$ and estimation $hat{y}$ the loss is
$$ell(y,hat{y})=(y-hat{y})^2$$
However, there is an uncertainty in the “true” labels, which varies across labels. So each true label is drawn from a distribution for which I can obtain a reasonable estimate for any statistic e.g. the standard deviation.

I’d like the loss to reflect the variation in the true label $y$. I thought about simply normalizing by the standard deviation of each label

$$ellleft(y,hat{y}right)=left(frac{y-hat{y}}{sigmaleft(yright)}right)^{2}$$

Or, since sometimes $sigma(y)=0$, maybe

$$ellleft(y,hat{y}right)=left(frac{y-hat{y}}{1+sigmaleft(yright)}right)^{2}$$

But this seems too ad-hoc. Is there a standard theory or approach people use in this sort of situation?

Cross Validated Asked on November 14, 2021

1 Answers

One Answer

Usual approach in statistics is to consider the errors $epsilon_i= y_i-E[y_i|x]$ homoscedastic with variance $sigma^2$. This assumption, joint with independence one, results in least squares as the loss function for estimating $E[y_i|x]$.

If your measures of $y$ are themselves variable, the variance of errors should be $sigma^2 + sigma(y_i)^2$. This results in a loss function of $sum_i w_i(y_i-hat y_i)^2$, where $w_i= (sigma^2 + sigma(y_i)^2)^{-1/2}$.

Problem is that $sigma^2$, a.k.a residual variance, is not known, and has to be estimated, and it can't be estimated afterwards the rest of the model, which needs it to properly define loss function. Solution is given by Iteratively Reweighted Least Squares. That's a quite intuitive algorithm, one simple explanation is available in section 2.3 of this document.

Answered by carlo on November 14, 2021

Add your own answers!

Related Questions

How big should my subsample be?

1  Asked on December 11, 2020 by kaecvtionr

 

How do you compare standard deviations?

2  Asked on December 10, 2020 by yaynikkiprograms

   

Can k-fold CV help reduce sampling bias?

0  Asked on December 9, 2020 by aite97

         

Random Censoring scheme in Weibull Distribution

0  Asked on December 8, 2020 by soham-bagchi

     

Ask a Question

Get help from others!

© 2021 InsideDarkWeb.com. All rights reserved.