Result:Average CSMC Regression Regret
For all average constrained CSMC distributions $D$, and all cost-sensitive classifiers $h_g$ derived from a regressor $g$, \[ r_{av} (h_g) \leq \sqrt{2 |K| \epsilon_{L^2} (g)}, \] where $\epsilon_{L^2} (g)$ is the $L^2$ loss on the underlying regression problem.
Proof: See Average Analysis below.
Proof: See Average Analysis below.
Result:Minimax CSMC Regression Regret
For all minimax constrained CSMC distributions $D$, and all cost-sensitive classifiers $h_g$ derived from a regressor $g$, \[ r_{mm} (h_g) \leq \sqrt{2 |K| \epsilon_{L^2} (g)}, \] where $\epsilon_{L^2} (g)$ is the $L^2$ loss on the underlying regression problem.
Proof: See Minimax Analysis below.
Proof: See Minimax Analysis below.
Average Analysis
In this case, there is a distribution $D = D_x \times D_{\omega|x} \times D_{c|\omega,x}$, where $c: K \to \mathbf{R}$ takes values in the extended reals $\mathbf{R} = \mathbb{R} \cup \{ \infty \}$, and the components of $c$ which are $\infty$-valued for a particular instance are revealed as part of the problem instance via $\omega \in \mathcal{P} (K)$ (i.e., $\omega$ is a subset of $K$). The regret of a particular classifier $h: X \times \mathcal{P} (K) \to K$ is given by \[ r_{av} (h) = E_{(x, \omega) \sim D_x \times D_{\omega|x}} \left[ E_{c \sim D_{c|\omega,x}} \left[ c (h (x, \omega)) - \min_{k \in K}\; E_{c \sim D_{c|\omega,x}} \left[ c (k) \right] \right] \right]. \] An argmax regression strategy to solve cost-sensitive multiclass classifier is a function $g: X \times \mathcal{P} (K) \times K \to \mathbf{R}$ which defines an associated cost-sensitive multiclass classifier $h_g: X \times \mathcal{P} (K) \to K$ according to \[ h_g (x, \omega) = \underset{k \in K}{\operatorname{arg\,min\;}} g (x, \omega, k). \] I would like to bound $r_{av} (h_g)$ in terms of the regret of $g$ on the regression problem, \[ \epsilon_{L} (g) = q_{L} (g) - \min_g\; q_{L} (g), \] where $q$ is the error on the regression problem \[ q_{L} (g) = E_{(x, \omega, c) \sim D} \left[ \frac{1}{|K|} \sum_{k \in K} L \bigl (g (x, \omega, k), c (k) \bigr) \right], \] and $L$ is a loss function for the regression problem (defined on the extended reals). I'll focus on $L^2$ loss for the regressor defined on the extended reals via $L^2 (\infty, \infty) = 0$, $L^2 (\infty, \cdot) = \infty$, and $L^2 (\cdot, \infty) = \infty$.Consider a single instance $(x, \omega)$ with associated conditional per-instance cost-vector distribution $D_{c|\omega,x}$, and suppose our regressor has cost estimates which differ from a minimum error regressor's estimate $c^* (x, \omega, k)$ by $\delta (x, \omega, k)$, \[ g (x, \omega, k) = c^* (x, \omega, k) + \delta (x, \omega, k). \] For $k \in \omega$, $\delta (x, \omega, k) = 0$ since both $c^* (x, \omega, k)$ and our regressor $g (x, \omega, k)$ will be $\infty$. The associated classifier $h_g$ is \[ h_g (x, \omega) = \underset{k \in K}{\operatorname{arg\,min\,}} \bigl( c^* (x, \omega, k) + \delta (x, \omega, k) \bigr). \] Imagine an adversary which attempts to create a certain amount of CSMC regret on this instance while minimizing the amount of regression regret on this instance. This adversary is faced with the following family of problems indexed by $k^{**} \in K \setminus \omega$: \[ \begin{aligned} &\min_{\delta}\; \sum_{k \in K} \delta (x, \omega, k)^2 \\ \mathrm{s.t.} \; \forall k \neq k^{**}, \; & c^* (x, \omega, k^{**}) + \delta (x, \omega, k^{**}) \leq c^* (x, \omega, k) + \delta (x, \omega, k). \end{aligned} \] This is the same as the unconstrained CSMC reduction to regression but with $k^{**}$ restricted to the set $K \setminus \omega$. When $|K \setminus \omega| \leq 1$, the CSMC regret is zero; otherwise the adversary's strategy is unchanged: perturb the $k^*$ and $k^{**}$ estimates and leave others alone. Thus leveraging the previous analysis yields \[ r_{av} (h_g) \leq \sqrt{2 |K| \epsilon_{L^2} (g)}. \] It should also be noted that the regression regret will be at most the regression regret in the unconstrained case, since the additional information contained in $\omega$ allows the regressor to have a perfect estimate for some values of $k$.
Minimax Analysis
In this case, there is a distribution $D = D_x \times D_{\tilde c|x}$, where $\tilde c: K \to \mathbb{R}$ takes values in the regular reals $\mathbb{R}$. Then, an adversary comes in and manufactures a cost vector $c$ in the extended reals $\mathbf{R}$ by setting some of the components to $\infty$; these choices are revealed via $\omega$ prior to a decision being elicited. In this case the regret of a particular classifier is given by \[ r_{mm} (h) = E_{x \sim D_x} \left[ E_{\tilde c \sim D_{\tilde c|x}} \left[ \max_{\omega \in \mathcal{P} (K)} \left\{ c (h (x, \omega)) - \min_{k \in K}\; E_{\tilde c \sim D_{\tilde c|x}} \left[ c (k) \right] \right\} \right] \right]. \] Consider a single instance $x$ with associated conditional per-instance cost-vector distribution $D_{c|x}$; in addition the adversary can pick $\omega$ to construct the complete problem instance $(x, \omega)$. As above the regressor has cost estimates which differ from a minimum error regressor's estimate $c^* (x, \omega, k)$ by $\delta (x, \omega, k)$ and when $k \in \omega$ the estimates are perfect, i.e., $\delta (x, \omega, k) = 0$.Imagine an adversary which attempts to create a certain amount of CSMC regret on this instance while minimizing the amount of regression regret on this instance. This adversary is faced with the following family of problems indexed by $\omega$ and $k^{**} \in K \setminus \omega$: \[ \begin{aligned} &\min_{\delta}\; \sum_{k \in K} \delta (x, \omega, k)^2 \\ \mathrm{s.t.} \; \forall k \neq k^{**}, \; & c^* (x, \omega, k^{**}) + \delta (x, \omega, k^{**}) \leq c^* (x, \omega, k) + \delta (x, \omega, k). \end{aligned} \] Again when $|K \setminus \omega| \leq 1$, the CSMC regret is zero; otherwise the adversary's strategy is the same for each $\omega$ and the leveraging the previous analysis yields \[ r_{mm} (h_g) \leq \sqrt{2 |K| \epsilon_{L^2} (h_g)}. \]
No comments:
Post a Comment