# Randomized weighted majority algorithm

The Randomized weighted majority algorithm is an algorithm that belongs to the machine learning theory.[1] It minimize the mistake-bound of the weighted majority algorithm. For example, each day we get a prediction from some number of experts about whether the stock market is up or down.Then we use them to make our prediction. Our goal is to do nearly as well as best of the experts in hindsight.

## Motivation

In machine learning, the weighted majority algorithm(WMA) is a meta-learning algorithm which "predicts from expert advice".

The algorithm:

initialize all experts to weight 1.
for each round:
poll all the experts and predict based on a weighted majority vote of their predictions.
cut in half the weights of all experts that make a mistake.


Suppose there are n experts and the best experts made m mistakes. the weighted majority algorithm makes at most $\ 2.4(log_2n+ m$) mistakes.

## Randomized weighted majority(RWMN)

$\ 2.4(log_2n + m$) not so good if the best expert makes a mistake 20% of the time. can we do better? yes. we'd like to improve dependence on $\ m$. Instead of predicting based on majority vote, we use the weighted as probabilities - Randomized weighted majority. If $\ w_i$ is the weight of expert $\ i$, $\ W=$$\ \sum_iw_i$, then we will follow expert $\ i$ with probability $\ \frac{w_i}{W}$. Our goal is to bound the worst-case expected number of mistakes, assuming that the adversary(the world) has to select one of the answers as correct before we make our coin toss. Why is this better in the worst case? Idea: worst case for determinisric algorithm(weighted majority algorithm) was when weights split 50/50.But, now it is not so bad since we also have 50/50 chance of getting it right. Also, to trade-off between dependence on $\ m$ and $\ log_2n$, we will generalize to multiply by $\ \beta<1$, instead of necessarily $\frac{1}{2}$.

## Analysis

At the $\ t$-th round, define $\ F_t$ to be the fraction of weight on the wrong answers. so, $\ F_t$ is the probability we make a mistake on the $\ t$-th round. Let $\ M$ denote the total number of mistakes we made so far. Furthermore, we define $E[M]=\ \sum_tF_t$, using the fact that expectation is additive. On the $\ t$-th round, $W$ becomes $\ W(1-(1-\beta)F_t)$. Reason: on $\ F_t$ fraction, we are multiplying by $\ \beta$. So, $\ W_{final}=n*(1-(1-\beta)F_1)*(1-(1-\beta)F_2)...$
Let's say that $\ m$ is the number of mistakes of the best expert so far. We can use the inequality $\ W\geq \beta^m$. Now we solve. First, take the natural log of both sides. We get: $\ mln\beta \leq ln(n) + \sum_tln(1-(1-\beta)F_t)$, Simplify:
$\ ln(1-x)= -x -\frac {x^2}{2} - \frac {x^3}{3}-...$, So,
$\ ln(1-(1-\beta)F_t)< -(1-\beta)F_t$.
$\ mln\beta \leq ln(n) - (1-\beta)* \sum_tF_t$
Now, use $\ E[M] =\ \sum_tF_t$, and the result is:
$\ E[M] \leq \frac {mln(1/\beta)+ln(n)}{1-\beta}$
Let's see if we made any progress:

If $\ \beta=\frac{1}{2}$, we get, $\ 1.39m+2ln(n).$,
if $\ \beta=\frac{3}{4}$, we get, $\ 1.15m+4ln(n)$.
so we can see we made progress. Roughly, of the form $\ (1+\epsilon)*m+\epsilon^{-1}*ln(n)$.

## Uses of Randomized weighted Majority(RWMN)

Can use to combine multiple algorithms to do nearly as well as best in hindsight.

can apply Randomized weighted majority algorithm in situations where experts are making choices that cannot be combined (or can't be combined easily).For instance, repeated game-playing or online shortest path problem.In the online shortest path problem, each expert is telling you a different way to drive to work. You pick one using Randomized weighted majority algorithm. Later you find out how well you would have done, and penalize appropriately. To do this right, we want to generalize from just "losS" of 0 to 1 to losses in [0,1]. Goal of having expected loss be not too much worse than loss of best expert.We generalize by penalize $\beta^{loss}$, meaning having two examples of loss $\ \frac {1}{2}$ gives same weight as one example of loss 1 and one example of loss 0 (Analysis still oes through).

## Extensions

- "Bandit" problem
- Efficient algorithm for some cases with many experts.
- Sleeping experts/"specialists" setting.