|This article is of interest to the following WikiProjects:|
This article uses the notation KL(p, q) and also DKL(p || m) when talking about Kullback-Leibler divergence. Are these notations two ways of expressing the same idea? If so, the article may want to indicate this equivalence.
The log-likelihood of the training data for a multinomial model is the same as the cross-entropy of the data. (Elements of Statistical Learning, page 32)
L(theta) = sum (all classes k) I(G=k) log Pr(G=k | X = x)
I guess "I(G=k)" is p and Pr(G=k | X=x) is q here.
Could somebody in the know please add this? Thanks!
WikiProject class rating
This article was automatically assessed because at least one WikiProject had rated the article as stub, and the rating on other projects was brought up to Stub class. BetacommandBot 09:46, 10 November 2007 (UTC)
To improve clarity, this stub should be merged into the "Kullback–Leibler divergence" article. "Cross entropy" and "relative entropy" refer to the same quantity in the literature, at least up to a sign convention. DRB (talk) 00:39, 15 January 2010 (UTC)
- Hmmm... fair idea but not a cut-and-dry case. maybe put a merge tag on it, at least. Kevin Baastalk 14:07, 15 January 2010 (UTC)