Talk:Minimum description length
WikiProject Statistics  (Rated Startclass, Lowimportance)  


WikiProject Computing / Software  (Rated Startclass, Lowimportance)  


Comments[edit]
The following comment pertains to the previous version of this page, of which only traces remain in the current version. Any new comments are welcome! I'll check this discussion page now and then for the next couple of weeks (september 2004).
see the disccussion Minimum message length which may be relevant to stress the difference between MML and MDL (perhaps the easiest is in the historical sense / pratical issues)  Meduz 12:15, 4 August 2006 (UTC)
This page lacks mathematics. What _exactly_ is the definition of MDL? This should be front and centre, with various fun sounding things interpretations presented afterwards. — Preceding unsigned comment added by 193.60.86.75 (talk) 15:55, 28 June 2017 (UTC)
I'm sure this means something to somebody, but it needs more context to help out those of us that aren't in whatever field this is (the theory of computation, maybe?).
For example:
 what kind of "model" are we talking about?
 what does "in the light of data" mean?
 Who are Wallace and Boulton? Shannon?
 Lots of stuff needs to be wikified (i.e. turned into Wikipedia links)
 Can the second paragraph be clarified? What's the takehome message that it's trying to say?
 What's the "MDL community"?
One more question:
Is it really true that the same priors tend to be favored in "objective Bayesian" analysis as in MDLtheory?
Yes, this is true: The problem with Bayesian probability is that, since the prior is subjective, any probabilities you compute no longer necessarily correspond to frequencies in a repeated experiment, as in classical frequentist probability theory. Objective Bayesians try to get the best of both worlds by using priors that, while not in themselves corresponding to frequencies of any kind, converge as quickly as possible to posterior distributions under which any predictions closely correspond to actual frequencies that you observe in real life experiments. This means that priors should use as little assumptions about the data as possible. Also the prior should be invariant under the functional form of the distribution function, for example: you could parameterize the normal distribution using and , but that shouldn't change which distributions are more likely according to the prior. For these reasons, Jeffreys' prior is often advocated by Objective Bayesians. It turns out that the Bayesian probability using Jeffreys' prior behaves asymptotically the same as the NML probability that is favoured by MDL people (read: Rissanen).
All this is quite involved and I don't think people will be helped if stuff like this is included on the main page. Also, my apologies for the lack of references, I will try to provide a few on request...
I have slightly changed the introductory part of this article: it suggested, incorrectly, that MDL is basically the same thing as inference based on Kolmogorov complexity. While it is inspired on Solomonoff's theory, one of the basic properties of MDL is that it is computable, which it would not be if general computer programs were used to model the data.
fix link to function to be function (mathematics)[edit]
This text in the article: "Central to MDL theory is the 11 correspondence between code length functions and probability distributions. (The lemma involved is the KraftMcMillan inequality.)" linked to the function disambiguation page. I altered the link to point to function (mathematics). I'm fairly certain this is correct, but since the article does reference Turing machines, I'm hoping the text does not refer to subroutines by the word "function." Correct me if I am wrong. Volfy 02:07, 23 November 2005 (UTC)
functions[edit]
No, I really meant ordinary mathematical functions, so the cross reference is correct.
latest modifications[edit]
I made a number of new changes to the page. Most changes are minor. The most important changes are: I included a bit about MDL's builtin protection from overfitting and I cleared up the relation to Bayesian inference and MML. I removed the crossreference to 'reduction (mathematics)' because the phrase 'reduces to' was not intended to be taken in a technicalmathematical way. I also removed the suggestion to merge the topic with MML: the page already contains a very clear reference and the two subjects really are not the same thing. Although I do feel that the exact differences should be explained in more detail. I'll get one of my colleagues to help with this shortly.
another minor change[edit]
I added a little more on the differences between MML and MDL. I also reverted the first sentence to its previous state. Occam's razor is a general principle in the philosophy of science which has nothing to with Bayesian inference so I object to the term 'Bayesian Occam's razor', which suggests that MDL implements a specific kind of Occam's razor which has something to do with Bayes' rule. This is not the case. MDL is proposed directly as an implementation of Occam's razor and developed independently of Bayesian statistics. It is true that there is a very strong connection between MDL and Bayes; both implement Occam's razor for instance, but that doesn't mean that MDL should be defined in terms of Bayesian statistics. I welcome real improvements to the text, which is obviously far from perfect, but this doesn't really help. Finally, as a courtesy to me, if you change the first line, would you please consider motivating your changes on the discussion page?
Coin example, never selecting a biased coin model?[edit]
I think this section needs a citation, clarification, or confirmation by an expert (preferably by citation). It looks like original research as it stands, and I tend to think the conclusion is incorrect.
The simplest counterexample is that of a lossless graphic compression algorithm. These map pixel data into a computablyfeasible model space, and using techniques such as runlength encoding often effectively exploit the fact that not all images appear with equal likelihood, and that color bias and spacial consistency often allow a reduction. This amounts to a model selection. Even a random image of 1000 pixels which are selected according to a severe bias (95% black and 5% white) will be compressible to less than 1000 bits, because at those odds runlength encoding works well, as does an "all are black but these" model. With those languages, should you encounter a picture that is 50/50 random, you may need give up a constant number of bits to say "no better compression possible", but making these practical tradeoffs is one of the advantages of MDL versus Kolomorov computation: as the article says, in MDL the actual constants matter.
12.232.236.2 (talk) 16:36, 5 July 2011 (UTC)
External links modified[edit]
Hello fellow Wikipedians,
I have just added archive links to one external link on Minimum description length. Please take a moment to review my edit. If necessary, add {{cbignore}}
after the link to keep me from modifying it. Alternatively, you can add {{nobotsdeny=InternetArchiveBot}}
to keep me off the page altogether. I made the following changes:
 Added archive http://web.archive.org/web/20100218134039/http://www.mdlresearch.org:80/ to http://www.mdlresearch.org/
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}
).
checked=
to true
 If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
 If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—^{cyberbot II}_{Talk to my owner:Online} 00:18, 19 March 2016 (UTC)
External link defunct[edit]
http://www.mdlresearch.org (edited by the bot) appears to have been taken over by some chinese entity. http://www.mdlresearch.net/ still exists (and refers to the .org domain).
External links modified (February 2018)[edit]
Hello fellow Wikipedians,
I have just modified one external link on Minimum description length. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 Added archive https://web.archive.org/web/20100616234207/http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=11155 to http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=11155
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
As of February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to declutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{sourcecheck}}
(last update: 15 July 2018).
 If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
 If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—InternetArchiveBot (Report bug) 02:14, 1 February 2018 (UTC)
 StartClass Statistics articles
 Lowimportance Statistics articles
 WikiProject Statistics articles
 StartClass Computing articles
 Lowimportance Computing articles
 StartClass software articles
 Unknownimportance software articles
 All Software articles
 StartClass software articles of Lowimportance
 All Computing articles