Talk:Machine learning

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Wrong date[edit]

The date when Arthur Samuel wrote the first program is 1952 as wikipedia and many other online websites say, however this article says 1959, which should be corrected.

Nomination of Portal:Machine learning for deletion[edit]

A discussion is taking place as to whether Portal:Machine learning is suitable for inclusion in Wikipedia according to Wikipedia's policies and guidelines or whether it should be deleted.

The page will be discussed at Wikipedia:Miscellany for deletion/Portal:Machine learning until a consensus is reached, and anyone is welcome to contribute to the discussion. The nomination will explain the policies and guidelines which are of concern. The discussion focuses on high-quality evidence and our policies and guidelines.

Users may edit the page during the discussion, including to improve the page to address concerns raised in the discussion. However, do not remove the deletion notice from the top of the page. North America1000 10:36, 12 July 2019 (UTC)

Decision tree image[edit]

decision tree

I just posted this image to the article.

I liked it because

  1. It has a free license
  2. We do not have a competing image
  3. It lists various machine learning techniques
  4. It is useful for students
  5. The illustration has the backing of an academic paper explaining it

Blue Rasberry (talk) 18:43, 24 September 2019 (UTC)

I don't like it, and I don't think we should put it into this article.
  1. it promotes one tool, sklearn
  2. it is quite specific to the narrow algorithm selection available in this particular tool, ML is much more
  3. it is outdated for years even for sklearn, missing much of sklearn's own functionality — Preceding unsigned comment added by (talk) 22:22, 24 September 2019 (UTC)

Relation to statistics[edit]

The first paragraph of this section is very good, IMHO, but the last two are problematic. The second seems random and a little unfinished. The third raises an important point, but saying that statistical learning arose because "[s]ome statisticians have adopted methods from machine learning" is arguably confusing the chicken with the egg. It should also be mentioned here that statistical machine learning is a relatively well-established term (cf. e.g. this book) which has a meaning somewhere in between machine learning and statistical learning. Thomas Tvileren (talk) 13:07, 1 November 2018 (UTC)

The new first line "Machine learning and statistics are closely related fields in terms of methods, but distinct in their principal goal: statistics draws population inferences from a sample, while machine learning finds generalizable predictive patterns" is wrong and is currently under discussion on social media. — Preceding unsigned comment added by (talk) 06:55, 9 December 2019 (UTC)

@ Don't top-post. New messages to to the bottom (see Help:Talk pages). Also, social media is not authorative. Nature published this very opinion: co-authored by statistics professor Naomi Altman. We go by authorative sources such as Nature, not social media opinions. HelpUsStopSpam (talk) 22:21, 9 December 2019 (UTC)
@HelpUsStopSpam: Your linked article starts with “Statistics draws population inferences from a sample, and machine learning finds generalizable predictive patterns.” which is wrong. You can't draw population inference from a single sample. The argument raised by Thomas Tvileren is also valid, although I'm not sure it is possible to draw a line between statistical learning and machine learning. Most of machine learning is statistical learning, it is only modelled slightly different. Jeblad (talk) 03:54, 28 December 2019 (UTC)
The linked article has several problematic claims, for example “…ML concentrates on prediction…” and “Classical statistical modeling was designed for data with a few dozen input variables and sample sizes that would be considered small to moderate today.” Both of these claims seems to be problematic. Jeblad (talk) 11:48, 28 December 2019 (UTC)
That is your opinion. Apparently, the reviewers of Nature had a different opinion. This statement - whether you like it or not - satisfies Wikipedia:Verifiability. So why don't you write an opposing article in Nature, so we can add it here? Right now, yours is an unsourced personal opinion, and we rather want to add sourced material. I also disagree with you: A sample has a defined meaning in statistics: Sample_(statistics) that you do not appear to be aware of (not to be confused with a single sample point). So I am not sure you know what they talk about when they write "classical statistical modeling", either (this is not the same as a "model" in deep learning, which is just some matrices where you usually have no idea what they actually do...; of course it is not completely independent, but also not quite the same). You may need to think outside your "ML bubble" to understand that source. The sentence that you complain about was not there when Thomas Tvileren posted - this was an old thread... nevertheless, the Nature source that you don't like also writes: "the boundary between statistical and ML approaches becomes hazier." and "The boundary between statistical inference and ML is subject to debate—some methods fall squarely into one or the other domain, but many are used in both."
But in the end, it boils down to WP:PROVEIT: if you think that sentence is "wrong", then provide reliable sources. Above source is from "Nature", where are your sources? HelpUsStopSpam (talk) 21:41, 14 January 2020 (UTC)

Artificial intelligence[edit]

The claim “It is seen as a subset of artificial intelligence.” is wrong. Rephrased as “Methods from machine learning are used in some types of artificial intelligence.” would be correct. In particular, it is per definition not part of wet AI unless biological material are defined as “machines”. Artificial intelligence is about creating thinking machines, not just algorithmic description of learning strategies. (It is probably “narrow AI” creeping into the article, or robotic process automation (RPA) aka “robotics” aka business process automation, which is mostly just sales pitch and has very little to do with AI.)

It seems like all kind of systems with some small part of machine learning is claimed to be AI today, and it creeps into books and articles. Machine learning is pretty far from weak AI and very far from strong AI. It is more like a necessary tool to build a house, it is not the house. Jeblad (talk) 03:18, 28 December 2019 (UTC)

WP:PROVEIT, too. That appears to be your opinion (and you appear to have misread the text), but I can easily find many sources that say "ML is a part/subset of AI". Its not saying its "strong AI" or "all of AI". HelpUsStopSpam (talk) 23:17, 14 January 2020 (UTC)
I've been in the field for 30 years and the claim is common, but machine learning is not AI, the same way a hammer is not the house. Jeblad (talk) 12:13, 22 January 2020 (UTC)
It does seem increasingly common for sources to deny ML is a part of AI. But the majority still seem to hold the opposite view. IMO, for now the lede should still unequivocally say ML is part of AI, but in the body we can reflect the alternative perspective, as long as we don't overweight it. I'll have a go at making this change & a few other improvements. FeydHuxtable (talk) 09:40, 7 April 2020 (UTC)

overlooked randomness[edit]

could someone possibly add some thoughts on how randomness is needed for ml?

i wish i could do it, but i lack the expertise or the time to bring this up in Wikipedia style, as it is evident by this very post and the chain of links in it, if you care enough to dig.

cheers! 😁😘 16:11, 27 February 2020 (UTC) — Preceding unsigned comment added by Cregox (talkcontribs)