Jump to content

Wikipedia talk:Wikidata/2017 State of affairs: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Visual clutter: that would seem to be fixable
Some comments: really boiling down to us vs them
Line 243: Line 243:


This isn't a one-off problem: the above two errors were introduced by two long-term, trusted Wikidata editors, and when I look at [https://www.wikidata.org/wiki/Q3757 Java] (the island), I see the same kind of wrong claims (Java is located in the administrative units West Java, Central Java, East Java, ...) added by yet another very active editor. If such errors are not caught on major topics, then how are you ever going to make Wikidata good enough to be used as the source for infoboxes, lists, or whole articles (placeholders or real ones)? [[User:Fram|Fram]] ([[User talk:Fram|talk]]) 08:19, 24 January 2017 (UTC)
This isn't a one-off problem: the above two errors were introduced by two long-term, trusted Wikidata editors, and when I look at [https://www.wikidata.org/wiki/Q3757 Java] (the island), I see the same kind of wrong claims (Java is located in the administrative units West Java, Central Java, East Java, ...) added by yet another very active editor. If such errors are not caught on major topics, then how are you ever going to make Wikidata good enough to be used as the source for infoboxes, lists, or whole articles (placeholders or real ones)? [[User:Fram|Fram]] ([[User talk:Fram|talk]]) 08:19, 24 January 2017 (UTC)
: Wikidata could certainly use more human eyeballs to find and fix errors like these, as could every wikipedia. What do you suppose the error rate is in the average enwiki article? As I mentioned earlier, the French university articles (including long lists) were substantially wrong in enwiki for 2 years. I've edited enwiki articles on major physics topics that had basic misunderstandings that had stood in place for 5 years or more. [[WP:WIP]] - applies to wikidata just as well. But I suppose the more fundamental question is whether wikidata is seen as part of "us" or just another "them" which seems to be what most of this page is about... [[User:ArthurPSmith|ArthurPSmith]] ([[User talk:ArthurPSmith|talk]]) 17:34, 24 January 2017 (UTC)


== Is the page communal or a collection of opinions? ==
== Is the page communal or a collection of opinions? ==

Revision as of 17:35, 24 January 2017

Fake news

I've just reverted an edit which made the bold - and, ironically, unsubstantiated - assertion that "Wikidata edits violate WP:V and WP:BLP.". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:28, 12 January 2017 (UTC)[reply]

And I've now reverted an attempt to re-word that as "The lack of reliable sourcing means that imported Wikidata text violates WP:V and WP:BLP.", which is still bunkum. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:34, 12 January 2017 (UTC)[reply]
Actually, it's one of the bigger problems, and I'm really sorry that you don't understand that. I note on reading this page, and many other Wikidata-related discussions, that individuals who are strong proponents of using Wikidata information in articles do not seem to give any weight whatsoever to the editorial policies of the recipient projects. Maybe the editorial community of a small wiki doesn't care and is more interested in getting *any* data regardless of whether or not it's even true (I've seen lots of Wikidata entries with bunkum in them). This is not one of those projects. We cannot afford to be one of those projects. We're the one that makes international headlines for incorrect information. Risker (talk) 15:32, 14 January 2017 (UTC)[reply]
"...you don't understand that" Bullshit. "individuals who are strong proponents of using Wikidata information in articles do not seem to give any weight whatsoever to the editorial policies of the recipient projects" Also bullshit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:28, 14 January 2017 (UTC)[reply]
You have made it crystal clear that you do not in any way perceive this to be a problem, both here and elsewhere on this page, and I don't understand your rather extreme denial that there are BLP problems. I would identify you as one of the strong proponents of Wikidata on this project, and you seem to be denying there are any BLP and WP:V problems, while it is repeatedly demonstrated that there are. I think you would likely characterize yourself as being a strong proponent, and you're not giving it any weight. Risker (talk) 01:39, 15 January 2017 (UTC)[reply]
"your rather extreme denial that there are BLP problems... you seem to be denying there are any BLP and WP:V problems". Really, Risker, we expect this kind of rubbish from the usual trolls on Wikipedia, but you used to know better. Shame on you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:30, 15 January 2017 (UTC)[reply]
Andy, having personally reverted Wikidata information that was indeed not only unverified but completely erroneous, at least on one occasion involving a living person for whom the error was particularly problematic, I will stand my ground on this one. I expect better from you too, Andy. These are core policies of English Wikipedia. If you're not on board with them, I don't know what to say. Risker (talk) 03:24, 16 January 2017 (UTC)[reply]
I too have personally reverted Wikidata information that was indeed not only unverified but completely erroneous. Such anecdotes are utterly irrelevant to the point at hand. "core policies of English Wikipedia. If you're not on board with them..." More bullshit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:46, 16 January 2017 (UTC)[reply]

Request to fix my own post

I've tried three times to reduce my own post to bullet format without a signature, [1][2][3] but Pigsonthewing has reverted me. Could someone do that for me, please? It's my own post, so I should be allowed to write it as I want to; it's in the "disadvantages" section, where people were asked to list perceived disadvantages.

I would like it to say: "The lack of reliable sourcing means that imported Wikidata text violates WP:V and WP:BLP." SarahSV (talk) 23:45, 12 January 2017 (UTC)[reply]

As I thought I'd made very clear above, I would object most strongly to such a falsehood being included here. See also the recent comment by User:RexxS. As for "your own post", the page referred to is a communal page; see WP:OWN. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:58, 12 January 2017 (UTC)[reply]

Request to fix my own post (2)

I've tried three times to reduce my own post to bullet format without a signature, [4][5][6] but Pigsonthewing has reverted me. Could someone do that for me, please? It's my own post, so I should be allowed to write it as I want to; it's in the "disadvantages" section, where people were asked to list perceived disadvantages.

I would like it to say: "The lack of reliable sourcing means that imported Wikidata text violates WP:V and WP:BLP." SarahSV (talk) 23:45, 12 January 2017 (UTC)[reply]

As I thought I'd made very clear above, I would object most strongly to such a falsehood being included here. See also the recent comment by User:RexxS. As for "your own post", the page referred to is a communal page; see WP:OWN. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:58, 12 January 2017 (UTC)[reply]
I'm unclear as to why my comment was copied from the above section, into this one. Most irregular. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:06, 13 January 2017 (UTC)[reply]

And there goes the attempt to list the positions of both sides as they perceive them. Pigsonthewing, there are and will be "falsehoods" included in both the benefits and disadvantages section. If we only include things everyone can agree upon, the list will be very short and not useful at all. The (obviously badly failed) intention was to get to know the opinions, the perceived reality, which could then (here, or in a separate section) be discussed (politely, not by insulting everyone who dares to have a negative opinion of Wikidata). List what people see as the benefits and disadvantages of Wikipedia, not what the "real" benefits and disadvantages are, since no agreement on such a "real" list will ever be found. The "uses" and "discussions" sections should be factual (but there as well some Wikidata-promotors insist on adding "but it will improve" and "look what Wikidata can do" comments), but the disputed section not, as it can't be factual and useful at the same time. Fram (talk) 08:25, 13 January 2017 (UTC)[reply]

Yes, I think this dispute has occured because people don't know whether to treat the page as a project page (in which WP:CON applies) or a talk page (in which WP:TPG applies). I think it would be simpler if it were a talk page, but is it feasible to convert to such now? — Martin (MSGJ · talk) 08:51, 13 January 2017 (UTC)[reply]
Perhaps you could avoid straw men like "insulting everyone who dares to have a negative opinion of Wikidata"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:09, 13 January 2017 (UTC)[reply]
There is a difference between slight hyperbole and straw men, you know? The number of insulting replies to anything somewhat negative about wikidata on this very page is staggering. Fram (talk) 09:30, 13 January 2017 (UTC)[reply]
Yes, I know the difference, which is why I chose my phrasing so carefully. Please give examples of these perceived insults. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:43, 13 January 2017 (UTC)[reply]
Then you chose carefully but wrong. As for insults, things like "It appears to have become merely a vehicle for some editors to regurgitate half-baked slogans and ridiculous inventions as if they were Gospel." perhaps? Fram (talk) 09:57, 13 January 2017 (UTC)[reply]
No. "Half baked editors" would be an insult; "half baked slogans" criticises - clearly - the slogans. Otherwise, your "cult-like behaviour" canard would be an insult, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:02, 13 January 2017 (UTC)[reply]
You don't seem to know the difference between "insults" and "personal attacks" (just like you don't know the difference between hyperbole and strawmen, and between reasoned arguments and I-can't-hear-you dismissals). The quoted sentence criticizes edits, but in an insulting way. "Criticism" and "insult", just like "source" and "wikimedia project", aren't mutually exclusive. Fram (talk) 10:19, 13 January 2017 (UTC)[reply]
It seems reasonable and logical to deduce from your argument that you intended your "cult-like behaviour" comment to be insulting. Should you wish to deny this, please explain the inconsistency of your argument. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:38, 13 January 2017 (UTC)[reply]
Obviously. It was reciprocal for "as if they were Gospel". An eye for an eye and all that nonsense. Fram (talk) 12:43, 13 January 2017 (UTC)[reply]

Can we please all agree to let people add their own idea of the benefits to the benefits sections, and of the disadvantages to the disadvantages section, without anyone else interfering with it? If you prefer, we can ask that everyone signs their entries to make it clear that it is the position of that person and not an "official" enwiki position. If you want to discuss these entries on the main page, start a new section at the bottom; alternatively, dissect any entries you want here to your (polite) heart's delight. Fram (talk) 08:25, 13 January 2017 (UTC)[reply]

While you're removing or altering posts with edit summaries like "Removed pro-wikidata POV" and "please keep your POV out of this"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:12, 13 January 2017 (UTC)[reply]
Yes, these were in the "uses" section, not in the benefits or disadvantages section. Whether a particular use is "The ideal for Wikidata adoption."[7] is POV and doesn't belong there. Fram (talk) 09:30, 13 January 2017 (UTC)[reply]

Pigsonthewing, please don't add incorrect "unsigned" templates. The comment above was mine, not SlimVirgins. Fram (talk) 09:25, 13 January 2017 (UTC)[reply]

Redux

I see that this false claim has been reinserted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:46, 18 January 2017 (UTC)[reply]

You mean the SimVirgin line? Let's see who last readded that line... [8] Hmm, a certain Pigsonthewing apparently. Ever heard of him? Fram (talk) 08:13, 19 January 2017 (UTC)[reply]
Apologies, I'd overlooked that it was the signed version. The whole page is a mess. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:11, 19 January 2017 (UTC)[reply]

We now have - despite my best efforts to make it actually say something factual and useful - an unsigned claim that " Wikidata edits/transclusions (where data is included from Wikidata without a reliable source being provided on ENWP) violate WP:V " As previously explained, this is still utter bollocks. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:47, 23 January 2017 (UTC)[reply]

WP:V requires that all information on a wikipedia article is verifiable by use of a reliable source. When data is transcluded from wikidata into an article, and there has been no reliable source provided, it is unverifiable. 'The data is sourced at wikidata' does not satisfy the burden required by WP:V. Especially since Wikidata a)has lax sourcing requirements itself, b)contains data from other wikis with equally different standards. While everything does not require an inline citation, it does require that a reliable source be used in the first place. Wikidata transclusions do not satisfy that except where the data is already sourced in the article or where someone manually adds the sourcing after. The first is sometimes true, the latter is rarely true. Only in death does duty end (talk) 17:55, 23 January 2017 (UTC)[reply]
Where did I ever claim "'The data is sourced at wikidata' satisfies the burden required by WP:V"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:36, 23 January 2017 (UTC)[reply]
Strange, the policy which directly supports this has been quoted and explained to you. And of course, all claims on the page are "unsigned", not just this one. That's standard procedure. That you don't agree with a claim doesn't mean it should be removed, certainly not when it is a perceived claim in the first place, and one that isn't even subjective but actually based in policy secondly... Fram (talk) 17:57, 23 January 2017 (UTC)[reply]
It is not that I "don't agree" with the claim, is that it is over-reaching, and demonstrably false, and unsupported by both policy and agreed practice, as I have shown elsewhere on this page. That both you and Only in Death ignore that does not negate it. The "perceived" heading is a cop-out. We no more need to cater for bogus perception here than we do on pseudo-science articles. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:36, 23 January 2017 (UTC)[reply]
We don't ignore it, we don't agree with you. My reading of policy supports the disadvantage as written, your reading of policy somehow doesn't. That has little to do with pseudo-science. Do not use articles from Wikipedia (whether this English Wikipedia or Wikipedias in other languages) as sources. Also, do not use websites that mirror Wikipedia content or publications that rely on material from Wikipedia as sources. seems pretty clear to me. The only "cop-out" some people use is to claim that Wikidata is not a source somehow, and thus this policy rule is not violated. Which, to us your words, is bogus. Fram (talk) 07:48, 24 January 2017 (UTC)[reply]
Once again, you grossly misrepresent what I have said. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:03, 24 January 2017 (UTC)[reply]
Which part of my reply did this? "You are wrong" is hardly helpful without indicating what exactly is wrong. Fram (talk) 11:12, 24 January 2017 (UTC)[reply]
That would be quite a feat since despite asking multiple times you have yet to put forth a coherant argument other than 'its bullshit, fud' etc. From your rather uninformative opposition I have come to the conclusion you dont think content added on ENWP articles that has been transcluded from Wikidata counts as 'sourcing the data from wikidata'. You see wikidata as a middle-man where all the information is actually sourced from somewhere else. Where myself, Fram (and to a lesser extent Risker/SV above - correct me if I am wrong) see wikidata as the source from which the material originates - as that is where it comes from when it is used - regardless if it is ultimately referenced/originated on wikidata from elsewhere. This wouldnt be a problem if 100% of wikidata was reliably sourced according to our policies, but since barely 25% of Wikidata has a source that could be considered at all (let alone hit our reliability markers) and functionally even when there are sources available at wikidata, these are not used in ENWP at all because the material displayed/incorporated on ENWP is done absent references. Only in death does duty end (talk) 11:15, 24 January 2017 (UTC)[reply]
"you have yet to put forth a coherant argument other than 'its bullshit, fud' etc." Now that is simply a lie. Desisst. Having refuted the claim, it is not for me to do so again, every time you or Fram chose to ignore that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:21, 24 January 2017 (UTC)[reply]
The closest you came to actually explaining anything is now in Archive 2 where you basically said that not *all* wikidata edits (interwiki links etc) violate the above policies. Since no one was actually arguing that in the first place it was rather a weak argument and still didnt address the issue of content in articles. I have now trawled through both archives (and on this page) trying to find an edit by you where you clearly state why it is wrong. Other than 'its false, lies, etc'. If you have done so, please correct me and provide a diff or I shall work on the assumption you have not. Only in death does duty end (talk) 12:49, 24 January 2017 (UTC)[reply]

Data used across many articles

@Waggers:: can you give some examples of "Data that are used across many articles, such as population data or current representatives / ruling parties"[9]? In most cases, such information is used by three or four articles at most; and we already use enwiki templates to reduce the maintenance of such situations anyway. Fram (talk) 11:07, 13 January 2017 (UTC)[reply]

@Fram: Template:Infobox UK place is transcluded by 22,933 articles and Template:Infobox UK constituency by 1,932, and obviously that's just the UK on the English Wikipedia. While an individual data item might only be shared across a handful of articles, updating them all after a new dataset release or general election is almost insurmountable, evidenced by the fact I'm still seeing pages in my watchlist being updated with 2011 census data (previously showing 2001 census data) as recently as last week. You're right that enwiki templates potentially offer an alternative solution, but they're not really designed for handling data and each Wikipedia would have to manage their own version of such templates. In contrast, Wikidata is designed for handling that kind of data and makes it available to all Wikimedia projects. WaggersTALK 10:25, 16 January 2017 (UTC)[reply]
Thanks. Template:Infobox German location (and a few others) automatically takes its population from Template:Population Germany, which is fed by the German Wikipedia. This seems to work without problems. The advantage is that changes to the template:population Germany are tracked onwiki; if the data was located at wikidata, subtle vandalism could only be easily spotted there. At the moment, it seems to me that vandalism spotting at Wikidata is worse than at enwiki or dewiki (not that it is perfect here, far from it). Fram (talk) 10:35, 16 January 2017 (UTC)[reply]
Everybody can switch on showing Wikidata edits on the watchlist in the English Wikipedia (or in any other project, for that matter). Except for some bugs which are currently being dealt with, this seems to work and helps to find vandalism in the articles on my watchlist. The vandalism-fighting per se on Wikidata is obviously much weaker than on the English Wikipedia, since the number of items is bigger, and the active community is way smaller than here.--Ymblanter (talk) 11:46, 16 January 2017 (UTC)[reply]
I had it turned on, and have turned it back off for a number of reasons. I find the edit summaries to be very unclear. I get e.g. a number of edits to Wikipedia talk:WikiProject Cycling which turn out to be completely unrelated edits like [10][11][12]. If I had seen the actual page those edits were to, then I would have noticed that the new labels were nearly identical and that furthermore these pages were of no interest to me, and thus I wouldn't have need to check them. I do notice that someone added the native name for Peter Paul Rubens, referenced to ... the Russian Wikipedia, even though this value is different to the name used by the Dutch Wikipedia, who probably know better. I could now go and change this, or just shake my head and remember why I don't think using Wikidata is a good idea. I'll turn Wikidata changes on my watchlist back off now, as this is a waste of time for me. Fram (talk) 12:36, 16 January 2017 (UTC)[reply]

Please let us be considerate ...

When you look at both Wikipedia and Wikidata you will find problems in all of them. That is normal. The point is that both Wikidata and any Wikipedia serve the same purpose and it is to share in the sum of all knowledge. Yes, there are violations to be found in any project of the policies of the other project. The point is not that they exist, the point is that we have a way to deal with such problems. This is what talk and consideration of our POV is there for.

When Wikipedia is to use data from Wikidata, it makes sense for us not to vilify each other. The point is that in order to achieve our goal it is best for us to work together. Magnus did some research and it showed that Wikidata is improving a lot on quality by having more and more sources, he blogged about it (recommended reading). I have been scapegoated often for using category data from Wikipedia because it proved to be wrong. The fact of the matter is that our projects are works in progress. We should not use absolutes at each others because the only net result is that we do not achieve our mutual goal and fail to collaborate. Thanks, GerardM (talk) 09:37, 15 January 2017 (UTC)[reply]

It would be useful when you propose some "recommended reading" that you would link to it. Anyway, "When Wikipedia is to use data from Wikidata, it makes sense for us not to vilify each other." is again meaningless. First, replace "when" with "if". Second, it only makes sense to criticize wikidata just because we may (or some people do) use data from it or want to push it here. That's not vilifying (thanks for the very neutral description there). We shouldn't blindly import data just because we share the same goal (somewhat), seeing how our policies and approach differ. Wikidata can be useful for enwiki, but we should tread carefully. Fram (talk) 10:39, 15 January 2017 (UTC)[reply]
"First, replace "when" with "if"." No. You are ignoring the hundreds of thousands of articles on Wikipedia that - as I have recently explained elsewhere on this page - already import data (and here I exclude interwiki links) from Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:26, 15 January 2017 (UTC)[reply]
That would include those that worked perfectly allright before, but which for some obscure reason have been changed to get the same value (if it hasn't been vandalized meanwhile) from Wikidata? Apart from Authority Control, I see very little that actually gets taken from Wikidata which wasn't already present on enwiki. The added value of Authority Control (ugh, dreadful name) is something that can be debated as well (most of these links would be removed from External links or Further reading since they don't add anything at all), but at least it's something new. Fram (talk) 08:03, 16 January 2017 (UTC)[reply]
The first thing Wikidata brought Wikipedia is a replacement for the interwiki links. The result is a much improved process of maintenance and the work on statements makes disambiguation for these interwiki links also easier.. It proved to be a major improvement. Thanks, GerardM (talk) 11:55, 16 January 2017 (UTC)[reply]
Yes, and...? I haven't really noticed much difference in disambiguations since interwikis were moved to Wikidata. But Wikidata is perfect for interwikis, which are hardly data but the connection between wikipedia versions. I would have no problem with Wikidata if that had remained its scope. Fram (talk) 12:46, 16 January 2017 (UTC)[reply]

Proposal

At this time it seems people are up in arms because of their entrenched positions. So let us seek a middle ground first. Let us have Wikipedia record under water the links to Wikidata on every Wiki link and red link.

Objective

When any link internal to the WMF knows it has a connection to Wikidata, we can more easily make the links and use advanced tools like queries to check for inconsistencies and to increase the quality in all our projects. The tools necessary will be left to user initiative as is the norm.

Impact on current use

The behaviour and the way Wikipedia works remains the same for those who do not opt in.

Only for those who opt in a new red link will prompt to add to an existing Wikidata item or to add a new item. It will be possible to add statements and references. When a new link is created, the associated item will be added and only the possibility of adding statements is added. For existing links there will be a page that shows existing links and associated statements.

Arguably, this page may be there for everyone to see and, it should have an easy option to opt in or out for the full functionality.

"Let us have Wikipedia record under water the links to Wikidata on every Wiki link and red link." Why would we record the link to Wikidata on every bluelink as well? We already have an article, why clutter the pages with wikidata indications? As for the redlinks, why would we link to an unreliable wiki and not to any other site? We can add a tool comparable to what is used at AfD, with links to Google, Google News, ... and add wikidata there as well. Why would we see Wikidata as some superpriviliged partner that needs linking on every link we have? Just because it is also a WMF project? No thanks. Feel free to change Wikidata to have a "what links here" from all wikis, with an indication if it is a redlink or a bluelink, no one stops Wikidata from doing this. This would make more sense than putting this on every Wiki, adding "this exists on Wikidata!" to every bluelink and some redlinks on every page.
"a new red link will prompt to add to an existing Wikidata item or to add a new item." Again, why not do the reverse? Send editors from wikidata to enwiki (and the other language wikis), don't siphon editors away from enwiki to wikidata. It makes no sense to tell editors on what is normally the most visible page about a subject to edit a much less visible page instead (Google returns wikipedia pages first, and wikidata much, much lower in the rankings; and the number of pageviews is similarly different). All Wikidata does with this is duplicate effort. Which seems to be what Wikidata has largely become; a duplication of effort for little benefit. Still, anyone is free to edit Wikidata, just don't try to use enwiki as a massive recruiting ground in your "a middle ground proposal". Fram (talk) 10:39, 15 January 2017 (UTC)[reply]
You do not get "opt in". You do not need to see any link to Wikidata in Wikipedia. When you are just reading an article you will not be bothered with any of the information that is hidden. Even with red links everything will be invisible to you when you read or edit. When you opt in, there is the option to link to Wikidata. How that is for the UI people to decide but, the point is that it is there only for the people who care about quality and understand how Wikidata can help. Thanks, GerardM (talk) 11:54, 15 January 2017 (UTC)[reply]
By the way, this is not about recruiting at Wikipedia. Far from it. I often use Wikipedia data and I often do not edit Wikipedia because of all the hassles I get. When the work I do is in line with what I do at Wikidata, it is more easy, more obvious. When you consider duplicate effort, it works both ways. By making it easy and obvious how things are changed and why we both gain in quality. In the example of IMDB there is one big difference; Wikidata is linked to any and all Wikipedias and consequently when best practice is there for all of us we all win in quality. Again, without adding anything obvious for those who do not want this. Thanks, GerardM (talk) 12:19, 15 January 2017 (UTC)[reply]
You are mixing the confusing and the insulting here. "it is there only for the people who care about quality": I think most people who commented here and elsewhere and are (very) critical of Wikidata and its role on enwiki, do so just because they "care about quality". For you, on the other hand, the only indication of quality you have discussed so far is the superior quantity of Wikidata, and then only if you count articles vs. items and somehow pretend that this count is in any way meaningful.
In the past I blogged about errors in Wikipedia where I found an error rate of 20% in the links. What I want is tools to fix them. This is what my proposal is about. Just have us work on links and red links. It will improve the quality in both our projects and if this is not your thing, do not be bothered having us fix things. My proposal is to grow together, if you do not want infoboxes for now, I do not care. It is probably not the time yet. Thanks, GerardM (talk) 07:08, 16 January 2017 (UTC)[reply]
I think I can safely ignore you now. You took one article, and found an error in 2 of the 19 links, and then claim as title "#Wikipedia - a 20% error rate" and in the body "With such statistics it is obvious to make the argument that replacing links with links through Wikidata will enhance quality in the English Wikipedia.". For starters, 2 out of 19 is not 20% at all, it is just a bit more than 10%. Then, using one article is not "statistics", it is "anecdote". Finally, you then, just like you do here all the time, make the totally unwarranted claim that "replacing the links with links through Wikidata" would enhance quality miraculously, without any indication that this is actually true. I notice that you didn't edit the Wikipedia article, but at the time of your blog post added some items to Wikidata (like [13], making this a self-fulfilling prophecy and an even less reliable blog (should I write a blog about items missing or wrong at Wikidata but better at enwiki, if I first make sure that these "facts" are true by editing enwiki in this regard). Basically, you are making your own truth and then present it to the world as if it are facts, with a rather extreme calculation error which just happens to make your point twice as strong. And your change to Wikidata has not improved enwiki at all, leaving the errors in an article Fram (talk) 08:22, 16 January 2017 (UTC)[reply]
I am not talking about replacement. What I am after is that at first the links include both the item and the article. There is nothing miraculous in what I propose. It is how we can use the tooling that is possible through Wikidata.
When I write my blog, it is because I care about what we do. Our aim is to share in the sum of all knowledge and I know we can do a better job. The problem is that the arguments are there and never mind if it is 10 or 20% it is by that number that we improve that one article. If that is not relevant, what is? Thanks, GerardM (talk) 12:00, 16 January 2017 (UTC)[reply]
"I am not talking about replacement"? My apologies then, it turns our that that blog post is even more useless than I thought, as a sentence like "it is obvious to make the argument that replacing links with links through Wikidata will enhance quality in the English Wikipedia." for some reason gave the very strong impression that you were actually talking about replacement. Perhaps first make up your mind what it is you actually want. I can't really understand what it is you are trying to say with the rest of your reply, all I notice is that you noticed a few problems in one Wikipedia article, then went to add some items to Wikidata, only to able to then tell us (and this is for once not a quote but a paraphrase) "look how poorly enwiki is doing and how much more Wikidata has on the same, aren't we superior?". No, you aren't, not by a long stretch. Perhaps try to explain instead how the addition of the wikidatalink (before you actually added the wikidata item to prove your point) would have prevented or corrected any of the problems in that one article you used to miscalculate statistics. Fram (talk) 12:46, 16 January 2017 (UTC)[reply]
"understand how Wikidata can help" That's what this discussion is trying to do. All you seem to offer is "Wikidata has an item, and if it hasn't you can create it". The question remains: how does this help enwiki? "this is not about recruiting at Wikipedia", but in the previous post you said "a new red link will prompt to add to an existing Wikidata item or to add a new item." Prompting people to edit Wikidata = recruiting at Wikipedia, no? You "often do not edit Wikipedia because of all the hassles I get. When the work I do is in line with what I do at Wikidata, it is more easy, more obvious." Fine, my experience is the opposite, but that's personal preference I presume. "When you consider duplicate effort, it works both ways." But this isn't true of course. You have to edit enwiki anyway, as there is a lot that is needed and wanted here but which can't be put in Wikidata (and Wikidata isn't intended to be "read" as an article anyway). E.g., as far as I can see, Wikidata only wants entries for the children of someone if these are also notable (otherwise they shouldn't get an item). But in a Wikipedia article, you often mention all the children someone had, whether they were notable or not. Enwiki simply has much more detail. So editing enwiki without editing Wikidata makes perfect sense; but the other way around is a waste of time, adding something there when you notice it missing here is a bit pointless. "By making it easy and obvious how things are changed and why we both gain in quality." After four years of Wikidata, how much quality gain has enwiki had from it? There is authority control, for what it's worth; and very little apart from that. Why would we beleive that somehow Wikidata will make enwiki qualitatively so much better in the future when it hasn't delivered on that promise until now? Please present some actual examples, not just some vague hopes.
"In the example of IMDB there is one big difference; Wikidata is linked to any and all Wikipedias and consequently when best practice is there for all of us we all win in quality." How? Best practice at enwiki and other wikis is not the same as best practice at Wikidata. Wikidata accepts findagrave, quora items, ... so it doesn't look to me as if we share "best practices" at all, or that Wikidata ssets an example of quality we should import or follow. That Wikidata is linked to all wikipedias is not a measure of quality, it is just a result of its first purpose, being a container for interwikilinks. But that container has spawned a monster which seeks a purpose to give any value to its size and effort (and cost? No idea how much Wikidata has cost so far).
We are working for the readers, not for ourselves; and the readers are at enwiki, not at Wikidata. If Wikidata can truly improve enwiki, then feel free to give us actual examples of this. Just indicating that "Wikidata has more items and is linked to everything, so it is better and produces quality" is to me at least far from convincing, and doesn't match what I see when I go to Wikidata. Fram (talk) 21:07, 15 January 2017 (UTC)[reply]
"The behaviour and the way Wikipedia works remains the same for those who do not opt in." - this is 100%, absolutely, demonstrably not true. To take even a small example- any article that has an infobox that is now pulling data from wikidata has the (strong) potential to pull in unwanted data, not to mention incorrect data. The article writer didn't have to "opt in"- infoboxes are "opt out" to begin with, and you also have to be editing the articles in question after the infobox got changed under you. Just as an example- I rewrote and got to GA Hugo Award back in 2011. I did not put in a country parameter in the infobox- and why would I? The award is a multinational, purportedly worldwide award, that happens to have it's trademark-holding body based in the US, so listing a "country" would be misleading at best. Not every field needs to be filled. Oh, but look! Apparently someone, at some point, added country=US into wikidata (source: Wikipedia), so... now it shows up in the infobox. Not my watchlist, of course, or the page history- wikidata edits get special privileges in that regard. The only way to get rid of it is to make a new edit to opt out of that one wikidata item, or to delete the incorrect wikidata item- though, of course, wikidata tries to shame you into not doing so and gives no indication at all that trying to delete it twice will have it work the second time.
I'm all for turning wikipedia's structured infobox data into a searchable database. But if wikipedia pages are going to be based on that database (in part), then edits to that database need to be tied in completely to wikpedia, not just halfheartedly imported. This is looking like just one more example of the wikimedia project's knack for taking really good ideas and failing them with poor technical execution. --PresN 03:16, 16 January 2017 (UTC)[reply]
When you restrict yourself to my proposal; ie having items associated with wiki links and red links only, you will not see any results if you do not opt in. Thanks, GerardM (talk) 07:02, 16 January 2017 (UTC)[reply]
But you still haven't explained why we would (opt-in only) add a a pointer to Wikidata (which certainly for bluelinks seems completely superfluous anyway, who cares on enwiki whether we have an item in Wikidata when we have one here), and not e.g. a pointer to Google Books, Britannica, IMDb, whatever, since each of them may be more useful and/or more reliable than Wikidata. On Wikidata, it yesterday took more than an hour before anyone notices that their page on Superman (not really an obscure page on enwiki) has been "moved" (renamed) to "UGLY". Why would we want to point to (never mind import data from) a site which is still so easily vandalized and clearly much less well patrolled? Fram (talk) 07:52, 16 January 2017 (UTC)[reply]
When you suggest that I do not care about quality for both our projects, you are a fool. When you suggest that Wikidata is no different than IMDB among others, you are foolish. As I showed before, with my proposal you do not need to experience how we will make a difference, the only question is will you let us do a better job for both our projects. Thanks, GerardM (talk) 12:03, 16 January 2017 (UTC)[reply]
This seems to be a reply to another post? I didn't say that you didn't care about quality here(although your actions show that when you identify problems on enwiki, your only care seems to be to improve Wikidata to prove the point that it is better somehow), I said that you seem to consider quantity an absolute indicator of quality, which is nonsense. I also didn't say that Wikidata isn't different than IMDb, just like I didn't say that IMDb, Britannica and Google Books aren't the same (which would have been a foolish claim indeed). What I said was that there are a lot of sources we could link to, and you single out Wikidata for no clear reason, even though it is an unreliable site where most of the information comes from Wikipedia to start with. "As I showed before, with my proposal you do not need to experience how we will make a difference, the only question is will you let us do a better job for both our projects." You haven't shown anything so far, actually, you have just claimed that your proposal will somehow make things better for enwiki (it may make things better for wikidata, by indicating which bluelinks don't have a wikidata item already, but that is not what this discussion is about). "will you let us do a better job for both our projects." If you can convince me that you will do a better job for enwiki, yes, why not? So far nothing you said has given me the idea that you actually will do a better job for enwiki though, you haven't given any examples of actual improvements which would happen through your proposal. Just give a few concrete examples of how enwiki would benefit from your proposal. Fram (talk) 12:56, 16 January 2017 (UTC)[reply]

Authority control

Template:Authority control is added to more than 500,000 pages so far. At first glance, this seems like a good thing. I do see some problems with it though:

  • The name: it suggests that either some authorities control the page it is listed on (false), or that the page has been controlled by checking it with these authorities (also false). What it does is list some authorities which could be used to perhaps control some aspects of the page, if you are so inclined. Perhaps it should be moved to the talk page of articles instead of being on all these articles?
  • The contents: many of the links in the template add no value at all to the article, and would not be accepted for that page in the "external links" or "further reading" sections. Then why do we add them automatically through a template anyway? Some examples:
    • Marc Sleen, which already has 14 references and an external link, has 9 AC links. WorldCat is interesting, VIAF doesn't add much that WorldCat doesn't, LOC is useless here, ISNI gives "Your data limit has been reached." (first time I visit this!), Deutsche Nationalbibliothek is useless for a Flemish author on enwiki, IDRef can be removed as well, BnF is somewhat interesting, and Libraries Australia is again pointless.
    • Cromwell Dixon: Worldcat again interesting, VIAF not, LCC not really interesting either
    • Jan Van Eyck gets loads of AC links, including things like the wiki Musicbrainz, which gives us... a copy of the Wikipedia article[14]. Useless stuff again includes Libraries of Australia[15], some Swedish site[16], some Polish site[17], a Japanese site[18], ... many of which point to Wikipedia if you want more information. Added value? Zero. Added clutter? Too much.

If Authority control is supposed to be useful, it should be curated and much more restrictive. The blind posting of every identifier someone adds to Wikidata just because it is about the same person, but not because it adds anything at all for readers, is something that would not be allowed in normal editing, but is somehow acceptable because it is Wikidata-driven. Worldcat seems to be the only one that is consistently useful, all others seem to be dependent on the subject (e.g. Deutsche Nationalbibliothek should only be added to people with some link to German or the German language) or perhaps even never useful at all. As it stands, the disadvantages seem to outweigh the advantages, which are rather small to begin with. Fram (talk) 08:59, 16 January 2017 (UTC)[reply]

I wonder, it is easy enough to make a selection of the external identifiers that are included. My question to you is, do you understand why it is a good thing to link to these external sources. Do you appreciate that what is enough for you may be too little for someone else? How do you find what it is that is useful. When the data is available it could be a personal option to increase or restrict what is shown. One caveat, the caching involved. Thanks, GerardM (talk) 12:07, 16 January 2017 (UTC)[reply]
As far as I can see, I can't make a per-page selection of which identifiers to include (e.g. leave DNB for German-related topics but remove from others). I think I have well explained why it is a good thing to link to some of these external sources, on a case-by-case basis, and why some others are not a good idea. Feel free to indicate which of the ones I didn't find useful in these examples would be useful for someone else on enwiki, and why. "How do you find what it is that is useful." Well, that's what we did for 15 years (and still do) in our external links and further reading sections. We decide this on a page-by-page basis, with talk page discusion if needed (though such a discussion is rarely needed); we don't dump the same indiscriminate list of reliable but in many specific cases useless links to all 500,000 pages with the same template. At least enwiki still gets to decide which entries get on the indiscriminate list and which don't, so we don't have to put up with really crap links liks Quora. But even so, getting an ID which says "yes, the subject exists, for more information check Wikipedia" will hardly be useful for anyone. Fram (talk) 13:04, 16 January 2017 (UTC)[reply]
"same indiscriminate list of... links to all 500,000 pages with the same template" This is not what the template does. Yet more bullshit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:53, 16 January 2017 (UTC)[reply]
Then please give some examples of articles which don't do this. Don't just tell people "you're wrong", show it. As far as I know, the template has a list of what, 20+ possible links, and every one of them gets added to the article if a value is given in Wikidata, no matter how useful or useless that link may be for our readers (and no, this isn't just "personal preference", a link which gives nothing but a Wikipedia page is not a useful or even acceptable link). Fram (talk) 14:19, 16 January 2017 (UTC)[reply]
So, as you now explain, not the "same indiscriminate list of... links to all 500,000 pages with the same template". Your "nothing but a Wikipedia page" is yet again FUD. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:34, 17 January 2017 (UTC)[reply]
You keep bleating 'FUD' but dont actually provide any rebuttal of substance. If there is any FUD coming here its from your end. Only in death does duty end (talk) 10:42, 17 January 2017 (UTC)[reply]
Indeed. I gave the example of Jan van Eyck and his MusicBrainz link above. This is "nothing but a Wikipedia page". None of the tabs give any additional information at all. Oh wait, yes, it has a link to discogs.com[19] (in itself not a reliable site either): but the information on this page is not about Jan Van Eyck, the 15th century painter. The person who is meant here is Jacob Jan Van Eyck instead. So you get not just a Wikipedia page, but a Wikipedia page and, if you look carefully, a link to an incorrect Discogs page. Why should I welcome this link in any way? Fram (talk) 15:43, 17 January 2017 (UTC)[reply]
WP:EL. Only in death does duty end (talk) 13:23, 16 January 2017 (UTC)[reply]
":it suggests that either some authorities control the page" More FUD. Read the help page to which that phrase is linked in every instance of the template (and note that the phrase, our use of it, the template, that link, and the linked help page, all pre-date Wikdiata). "the blind posting of every identifier someone adds to Wikidata" this does not happen. More bullshit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:51, 16 January 2017 (UTC)[reply]
At the template, we control (for now) which of the identifiers get used, and which don't. For all articles that have the template, indiscriminately. We don't only use Musicbrainz for music-related articles, DNB for Germany-related articles, and so on. If I add a musicbrainz identifier to Wikidata for Margaret of Valois, then that will automatically be shown here, without any change to the page history, and without any benefit for the readers. Ridiculous? Not really, such an ID exists[20]. Fram (talk) 14:16, 16 January 2017 (UTC)[reply]
"for now" More FUD. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:26, 16 January 2017 (UTC)[reply]
Thanks for your constructive criticism and helpful examples. Fram (talk) 19:08, 16 January 2017 (UTC)[reply]
See authority control. It is the standard term for efforts to assign unique reference identifiers or standardised indexing forms for things that may have many names. Jheald (talk) 13:01, 17 January 2017 (UTC)[reply]
Thanks. This explains why the name was chosen. I still think it is a bad name for the user-facing side of it (for readers, it is interesting that we have links to other, reliable insitututions with information about the subject: it is for most readers not interesting that abbrevation X uses ID Y for the same subject, and that this ID-giving is called "authority control"). Fram (talk) 15:43, 17 January 2017 (UTC)[reply]
  • So I spent 15 minutes trying to use the Authority Control template at Jan Van Eyck to exclude some links (which are inappropriate). If its possible I dont see how without removing the template entire, as there is no alternative (unlike infobox person). I can manually set them, but then it flags up big red errors. This is obviously a problem (not so much for a dead artist) on articles where the ACs may be incorrect or link to a different person, or just be completely useless links that shouldnt be there per WP:EL. Only in death does duty end (talk) 17:30, 23 January 2017 (UTC)[reply]

Quality improvements through Wikidata lists

The intention of some people is to have e.g. lists which get maintained at Wikidata, not here. Somehow this would be ensuring better quality by some magic process. Let's take an example of such a possible list, the Judith Wright Prize.

In reality, the ACT Judith Wright Prize was awarded between 2005 and 2011[21]. While we lack the 2005 and 2011 winners, our list of winners is correct otherwise.

According to Wikidata, the list of winners is a lot longer though, it has 22 entries.

  • Peter Boyle, correct
  • Adrian Caesar[22] (oops, he was only shortlisted)
  • Alan Gould[23] (oops, only shortlisted as well)
  • Barry Hill, correct
  • David Brooks (author)[24] (oops again, shortlisted)
  • Diane Fahey, correct
  • Emma Jones (poet)[25] (oops, "commended", not won)
  • Felicity Plunkett: commended, not won
  • J. S. Harry: highly commended, not won
  • Jan Owen: highly commended, not won
  • Jaya Savige: shortlisted, not won
  • Marcella Polain: shortlisted, not won
  • Martin Harrison: shortlisted, not won
  • Petra White: commended (twice), not won
  • Philip Hammial: shortlisted, not won
  • Sarah Holland-Batt: correct
  • Jordie Albiston: highly commended, not won
  • Susan Hampton: correct
  • Elizabeth Campbell: shortlisted, not won
  • Brendan Ryan: shortlisted, not won
  • Ella O'Keefe[26]: ouch, she was never involved with the ACT Judith Wright Prize in any way, she won the Overland Judith Wright Poetry Prize which is a completely different award (not even the successor of the ACT one, as both existed separately between 2007 and 2011 or thereabouts)
  • Melody Paloma[27] same problem as O'Keefe

So, GerardM, perhaps you can now write a scientific, statistically totally sound blog post about how Wikidata has a (22 entries, 5 correct) 77% error rate for these BLP data (with most of the errors going back 2 years, but some added only today by yours truly?)? And that Enwiki, in its one "item" about the Judith Wright Prize, had a 100% correct rate instead? Just imagine that we had replaced our local list with a Listeriabot shared list, in our quest to improve quality by using the bigger, more linked database? Perhaps the saddest thing is that our list already existed before you made all these errors in Wikidata. Fram (talk) 14:08, 16 January 2017 (UTC)[reply]

@Fram: Can you do the same exercise by filtering the statements having at least one reference which is not Wikipedia ? Snipre (talk) 13:20, 17 January 2017 (UTC)[reply]
Well, the prize has no reference[28], and for the other 20 have only O'Keefe and Paloma a real reference, but like I said these two are wrong anyway, as it is about a different prize. Fram (talk) 15:33, 17 January 2017 (UTC)[reply]
Very unfortunately these sort of problems have been going on for years and years and nothing effective seems to be done. It seems to be left to people noticing errors to correct them and not up to those creating them to improve their faulty mass importing. See, rather notably, wikidata:Wikidata:Project chat/Archive/2014/08#Vandalism? and wikidata:User talk:GerardM/Archive 1#!!!!! What !!!! (where crowds of Israelis, not just Israeli politicians, were completely wrongly marked as "religion Islam"), and, ongoing, wikidata:User talk:GerardM. Thincat (talk) 16:22, 17 January 2017 (UTC)[reply]
Damnit, I will have to remove 'alter Jews to religion Islam and vice versa' off my 'how to vandalise Wikidata and have it disseminate erroneous information' list. Wont be original now... Only in death does duty end (talk) 16:32, 17 January 2017 (UTC)[reply]
I noted that people are complaining about his edits at the Wikidata project chat (section "linguists", where an error rate of 37% for some of his edits gets greeted with a resigned "Usual for GerardM"), and at his own talk page (section "it's frustrating"), where four persons are complaining about his general approach to editing. He "defends" himself with a link to a recent blog post he made[29], where he is lambasting Wikipedia for adding a wrong award to Clare Hollingworth, only to have to add in a second comment below the post that actually, enwiki was right all along anyway. But he doesn't care about making errors, since they will be solved someday anyway. Nice attitude, explains his 77% error rate I highlighted above, or his merge of [30] and [31] soon afterwards (one is a hill in a district, the other is the district...) He probable meant to merge [32] instead (two other articles about the same district), but now he has added incorrect interwikilinks (so even these get corrupted by Wikidata or GerardM, and no one notices it because they don't turn up on our watchlist). Other discussions at his talk page, like the "Anna-Kristin Ljunggren" section, indicate that he doesn't understand (or care) what he is doing, damaging wikipedia (in this case norwegian) in the process. Correction of errors that get mentioned doesn't seem to be something that interests him. But we shouldn't criticize him because he has over 2 million edits... Frightening! On enwiki, we could restrict or block him, but we can't control who edits Wikidata, and so e.g. known BLP violators could still easily edit enwiki through Wikidata lists, infoboxes, ... if these would become accepted. Fram (talk) 08:31, 18 January 2017 (UTC)[reply]

Some comments

First, I do make mistakes. Obviously. The big thing is that What I am after is not for Wikipedia to import any data from Wikipedia, I am interested in getting links associated with wiki links and red links. The first objective is to bring Wikidata tooling to Wikipedia so that from Wikipedia we can more easily find inconsistencies and associate the links with statements at Wikidata. For all kinds of reasons there will be inconsistencies. Some of them will be editorial and some are just wrong. Once it is easier to do these things more and more effort will go into reconciling the differences between both Wikipedia and Wikidata. Again, this does not change the experience of Wikipedia at all.

Second, if and when a Wikipedia decides to use Wikidata directly is up for them. I do not care really.

Third, it is assumed that everything has to be perfect. It is not and Wikidata is much better than it was before but for it to grow there has to first be something to grow and nurture. At the start I have imported a lot from Wikipedia categories and some checks and balances were in there. The problem is that the categories are not consistent and this introduces the problems that were introduced. I am no better at finding after the fact what went wrong because there is no record and much of the tooling has changed a lot. I made errors but I think they are within the range of what can be expected of either human editing or bot editing. The crux is that I have been bold and it resulted in a lot of content, content that allows Wikidata to grow and prosper.

Fourth, it is nice to see that I am vilified. Talk about BLP. I do not care but I do care that it is used to deflect from what is proposed. Then again do your worst. It is not an argument that makes you stand strong quite the contrary, it gives me the impression that you do not know what you are talking about, that you only have an axe to grind. Thanks, GerardM (talk) 07:22, 19 January 2017 (UTC)[reply]

BLP doesn't mean that editors can't be criticized for their editing errors, sheesh. But thanks for confirming that what you are interested in with your proposal is making Wikidata better (or at least larger), not so much making enwiki better. What the purpose would be of making Wikidata better in this way remains unclear, but that "more and more effort will go into reconciling the differences between both Wikipedia and Wikidata." is a problem, not a solution.
Again, you prove that you do not understand how this works. Fine. Thanks, GerardM (talk) 10:14, 19 January 2017 (UTC)[reply]
Then make a better effort at explaining it. To use your logic and statistics, how would replacing enwiki lists with an 11% error rate with wikidata lists with a 77% error rate actually improve enwiki? Or, to go back to your proposal, how would the system even know, if you had a redlinked "Overland Judith Wright Poetry Prize for New and Emerging Poets" in a Wikipedia article (which happens in List of Australian literary awards), that it was supposed to go to this unsourced, incorrectly named item you just created? Before the system can automagically link the items, a lot of work will need to be done, since just saying "they have the exact same name" will often not help either. You have created at Wikidata multiple indistinguishable items "Helena", which no system could ever reliably choose between if you have a redlinked Helena at enwiki. You would need a lot of AI to decide, based on what page the redlink is found on, whether any item at Wikidata is the right match (if it exists at all, which is usually not the case anyway). There are two Edgar Millers at Wikidata, but neither is about Edgar Miller (psychologist). Even if you had AI, take Klaas Boot. A redlink at Dutch Sportsman of the year, wining the award in 1956 for Gymnastics. Wikidata has two persons named Klaas Boot, Klaas Boot sr. and Klaas Boot jr.. Assuming the link system would have found the match between the names, it would then at best link to Sr., who is at Wikidata described as a "Dutch gymnast", and not to Klaas Jr., who is described as "Dutch television presenter" and makes no mention at all of gymnastics or sport or awards. Too bad that the correct link would have been to Jr., not Sr. Basically, what you propose is a heavy piece of programming, which will be hard pressed to give good performance and good results at the same time, but which will be unlikely to result in more or better articles. In most cases, one will need to look for reliable sources (through Google and the like) anyway, since usually these are lacking in Wikidata. And to get to the articles in other languages usually will be much faster through Google as well, or by using the existing interface (for the Klaas Boot example, a redlink on Dutch Sportsman of the year? Go to the Dutch version of the article), and there you are certain to find the right [https://nl.wikipedia.org/wiki/Klaas_Boot_jr. Klaas Boot.
In the end, your proposal is practically unfeasible and would even when implemented not significantly improve enwiki even in the long run, nor would it help our readers in most cases. The cost would be way too high compared to the potential benefits. Fram (talk) 11:20, 19 January 2017 (UTC)[reply]
As for your problematic edits listed above (and those discusseed by others at Wikidata), these are not edits from the start of Wikidata, these are edits from late 2014 to just the last few days. You are quick to write a blog where you incorrectly claim that wikipedia statistically has a 20% error rate, based on one sample where it had an 11% error rate; but when it is pointed out that a similar sample based on your recent edits show a 77% error rate, you start about BLP, vilifying, invalid "everything has to be perfect" requirements, and so on. But when everything is explained to you at length here, and you then still succeed in adding the website of the wrong Overland Judith Wright Poetry Prize to the article of the Judith Wright Prize[33] then I don't think any improvemnts may be expected. Oh, FYI, the correct website would be this.
"I made errors but I think they are within the range of what can be expected of either human editing or bot editing. The crux is that I have been bold and it resulted in a lot of content, content that allows Wikidata to grow and prosper." No, you have been bold again and again, resulting in way too much invalid content, often related to BLPs, which makes Wikidata (and its reputation) worse and which seriously reduces the appetite to include Wikidata data here (and which in most cases you left to others to correct even when the problems were pointed out). You are one of the most prolific Wikidata editors, and can continue largely unchecked. Someone with your track record would have been long restricted or blocked at enwiki. If Wikidata doesn't handle these kind of edits and editors any better, then it just can't be trusted enough to be used as a datasource, and your proposal (and comments) here is just a waste of time. Fram (talk) 08:08, 19 January 2017 (UTC)[reply]
You do not know my track record. It is much longer and probably complex than you expect. Never mind, you want an argument why cooperation would benefit English Wikipedia. I have one for you. When you react, do not talk about me, that is not relevant, talk about the point that I make. Thanks, GerardM (talk) 09:40, 21 January 2017 (UTC)[reply]
Read it, don't see how your lofty conclusions follow from the proposal you make. When you have a redlink, you can a) write an article about it, or b) try to match it to Wikidata based on whatever, then (usually) write a Wikidata item because none exist, or find good sources to verify and expand the Wikidata item, and then write a Wikipedia article based on this. For some reason, it looks as if A is a lot more logical and productive than B. Perhaps for small wikipedias this may be different (although even then most of the information will still need to be researched anyway, so why bother with Wikidata in the first place?), but for enwiki (and most other large wikis), going to Wikidata as the first port of call makes little or no sense. And what the gender gap has to do with all this? Yes, after you have added links to Wikidata to all redlinks on enwiki, you can probably calculate how many of these are about women and how many about men (and how many about neither). Of course, in that time you could simply have written many articles about women, if that is your main interest. Or about people from the Non-English speaking, non-Western world, because as far as I can tell we have a much larger globalization gap than a gender gap. Fram (talk) 14:45, 21 January 2017 (UTC)[reply]
I would think the "globalization gap" is precisely where wikidata can help. I've been working on organization-related data and see this all the time. For instance there are hundreds of universities in Indonesia that have an entry on the 'id' wikipedia but no other wikipedia. Wikidata entries based on those idwiki pages now at least give you basic information on name, website, maybe location, type of institution, etc. and a link to the 'id' article from which you could create at least a brief English translation. Similarly for many institutions in Brazil, or even some European countries. Surely it is better for enwiki to have at least a stub of information on a legitimate organization based on what can be gleaned from wikidata than to have no hint it even exists? ArthurPSmith (talk) 15:13, 21 January 2017 (UTC)[reply]
No. Without reliable sources for such subjects, we are better of without an article than with poor stubs based on an unreliable source. And to find such universities, it is easier to check the enwiki lists, and to follow the interwiki link to e.g. the Indonesian wikipedia to find more complete lists. FInding it on Wikidata is nnot really user-friendly. E.g. the first redlinked one I find has an article on the Indonesian Wikipedia, [34] but not on Wikidata. I found it through Google. So why would we link all redlinks to Wikidata and not to e.g. Google, which has much more information than Wikidata and more often points to reliable sources as well? Fram (talk) 17:52, 21 January 2017 (UTC)[reply]
@Fram: Why don't we link to Google, Britannica, or other external sources when they have better articles on a given topic than we currently do? Because they're external links, and we prefer internal links to Wikipedia articles, even if they're stubs/worse than external sites, with the hope that someone will then come across that article and help improve it so it is better than the external site. The same applies to links to Wikidata - they are internal links within the Wikimedia projects, and they can be improved by pointing visitors towards them and asking them to improve them. In the long run, hopefully we'll have article placeholders that will even present that information inside the English Wikipedia - but in the meantime, it's better to point towards the wikidata entry instead. Thanks. Mike Peel (talk) 23:14, 21 January 2017 (UTC)[reply]
I agree with Fram here. It is better to wait until reliable sources are available and/or a reasonably complete and informative article can be written, rather than having a small stub with sources of doubtful veracity and usefulness. Sometimes a redlink can prompt the writing of a proper article, in ways that a stub doesn't. It is difficult to be sure, and I wish proper studies had been done on this (maybe someone has studied this?). Carcharoth (talk) 00:56, 22 January 2017 (UTC)[reply]
@Fram: d:Q12486663 is the wikidata entry for id:Institut Teknologi Sumatera - and it has sat there as an entry with no attached data other than the idwiki link since 2013. If even one other wiki had cared to link to that wikidata item in some way we would likely have considerably more information about that institution in wikidata, and available to every other wikipedia simultaineously. Obviously all the wikipedias and wikidata are works in progress, I think the big question here is how do we most encourage that progress so that all the world benefits? Your approach of essentially "isolationism" strikes me as simply fundamentally the wrong way to go. It leads to duplication of effort, unresolved discrepancies and errors, and a far greater maintenance challenge in the long run. ArthurPSmith (talk) 18:06, 22 January 2017 (UTC)[reply]
How would having the differently named redlink on enwiki linked to that Wikidata item have made any difference? "we would likely have considerably more information about that institution in wikidata" how? This seems like more wishful thinking. At the moment, it looks to me as if Wikidata would not reduce maintenance or the error rate (just as often, it would spread errors to other wikiversions). It may be useful for smaller wikis (considering that something like the Volapuk wikiversion is happy being filled with botcreated articles to inflate their article count, I guess they would be very happy with Wikidata-generated articles and data as well), but for enwiki, it makes little sense. It seems more logical to use the big wikiversions to populate wikidata, and then use wikidata to populate the smaller ones once Wikidata has reached a sufficient quality level. But to create articles on e.g. Indonesian universities, the best way is to actually write them, based on reliable sources, and with perhaps at most the indonesian Wikipedia as a source of inspiration. Using Wikidata somewhere in this process (and let's be clear, this was the entry before I highlighted it here, a link to the Indonesian article and nothing else, since May 2013) would have been a waste of time. Now you have expanded the wikidata entry instead of creating an enwiki entry. One can wonder what would have benefited the most people and had the most impact in the long term. To me that would have been an enwiki article, not a Wikidata entry. Fram (talk) 08:51, 23 January 2017 (UTC)[reply]
I wrote my comment before I edited the wikidata entry; my editing there made some basic information from idwiki and the institution website available in over 300 languages. Doing the wikidata work took me about 10 minutes. I'm not sure why it's my responsibility to write an enwiki article for this - why haven't you written one? The reality is, wikidata editing is far easier than writing good wikipedia articles. Let me share another example from my experience in the area of organizations - the French university system. Public universities in France were significantly reorganized in 2007, then again in 2013. Enwiki articles on the French university system as of early 2015 were completely out of date, and describe the pre-2013 French system as if it was current. French wikipedia was, of course correct. The various lists on English wikipedia needed to be completely reorganized. Who was going to do that? It was as if enwiki and frwiki were describing two completely different realities. If those lists and list-like templates had derived form a common (wikidata) source across languages, there would have been far less confusion and a much easier maintenance problem. ArthurPSmith (talk) 13:40, 23 January 2017 (UTC)[reply]
It's not your responsability, everyone is free to write where and what they like. But you have now made "some basic information from idwiki and the institution website available in over 300 languages" on a website which hardly anyone will find (unless they have first gone to the idwiki page), and in a format most people won't be interested in using to read about a subject (it's after all a database). "Wikidata editing is far easier than writing good Wikipedia articles". True, writing a useful real article is harder than adding a few loose tidbits to Wikidata. I don't believe that's really an argument pro Wikidata though. Yes, from time to time having our data Wikidata-based might be beneficial. Most of the times the opposite would have been true (see the Judith Wright Prize example above), and with Wikidata used the way some people propose that wrong information could have been shown in 300 languages, not just in one database hardly anyone reads anyway. By the way, Wikidata calls it "Sumatra Institute of Technology", Google calls it "Sumatran Institute of Technology", and enwiki and other sources call it "Sumatera Institute of Technology". "Sumatera" is what reliable sources in English also seem to use[35]. So have you now promoted a wrong English name to 300 languages? And is it correct that your item doesn't even have an "original name" property? Of course, if you leave out essential information, then creating (or expanding) an item in 10 minutes won't be too hard. Fram (talk) 14:20, 23 January 2017 (UTC)[reply]
@Fram: wikidata really does repay spending some time with it. To answer your question on translation, d:Q3492 pretty clearly demonstrates that en:Sumatra is the English translation of the Indonesian word id:Sumatera. If there is an official English name for the institution of course that would be the best label to use in English, but barring that one has to come up with some sort of reasonable translation - or go with the name in native language if that's at least readable in English (I added the Indonesian name as an alternate label for English). Labels and descriptions are one piece of wikidata (along with sitelinks) that do not have a source mechanism but in a sense they are self-sourcing, they help (along with "instance of" and "official website" relationships) to define what the wikidata item is about. Not every piece of information needs a citation! ArthurPSmith (talk) 00:46, 24 January 2017 (UTC)[reply]

Looking at Sumatra on Wikidata, I notice straightaway the claim that this large island is or was (this is not indicated) the capital of Pagaruyung Kingdom. This even has a "reference" in Wikidata, namely... the Wikidata entry on Pagaruyung Kingdom. A site which wants to act as source or basis for enwiki articles and data but which uses "reference" in such an extremely loose manner is not welcome. Furthermore, the island Sumatra obviously wasn't the capital of the Kingdom, which was only a part of Sumatra in the first place. The capital of the Kingdom was Pagaruyung, now a small village... The Wikidata Sumatra entry next states that it is located in the administrative division of Pangkalan Kuras. Strange, this should be a huge administrative division to encompass the whole of Sumatra. But in reality, this is one of the sub-districts of Riau, where Riau is one of the ten provinces of Sumatra. If even the few items on a huge topic (an island with a population of 50 million) has at least two such glaring errors (in 17 items, most of them not really about the island but about Commons or a stupid link to the Wikivoyage banner, for crying out loud; and should "elevation" really give the highest point? Not clear at all), then I doubt that "wikidata really does repay spending some time with it." except for the chuckle factor perhaps. When 2 out of 11 real items on such an important and easy to check subject are blatantly wrong (and have been since 2015), then Wikidata really isn't ready at all to be used a a source of data for enwiki, never mind as a source to carry this disinformation to 300 languages at the same time.

This isn't a one-off problem: the above two errors were introduced by two long-term, trusted Wikidata editors, and when I look at Java (the island), I see the same kind of wrong claims (Java is located in the administrative units West Java, Central Java, East Java, ...) added by yet another very active editor. If such errors are not caught on major topics, then how are you ever going to make Wikidata good enough to be used as the source for infoboxes, lists, or whole articles (placeholders or real ones)? Fram (talk) 08:19, 24 January 2017 (UTC)[reply]

Wikidata could certainly use more human eyeballs to find and fix errors like these, as could every wikipedia. What do you suppose the error rate is in the average enwiki article? As I mentioned earlier, the French university articles (including long lists) were substantially wrong in enwiki for 2 years. I've edited enwiki articles on major physics topics that had basic misunderstandings that had stood in place for 5 years or more. WP:WIP - applies to wikidata just as well. But I suppose the more fundamental question is whether wikidata is seen as part of "us" or just another "them" which seems to be what most of this page is about... ArthurPSmith (talk) 17:34, 24 January 2017 (UTC)[reply]

Is the page communal or a collection of opinions?

I was going to edit the page, but then I noticed that is full of signatures. If the page is communal, shouldn't signatures be removed from it?--Micru (talk) 08:06, 19 January 2017 (UTC)[reply]

Some people have signed, some didn't. Feel free to add your own opinions to it with or without signature: it's intended to collect opinions, it's not that important who actually added what. Fram (talk) 08:10, 19 January 2017 (UTC)[reply]
I find the mix confusing, as it discourages improvement of the page and the different statements. Specially "Uses of Wikidata on enwiki", should be factual and a collaboratively written, but as it is signed, it prevents discussion. Personally, I would prefer to remove all signatures, and allow edits provided that they keep the spirit of each corresponding section.--Micru (talk) 09:08, 19 January 2017 (UTC)[reply]
I would prefer the same as well, and intended it to be a signature-less page, but some editors have been edit-warring to retain the signature of someone else on some statements, to the point that the page got fully protected and the edit-warring editor only escaping a block because the 3RR report was stale whne someone finally got around to act on it. Fram (talk) 10:06, 19 January 2017 (UTC)[reply]
@Pigsonthewing and SlimVirgin: As participants in the previous conflict, are you ok in removing all signatures from the page and keeping the discussion in the talk page about contested points? I see the signatures as an obstacle to develop the page, but before proceeding I would like to know your opinion too.--Micru (talk) 10:43, 19 January 2017 (UTC)[reply]
Currently, they act as a barrier to collaborative editing. That said, I'm not sure that removing them will fix the underlying problems. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:32, 19 January 2017 (UTC)[reply]
As has just been amply demonstrated. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:45, 23 January 2017 (UTC)[reply]

A poor discussion

This page is a terrible contribution to the discussion of the underlying issues.

I feel I could contribute both on "data integrity on Wikidata" and "reliability of English Wikipedia content", and the relationship between those two issues. But I have no intention of doing so here. I would actually be glad to have Wikipedia:Wikidata/2017 State of affairs deleted. The personal attacks on this page could be deleted under talk page guidelines.

The scope of the subpage title has shown itself, pretty much, to be too broad. We can surely do better than this. Charles Matthews (talk) 12:19, 19 January 2017 (UTC)[reply]

Feel free to start any page about this you like. You are also free to nominate this page for MfD, but I don't see a good policy reason for it, nor a good non-policy reason either. The page, poor as it is, will be used in an upcoming RfC to give at least some background to what happens already with Wikidata on enwiki, and on how people perceive this. The value of this talk page is not really impressive though, but is indicative of the fact that this issue divides (parts of the) editing community and needs more facts and less ill-informed contributions all around. Personal attacks can also be removed of course, just make sure that they are real personal attacks and not general criticism of edits or something similar. Fram (talk) 13:13, 19 January 2017 (UTC)[reply]

Well, I suggest a "refactoring" of the page would be beneficial. You do realise that some of the discussion above is "pot-kettle"?

I have been working on Wikidata since 2014, on projects which have nothing to do with infoboxes. I know certain benefits have arisen, for the English Wikipedia (and not only that one). I would say your comment above about limiting Wikidata to the indexing function (interwiki) is not really accurate. I think the verifiability issue on Wikidata is important, but not as simple as is sometimes implied.

I'm concerned that you say the page "will be used". As you say, in fact, it reflects mainly that there are divisive arguments used here, about Wikidata. It is currently fashionable to force the issue in divisive areas of politics: and we don't see the argument from factual evidence getting much credence there.

WP:VOTE, full title Wikipedia:Polling is not a substitute for discussion, makes the point in its nutshell: "Some decisions on Wikipedia are not made by popular vote, but rather through discussions to achieve consensus. Polling is only meant to facilitate discussion, and should be used with care." I am arguing for care, given that this Talk page shows not the slightest sign of emerging consensus. Charles Matthews (talk) 13:40, 19 January 2017 (UTC)[reply]

"I would say your comment above about limiting Wikidata to the indexing function (interwiki) is not really accurate."? What comment do you mean?
As for "this page will be used": I plan an RfC (probably first a stage to decide, if possible, on a number of questions for the RfC, and then the actual RfC) to get some consensus about the use of Wikidata on enwiki (not a simple yes or no, but more of a "this, this and this is allowed, but that is not allowed). At the moment, all I note are discussions about this without any resolution, with one project embracing it and another rejecting it, with people implementing e.g. Wikidata lists on their own, with others making phabricator tickets to add wikidata links to enwiki, and so on. It would be great if you could start a page that would create a policy or guideline for this which has broad consensus and which is achieved by simple, congenial discussion. I doubt this can be achieved though, and think that an RfC with a number of neutral questions can see if there is a consensus or not. Such consensus in any case is not decided by simple votecount, but by the closing admin(s), so the end result may still be a "no consensus". But then at least we have tried.
So, feel free to start another page and gather all factual evidence that you can. I have little faith that it will fare better than this one, but I would gladly be proven wrong. Fram (talk) 14:03, 19 January 2017 (UTC)[reply]
Fram, one function related to Wikidata that I have seen used to great effect is called "mix 'n match". It addresses some of the objections you raised above about getting links correct. See meta:Mix'n'match and the actual interface here. I hope someone else can explain better than me the benefits of that. Carcharoth (talk) 14:14, 19 January 2017 (UTC)[reply]

Let's take the usability discussion, for example. How could that be made "factual"? It is more suitable for a separate page, I would say.

I was actually thinking yesterday about the "Richard Burton's wives" issue you raised, because I was adding education histories to Wikidata items about people, and it is annoying if the order doesn't follow chronology. For Burton, you are actually "supposed" to add a "start time" qualifier in the "spouse" statements; and I would say that you should reference it. That is, the marriages should come at least with dates, if we are going to attach significance to any ordering at all. It is certainly is troublesome. if the marriages are out of order, for the human reader. You can only fix it by deleting and retyping. But actually, with the dates, the system could fix that. This doesn't yet happen in the maintenance of Wikidata, but clearly it could.

This sort of thing strikes a newcomer as a usability issue. Wikidata statements are added to the bottom, sort of - there is a division into substantive statements, and identifiers; and there is a setting in Preferences to display the substantive statements in a standard order.

So those are some facts, and explanations. Do you agree that such matters deserve a page of their own? Charles Matthews (talk) 14:26, 19 January 2017 (UTC)[reply]

A page of their own, or a section on this page (not the talk page, the actual page). This page is intended as a central repository, if some aspects need more detailed explanations then subpages may be the way to go. But while we need such pages or sections as background, they will never decide whether we (the broader community) trust Wikidata enough to let them insert X into Wikipedia (where some X will be widely accepted, and some others will be generally refused, to take an extreme example, I don't think many people will let Reasonator generate articles, or will let a bot delete articles here when for some reason the Wikidata entry gets deleted). Fram (talk) 15:07, 19 January 2017 (UTC)[reply]
I don't think anybody calls for Reasonator creating articles on Wikipedia or call for bots deleting Wikipedia articles when Wikidata entries get deleted. That's a strawman.
We seem to disagree about the usage of Listeria, data import in templates, links from Wikipedia to Wikidata and the article placeholder tool.
There also an open discussion about what tools should be build to improve the integration. That goes for direct editing of Wikidata via Wikipedia, Watchlist integration, new versions of listeria and also further projects like Librarybase. ChristianKl (talk) 09:09, 23 January 2017 (UTC)[reply]
[36] This Article Placeholder discussion certainly saw some prominent Wikidataians calling for Reasonator-created pages on Wikipedia. "I agree with GerardM. We should be getting rid of substubs and replacing them with reasonator pages not creating more substubs." Fram (talk) 09:57, 23 January 2017 (UTC)[reply]

Sorry – "they will never decide whether we (the broader community) trust Wikidata enough ..." – did you get elected to office here? I feel this kind of rhetoric is misplaced. Maybe half a year of Brexit does that to people.

I would say, with the best will in the world, that what is needed is a "Wikidata FAQ for Wikipedians". And I think the effort to get one by choosing this particular route in project space has probably failed. Which is why the page should be deleted, or edited very seriously.

But anyway, isn't a FAQ what we are talking about? Charles Matthews (talk) 15:22, 19 January 2017 (UTC)[reply]

Rhetoric? A FAQ, a page about Wikidata usability, this page, ... are (at best) interesting and helpful, but "they" (these pages and sections, not the people editing them) can not decide what the policies about Wikidata will be. That will happen through discussion, probably an RfC (as the issue seems to me to be too acrimonious to be decided by simple discussion), where the broader community (you, me, ... i.e. "we") can have their say. In my view, the ultimate point for what will be accepted and what won't be accepted is whether we (again, you, me, all others participating) trust Wikidata or its implementations enough. Do we trust Wikidata to provide lay-out, or do we want to keep that local? Do we trust Wikidata for sourcing? Do we trust Wikidata for BLP-sensitive things? You may describe the decision-making process here, and what will be the driving force, in your own words, and you don't have to agree that trust will be a deciding factor, but there is no reason to air your disagreement with "did you get elected to office" and references to Brexit. My "rhetoric" was directly related to the issue at hand, your rhetoric though was rather out of the blue and indeed misplaced. A bit sad to see you coming here to complain about a poor discussion which should be deleted because of the personal attacks, and then reply in such a manner to what is a rather normal and relevant statement, even if you don't agree with it. Fram (talk) 15:34, 19 January 2017 (UTC)[reply]

How about the ultimate point being the Wikipedia mission statement? Say as on q:Jimmy Wales. Or the WMF statement of purpose: "The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally." As far as I'm concerned, the latter has quite a lot to do with Wikidata.

How about addressing my point on the need to build consensus, per the important WP:VOTE essay, rather than laying down sweeping preconditions?

I apologise, as a UK citizen, for everything about Brexit. I did think you might include under "general criticism of edits or something similar" a criticism of tone.

I do think, if you will try not to take this as a personal attack, that the non-specific use of "we", as if representing a constituency, was not helpful. It anyway gives me a clue as to why this debate is not going well. Normally editors here speak for themselves. Charles Matthews (talk) 16:24, 19 January 2017 (UTC)[reply]

Quoting Jimbo Wales is unlikely to convince me ;-) First, I think I have caused the friction or misunderstanding here (at least in part) by using "they" instead of "these" in the "they vs. we" statement above. I meant that information pages don't decide consensus, but the community does. "We" was not meant as as a group separate from some undefined they, but as "all of us". That's why, what you call a non-specific use of "we", had the parenthesis "(the broader community)", as an attempt to explain what I meant. I clearly failed there... Anyway, there are different ways to achieve the statement of purpose, and I am not convinced that the use of Wikidata is in most cases the best way to achieve this. Ccriticism of Wikidata and not wanting to use Wikidata for some things doesn't mean that one doesn't support the WMF statement of purpose. And an RfC is a medium to gauge consensus, even though that essay may not believe it to be the best or even a good one. Like I said, you are free to try a different approach to get consensus about Wikidata use. Fram (talk) 17:43, 19 January 2017 (UTC)[reply]

OK then, it is better if the opposite of "us" is not "not one of us". Let's try for some consensus starting from GIGO. That's an important principle. So is the principle that one has to live in the real world. I find it interesting to draw a parallel with Commons. The metadata on Commons is, from some points of view, in a bad way: it can be 10 years old. It is essentially never footnoted. There are mistakes made here with images misidentified, so being placed incorrectly in articles. More worrying, if quite rare, is that apparently good metadata from a GLAM is wrong: I know this happens. It is unlikely that in all cases independent, third-party referencing could be provided: we tend to accept the GLAM metadata as authoritative.

So, why do we accept the use of Commons images here routinely, instead of insisting that everything be uploaded here, and the metadata scrutinised? I would suggest this is not a matter of "trusting" Commons: because the mass uploading probably makes that (a) meaningless and (b) implausible. There could be plenty of photoshopped nonsense: copyvio is more of a concern. There are two things acting in the other direction: Commons is very helpful in developing articles; and images, we feel, tend to provide internal evidence that we use in place of good metadata. "In the real world", don't we mainly just accept Commons on that basis?

To sum up: some of the arguments deployed in the case of "trusting" Wikidata, to adopt your term though I don't like it, can be seen in action here. One kind of argument would say "Wikidata is not very useful in developing articles or lists here". I don't agree, but we can have a rational discussion about that—and the WMF point I was introducing is that enWP and deWP are the hardest Wikipedias on which to make this argument, because they already have so many non-stub articles, the case being clearer on smaller wikis. The other part is clearly more interesting. It is rational to say that adding an image of a dog here and giving the caption "cat" is not really troubling: it is child-like vandalism and a small child could spot it. I won't try to complete the discussion here: I did say in my original posting I wasn't going to do that. But I think GIGO can be used in a nitpicky way, as well as sensibly. Charles Matthews (talk) 18:48, 19 January 2017 (UTC)[reply]

The other problem with insisting that everything be uploaded here, and the metadata scrutinised? is that it won't happen; hosting files here does not increase the scrutiny or quality in any way. The issue with Wikidata is that there is a widespread feeling that in information hosted on Wikidata there is less scrutiny and quality than information hosted here. Jo-Jo Eumerus (talk, contributions) 19:30, 19 January 2017 (UTC)[reply]

Yes, that was the "widespread impression" I said I wasn't going to address here.

Why I said that there was a pot-kettle argument in progress is that I don't think Wikipedians have the right approach, if they think denigration of Wikidata helps anything at all. For reasons User:Carcharoth alluded to above, amongst other things, Wikidata can be helpful with Wikipedia's quality of information, and has been criticised for its importation of data from Wikipedia. The part that might be taken seriously of all that is any issue of circular referencing.

Putting Wikipedia "factoids" into Wikidata where they are scrutinised ... well, honestly, if people can't see some good might come it, I don't want to spend much time on further exposition. Suffice it to say that machine-readable format could help with detecting anomalies. Charles Matthews (talk) 19:46, 19 January 2017 (UTC)[reply]

What is the goal of this RfC ?

What is the goal of this RfC ? Is it a real discussion where we list problems and find solutions or a list of complaints from persons who don't want to change their habits ? We had already two big RfCs in the German and the French WPs and both communities were able to use WD with some rules. Why WP:en is so special to not be able to do a similar process ?

Some mains problems with WD and possible solutions:

  1. Problem: WD has a large amount of unsourced data or data sourced from other WPs. Solution: Just use data from WD with have a source which is not a WP. You can even be more selective and require that only reference like books, databases, newspaper or scientific articles have to be used if you don't trust data from websites. Data from WD have different qualities but you can filter the data to take only what you want. So bad quality data is not an obstacle because you can analyze the data quality before the data display and you can discard it.
  2. Problem: Data can be different between the ones written in the wikitext and the ones imported from WD. Solution: Display the source of the data in order to allow the readers to understand why the data are different (different sources so different values).
  3. Problem: Changes in WD items are not integrated in the watchlist of WP contributors allowing change in WP articles without any notice. Discussion: This is the same situation for pictures from Commons or for templates in WP:en and this was never a problem until now. If this is really a problem then the same conclusion should be applied to Wikidata, Commons and WP:en templates. Solutions: Reduce use of WD data in some particular environments like templates, infoboxes, tables and graphics. Then use the option of the watchlist which allows to follow the items linked to articles listed in the watchlist.
  4. Problem: WD is an external environment with a dedicated interface. WP contributors don't want to use WD or find the WD interface difficult. Solution: Use only templates which give priority to local data. Templates have to be able to display data present in the wikitext of the WP article first and then only if no value is found then data from WD can be used. WP contributors add data in the wikitext using the old system, then this data will take priority to WD data.
  5. Problem: New templates using WD data are ugly. Solution: Not a WD problem, this is an internal process of WP. You have to define a process to discuss and authorize the use of new templates, but this should already exist. If not don't blame WD.

As summary I just want to add that WD is not forcing you to display some data. WP:en can choose through the data selection what is displayed (you can even specify that only data from predefined books or databases can be displayed). The above solutions will solve most of the current problems and will allow WP:en contributors to use WD. WD has to improve its protection systems against vandalism and the data quality but this is ongoing. WP articles don't need to reach a FA-Class some days after their creation so if you allow contributors to start with stub articles why do you have a problem with the teething problems of WD ? Snipre (talk) 23:53, 21 January 2017 (UTC)[reply]

This isn't an RfC. The intention of this discussion is to inform a big RfC that will determine consensus on how enwp uses and doesn't use WD - in other words, the "rules". Nikkimaria (talk) 00:18, 22 January 2017 (UTC)[reply]

Blocklist

The page currently says "We can block links from being added to enwiki (through the blacklist); what if these links are inserted from Wikidata (in infoboxes or so)?" This is incorrect. We have a global blocklist that's also blocking edits on Wikidata. ChristianKl (talk) 09:42, 23 January 2017 (UTC)[reply]

And on every Wikimedia project. Meaning that we need a damn good reason to have it added to the global list. Jo-Jo Eumerus (talk, contributions) 09:53, 23 January 2017 (UTC)[reply]
Yeah, it is not easy (or quick) to get something on the global list (which is why individual wiki's have their own). Only in death does duty end (talk) 10:36, 23 January 2017 (UTC)[reply]
Indeed. We now have a discussion on whether the Daily Mail should be added to the enwiki blacklist; to get this added to the global blacklist would be almost impossible, I think. The local one is [37], the global one is [38]. Fram (talk) 10:57, 23 January 2017 (UTC)[reply]

Example of problematic Wikidata infobox

Carltheo Zeitschel has recently has its infobox converted to show some Wikidata fields[39]. One of the fields that are now visible is his occupation: diplomat.

While technically correct (though unsourced at Wikidata), the article gives quite a different image: "a German Nazi physician, and diplomat who organized the deportation of Jews in the German Embassy in France as Judenreferent." If you look at Wikidata, you also get the innocent "diplomat" as description in English and Dutch[40]. This whitewashing may be wanted on Wikidata, but I don't think enwiki should describe nazi criminals only as "diplomats" in their infoboxes. In the German Wikipedia, his first category is "Kategorie:Täter des Holocaust" for a reason. I removed the infobox. Fram (talk) 16:06, 23 January 2017 (UTC)[reply]

I just made 4 (failed) attempts (using Preview) to incorporate the infobox without the diplomat profession, and came to the conclusion the easiest way was to convert it to a non-wikidata infobox. Of course I then came to the conclusion having an infobox that contains his name, birth and death details, (which are already in the very first sentence of the lead) was completely pointless. I am going back now to have another go. Only in death does duty end (talk) 16:20, 23 January 2017 (UTC)[reply]
Behold the wonder of the (un)infobox! So you need to manually suppress a field from wikidata to use the template which works but leads to further problems down the line. I am familiar with how templates work in general so I knew right where to look to work out what I needed to do. Not all editors will know that. Secondly now I have suppressed it, any changes at wikidata will not come through, regardless if they are sourced correctly or not, until someone removes the suppression. This has basically added an extra layer of complexity to what should be a simple editing process - click edit page, insert/remove content - click save. Only in death does duty end (talk) 16:28, 23 January 2017 (UTC)[reply]
Unfortunately Fram replaced the "Wikidata" infobox, which said he died in 1945, with an old-style infobox, which said, as did the lede, that he died on "21 April 1945". The latter date is uncited in the en.Wikipedia article. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:00, 23 January 2017 (UTC)[reply]
And the Wikidata date was also uncited, so how was my change "unfortunate"? At least here we can add a "citation needed", in a Wikidata only infobox this is impossible. Fram (talk) 17:04, 23 January 2017 (UTC)[reply]
Your logical fallacy is tu quoque (and the {{Cn}} is needed in the body, not the infobox). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:11, 23 January 2017 (UTC)[reply]
It doesnt have an inline citation but it is actually sourced in the prose. His death was listed in the release of formally classified documents. Its a result of the patchy translation from DEWP - I think but my German is incredibly bad, it is sourced explicitly in the German article. Only in death does duty end (talk) 17:25, 23 January 2017 (UTC)[reply]
Well, if you think "hIs[sic] fate was unclear until the files [plural] output from the Foreign Office in 2014 in the literature" is a source... Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:42, 23 January 2017 (UTC)[reply]
It certainly is a better one than "1945" from Wikidata without any source at all... Fram (talk) 17:45, 23 January 2017 (UTC)[reply]
More FUD. "1945" is acceptably cited in the article, to " Bernhard Brunner, Der Frankreichkomplex: Die nationalsozialistischen Verbrechen in Frankreich und die Justiz in der Bundesrepublik Deutschland, Frankfurt 2008, S. 43." Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:50, 23 January 2017 (UTC)[reply]
Please explain what in my statement included "fear, uncertainty or doubt"? There was (probably is) at Wikidata no source for "1945", so why would we include it from there? Fram (talk) 18:01, 23 January 2017 (UTC)[reply]
And of course it isn't a tu quoque fallacy (it is a fallacy when it "intends to discredit the validity of the opponent's logical argument", as our article on it states; since you had no valid logical argument for your revert, my reply isn't an example of the fallacy. QED). I criticized the Wikidata infobox for another reason, you then saw fit to revert to it for a completely bogus reason (you replaced an unsourced date here with an unsourced date from an unreliable other site, which is hardly an improvement), and "cn" may equally be applied to infoboxes. Anyway, my infobox correctly summarized the article. If the article is wrong, then the infobox is wrong as well. But replacing it with the Wikidata infobox didn't help (just like it doesn't help in most cases). Fram (talk) 17:34, 23 January 2017 (UTC)[reply]
It is possible to edit Wikidata. I think a {{sofixit}} attitude towards Wikidata would help us a lot more than trying to find out exactly how terrible it is and why we should not use it. On the other hand, Wikidata should lower the bar for Wikipedians to contribute so it gets the critical editor mass it needs to work well. —Kusma (t·c) 17:31, 23 January 2017 (UTC)[reply]
This is going to be a problem, as SOFIXIT applies to ENWP, not another project. As soon as you start requiring people to go off-wiki in order to apply on-wiki changes, you have drastically increased the burden on editing. Apart from the whole having to learn an entire other projects rules and policies, its just adding un-needed steps to the editing process. The idea of editor retention is making it easier and simpler to edit. Not increase complexity. Only in death does duty end (talk) 17:35, 23 January 2017 (UTC)[reply]
I'm sympathetic to the goals of Wikidata, but this does seem a key point to me -- perhaps the key point. The number of users willing to go to Wikidata to fix an issue is surely several orders of magnitude less than the number who will make edits to articles including Wikidata. Teaching an editor like me how to fix something by going to Wikidata isn't the answer. Andy, if it should turn out that we cannot get adequate engagement from the bulk of en-wp's editors to keep up with the edits needed on Wikidata to keep it sourced and accurate, what would plan B look like? Mike Christie (talk - contribs - library) 18:32, 23 January 2017 (UTC)[reply]
We certainly should have a plan how to improve editor engagement with Wikidata. But I don't think the problem is insurmountable. Editing infobox data on Wikidata could possibly become as easy (or even easier) as editing infoboxes here and not require understanding of a template's intricate syntax. Thanks to SUL, I hardly notice when I am editing on other projects (well, I notice because I have my own custom .css files here).
About other projects: They are not alien things (and they should not feel "off-wiki"), they are part of our family. We share data with Commons quite a lot, and many people contribute to several projects. If we decide to ignore Wikidata and keep all data local, it may die (just like we killed Wikinews, as we have never been strict about WP:NOTNEWS and so have usually provided better news coverage than Wikinews). —Kusma (t·c) 20:23, 23 January 2017 (UTC)[reply]
Agreed, in principle. My question is about what we will do if it turns out that most editors won't engage with Wikidata, despite whatever efforts we may put in to encourage them. Mike Christie (talk - contribs - library) 20:27, 23 January 2017 (UTC)[reply]
Plan B I guess would be to use technical means to only filter out Wikidata info which is either properly sourced or does not need a source, and to develop anti-vandalism protection.--Ymblanter (talk) 21:04, 23 January 2017 (UTC)[reply]
@Fram: I've just spotted this. You seem to have some facts wrong:
  • The Wikidata infobox was added by this edit by @Brock-brac back in December - not myself. The edit I made was to keep it working when I made a breaking change to the infobox template (changing it from opt-out to opt-in). If you'd pinged me when you criticised my edit, I could have pointed this out sooner.
  • Individual rows in the infobox can now be suppressed by adding e.g. "suppressfields=occupation" (as @Only in death discovered, thanks for making that edit!)
  • This infobox template now includes opting out of unreferenced material from Wikidata (use "onlysourced=yes") if you want - I turned that on in BLP articles using this infobox, but not others.
  • In this case, the information added from Wikidata was correct (as the article says, he was a diplomat), but lacking context. The easiest way to fix that is to simply edit Wikidata to add more context - it's not hard!
  • Please don't build straw-man arguments like "whitewashing may be wanted on Wikidata" that are not true! (Or, provide a citation to support that.)
  • I've reverted the article back to the Wikidata version now (with the suppressed occupation field), since it displays the exact same information now as the manual version does (thanks @Pppery: for changing the date of death on Wikidata, which I think was the only change made here so far).
Thanks. Mike Peel (talk) 10:30, 24 January 2017 (UTC)[reply]
And I've reverted your pointy revert. Why would you change an infobox here that works allright, to the Wikidata-filled version of it, if it a) gives the same result, but b) takes it from an unreliable site with unsourced data, with c) worse layout of the infobox (added unnecessary clutter) and d) more difficulty for most people to edit the info or add additional fields (it is not really clear how I can add "context" to the item "diplomat" in Wikidata, I can add qualifiers but these are not the same at all). Fram (talk) 10:58, 24 January 2017 (UTC)[reply]
It's not me making pointy edits - it's you. You changed the infobox from a Wikidata-filled version that was working OK (after the occupation field was suppressed) to one that shows identical content. And then you raise a fuss about it here, which is definitely POINTy. In answer to your specific points: a) because it's a test case of the use of the infobox, and we can add additional information to it through Wikidata. b) Are you referring to Wikidata or Wikipedia here? c) edit links are now "unnecessary clutter"?!, d) citation needed - please see d:Wikidata:Introduction if you need an introduction to how Wikidata works. Thanks. Mike Peel (talk) 11:25, 24 January 2017 (UTC)[reply]
No. I removed the infobox, and then I raised a fuss about it here. [41] is the removal, on 16.06, the same minute I started this section. Only in Death succeeded in getting the infobox to hide theone field on 16.24[42], after which I replaced it with a standard infobox with more information than the Wikidata infobox had at that moment[43]. So you have both the order of events wrong, and the actual result of my edits (which was not "changed the infobox[...] to one that shows identical content".
As for other points: how many tests did you plan? In any case, it is at TfD now. For b), I meant Wikidata, duh. c) Yes, edit links to Wikidata are unnecessary visual clutter in read mode. And d): this link to the introduction contains no information about my question, but a lot of highly optimistic information which seems to describe more what you want Wikidata to be than what it actually is. Fram (talk) 13:08, 24 January 2017 (UTC)[reply]

Prototype for editing Wikidata from Wikipedia on the works

As far as I read on the Wikidata weekly summary 2017-01-21 the development team is working on a clicking prototype interface for editing Wikidata from Wikipedia, based on a research conducted in 2015. That would remove the barrier of going to another wikiproject that other editors find annoying. The ticket for the task is here T132790.--Micru (talk) 08:45, 24 January 2017 (UTC)[reply]

Thats going to be problematic for a whole host of other reasons. The main one being if the WMF makes an easy interface to edit wikidata from the article directly, instead of just ensuring article integrity at ENWP, you now have to contend with every other project that accesses the same data. Its basically an escalation of the current drawbacks without solving the underlying problem. Only in death does duty end (talk) 09:15, 24 January 2017 (UTC)[reply]
What do you mean by "ensuring article integrity"? And btw it is not the WMF, it is WMDE the one developing this.--Micru (talk) 09:18, 24 January 2017 (UTC)[reply]
One problem, though probably minor compared to data integrity: enwiki prefers, where possible, to have English-language sources for a claim. Other wikis probably have similar wishes. How many references are you going to end up with per statement at Wikidata? And if you then want to show on enwiki the references taken from Wikidata, how are you going to ensure that you show the English ones only (if those are added), and only other language sources if no English one is provided? As far as I know, Wikidata has no "language of the source" property.
As for the prototype, I wonder how that will ever work. I discussed on this page how I tried to add a claim to Wikidata, using a book as a reference. It turned out that I had to create a new item for that book, before I could even use it as a reference. The current mockup of the prototype doesn't seem to take such things into consideration. Fram (talk) 09:34, 24 January 2017 (UTC).[reply]
I guess it would be feasible to give preference to English sources.
Regarding the language of a reference there is a property for that d:Property:P407, here it can be seen in use: Notices of the American Mathematical Society (Q24158).
As for how to create an item for a reference from Wikipedia, I think it has not been taken into consideration for the first prototype. @Lea Lacroix (WMDE): Can you please clarify?--Micru (talk) 10:08, 24 January 2017 (UTC)[reply]
The property for the language of a source only works if the source is an item (e.g. a book), not if it is an URL (which is the vast majority at Wikidata). Fram (talk) 10:15, 24 January 2017 (UTC)[reply]
You can also use that property together with the reference url, for instance for the "occupation->computer scientist" of Tim Berners-Lee (Q80), I have added the language of the website. You can add more statements, like date retrieved, title, etc.--Micru (talk) 10:23, 24 January 2017 (UTC)[reply]
Thanks. These things really aren't obvious at all (and seem to be very sparsely used, this the first time I saw an URL with the language added on Wikidata). Fram (talk) 13:08, 24 January 2017 (UTC)[reply]
I agree that the system for adding references in Wikidata is not very clear, I think it should use the same system as in Visual Editor, but I guess for now there are other development priorities (Commons).--Micru (talk) 13:16, 24 January 2017 (UTC)[reply]

Visual clutter

Thanks to the use of this infobox instead of the standard one, readers of Charmian Clift get 6 pencil icons (tooltip: edit this on Wikidata), one comment in brackets "edit on wikidata", and two multicoloured icons which turn out to mean "Article is available on Wikidata, but not on Wikipedia". Indeed, we don't have a separate article on "short story writer" or "essayist": both of course don't need an article, everything is said in short story and List of short-story authors. We do have the redirects Short story writer and essayist, but thanks to the wonders of Wikidata (which doesn't connect to redirects!) and this template, these bluelinks are not shown in the infobox, and it pretends that we don't have info on the subject.

Similarly, at Abbé de Coulmier, we apparently don't have enwiki articles on psychotherapist and Catholic priest; at George Auriol, we don't have articles for type designer, painter, printer, ... Why we would divert readers to Wikidata for these, and why would want to give them the impression that we don't have information on, say, Catholic priests, is not really clear.

That the result of the Wikidata version of the infobox a lot of added visual clutter is, seems clear though. Fram (talk) 15:33, 24 January 2017 (UTC)[reply]

That is really awful visual clutter. Really bad. The sort of thing that people would go in and edit. Except here, it is difficult to learn how to do that. An example of what happens when a mature, well-documented and well-developed editing and reading environment (Wikipedia) clashes with a far-less developed (and sometimes really poorly designed) editing and display environment (Wikidata as displayed in templates). Carcharoth (talk) 16:09, 24 January 2017 (UTC)[reply]
This has been a known issue since the very beginning of Wikidata, the so-called 'Bonnie and Clyde problem' (T54564). Apparently there were some technical hurdles back then. I also don't like that there are so many pencils, it should be possible to edit WD from VE, that would remove the need of having them there.--Micru (talk) 16:14, 24 January 2017 (UTC)[reply]
Try setting "noicon=yes" in one of the current uses - that will remove the edit icons. That could be the default setting for this infobox if preferred. Thanks. Mike Peel (talk) 16:19, 24 January 2017 (UTC)[reply]
This is a result of the implementation of the module here on en.WP, not anything that is the fault of Wikidata. I would suggest that it is possible to get the existence/redirect status of the pages on en.WP and change the output in the module. --Izno (talk) 16:22, 24 January 2017 (UTC)[reply]

Atomising of article writing

From my experience with editing Wikidata and en-Wikipedia, there is an element of feeling that rather than pulling together sources to build an article, information is being broken down into data elements for the purpose of putting it back together again later (i.e. machine writing).

Anyone who has written an article on Wikipedia (by which I mean a properly sourced and reasonably standard-length article) knows that this involves accessing different sources of information and distilling it into words and presenting this information. The information present can be broken down into discrete elements and carefully marked up and filed away as data (this is best seen in the construction of infoboxes and the metadata associated with references), but there is a balancing act between doing this and maintaining the flow of writing an article and understanding a topic.

What I am trying to say is that the process of processing and handling the underlying data can at some point go too far. There is, for want of a better word, an intuitive process that involves a human accessing a set of source(s) and transferring the information into an article (with references). Wikidata sometimes seems to try and make that intuitive process more robotic in nature, which doesn't really work. I am going to repeat what I said elsewhere:

"There needs to be a separation between database maintenance and article writing, and an interface between the two that is intuitive rather than opaque." (Wikipedia talk:Wikidata/Archive 4#Is this not a form of machine writing?)

Wikidata is great for organising data. When people then try and use this data for generating articles (rather than lists or discrete items of information), that starts to rub up against humans trying to put words together as language. Wikidata should be a resource, to be used by people writing articles, but it shouldn't be used to write articles in rote or automated fashion. Carcharoth (talk) 11:43, 24 January 2017 (UTC)[reply]

That atomisation of data is already being done when adding categories to the article, with the difference that the category selection cannot be sourced and the wikidata statement can be sourced. In fact many statements in Wikidata were added because the article was included in a given category.
I don't know why do you mention automatic writing of articles, afaik nobody suggested writing automatic articles, although some displaying of data from Wikidata is already being done (as an opt-in) by some small Wikipedias by using the Article placeholder extension (example: nn:Special:AboutTopic/Q845189 - Spotted turtle).--Micru (talk) 13:09, 24 January 2017 (UTC)[reply]
Article placeholders = automatic writing of articles. Listeriabot = automatic writing of articles. And I have seen (and linked here, I thnk) calls to use Reasonator to create articles as well. Fram (talk) 13:28, 24 January 2017 (UTC)[reply]
I don't understand how you can call displaying data "writing an article" when it is clear that it doesn't follow the same format as a written article. Reasonator does generate some short texts, but from that to saying that it "writes articles" is a stretch. Btw, one question, why is it requested that Wikidata generated lists should be individually sourced, when in general enWP lists are sourced globally?--Micru (talk) 13:50, 24 January 2017 (UTC)[reply]
You and I would perhaps know that the different format indicates that it is not an article. Our readers wouldn't. They go to a page in the mainpage and see information on it. For them, this is a Wikipedia article. As for Wikidata lists, how would you technically collect different items (different pages) based on a common property, but source them globally anyway? For me personally, one source to rule them all would be as good as one source per line, as long as we could be reasonably sure that every item came from that source. But just having e.g. a source "CIA World Factbook" at the item "capitals", and then assuming that all capitals are sourced from there on the individual pages, would be problematic (see e.g. the Sumatra example above). Fram (talk) 14:02, 24 January 2017 (UTC)[reply]
Regarding readers not knowing that a page is ArticlePlaceholder generated, there is T124191, which I hope it will solve it.
As for lists, in theory it should be possible to assign the same source to several statements at once. Perhaps what it is needed is a better interface for "mass-sourcing" items in a list.--Micru (talk) 14:28, 24 January 2017 (UTC)[reply]