Jump to content

User talk:Hestiaea: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Hestiaea (talk | contribs)
Hestiaea (talk | contribs)
Line 402: Line 402:


::I think you know the reason. You can type of it yourself here if you like. I am tired of being played. I do not like being played. [[User:History2007|History2007]] ([[User talk:History2007|talk]]) 19:14, 14 December 2012 (UTC)
::I think you know the reason. You can type of it yourself here if you like. I am tired of being played. I do not like being played. [[User:History2007|History2007]] ([[User talk:History2007|talk]]) 19:14, 14 December 2012 (UTC)
::: Yes I'm a banned editor (since August 2009, and after many contributions dating from July 2003). I would have told your earlier if I had your email earlier. [[User:Hestiaea|Hestiaea]] ([[User talk:Hestiaea#top|talk]]) 19:20, 14 December 2012 (UTC)
::: Yes I'm a banned editor (since August 2009, and after many contributions dating from July 2003). I would have told you earlier if I had your email earlier. [[User:Hestiaea|Hestiaea]] ([[User talk:Hestiaea#top|talk]]) 19:20, 14 December 2012 (UTC)

Revision as of 19:20, 14 December 2012

Hello, Hestiaea, and Welcome to Wikipedia!

Please remember to sign your name on talk pages by clicking or or by typing four tildes (~~~~); this will automatically produce your username and the date. Also, please do your best to always fill in the edit summary field. Below are some useful links to facilitate your involvement. Happy editing! SwisterTwister talk 22:55, 17 September 2012 (UTC)[reply]

Getting started
Finding your way around
Editing articles
Getting help
How you can help

GibraltarPedia

Thank you very much for your information. How strange, right?.--Juanmatorres75 23:14, 17 September 2012 (UTC) — Preceding unsigned comment added by Juanmatorres75 (talkcontribs)

Note that links using mailman/private only work for signed-in list members. Links visible by anyone use pipermail. Example:

Thanks, and best wishes -- --JN466 06:33, 21 September 2012 (UTC)[reply]

Your research

Hi. I see you are writing about the philosophy and sociology of Wikipedia. Don't hesitate to ask me anything you wish. Regards, Kudpung กุดผึ้ง (talk) 03:36, 23 September 2012 (UTC)[reply]

In response to your feedback

You do not have to be :) As long as you do not make serious vandalism so you will not be blocked. It's just something that happens after 3 chances. So do not worry, you can just regiger loose, and if you make a mistake, then an administrator or another just make you aware of it :)

Simeondahl (talk) 22:35, 17 October 2012 (UTC)[reply]

 

FTN

I've moved your FTN comment to the talk page. You can find it here: WT:FTN. IRWolfie- (talk) 16:35, 4 November 2012 (UTC)[reply]

Oops thank you. Hestiaea (talk) 16:49, 4 November 2012 (UTC)[reply]

Book

Feel free to ask me whatever you want. I've been on Wiki for awhile and although I'm not a publisher/writer, I've worked with several authors on their books - so I guess "editting" is a hobby for me. Ckruschke (talk) 18:30, 5 November 2012 (UTC)Ckruschke[reply]

E-mail would probably be better as I usually only get on Wiki once a day. Post on my page and I'll delete once I write it down. ~Ckruschke

Wiki

We can continue on my talk page about your book question. History2007 (talk) 21:15, 5 November 2012 (UTC)[reply]

By the way, I noticed just above that you asked about FTN. See Talk:Bell's_theorem/Archive_4 and Talk:Bell's_theorem/Archive_5 as examples of fringe science and WP:COI. In this case users with COI were adding fringe views, but were blocked because a number of reasonable, uninvolved editors (including myself) came out against them when the alarm bells went off - pun intended. But that was only because Bell's theorem is a major item and enough people cared. On obscure scientific pages, no one would have noticed because no one would be watching those pages. I am not aware of any statistics of how many far-flung scientific pages are not even being watched. There is a flag that says how many people have the page on their watchlist, but some of those users may have already retired from Wikipedia, so that number is not accurate. History2007 (talk) 02:21, 6 November 2012 (UTC)[reply]
As a separate issue, see how much of my time was wasted dealing with those 3 WP:COI fringe users until they went away. It can be very time consuming to deal with these determined COI editors, and that type of Wiki-defense eats time like Pac-man. I don't even get involved in those any more. History2007 (talk) 02:25, 6 November 2012 (UTC)[reply]

Some examples

I think many people are unaware of "how neglected" various topics are, so I will gradually add some examples for you here so you can refer to them, as you go back to refer to the issues:

  • Talk:Message passing: Over 18 months ago, I tagged that as having big problems. And it is a key concept in computer communication and is rightly marked as a High Importance article. Its content is a disaster. Has anyone even worked on it? No.
  • Search engine technology: In this day and age of search engines, one would have thought that Wikipedia would have had a good article on that. As of this writing, that article is just hopeless and has no references. And it gets worse. In March 2011 I wrote to the WMF exec director and provided a challenge to do something about this and mentioned that article and quality of other articles in general. Was anything done? No. But the response I received did say a lot. The response was: silence.

There are many, many more important and neglected articles like those. This is not just an occasional problem. History2007 (talk) 20:07, 10 November 2012 (UTC)[reply]

Thank you. I was meaning to take this issue up with Sue (who I met a few years ago), and I left this [1] on her page. I am not an expert on message passing - is there any way you could explain clearly to a layperson the issues with some of these articles? The book will be aimed at an educated, middlebrow audience, but I am eschewing any technical jargon and specialist concepts. Hestiaea (talk) 09:45, 11 November 2012 (UTC)[reply]
Yes, yet there may be very little WMF can directly do about it if the Wiki-crowds do not care. But one does not need to be an expert on message passing to see that there are multiple tags on the article and no one is even addressing them. And many of the articles on digital signal processing have tags on them. But search engine technology will win the neglect Olympics here. It is a key technology and no one is even working on it: the page has had about 30 edits in 2012 and they were mostly minor trivial items. And it has zero references, etc. That is obvious. Now let us compare that with Johnny Depp: Well written, well sourced, factual and accurate. The same applies to Charlie Sheen and Britney Spears which has almost 300 references. Britney Spears products is also well written, and factual. But Sheen, Depp and Spears do not amount to scholarship. Wikipedia obviously works for common knowledge, but not for scholarship, yet is positioned as an "encyclopedia". That is why I have felt so uncomfortable about the endless parading of the 50 articles looked at in 2005 by Nature. That means very little once a simple analysis of the situation is performed. History2007 (talk) 10:25, 11 November 2012 (UTC)[reply]
I found this interesting. Hestiaea (talk) 10:28, 11 November 2012 (UTC)[reply]
On the subject of what the WMF can do, a properly designed reliability study would be the obvious choice. Although I think I already mentioned this. Hestiaea (talk) 10:31, 11 November 2012 (UTC)[reply]
Regarding the success stories, I could add my own. Anytime, and I mean any time I want to know about a town, city, river or lake, the only source I rely on is Wikipedia. I have total confidence in the geographic information it provides. Would I trust an article on biology? Never, unless I know the editor who wrote it. History2007 (talk) 10:38, 11 November 2012 (UTC)[reply]

I thought this was entertaining: "Search engines that are expressly designed for searching web pages, documents, and images were developed to facilitate searching through a large, nebulous blob of unstructured this and that. They are engineered to follow a multi-stage process: crawling the infinite stockpile of pages and documents to skim the figurative foam from their contents, indexing the foam/buzzwords in a sort of semi-structured form (database or something), and at last, resolving user entries/queries to return mostly relevant results and links to those skimmed documents or pages from the inventory." I don't know anything about SE technology, but the writing could do with a bit of attention, probably.Hestiaea (talk) 10:33, 11 November 2012 (UTC)[reply]

Yes, but the article is not about what search engines are, it is about the "technology" used in them.... That part is in search of editors to work on it. And look at where the content came from: mostly IPs with no history of serious work on the subject. History2007 (talk) 10:38, 11 November 2012 (UTC)[reply]
The article history suggests that the bulk of the article, apart from one section, was contributed by this guy in 2006. Hestiaea (talk) 10:46, 11 November 2012 (UTC)[reply]
Yes, outdated articles are all over the place. Look at this case. I really started singing along when I saw that one.... I eventually fixed it, but how many more can I fix? I have now effectively stopped work on those. But there is correct content here and there, and that says other things. My very last serious article was Message passing in computer clusters. So why did I work on that and not message passing itself? Because working on basic computing articles in Wikipedia may be building sandcastles. You will have to watch too many of them and it is just too painful to see your work degrade over time by additions from IPs who know nothing. If there is little protection for your work, why bother? So I will never touch the basic articles like computer architecture or message passing. The only reason I wrote "message passing in computer clusters" was to complete the series of articles I had started on supercomputing after the John Lennon incident. There is a statement somewhere in Wikipedia editing guidelines that says: "if you don't want your work to be brutally edited by others, do not contribute it", or something like that. I think the response to that from many experts (myself included now) is: "you have a deal, we will not". History2007 (talk) 11:38, 11 November 2012 (UTC)[reply]

Thanks for mentioning this. It's fascinating. And there really is a Britney Spears project? It's like a parody of the popular view of Wikipedia, but of course it's real. Hestiaea (talk) 10:58, 11 November 2012 (UTC)[reply]

Yes, there is a Britney project with many active editors.... and that is the nature of the unusual nature of the entire project. History2007 (talk) 11:38, 11 November 2012 (UTC)[reply]
On the subject of WMF action, I thought about that some more. What can they do? They can spend money on a formal reliability study, of course. But then what? Suppose it says reliability is low. What can they do? Wikipedia policy is decided by users at large. Again, let us look at the self perpetuating RFCs on Pending Changes. Assume the WMF decided that Pending Changes are needed. Do they have any power to impose them? If the Wiki-crowds do not want Pending Changes, I do not see a policy that allows WMF to do anything against the will of the crowd. And the will of the crowd now points in 12 different directions. That phenomenon is technically known as social gridlock, and usually maintains the status quo for a very long time. I think all that WMF can do is try to somehow change the will of the crowd. From what I have seen in the past 12 months in the Pending Changes Palio (and it really was as chaotic as that one the last time I naively participated in it on Wikipedia) I am not sure how that can happen now. Social gridlock is how societies and communities stagnate and from what I saw on the Pending Changes Palio (which has had no winner after all the excitement) real gridlock has already set in the Wikipedia social experiment, and I am not sure what can be done about it. History2007 (talk) 15:12, 11 November 2012 (UTC)[reply]
Is social gridlock a real term, by the way? Hestiaea (talk) 15:17, 11 November 2012 (UTC)[reply]
Yes, has happened a few times in history, e.g. please see: The Business Community of Seventeenth-Century England by Richard Grassby 2002 ISBN 0521890861 page 413, which states: when "paralysed by social gridlock, which required a major war or revolution to disentangle". I guess Wikipedia needs an article on that, but I am not up to writing it... History2007 (talk) 15:31, 11 November 2012 (UTC)[reply]

Summary of the situation

As a result of the discussion with you, I thought about the situation some more and in the end it is quite simple really:

  1. Any open society/community is shaped by a set of common beliefs and ideals among its participants.
  2. A young community prospers if the founding group manages to craft policies to organize it, yet not restrain it. That happened 2003-2006 on Wikipedia as policies took shape.
  3. In an open community , policies only work if the large majority of people show some respect for them; e.g. if everyone in NY city decided to scratch other people's cars as they walk the streets, there is little the police could do about it.
  4. As the profile of the people within the community changes the initial policies may no longer help it prosper.
  5. If the policy making process is not dynamic enough, the community will stagnate with old policies that no longer fit the times, social gridlock sets in, and prosperity fades away.

The key points here are:

  • The initial few thousand people who formed Wikipedia were almost entirely optimistic, well intentioned idealists. The policies they designed (e.g. WP:AGF) work really well for that type of community.
  • As the user profile diversified, "consensus based decision making" made changes to policy very hard indeed.
  • As Wikipedia prospered, the ratio of idealists to opportunists changed dramatically - Gibraltar is just one minor manifestation of that. The opportunists realized that publicity could be obtained at extremely cost effective rates.
  • The situation is exacerbated by the fact that as the number of articles has grown, the kingdom is now so vast that the few thousand well intentioned idealists can no longer police all the articles (30,0000 new articles get added every month anyway), and in a large number of cases the opportunists can do whatever they like.

So the persistent opportunists are wearing out many idealists (look on WP:FTN for a few days and see how persistent and widespread they are), and while a great deal of information exists in Wikipedia, it is getting so mixed with WP:COI and fringe views (as well as unintentional errors) that it is getting almost impossible to unscramble the egg.

But Wikipedia was an interesting social experiment in any case, and the best parts of its content may yet be (and indeed are being) selectively siphoned by other organizations that may clean some of it up and use it for educational purposes. What was of great value was the Wikipedia "node structure" and that was the first thing Google took - it was free. And if you try [2] you will see the beginning of the trend on the right hand side of the screen. But there will be others, and social networking sites will take some more, etc. As the man said The Times They Are a-Changin', of course. History2007 (talk) 09:07, 12 November 2012 (UTC)[reply]

Thanks - this is pretty similar to views expressed by other interviewees who have left Wikipedia, but good to hear from someone who is still (nominally) working here. Hestiaea (talk) 20:08, 12 November 2012 (UTC)[reply]
It will be interesting to see how the Wikipedia:Disambiguation links get used by commercial entities. They are pretty clean and very useful, in fact worth good money and I am sure they will be siphoned off before anything else. Eventually there will be a "Google gobble" of a lot of content, as they build their own. Microsoft is a rich man on life support now, so I am not sure if they can do anything right these days, but there will be other companies who will siphon the content into a Wolfram alpha type system. So clean content will emerge after that. History2007 (talk) 20:46, 12 November 2012 (UTC)[reply]

Other websites

I was surprised to find out about this website today, after my 5 years here. I had never heard of it, and does not seem to be mentioned in Wikipedia. I thought you should know about it, if not already familiar with it. History2007 (talk) 00:55, 13 November 2012 (UTC)[reply]

Questions

Hestiaea, I will attempt to answer your questions if you contact me off line via my contact tool Tom Butler (talk) 19:38, 10 November 2012 (UTC)[reply]

Official statistics

Here are the closest thing to official statistics about article quality that you will find: Wikipedia:Version_1.0_Editorial_Team#Statistics. On the quality, it chooses the highest rating from a wikiproject for an article, so it's way of working it out is probably a bit optimistic. IRWolfie- (talk) 21:27, 10 November 2012 (UTC)[reply]

Yes, that is the semi-official method. But some explanation is in order:
  • The assignment of an "assessment" is in most cases unassessed in that the assessment itself may be a joke. You could go and assess an article as B or C as a joke, or the assessment may have been by a 12 year old, or a person who has had 12 beers too many. And there are people who edit Wikipedia after 12 beers - there is no test for that, and no policy against it.
  • The FA and A marks may have some meaning. I don't think FL is way off either, so overall 6 or 7 thousand articles may have been carefully checked for quality, i.e. less than 2% of the total!
  • The GA mark is questionable in my view, but is better than nothing. The person doing the review for a GA may be a 12 year old.... No policy against that.
That means that about 7,000 articles may have a meaningful assessment, another 15,000 a vague and partial assessment and the rest are of unknown quality in that anyone can change the assessment at any time at will. And the assessment may be 4 years old, the article may have changed a lot and been gently vandalized in the meantime. The rest are anyone's guess.
But the official optimistic numbers are still scary: over 3 million articles are "start or stub" i.e. of low quality. So the overall situation is very bleak indeed, as 30,000 new articles are getting added every month, again with no quality checks to speak of. So teh situation is this:
  • About 25,000 articles (7,000 seriously, and 15,000 just GA) have been looked at by other community members and "assessed" in a formal review beyond a number pulled out of the air by a single user.
  • Over 30,000 new articles are coming in every month.
So over 98% of the content is of unknown quality by any serious measure. Russian Roulette of knowledge. History2007 (talk) 09:26, 11 November 2012 (UTC)[reply]

How long will it take to assess Wikipedia?

This has become interesting now. I asked myself, how long will it really take to check Wikipedia for accuracy? Here are some straightforward approximations:

  • How long does it take to check an article? That obviously depends on article size. A short article may take 20 minutes, a really long article several days by the time the references are verified to avoid the jagged phenomenon. A mid sized article would take several hours to verify. My feeling is that 4 hours is the average time it would take me to verify an average article.
  • Given 4 million articles, that is 16 million hours. Assuming a 40 hour week, that is 400,000 weeks, i.e. about 8,000 years for one full time person.
  • Assuming that the top 1,000 Wikipedians who can be trusted work on it at 20 hours a week, it would take 16 years. If they work on it full time, it will take 8 years.
  • In terms of real cost, assuming $30/hour, the cost is about $480 million, i.e. about half a billion dollars to check Wikipedia.

Now I understand why it will be there for long, but no one will ever know how reliable it is because by the time it is checked half way through, the contents will have changed.

Now I also understand why traditional encyclopedias have been smaller than this, because they had to be checked by experts, and the barriers outlined above limited their size. History2007 (talk) 09:04, 17 November 2012 (UTC)[reply]

That's a good point, and it is one I have covered in the book. The 'official' story of Wikipedia is that there was a 'bottleneck' in the complex process used to approve Nupedia articles. The bottleneck was solved by the introduction of the magical wiki process, and the rest is history. The true story is a bit more complex. There are really two bottlenecks in the development of any encyclopedia, namely in achieving breadth and depth. 'Broad' articles are those like Ardmore, Alabama where anyone with a little training can write a good article. You need lots and lots of these so you need a way of rapidly developing them. In traditional encyclopedia you hire graduates or get interns on fairly modest pay, and you often copy articles from older encyclopedias. The wiki managed to solve this bottleneck brilliantly. Getting 'depth' is a lot more difficult as there is no other way than hiring experts and either paying them, or rewarding them with status reflected by the status of the reference work as a whole. The Stanford Encyclopedia of Philosophy manages this brilliantly. The wiki did not resolve this bottleneck very well, and the rate of reliable article creation in 'deep' subjects has been no better (in my view) than with Nupedia. I like your monetary calculation, by the way. Hestiaea (talk) 10:59, 17 November 2012 (UTC)[reply]
I agree with the first part of the analysis. The issues I do not agree with is that experts need "status" and the Stanford issue. The problem with experts is not status (I could not really care less about it myself) but two other issues:
  • A key problem is building sandcastles. Any person who has had 12 beers can come and change what the expert wrote, or worse a wannabe researcher can add WP:COI to make themselves look good. The "lack of protection" is a key (pronounced lethal) problem in attracting experts.
  • The reverse problem, as reflected in the Stanford case is that just one expert can go off on a personal tangent. I have seen that on Wikipedia with some experts who have a dusty Nobel prize somewhere, then act in a less than rational manner here. Even in less extreme cases, I have seen experts who I personally know and wrote parts of the Stanford come over and try to push the envelope on Wikipedia with new theories, buried into the text they write.
But overall I agree with your analysis. The Wiki model can deliver a large number of shallow articles of semi-reasonable quality, but depth will forever avoid the model, as is. History2007 (talk) 19:13, 19 November 2012 (UTC)[reply]

I should mention some other points: First, take a look at where the money for Wikipedia comes from. Most donations are about $10 or so I think. The data is somewhere on WMF pages. Next, on my talk page, I noted that "a reader has no indication whether any information obtained from Wikipedia at any point in time is correct or incorrect". But in fact there is interesting analysis that can be done there in terms of probability of accuracy. At the moment even the probabilities are totally unknown.

I am assuming that your background is in the humanities, so I would suggest having a few face to face conversations with some statisticians before you say anything in the book about sampling. It is a tricky issue. A sample of 30, 50 or 100 articles out of 4 million can only be described as hopeless.

But an interesting question that can turn into a master's degree thesis is: "what are the probabilities that information in a crowd sourced system is correct". Given some basic assumptions that will probably give some surprising results. I have no interest in working on it, but once someone publishes a serious mathematical analysis, it will be interesting to read. History2007 (talk) 02:34, 20 November 2012 (UTC)[reply]


  • On status, this is something we would have to agree to disagree about. I've interviewed a number of academics on the subject, and I took a hard look at the Nupedia mailing lists to understand why the Nupedians didn't like Wikipedia when it was introduced. You may be right, but all the evidence points to academics having a concern for both status and reputation. (Reputation and status are different, but connected, of course).
  • On sampling: even those of us in the humanities understand the importance of avoiding 'selection bias'. I think it's pretty simple: select the articles that would be the 'core subjects' of any standard reference work. One could object that Wikipedia is not a 'standard reference work', but fine, we will take that objection exactly as it is. If our study finds that Wikipedia is poor in areas that are handled well by standard reference works, then that is what it finds. Selection bias was a problem in the Oxford study, by the way. They selected Aquinas and Anselm for the study without explaining why they had been selected, given that there are at least five other major medieval philosophers they could have selected. I devised a metric based on the number of pages allocated to each scholastic, then averaged these out. The results were pretty consistent with my own expert, albeit subjective, judgment (Aquinas, Scotus, Ockham, Anselm, possibly Augustine if medieval includes him). One can do a similar sampling for subjects, as well is biographies. E.g. the ten greatest books in the history of philosophy. I think my own judgment would be pretty consistent with a sampling based on standard reference works. I think that would be consistent through time, also. I.e. if I looked at the 10 philosophers with the most space in Britannica 1911, that would map pretty well, though not exactly, with the list based on Britannica 2011. It would not map very well to Wikipedia, by the way. Ayn Rand received massive coverage on Wikipedia, but hardly any in standard reference works. Hestiaea (talk) 15:25, 20 November 2012 (UTC)[reply]

Final thought

This has been an interesting discussion, but I should end it now, for in the process my own views have changed a lot. It happened partly as a result of noting that most donations are around $10. Let me explain it via an analogy. The cool reception we both faced at WMF is reminiscent of two members of Guide Michelin walking up to the CEO of McDonald's, expressing utter disappointment in the quality the food he serves, and asking what he intends to do about it. What we did was walk into a few outlets, sample the food and walk out outraged saying: "What! There is no quail on the menu! And no Champagne!" How can you call this a restaurant? His response to our complaints that McDonald's is committing crimes against gastronomy will be simple: "Over 100 billion served". That is all he needs to say.

So, now I think just as McDonald's is an entity that caters to the masses and is supported by the masses, Wikipedia is intended for the masses and is supported by the masses who pay $10 every November. The Michelin critics should not even attempt to review it. But when a member of the public is really hungry and needs to grab a quick byte, they do not wait to see if the closest 2 star restaurant can give them a reservation. They go to McDonald's. Will they complain about crimes against gastronomy? No. And the Britney project is the equivalent of Ronald McDonald. It give the kids something to play with.

Will Wikipedia ever achieve academic quality? Only when a few McDonald outlets will get 2 Michelin stars. I am not betting on it, for it is not part of their culture, or the aim of the audience they serve. In the long term, Wikipedia will just ignore academic arrogance and shun the experts just as Ronald McDonald will shed no tears for not having a star.

Anyway, I wish you well, and good luck with the book. History2007 (talk) 15:12, 20 November 2012 (UTC)[reply]

Interesting - but we ought to distinguish 'supply' of knowledge from 'demand' for knowledge. My analysis of page views suggests that actually 'the masses' are more hungry for quality steak than you think. E.g. look at the page views on Aristotle. Not quite up there with Britney Spears, but respectable enough. But when you look at the supply, you find that many editors on Wikipedia are working on video games, old TV episodes of Babylon 5 and so on. I.e. the supply is for McDonalds, but the demand is for a higher quality, sadly. In my own subject area, there is a need for articles that would not be so difficult as the Stanford Encyclopedia of Philosophy, and aimed squarely at a middlebrow readership, but without compromising on 'quality'. It's sad that Wikipedia cannot supply this. I interviewed a few philosophers for the book and they all said the problem was the 'sandcastle' effect, as you neatly term it. Hestiaea (talk) 15:31, 20 November 2012 (UTC)[reply]
Not to prolong this, but you should also know of the "addictive effect". For some people not doing Wikipedia can be painful. I mentioned that I was becoming less active on Wikipedia to another editor who was also thinking of sandcastles, and he said: what will you do instead? I said that last summer I went out on the beach a lot more and watched people build castles there - I did not participate. But it was more fun than Wikipedia. Anyway, as I said, let it be 200 billion served. History2007 (talk) 15:51, 20 November 2012 (UTC)[reply]
This has been very fruitful for me. Thanks for your thoughts Hestiaea (talk) 15:52, 20 November 2012 (UTC)[reply]
You are welcome. History2007 (talk) 15:55, 20 November 2012 (UTC)[reply]
In the book, could I refer to you by a normal name? 'History2007' is a bit awkward. I don't mind if it's a made-up name, e.g. 'Chris' or 'Dave'. Let me know. Hestiaea (talk) 20:20, 20 November 2012 (UTC)[reply]

That was a Dave Letterman joke a week or two ago:

  • "Mayor Bloomberg said New York is going back to normal after the storms. Normal? Has New York ever been normal?"

So along the lines of that joke, call me David. History2007 (talk) 20:47, 20 November 2012 (UTC)[reply]

One final, final, final thought, then I will stop watching here: Please do take a look at Stratified sampling, Cluster sampling, Multistage sampling, etc. It is not that simple to sample the 4 million pages here, across multiple subjects given that technology articles change much faster than medieval items, corporate mergers in business articles, etc. The results obtained from philosophy articles may have zero relevance o technology or business subjects. Now the "ultimate irony": as a Wikipedia user can you trust what any of those three Wikipedia pages on sampling say? Funny, is it not? Or should the page ask: "Is that for here, or to go?". Anyway, so long and best wishes. History2007 (talk) 22:14, 20 November 2012 (UTC)[reply]

Book

If you want a clear picture of the situation on Prem Rawat you might consider talking to both sides. There is seldom one side to Wikipedia disputes.(olive (talk) 19:01, 16 November 2012 (UTC))[reply]

Absolutely - happy to talk. I am away for a bit but let's continue next week.Hestiaea (talk) 10:51, 17 November 2012 (UTC)[reply]

Re: ACE 2012 question

Before I answered your first question I wanted to clarify what you meant by "criminal". Are we talking "illegal in the lower 48 states"-type activities, or just activities in violation of policies and guidelines on Wikipedia? That changes the thrust of my answer rather significantly :) Thanks, Der Wohltemperierte Fuchs(talk) 23:58, 20 November 2012 (UTC)[reply]

I actually did mean 'criminal' in the 'could get prosecuted' sense. However the thrust of my question covered either. I am interested in the case where something should have been made public to the community but wasn't, and later on you are faced with the choice of making it public - and embarrassing yourselves in the process because you didn't make it public before - or keeping it covered up. It's mainly a question about openness and transparency. Are you open and transparent even when it means 'fessing up to previous lack of openness and transparency? And perhaps embarrassing your colleagues? Or do you keep the skeletons locked up in the cupboard, perhaps adding another skeleton or two in the process? Thanks for taking the trouble to ask me to clarify the question. Hestiaea (talk) 08:58, 21 November 2012 (UTC)[reply]

Rephrasing of question

I've given an initial response to your questions, which you may have seen. Above, in response to the query from David Fuchs, you say the question is about openness and transparency, and ask "Are you open and transparent even when it means 'fessing up to previous lack of openness and transparency? And perhaps embarrassing your colleagues? Or do you keep the skeletons locked up in the cupboard, perhaps adding another skeleton or two in the process?" While I'm wary of potentially reopening specific issues, I would be happy to answer a general question like that, which applies to several situations I can recall, and to which I could give a general answer.

On a different topic entirely, I've been perusing your talk page. This is an excellent point. That, and several other comments, remind me of some of the more cogent arguments I've seen on a number of Wikipedia criticism sites. If I have time, I may return and comment on some of that, as the issues there are fundamental ones and should be discussed more widely. Carcharoth (talk) 00:24, 22 November 2012 (UTC)[reply]

I agree the issues are fundamental ones, but interested whether you agree for the same reasons. My take on it is that, because of the extraordinary achievement of the wiki in developing a broad article coverage, there was an 'illusion of success', i.e. an illusion that specialist skills were not needed to build a comprehensive reference work. Of course, broad does not equal deep, but that is the illusion. Interested in your take, though. Hestiaea (talk) 19:50, 22 November 2012 (UTC)[reply]
In response to your other question here, what I am getting at is the situation frequently heard in smoke-filled rooms: "If we admit it now, we will be asked why we didn't admit it earlier". I.e. the Committe should have made some matter public some time ago, but didn't, and is now faced with having to embarrass itself by making it public. So do you embarrass yourselves, or do you find a further excuse for keeping it from the public view? It's really about striking the right balance between being openness and transparency, and avoiding the embarrassment and loss of face that openness and transparency inevitable brings. Hestiaea (talk) 20:01, 22 November 2012 (UTC)[reply]
I'd be happy to answer a question along those lines. Do you want me to answer here or on the election questions page? Carcharoth (talk) 21:26, 22 November 2012 (UTC)[reply]
Whatever you think is more appropriate. I imagine more people will be reading the election questions page than this little backwater! Have a good weekend! Hestiaea (talk) 19:41, 23 November 2012 (UTC)[reply]

A few more issues

Something semi-related occurred to me the other day and I thought I would mention it to you. Then a few other issues came up.

Why do so many students have to pay $10 each?

The fact is that $500 million is not a lot of money these days. I thought of that because I saw a news story that that type of cut was discussed just for the pork/perks of the members of the EU parliament, etc. And look at what the candidates spent in US elections the last 6 months, what the bailout of fake bookkeeping and mismanaged games in Greece/etc. is costing etc.

Will the members of the EU parliament take a small pay cut to fund an encyclopedia? Are you kidding? Can 1% of the cost of TV ads in the next US election go to fund an encyclopedia? No way.

So the reason college students from Brazil to Denmark have to pay $10 each to support an encyclopedia is the total failure of the higher ups to do it right. A small contribution by US/EU/Japan/Etc. could have provided $1 billion to build an encyclopedia that gets it right. A billion given to the top 30 from this list could have funded grad students and professors, and done it right - instead of serving fast food style knowledge. Charlie Sheen's page would have been excluded and could have been done by the kids, but the serious items could have been done right. One million correct articles, instead of 4 million half-baked ones. $1 billion for 1 million articles is $1000 per article, and most professors would do an article for less than that because they already have lecture notes on it. And many of the smaller articles would just cost $200 by grad students anyway. The numbers are straightforward.

Wikipedia exists, because academia has failed to ask the governments for the funds to do it right.

Cross platform issues

The failure of scholars mentioned above is even more apparent in the case of Wikisource. The very least the scholarly world (and those who fund them) could have done is orchestrate a central repository for public domain ancient documents. But as is, students have to pay $10 each to bring it about. And as I have said before elsewhere, as is it would not be hard to slip into Wikisource that Isaac Newton's favorite cheese was Cheddar. There are too many pages there and nowhere near enough protection.

And cross-wiki issues are paramount, e.g.

  • At any point in time Wikipedia can easily disagree with itself - and in many cases does. In an obvious case, it usually disagrees with itself about the busiest airports in the world. One page can list Narita as 3rd, another as 4th, etc. and these change all the time. Wolfram Alpha does not have that problem because it uses more modern technology.
  • Cross language pollution is another issue. Errors that have been corrected in English Wiki may persist in other languages for long. Those use older versions of the English in many cases, and pages are mostly ignored.

And of course there is no formal mechanism for checking/coordinating content between Wikis. They have different governance.

Information pollution

If knowledge helps the world (the underlying assumption here) then does incorrect knowledge harm it? It would be interesting to build a model to get some type of estimate of the level of Information pollution provided by web sites in general, and Wikipedia as one case. All those pages with errors can not be helping - and in most cases readers do not even know they are getting wrong information. You should talk to a someone in that field in a university to write a few contributed pages for you on that. These types of models have been built for web access, several times.

The idea is that while dramatically wrong information such as "Larry King has two heads" can be noticed, a large number of low level (say mildly toxic) items can go unseen and as they add up, they go all over the place - jagged was an obvious example of that, and it is not clear how many more there are. Unlike physical toxins from a rusty water fountain which are finite, these errors get reproduced in other sources and spread.

The idea of the model here is to mimic the analyses done for the spread of general toxins, and often involves summation via an integral over a range of probabilities, assuming a toxicity index for each page, viewed as "a source". It will take some analysis, but it is not hard and those types of models are built all the time for pollutants dumped in rivers by factories, etc. But while those toxins do not get reproduced, info-toxins do.

Usability

By the way, the page Information quality is pure junk - don't rely on it.

It is suddenly becoming quite clear here that as "user Hestiaea" you can not use Wikipedia to do the basic research for the book you are writing about it. You can not rely on the statistics articles for your sampling questions and can certainly not rely on the Information quality articles for your reliability questions. If you use those pages, your book will be murdered by its reviewers. Way to go... History2007 (talk) 05:11, 27 November 2012 (UTC)[reply]

Thanks for pointing that one - also Data quality is not much better. Hestiaea (talk) 20:14, 28 November 2012 (UTC)[reply]
Any time. But thanks should go the other way too, for this discussion has started to save me some time in fact. And the more I think of it, the more pointless many of the discussions in crowd sourced websites seem to be. Take this admin resignation letter for instance - it is on the B-noticeboard, those who monitor admins. The long and short of it is that as large crowds arrive, it becomes less of "crowd source development" and more of "crowd source mayhem". And reliability does not benefit. So I am stepping even further back now, the more I think about it. Thanks. History2007 (talk) 15:48, 30 November 2012 (UTC)[reply]

Roulette

Did someone play Russian roulette last week? It looks like someone did. I wonder if he will donate to Wiki next Nov. History2007 (talk) 14:58, 4 December 2012 (UTC)[reply]

Thank you! That goes into the book notes. Hestiaea (talk) 18:58, 4 December 2012 (UTC)[reply]
The point is that this is happening again and again. And it only comes to attention when some gov official gets a red face, e.g. the this, the UAE team, etc. Who knows how many more there are, and given the deceleration of page monitoring 3.3 as of Jan 2012 that trend will exacerbate. in January/Feb 2013 the 2012 numbers will show and may yet confirm that trend. The Wiki doors are wide open, and ....
And that is in the context of huge amounts of editor time wasted on trouble some users. Per WP:Canvas you can not comment or be involved in this discussion now that I have mentioned it here, but can just observe it from afar to see how hard it is to get rid of a user. And this is just the start of that process, my guess is 20% done, 80% more effort by all involved to expel him. Pac-man at work, while the community is asleep. History2007 (talk) 20:01, 4 December 2012 (UTC)[reply]
The 'right to assembly' i.e. for people to notify each other and get together to discuss a common interest or campaign is a fundamental and often hard-won freedom in most human societies - it always seemed odd to me that WP frowns on it. I understand it originated when people formed secret mailing lists and covert campaigns. I suppose the problem is that in real life the assembly is usually visible, but then that shows there are all sorts of problems with 'crowdsourcing' that its inventors never foresaw.Hestiaea (talk) 20:06, 4 December 2012 (UTC)[reply]
That is not how I see it. This is like some person who runs red lights again and again and the only way to take their driver license away is to have a referendum! This is how social grid lock chokes societies. And this case has not even got to WP:AN yet. When it gets there people will show out of the wood work to support him, say give him 2nd and 3rd chances, etc. There is only one term for it: social gridlock. History2007 (talk) 20:47, 4 December 2012 (UTC)[reply]
Who are the people coming out of the woodwork? Hestiaea (talk) 20:49, 4 December 2012 (UTC)[reply]
I do not know. Just watch it unfold when it gets to AN. Once when I was arguing for more protection a 22 year old from Brazil showed up and spent huge effort that the "founding principles" were open and must stay that way. Look here for how they are defending fringe as we speak! But again, please do not comment there. History2007 (talk) 20:53, 4 December 2012 (UTC)[reply]
Ah I had to look to test my theory that a non-expert can sniff out an impostor by various heuristics. Tests include whether the editor uses faulty logic, poor grammar, whether there are signs of unclear thinking, whether there are editors who can obviously think clearly pitted against him and a few other signs. Strongly positive on this one. The problem is to encode that logic into rules and principles. In my world, we interview people and get them to submit written work or tests, and this eliminates close to 100% of the impostors. Very old fashioned and pre-internet and the magic wiki, but always works, IMHO. good luck with that one. Hestiaea (talk) 20:57, 4 December 2012 (UTC)[reply]
Your comment that "the number of opportunists who saw this website as a means of self-promotion grew" is quite right. By the way I have completed most of my work on fringe editing and have moved on the paid editing. That is a beautiful area for research. Hestiaea (talk) 20:59, 4 December 2012 (UTC)[reply]
I have spent min time on that, look at Christie's edits and talk page of how much time they spent on it. I am a bystander there. But the way fringe editing eats time is part of the reason paid editing goes unnoticed. History2007 (talk) 21:03, 4 December 2012 (UTC)[reply]
By the way, I think the Independent just invented a new term there: "Wikipedia moment". I bet it will enter pop-culture sooner or later. Will it go into Wiktionary? History2007 (talk) 10:21, 5 December 2012 (UTC)[reply]
Also here now. History2007 (talk) 11:22, 12 December 2012 (UTC)[reply]

Sources of errors

Just before you go to print, here is another reason crowd sourced encyclopedias can not be relied upon. What I have seen is that there are many people who try to do disambiguation. E.g. user A types Seneca, without specifying which one. Six months later, user B who knows very little about the topic takes part in a "disambig campaign" with the best of intentions and tries to add links to many articles within a 3 day period. In a number of cases people with the best of intentions select the wrong Younger/Elder Seneca for the link, and the error persists almost for ever, unless someone with knowledge happens to notice it, but there are just too few of those users. There is no way around that in crowd sourced development. History2007 (talk) 11:22, 12 December 2012 (UTC)[reply]

The click ahead

I saw this the other day and thought of it as The Click Ahead. As everyone knows now, Bill's book confirmed what the man had said long ago: the book missed the road by a mile. But forecasting does help thinking and planning, so I thought that as a reader of a book on Wiki and crowd sourcing I would have liked to see what Wikipedians think the click ahead will bring about. You have talked with DGG, and he is one of the most involved and knowledgeable editors in terms library science, etc. But as a reader I would have liked to see what 5 or 6 Wikipedians think. I will give you my guess below.

Have you talked with user:Johnbod? He is very dedicated, eternally optimist and a respectable gentleman type. So his views of the future would be interesting, given that he is a content provider, while DGG is a guide and administrator. John is in London, according to his IP address. And you could look here, pick a few others and hear what they see as the future. That would be good material at the end of the book. But you would be wise not to make predictions yourself.

Anyway, what I think will happen, and in fact hope will happen is a "federated approach" - again one can not rely on the Wiki-aricle on it, so just look at the diagram for the idea.

In this scenario Wikipedia will become the lowest level of the knowledge food chain above the internet, and there will be two layers above it.

As a practical example what I hope to see is:

  • IEEE decides to build the IEEE encyclopedia. That will fit well into their IEEE Xplore framework anyway and can sell papers and books that way as well. Selective advertising will be allowed, so all publishers can advertise, computer vendors etc. But movie advertising etc. is out.
  • They use the free Wikipedia software - no need for major development. All users are identified by their real names.
  • All IEEE Fellows automatically get Wikipedia:Bureaucrats-like status. These are generally highly decorated, and often retired professors. So governance will be of high standards.
  • Only users with PhDs who are IEEE "senior members" can become administrators if approved by seven fellows.
  • Only IEEE members with PhDs can edit articles. And each article needs the approval of 3 users before it can appear.
  • The Wikipedia node structure (of great value) can act as the starting point, but content will come from the class notes of the members, or Wikipedia after correction - whichever is easier.

Then I think ACM can do their own - which will eventually merge with the IEEE probably. ACM also has fellows, etc. Then AMS, chemical society, etc. may do their own. The Stanford Enc. of philosophy may be upgraded to that, etc. I am certain many computing professors will do this out of the love of the subject.

Then what we get are:

  • About 20 to 30 "trusted encyclopedias" whose contents are reliable.
  • Somewhat uniform software based on WMF code.
  • Somewhat uniform terminology and node structure based on Wikipedia

That is still crowd sourcing, but to an upper class, well informed crowd.

But the key is what comes next: An upper layer "link manager" which knows where to send the user. Google, Wolfram or someone will then do this, or a yet new company. So the user gets "trusted links" that go to one of these, and they can interlink afterwards.

But the basic links can go to Wikipedia: So if the page on Einstein says he was born in Ulm, that page comes from Wikipedia. There is no need for IEEE, ACM or AMS to write a page on Ulm.

I see that as the click ahead. History2007 (talk) 14:54, 13 December 2012 (UTC)[reply]

Thanks - I'm afraid I am not much good at predicting the future either. I gave up when I saw what had happened about Citizendium. (Some people say it wasn't a failure, I think that, at least for the moment, it is a failure). I come back to the point that most academics are driven by status and the need for 'official' rewards. One told me that he liked contributing to Wikipedia (and his work was good, IMO). But his dept head said that editing Wikipedia wouldn't earn him any points. Not like publication in a top 5 journal. It might be different if Wikipedia encyclopedia had a better reputation, but it doesn't. Even then, the essential nature of the wiki is that the author doesn't get attribution, because there is no single author (usually).

That said, there are many examples of academics doing unattributed work, and a better system might work. But I have given up futurology.

Yes, I know Johnbod, indeed I have met him and he lives not far from me (as do a number of high profile Wikipedians). If you ask him, he will probably tell you who I am (but preferably not here on the wiki). Hestiaea (talk) 20:08, 13 December 2012 (UTC)[reply]

In the IEEE model, there will be attribution. The three editors who write the page will be named and will "own the page" - WP:OWN will go out of the window. And that is how users can trust the article, because it says it was written by professor X fro university Y. But to keep that professor in check, two others need to be there. A single professor can go off and say all kinds of things. There are a few anon-academics on Wikipedia, e.g. user:CBM. The entire mathematical logic field is due to him, and very high quality. I do not know who he is, and do not need to.
I wonder what John thinks the future holds. But I am not going to ask him. I will wait for your book to come out. I guess you have one buyer already. History2007 (talk) 23:27, 13 December 2012 (UTC)[reply]
I am making a few experiments with a wiki based on that design. I have always liked the wiki software (though many disparage it). It works for referencing, for changing, for changing back and so on. But no one thought about the economics of building a reference work. There does need to be a form of reward, if only the 'negative' reward of not having to argue with idiots. A few people of my acquaintance have been driven off that way.
Yes I know the work of CBM. I wrote some of the mathematical logic articles but that is not an area of real competence, and I would defer to his expertise always. That said, very little of my work has been changed or corrected in 10 years. Hestiaea (talk) 14:00, 14 December 2012 (UTC)[reply]
Interesting. A while ago on the talk page of Reliability of Wikipedia I said that it is a "Wiki paradox" that Wikipedia's articles on introductory topics are often less accurate than Wikipedia's articles on advanced topics - an unusual form of Russian Roulette. But given the higher frequency of views for introductory topics, the resulting information pollution is higher from those. But I have said it before, so I will say it again that if the WMF programmers had been working for me, I would have fired them all by now. The design, implementation and deployment of the User Feedback Tool was an example. Their semi-failure on Pending Changes was another example. That type of incompetence would have never survived in the commercial software development world. History2007 (talk) 14:35, 14 December 2012 (UTC)[reply]

Change of mind

I am sorry, I no longer wish to be quoted in your book or be associated with it. Please accept my apologies and delete my comments from your talk page and do not use them in your book. Thanks. History2007 (talk) 19:03, 14 December 2012 (UTC)[reply]

I think some sort of explanation would be reasonable, actually. Why this sudden about-turn? Hestiaea (talk) 19:13, 14 December 2012 (UTC)[reply]
I think you know the reason. You can type of it yourself here if you like. I am tired of being played. I do not like being played. History2007 (talk) 19:14, 14 December 2012 (UTC)[reply]
Yes I'm a banned editor (since August 2009, and after many contributions dating from July 2003). I would have told you earlier if I had your email earlier. Hestiaea (talk) 19:20, 14 December 2012 (UTC)[reply]