Jump to content

User:Sj/klog

From Wikipedia, the free encyclopedia
Templated klogging undergoing testing

Welcome to my wikilog, or klog. This can eventually be moved to User:Sj/essays (and merged with the current notes on knowledge-gathering there, each given its own page).

I decided to combine two great failures -- not publishing frequently and not sharing my research and summaries of Wikipedia topics with others -- in this daily log. I plan to use this as the original published source for my writing about all things wikipedia; with echoes on other 'logs and in other WP publications (notably the Wikimedia Quarto).

This began with templates, reverted to simple sections (below), and later was replaced with improved templates that help track updates and commentary.

Wiki spirit evolution

[edit]

19:43, 16 November 2024 (UTC) (decadence, fog of time)

Sitting at EMNLP, presenting a method to support large-scale review of billions of generated articles, to a politely interested but fairly unmotivated audience that see themselves as third-party researchers.

One talk includes distinguished scholars of collaborative production and knowledge-production in particular. In motivating their research into how to use the strengths of AI tools to detect misinformation generated by other AI tools, I started listing to myself the misinformation and confused ideas that these researchers had about WP and its history.

  • Miss: WP eventually made something comparable to traditional encyclopedias, and has now replaced them.
  • Reality: WP created something vastly better in a staggering number of ways: size, scope (both breadth and depth, on average), breadth of participation, flexibility, multilinguality, update speed, cost of maintenance, transparency, auditability, citation density, image quality, knowledge structure, format variety, inclusivity, and attention to systemic bias.
  • Miss: WP has maintained its neutrality and robustness to misinformation campaigns, often better than large social media platforms such as Twitter and Facebook, despite having many fewer employees working on the site.
  • Reality: WP has done this with better auditability, transparency, and scalability; and has almost no people working on this aspect of the projects. So it's more like 1000x to 10,000x fewer employees working on that. It is also continuously identifying and talking about and publishing campaigns and their responses, and allows anyone to join in addresing these issues in their own community, unlike other major platforms.
  • Miss: The idea among researchers that Wikipedia is or should develop into a trusted reference, let's hope it gets there.
  • Reality: Wikipedia's strength is being a frontier space for incomplete and imperfect knowledge, but that can be part of an ecosystem with many non-wiki projects designed to be uniformly trustworthy.

Many of these myths are ones that it seems increasingly hard for WMF staff to counter, or even to understand the fallacy of, as their self-image of their own jobs or why formal structures in the movement exist is somewhat at odds with the robust and generation-defining success of Wikipedia as a collaborative + grassroots cultural phenomenon.

Wikipedia evolution

[edit]

04:01, 23 September 2005 (UTC) (Features, software, change)

There are a few dozen major interface and other changes that need to take place, to keep pace with its growing popularity and the scores of new uses to which its software and community are put every month. I'm working on feature descriptions for as many of these changes as I can think of. They all need pruning and refining by brigher minds and keyboards. Don't tread lightly just because these are my user pages; have at it :-) Nothing warms the hear more than seeing a personal page attacked or improved by someone else who considers its subject worthwhile.


Subject regulars

[edit]

04:15, 23 September 2005 (UTC) (community, content)

Jimbo (among others) likes to say that the community is really a core of a few hundred very active editors who hold everything together; despite the tens of thousands of users who have 'edited more than 10 times'. As justification, he notes the immense proportion of edits are made by the 1% most active users. I like to counter that by saying those users are the ones who figure out where and how to implement bots, what kinds of edits can be carried out by the thousand, &c. On the other hand, you can quickly verify that over a quarter of all new articles are created by IPs... and another 20% by red-userpage users.

For those of you thinking "well, new users often created pages which are then deleted or merged": the creation rate is very close to the aggregate increase in article count. There are ~2500 articles/day, 50 times the deletion rate.

That said, there is a huge role played by subject experts, addicts, and other 'subject regulars' in the maintenance of quality and the development of ever-improving standards in Wikipedia. Many of these are not particularly active Wikipedians, statistically; though they make the bulk of edits to a handful of pages that matter to them. I just ran across a lovely example...

There will always be people in the "general public" (i.e. outside the core group of us who have been working on the hurricane stuff all season) who get reactionary and make mountains out of molehills. -- The Great Zo (from Talk:Hurricane_Rita#Timing_on_articles)

The Great Zo has indeed been working on hurricane stuff since July; with a few hundred edits on those and other subjects. You might never encounter Andy outside of that work, not even in other areas of meteorology, but even without editing them he helps set the tone, style, and informal policy for high-profile pages such as Hurricane Katrina and Hurricane Rita.

Coming up in Boston : Meetup, Conferences

[edit]

05:01, 23 September 2005 (UTC) (community, Boston, museums, Jimbo)

There will be a fantastic Wikipedia meetup and ice-cream social Monday evening at Toscanini's Ice Cream near Central Square (between Central and Kendall, along Mass Ave). Someone from the education division at the Boston Museum of Science will be there; along with a passel of happy Wikipedians and, if we are lucky, Jimbo (who will be in town).

There are also a couple of cool conferences coming up this week. First, MIT's Emerging Technology Conference next Wednesday and Thursday; Jimbo will be speaking on a panel at the end of day 2. I have a press pass to the conference, and will be careful to use it for good... to interview some of my heroes, including Stewart Brand.

Have a favorite speaker you want to know more about? A particular conference session I should be sure to transcribe? An angle you'd like me to cover? Let me know! Please give me a little advance notice so I can prepare properly.

Second, a local KM Cluster conference, held out in Waltham. I didn't realize how wrapped up I would get in E-Tech, and was hoping to make it out for some of the wiki conversations there. However, it's the non-wiki morning presentations in the morning that look more interesting. I think that a lack of wheels and hours in the day will conspire to keep me at MIT.

David Weinberger will be giving a keynote there that I wish I could go listen to.

Quarto standard

[edit]

06:03, 23 September 2005 (UTC) Quartos have been behind schedule since the first one last Fall; yet they have held to a standard format, comprehension, multilingual emphasis and time-period. Currently people are writing about events spanning both the second and third quarters of this year; rather than separating those periods for the benefit of both the last delayed Quarto (#4, for the second quarter of 2005) and the upcoming Quarto (#5, for the third quarter of the year). We should perhaps change the numbering schema so that the time-separations are clearer. And, of course, I need to start publishing early and often.

At any rate, WQ4 will include existing content about that 3-month period (the retrospective, older interview material as available); and WQ5 will cover this third quarter, July-September, including Wikimania (the centerpiece) and the rapid post-August chapter, foundation, and fundraiser developments.

One further standard that needs to be raised is that for announcing and distributing Quarto content. We have an excellent partial-pdf of WQ3; and a mailing-list of eager subscribers to new releases. Rough attempts at a multilingual announcement need reinstatement; as do mail announcements, mailed full-text copies, and mailed pdfs, however hasty.

Adminship and trust

[edit]

+sj + 08:01, 25 September 2005 (UTC)

Adminship should remain no big deal, even as the community scales. We seem to be doing a fair job of swelling the ranks at present, with just over 1 successful nom a day. There are a few outstanding people who might be nudged a bit in that direction -- Bogdangiusca, Ec, Ellywa, Llywrch, Zigger -- and many more new enthusiasts.

This is doubly true for new projects, which may perhaps need fixed-term adminships until the community is large enough to evaluate one another properly.

Trust metrics

[edit]

As for trust, we need at least one - and preferably a whole spectrum - of trust-metrics, which third party efforts, reader/editor plugins, and other patches can refer to while deciding how to display the database's bounty of information.

As a simple example, I should be able to say "Print out some information about Russian revolution" and have my "wiki-extraction-to-pdf"* extension retrieve a reliable version of the article and pipe it to my local printer. That article had minor vandalism on it for 2 days out of the last week, until it was noticed; not unusual for popular vandalism targets. When you're reading a printed page, minor vandalism and errors are far more annoying.

What "reliable" means here should be a settable preference; the default should be something better than "most recent version" -- perhaps "most recent version edited by someone in Group A" or "last version flagged 'reliable' by a member of Group B"
Trusted groups, like Groups A and B in this example, should draw on a spectrum of trust metrics available to all users of the project data. Such metadata should be freely available, like the text itself.

Meatball reflections

[edit]

16:10, 25 September 2005 (UTC)

One of the earliest wikilogs was MeatballWiki, at one point the source of a predominant part of English-language wiki philosophy. It remains an active site, although much of its community has been drawn into the massive reference-focused free-content wikis that have developed since then. I love the outsider's view of a growing Wikipedia that community provides; WikiPediaIsNotTypical is a great example, both spanning the entire history of the project and touching on the kinds of highlights that are forgotten within the Wikipedia community...


On classification systems

[edit]

18:09, 26 September 2005 (UTC)

Borges numbers, and his ancient chinese system of categorizing animals...

Early template trials

[edit]

In other news, templating is making progress. A four-template system currently seems to work alright (day/links/cal/'today' -- the last template used via preview, as MediaWiki doesn't support "eval"-style syntax for system variables).

One major question is : what is the right way to support revisiting an old subject? Clearly blogs do this the wrong way; new posts for new additions, separated in time and space, is inferior to versioning. Blog posts are really mid-level metadata about what the author is thinking, many of which meta-tidbits could point to the same elaboration[s]. Just as wikis separate user-pages, discussions, and content from one another; wikilogs should separate point-like ideas and reflections from monographs, case-studies, debates and treatises.

The current templates give too much weight to date and category tags, and too little to more important (if harder to measure) metadata about context, application, and usefulness of content.

More template success

[edit]

20:02, 12 October 2005 (UTC)

More text is ending up there; though a straight addition like this one, with no template worries, is far easier for a quick writing. The best solution probably remains feeding in new text this way; then reproducing it in granular, templated chunks; then creating custom views of those chunks for various audiences and [re]purposes.

Quick local notes

[edit]

20:07, 12 October 2005 (UTC)

From local Wikimania interest, a collection of ideas, suggestions, and comments.