Monday, February 19, 2007

Anyone may edit

Written in response to this project.

Somebody asked me today about what happens to a dictionary when anybody can edit it. As anybody who has ever edited a wiki knows, the openness is a mixed blessing.

It is a great thing, because many hands make light work. Dictionaries need to be every bit as large as the languages they catalog, so the process of gathering and maintaining the data is a huge one. As we start to add translations between languages, rather than simply defining a term, that task becomes orders of magnitude bigger. To capture all words in all languages is something that will take nothing less than a wiki and a worldwide community. It is a monumental task, but in a wiki, we can conceive of creating a resource on such a scale.

It is a great thing because so many regions and cultures can be represented. An American may understand most of the English spoken in South Africa or New Zealand, but both of those regions have slang all their own. Chile speaks Spanish far differently than Spain. All those variants can have their place.

It is a great thing because a wiki can evolve with a language. New terms come into use all the time, and a freely editable electronic resource is not limited in its capacity to store data or to accommodate a large, diverse set of editors.

The big trouble is this: if just anybody can edit, how on earth do we know it is right? I'd like to explore a few approaches here. Of course, any of these approaches could be considered as a barrier to entry, but these things are always trade-offs.
  1. Appoint trusted users to do the housekeeping. These are the sysops, administrators, bureaucrats, the librarians, or the janitors, depending on your point of view. These somebodies keep watch and undo the damage that some of the just-anybodies can do. If somebody writes an article containing typical vandalism, such as "asdfasdf" or "Dave is a dork!", an administrator can delete or undo it. Much vandalism is so predictable that even a bot can detect and remove it. Unfortunately, a select group of administrators, however well trusted or well-read, cannot be everywhere at once, and they cannot know everything. Things get missed, even with a checklist system such as patrolled edits. It is likewise impossible for an administrator or small group of administrators to know everything. Misinformation, intentional or otherwise, is not so easy to spot as out-and-out nonsense.
  2. Hold people accountable. Articles have histories, so you can see who did what. Even pseudonymous users develop reputations. Anonymous users tend to attract the most scrutiny. An active, healthy wiki often develops into a meritocracy, with leaders having sway (though not necessarily authority) based on reputation, seniority, and trust in the community. This effect generally works to improve content, but even a well-known, trusted user may make mistakes. If he or she is trusted well enough, there is a risk that an error or oversight may go unnoticed.
  3. Allow anybody and everybody to scrutinize and correct or flag the content. The process is not foolproof, especially in larger projects, but wikis have a remarkable capacity for self-cleaning. Of course, this approach can tend to result in a sort of groupthink effect: if enough people believe it, then it must be so.
  4. Demand credentials. Don't just let in any old riffraff. Wikipedia has clearly shown the power of amateurs and volunteers to create great content, but it is certainly possible to limit the users in a project, or part of the project, to a certain group. This approach is most appropriate to a wiki serving a closed community, such as a professional or academic group, especially one dedicated to a particularly narrow or specialized topic.
  5. Make the messes behind the scenes, and publish only the good stuff, with some review process. The German Wikipedia published a paper book containing selected articles. Online, there have been proposals for a "Stable Versions" system, where a mature article would be reviewed and locked, and any additional changes would go through a separate editing or discussion page.
  6. Demand references. There is a movement within Wikipedia to reference the articles and the claims made in them. In the context of a dictionary, references may be other dictionaries. Is the word recognized by RAE or OED (whom we trust to have done the requisite homework)? They may be other works about words. Or, they may be citations. Citations are quotations including the word in question. They show context and provide evidence that the word is or was in use. Of course, we must still question the validity of the evidence. Are the 400 Google hits because somebody prolific uses that nonsense word as a handle? Is a word more valid if it was used by a blogger or two, or by Thornton Wilder? Is an etymology known with reasonable certainty or is it apocryphal? Depending on the size and resources of the wiki, efforts to verify and reference articles may be systematic, or they may be requested when a given entry or fact is questioned.
A wiki is simply a website where anybody can post. With a bit of care and attention, its content can be as valid and accurate as any other reference, and certainly more complete and up-to-date.

No comments: