Wikipedia:Overcategorization: Difference between revisions
Content deleted Content added
added comment |
→General cases: +1 |
||
Line 29: | Line 29: | ||
:If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful. |
:If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful. |
||
:In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870"). |
:In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870"). |
||
;No potential for growth |
|||
:Example: ''Moons of Earth'', ''Albums by some band that has disbanded and released only two albums'' |
|||
:Just because most bands (etc) have a category doesn't mean all of them need one. Avoid categories that will never have more than two or three members. |
|||
;Subcategories with large overlaps |
;Subcategories with large overlaps |
Revision as of 16:26, 21 December 2006
This page documents an English Wikipedia [[:Category:Wikipedia WP:OC WP:OCATs|WP:OC WP:OCAT]]. Editors should generally follow it, though exceptions may apply. Substantive edits to this page should reflect consensus. When in doubt, discuss first on the talk page. |
[[Category:Wikipedia wp:oc wp:ocats|Overcategorization]]
This page's designation as a policy or guideline is disputed or under discussion. Please see the relevant talk page discussion for further information. |
Categorization is a useful tool for finding and correlating articles. However, sometimes we tend to overcategorize; the more categories an article is in, the less meaningful all of them become. Hence, based on existing guidelines and WP:CFD precedent, this page lists types of categories that should be avoided. If created, categories of these types are very likely to be deleted.
General cases
- Non-defining or trivial characteristics
- Example: People who own cats, Motorcycle riders, Cities with a MacDonalds restaurant
- We should categorize by what is actually important in a person's life, such as their career, origin and major accomplishments. In contrast, someone's tastes in food, their favorite holiday destination, or the amount of tattoos they have are trivial. Such information may be interesting to put in the article, but is not useful for categorization. If you could easily leave something out of a biography, it is not a defining characteristic. This applies equally to articles about other items than people, such as the cities example above.
- Subjective benchmark
- Examples: Tall people, Notable architecture, Famous songs
- Adjectives which imply a subjective benchmark should not be used in naming/defining a category. Examples include any reference to size (large, small, tall, short, etc) or distance (near, far, etc), or such subjective words as: famous, notable, evil, honest, great, beautiful, ugly, young, old, etc.
- Arbitrary inclusion limit
- Examples: People over six feet tall, Villages with more than 10,000 inhabitants, Disasters with more than 5,000 casualties
- There is no particular reason for choosing "six", "10,000", or "5,000" as cutoff points in these three cases. A village with 9,800 people is not meaningfully different from one with 10,100 people. A better way of representing this kind of information is to to put it in an article such as "List of villages in (locality) by size". Note that our software currently allows a table to be made sortable by any column. The obvious exception is categorizing by year, since making a category for each year is not arbitrary.
- Arbitrary geographical grouping
- Examples: Roman Catholic Bishops from Ohio, Quarterbacks from Louisiana, Male models from Dallas, Texas
- Avoid subcategorizing items by geographical boundary if that boundary does not have any relevant bearing on the items' other characteristics. For example, quarterbacks' careers are not defined by the specific state that they once lived in (unless they played for a team within that state). However, geographical boundaries are useful for dividing items into regions that are directly related to the items' characteristics (for example, Roman Catholic Bishops of the Diocese of Columbus, Ohio or New Orleans Saints quarterbacks).
- Arbitrary intersection
- Example: Bakers who won an Emmy award
- Avoid intersections of two traits that are unrelated, even if some person can be found that has both traits. For instance, Emmy awards are not awarded for bakery skill.
- Overly narrow categories
- Example: Italian composers born in 1850, Fictional Black African-American DC animated Superheroes with the power to manipulate electricity
- If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
- In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories (e.g. "Spanish composers born in 1870").
- No potential for growth
- Example: Moons of Earth, Albums by some band that has disbanded and released only two albums
- Just because most bands (etc) have a category doesn't mean all of them need one. Avoid categories that will never have more than two or three members.
- Subcategories with large overlaps
- Example: 1962 New York Yankees team roster, Members of United States 92nd Congress
- Categories like this will likely add multiple categories to many articles, for instance if most of the 1962 team was also on the 1963 team. It is better to handle these with a single category, and create lists that detail the multiple instances.
- Opinion on an issue
- Example: People who like ice cream, Politicians who favor legalizing drugs
- As above, holding an opinion is not a defining characteristic, and should not be a criterion for categorization, even if a reliable source can be found for the opinion.
- Race, religion and sexual preference
- Example: Christian ice skaters, African-American chemists, Homosexual physicists
- As above, people should only be categorized by race or religion if this has significant bearing on their career. For instance, in sports, Christian ice skaters are not treated differently from Jewish or Muslim ice skaters. Similarly, in chemistry, a person's actions are more important than their ethnicity (and there are no separate Nobel Prizes for different ethnicities). While "LGBT literature" is a specific genre and useful categorisation, "LGBT quantum physics" is not.
- Award winners and nominees
- Example: Michigan Red Herring Award nominees
- In general, the winners of all but the most internationally well-known awards should be put in a list rather than a category. It may nevertheless be useful to note the awards in the article. If an award doesn't have an article, it certainly doesn't need a category (and not every award that has an article needs a category). (The sole exception to this is the Academy Awards, which is currently under debate.) Nominees should not be categorized.
- Inclusion in a published list
- Example: Top 40 songs
- Magazines and books regularly publish lists of the "top 10" (or some other number) in any particular field. Such lists tend to be subjective and somewhat arbitrary, and as such don't make for meaningful categorization. Additionally, since there are many of such lists, creating categories for all of them would add needless clutter to all relevant pages. Some particularly well-known and unique lists such as the Forbes 400 may constitute exceptions, although creating categories for them risks violating the publisher's copyright.
Specific cases
- Actors by film, or films by actor
- Example: Starship Troopers actors, or Films starring John Travolta
- Since most films have several dozen otherwise-unrelated actors, and most actors play in several dozen otherwise-unrelated films, these categories would add unnecessary clutter to all pages on actors and films. This information is better represented by including a list of actors in the film article, and vice versa.
- In contrast, categorizing films by director is generally useful, because most films have only a single director, and directors tend to have their own styles.
- Fictional characters by team or association
- Example: Excalibur members, Category:Simpsons villains
- Most short-running stories don't have enough characters to make subcategorizing by team meaningful. Most long-running stories tend to have characters leaving teams, joining teams and switching teams a lot (which means a list is better for explaining the matter), or have "sides" that aren't black-and-white but have large gray areas (which means that the categorization is not objectively defined).
- Guest stars by show
- Example: People who appeared on the Muppet Show
- Several shows are based upon having a new set of guest stars for every episode. As such these stars really don't have all that much in common, and don't make for a meaningful categorization. Once again, a list is a more comprehensive way of showing this.