|This page documents an English Wikipedia editing guideline. It is a generally accepted standard that editors should attempt to follow, though it is best treated with common sense, and occasional exceptions may apply. Any substantive edit to this page should reflect consensus. When in doubt, discuss first on the talk page.|
|This page in a nutshell: Do not create categories for every single verifiable fact in articles. This only makes the category system more crowded and less useful.|
Categorization is a useful tool to group articles for ease of navigation, and correlating similar information. However, not every verifiable fact (or the intersection of two or more such facts) in an article requires an associated category. For lengthy articles, this could potentially result in hundreds of categories, most of which aren't particularly relevant. This may also make it more difficult to find any particular category for a specific article. Such overcategorization is also known as "category clutter".
To address these concerns, this page lists types of categories that should generally be avoided. Based on existing guidelines and previous precedent at Wikipedia:Categories for discussion, such categories, if created, are likely to be deleted.
- 1 Non-defining characteristics
- 2 Small with no potential for growth
- 3 Narrow intersection
- 4 Mostly overlapping categories
- 5 Arbitrary inclusion criterion
- 6 Miscellaneous categories
- 7 Eponymous categories for people
- 8 People associated with
- 9 Unrelated subjects with shared names
- 10 Intersection by location
- 11 Trivial characteristics or intersection
- 12 Subjective inclusion criterion
- 13 Non-notable intersections by ethnicity, religion, or sexual orientation
- 14 Opinion about a question or issue
- 15 Potential candidates and nominees
- 16 Award recipients
- 17 Published list
- 18 Venues by event
- 19 Performers by performance
- 20 Notes
- 21 See also
- See also: Wikipedia:Categorization of people § Categorize by defining characteristics and Wikipedia:Defining
One of the central goals of the categorization system is to categorize articles by their defining characteristics:
A central concept used in categorising articles is that of the defining characteristics of a subject of the article. A defining characteristic is one that reliable sources commonly and consistently define the subject as having—such as nationality or notable profession (in the case of people), type of location or region (in the case of places), etc.
Categorization by non-defining characteristics should be avoided. It is sometimes difficult to know whether or not a particular characteristic is "defining" for any given topic, and there is no one definition that can apply to all situations. However, the following suggestions or rules-of-thumb may be helpful:
- a defining characteristic is one that reliable, secondary sources commonly and consistently define, in prose, the subject as having. For example: "Subject is an adjective noun ..." or "Subject, an adjective noun, ...". If such examples are common, each of adjective and noun may be deemed to be "defining" for subject.
- if the characteristic would not be appropriate to mention in the lead portion of an article, it is probably not defining;
- if the characteristic falls within any of the forms of overcategorization mentioned on this page, it is probably not defining.
Often, users can become confused between the standards of notability, verifiability, and "definingness". Notability is the test that is used to determine if a topic should have its own article. This test, combined with the test of verifiability, is used to determine if particular information should be included in an article about a topic. Definingness is the test that is used to determine if a category should be created for a particular attribute of a topic. In general, it is much easier to verifiably demonstrate that a particular characteristic is notable than to prove that it is a defining characteristic of the topic. In cases where a particular attribute about a topic is verifiable and notable but not defining, or where doubt exists, creation of a list article is often the preferred alternative.
In disputed cases, the categories for discussion process may be used to determine whether a particular characteristic is defining or not.
Small with no potential for growth
- Note: Possible revisions to this guideline are being discussed at Wikipedia talk:Overcategorization#WP:SMALLCAT. Your input is welcome.RevelationDirect (talk) 19:18, 23 January 2016 (UTC)
Avoid categories that, by their very definition, will never have more than a few members, unless such categories are part of a large overall accepted sub-categorization scheme, such as subdividing songs in Category:Songs by artist or flags in Category:Flags by country.
Note also that this criterion does not preclude all small categories; a category which does have realistic potential for growth, such as a category for holders of a notable political office, may be kept even if only a small number of its articles actually exist at the present time.
If an article is in "category A" and "category B", it does not follow that a "category A and B" has to be created for this article. Such intersections tend to be very narrow, and clutter up the page's category list. Even worse, an article in categories A, B and C might be put in four such categories "A and B", "B and C", "A and C" as well as "A, B and C", which clearly isn't helpful.
In general, intersection categories should only be created when both parent categories are very large and similar intersections can be made for related categories.
Mostly overlapping categories
If two or more categories have a large overlap (e.g. because many athletes participate in multiple all-star games, and religious leadership does not radically change from year to year), it is generally better to merge the subjects to a single category, and create lists to detail the multiple instances.
Arbitrary inclusion criterion
- Examples: School districts at the top 7% on Pennsylvania standardized tests, Locations with incomes over $30,000, Category:100th episodes
There is no particular reason for choosing "7%", "$30,000", or the 100th episode as cutoff points in these cases. Likewise, a district with 3,800 students is not meaningfully different from one with 4,100 students. A better way of representing this kind of information is to put it in an article such as "List of school districts in (region) by size". Note that Wikipedia allows a table to be made sortable by any column.
Categorization by year, decade, century, or other well-defined time period (such as historical era), as a means of subdividing a large category, is an exception to this. When you create a categorization by time period, you should state the inclusion criteria clearly at the top of the category (e.g. This category is for politicians who were active in the 19th century is not the same as This category is for politicians who were born in the 19th century)
- Examples: People of the Moravian Church miscellaneous, Brass bands of other countries, Uncategorised songs
Do not categorize articles into "miscellaneous", "other", "not otherwise specified" or "remainder" categories. It is not necessary to completely empty every parent category into its subcategories. If there are some articles that don't fit appropriately into any of the standard subcategories, leave the articles in the parent category. The articles categorized together as "other" or "miscellaneous" generally will have little in common and therefore should not be categorized together in a dedicated "miscellaneous" category.
Eponymous categories for people
- See also: Wikipedia:Eponymous categorization, Wikiproject:BLP categorization Examples: Tim Halperin, Jena Irene, Clement Meadmore
Eponymous categories named after people should not be created unless enough directly related articles or subcategories exist. Individual works by a person should not be included directly in an eponymous category but should instead be in a (sub)category such as Category:Novels by Agatha Christie. Like with all categories a choice has to be made whether it is a "people" category (only containing biographical articles) or not (not containing a single biography beyond the main article) to keep people categories separate. Practically, even most notable people lack enough directly related articles or subcategories to populate eponymous categories effectively but Category:Barack Obama, Category:John Maynard Keynes and Category:Albert Einstein are some exceptions. Fans of celebrities should be cautious to avoid adding clutter to eponymous categories.
People associated with
- Examples: People associated with John McCain, People associated with Pope Pius XI, People associated with Madonna, People associated with the hippie movement
The problem with vaguely-named categories such as this is determining what degree or nature of "association" is necessary to qualify a person for inclusion in the category. The inclusion criteria for these "associated with X" categories are usually left unstated, which fails WP:OC#SUBJECTIVE; but applying some threshold of association may fail WP:OC#ARBITRARY.
However, it may be appropriate to have categories whose title clearly conveys a specific and defined relationship to another person, such as Category:Obama family or Category:Obama Administration personnel.
Avoid categorising by a subject's name when it is a non-defining characteristic of the subject, or by characteristics of the name rather than the subject itself.
For example, a category for unrelated people who happen to be named "Jackson" is not useful. However, a category may be useful if the people, objects, or places are directly related—for example, a category grouping subarticles directly related to a specific Jackson family, such as Category:Jackson family (show business).
When confrontated with subjects that share a name a disambiguation page might be a possible solution.
Intersection by location
Geographical boundaries may be useful for dividing subjects into regions that are directly related to the subjects' characteristics (for example, Roman Catholic Bishops of the Diocese of Columbus, Ohio or New Orleans Saints quarterbacks).
In general, avoid subcategorizing subjects by geographical boundary if that boundary does not have any relevant bearing on the subjects' other characteristics. For example, quarterbacks' careers are not defined by the specific state that they once lived in (unless they played for a team within that state).
However, location may be used as a way to split a large category into subcategories. For example, Category:American writers by state.
Trivial characteristics or intersection
- Example: Celebrity Gamers, Red haired kings, Bald People, Famous redheads, Age of death, Mirrors in fiction
Avoid categorizing topics by characteristics that are unrelated or wholly peripheral to the topic's notability.
For biographical articles, it is usual to categorize by such aspects as his or her career, origins, and major accomplishments. In contrast, someone's tastes in food, their favorite holiday destination, or the number of tattoos they have would be considered trivial. Such things may be interesting information to include in an article, but not useful for categorization. If something could be easily left out of a biography, it is likely that it is a trivial characteristic.
Note that this form of overcategorization also applies to grouping people by trivial circumstances of their deaths, such as categorizing people by the age at which they died or the place of death or by whether they still had unreleased or unpublished work at the time of their death. Even though such categories may be interesting to some people, they aren't particularly encyclopedic.
Subjective inclusion criterion
Adjectives which imply a subjective or inherently non-neutral inclusion criterion should not be used in naming/defining a category. Examples include such subjective words as: famous, notable, great, etc.; any reference to size: large, small, tall, short, etc.; or distance: near, far, etc.; or character trait: beautiful, evil, friendly, greedy, honest, intelligent, old, popular, ugly, young, etc.
Non-notable intersections by ethnicity, religion, or sexual orientation
Dedicated group-subject subcategories, such as Category:LGBT writers or Category:African-American musicians, should only be created where that combination is itself recognized as a distinct and unique cultural topic in its own right. If a substantial and encyclopedic head article (not just a list) cannot be written for such a category, then the category should not be created. Please note that this does not mean that the head article must already exist before a category may be created, but that it must at least be reasonable to create one.
Likewise, people should only be categorized by ethnicity or religion if this has significant bearing on their career. For instance, in sports, a Roman Catholic athlete is not treated differently from a Lutheran or Methodist. Similarly, in criminology, a person's actions are more important than their race or sexual orientation. While "LGBT literature" is a specific genre and useful categorisation, "LGBT quantum physics" is not.
Opinion about a question or issue
Avoid categorizing people by their personal opinions, even if a reliable source can be found for the opinions. This includes supporters or critics of an issue, personal preferences (such as liking or disliking green beans), and opinions or allegations about the person by other people (e.g. "alleged criminals"). Please note, however, the distinction between holding an opinion and being an activist, the latter of which may be a defining characteristic (see Category:Activists).
Potential candidates and nominees
- Example: Potential 2008 Republican U.S. Presidential Candidates (deleted in November 2006)
Wikipedia is not a crystal ball. A candidate not yet nominated for public office, the possible next CEO of a certain corporation, a potential member of a sports team, an actor on the "short list" to play a role, or an award nominee (just to name a few examples) should not be grouped by category. Lists may sometimes be appropriate for such groupings, especially after the passage of the events to which they relate.
- Example: Category:MTV Movie Award winners, Category:Honorary citizens of Berlin, Category:People who have received honorary degrees from Harvard University
In general (though there are a few exceptions to this),[clarification needed] recipients of an award should be grouped in a list rather than a category when receiving the award is not a defining characteristic.
- Example: Rolling Stone's 500 Greatest Albums
Magazines and books regularly publish lists of the "top 10" (or some other number) in any particular field. Such lists tend to be subjective and may be somewhat arbitrary. Some particularly well-known and unique lists such as the Billboard charts may constitute exceptions, although creating categories for them may risk violating the publisher's copyright or trademark.
Venues by event
- Example: WrestleMania venues, Republican National Convention venues, Democratic National Convention venues
There is no encyclopedic value in categorizing locations by the events or event types that have been held there, such as arenas that have hosted specific sports events or concerts, convention centers that have hosted specific conventions or meetings, or cities featured in specific television shows that film at multiple locations.
Likewise, avoid categorizing events by their hosting locations. Many notable locations (e.g. Madison Square Garden) have hosted so many sports events and conventions over time that categories listing all such events would not be readable.
However, categories that indicate how a specific facility is regularly used in a specific and notable way for some or all of the year (such as Category:National Basketball Association venues) may sometimes be appropriate.
- See also #Performers by performance venue.
Performers by performance
Avoid categorizing performers by their performances. Examples of "performers" include (but are not limited to) actors/actresses (including pornographic actors), comedians, dancers, models, orators, singers, etc.
This includes categorizing a production by performers' performances. For example, just as we shouldn't categorize a performer by action or appearance, we shouldn't categorize a production by a performer's action or appearance in that production.
Performers by action or appearance
- Examples: Actresses who have appeared veiled, Anal porn actress, Musicians who play left-handed. Saxophonists who are capable of circular breathing
Avoid categorising performers by some action they may have performed (such as a "pirouette", a "runway walk", a "spit take", a "pratfall", a "sword fight", "anal sex", etc.); some method of performance (such as while standing on their head, left-handed, etc.); or how they may have chosen to appear (such as bald, veiled, etc.)
Performers by role or composition
- Performers who have portrayed <character name>
- Performers who have portrayed <a type of character>
- Performers who have performed <a specific work>
- Examples: Fictional characters by actor and subcategories, American dramatic actors, Actors that portrayed heroes or villains, Jim Steinman artists, Actors & Actresses who portrayed, Actors who have played serial killers, Actors who have played gay characters, Actors who played HIV-positive characters, Actors who have played the President of the United States, and Category:Actors who have played Doctor Who.
Avoid categories which categorise performers by their portrayal of a role. This includes portraying a specific character (such as Darth Vader, or Hamlet). This also includes voicing animated characters (such as Donald Duck), or doing "impressions"; portraying a "type" of character (such as wealthy, poor, religious, homeless, gay, female, politician, Scottish, dead, etc.); or performing a specific work (such as Amazing Grace, "Waltz of the swans" from Swan Lake, "To be or not to be" from Hamlet (the play), "Why did the chicken cross the road?" (a joke), etc.).
Similarly, avoid categorizing artists based on producers, film directors or other artists they have worked with (such as "George Martin musicians" or "Steven Spielberg actors"). Performers are defined by their body of work, not by the people they have associated with professionally. For example, Tom Hanks is distinguished by his performances as an actor, not by the fact that he has appeared in Steven Spielberg's films.
Performers by series or performance venue
- Performers who have performed at <location>
- Performers who have performed on <production>
- Examples: Artists who played Coachella, Saturday Night Live musical guests, Ozzfest performers, Celebrity Poker Showdown players, Entertainers who performed for troops during the Vietnam War, and Actors by series
Avoid categorising performers by an appearance at an event or other performance venue. This also includes categorization by performance—even for permanent or recurring roles—in any specific radio, television, film, or theatrical production (such as The Jack Benny Program, M*A*S*H, Star Wars, or Phantom of the Opera).
Note also that performers should not be categorized into a general category which groups topics about a particular performance venue or production (e.g. Category:Star Trek), when the specific performance category would be deleted (e.g. Category:Star Trek script writers).
- See also #Venues by event.
- in prose, as opposed to a tabular or list form
- Wikipedia:Categories, lists, and navigation templates
- Wikipedia:Categorization FAQ
- Wikipedia:Categorization of people
- Wikipedia:Overcategorization/User categories
- Wikipedia:Naming conventions (categories)
- Sortable tables
- Wikipedia:Category intersection – one of several open feature requests which seek to be an alternative way to address overcategorization.
- Wikipedia:Overcategorization/Intersection of location and occupation, an essay
- Wikipedia:What Wikipedia is not § Non-encyclopedic cross-categorizations