Talk:Disk storage

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Computing / Hardware (Rated Start-class, High-importance)
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
Taskforce icon
This article is supported by Computer hardware task force (marked as Top-importance).
 

Merge With Hard Disk Drive[edit]

While there is arguably a super-article, Disk Storage of which hard disk drives could be a subordinate article, this article is almost entirely about HDDs. For example, the recently added section Disk_storage#Standard_Disk_Sizes applies only to HDDs. Accordingly, I propose we should merge all HDD material from this article into Hard_Disk_Drive and if nothing is left delete this article (or leave a stub). Tom94022 (talk) 17:38, 18 January 2010 (UTC)

  • Your point about Disk Storage potentially being a super article (or an overview article) is well taken, as there are, for example, Zip, Jaz, REV, SyQuest, Floptical, various Floppy disc sizes and formats (from the original IBM 8 inch floppy, through the super floppies), and various other magnetic and optical disc drives. As well as audio records. Even flash drives, although they are not technically disc drives, in that they don't have rotating media, generally appear to the OS as a removable disc drive (using USB MSC drivers) and are mostly used as very large capacity "floppy discs". As 90+ percent of the article only relates to HDD content, and as I'm in favor of not having two articles covering essentially the same ground, merging the HDD content is very reasonable and I support it. Providing this article is left as a stub for expansion. I will add the summary paragraphs (with pointers to the main articles) once the merging is complete. — Becksguy (talk) 00:50, 19 January 2010 (UTC)
I disagree with the proposal for a merge and delete. Disk storage, as Becksguy points out, is so much more than just computer drives - I think that we will see the phasing out of memory sticks being recognised as disk drives, certainly my mobile phone is recognised as a phone and a mini pic of a phone is displayed in most progs when it is connected even though windows explorer still has it as a drive lol
I noticed that large block going in on HDD and was thinking about moving it into HDD article along with most of the section on HDD. I suggest that once that is done the rest of the document should be expanded to include the sections, optical disk, floppy disk etc with summaries and links to their main articles. It has been on my list of things to do since July but due to other things taking precedent I have only returned to editing properly in the last month.
I will keep an eye and move this to the top of my list for work to be carried out later today.
Chaosdruid (talk) 20:15, 19 January 2010 (UTC)
I have hidden the sections which are on HDD, and not general info, until we decide what to do with them.
Maybe I missed something, but I can find no hidden sections Tom94022 (talk) 03:54, 20 January 2010 (UTC)
  • Sorry, the hidden sections didn't show up in my change list, probably a buffering issue. Tom94022 (talk) 17:41, 20 January 2010 (UTC)
Chaos commented out some content, as done here. — Becksguy (talk) 10:13, 20 January 2010 (UTC)
I have also started to include more about CD's etc and will do more later or tomorrow.
Chaosdruid (talk) 21:51, 19 January 2010 (UTC)
Most of the things you are proposing to add have their own articles. That is why this article can be greatly reduced in size. Both Becksguy (talk and I support a stub. At this point do we have agreement amongst the three of us.? Tom94022 (talk) 03:54, 20 January 2010 (UTC)
  • I have been working offline on expanding the stub after merger of the HDD content. I placed my work done so far into a subpage User:Becksguy/Disk storage for any potential communal use. — Becksguy (talk) 10:06, 20 January 2010 (UTC)
As I noted in my recent edit, disk storage is not a subset of computer storage - there is a whole consumer side that needs to be covered. To that point, your categorization in the stub into removable and non-removable is not IMO particularly helpful. Tom94022 (talk) 17:41, 20 January 2010 (UTC)
The removable and non-removable categorization, if completed, would create some strange pairings, RAMAC and 1301 for example in Non Removable and most of the rest of the early IBM drives in removable. I suggest in the end the categorizations just be Hard disk drive, Floppy Disk Drives, and Optical Disk Drives and let the many flavors be linked off those pages. Other wise this is going to become a very long list. That further suggest that we just create a category, and put the category on each relevent page. Tom94022 (talk) 17:50, 20 January 2010 (UTC)
Good point on categorization. My draft was just a first quick attempt to get something down. And yes, it applies only to computer related disk storage, as that is a subset of disk storage generally. DEC RP04 & RP06 disc packs (similar to the IBM 3330 DASD) were also removable, but they were the system drive and therefore don't really fit today's understanding of removable in the sense of being secondary storage. — Becksguy (talk) 18:38, 20 January 2010 (UTC)
  • The section Disk_storage#Access_methods is horrible, very HDD oriented and incomplete at that. I intend to do major surgery there unless someone objects. Tom94022 (talk) 17:41, 20 January 2010 (UTC)
Agree. Access methods even apply to phonograph records. Might that section be an initial good candidate for merging the specific HDD related stuff? — Becksguy (talk) 18:53, 20 January 2010 (UTC)
The idea is like all articles on this vein - Each section has a brief description and a link to the main article on that topic.
At the moment this article is heading towards "Computer Disk Drives" not "Disk storage"Chaosdruid (talk) 02:48, 21 January 2010 (UTC)
disagree this article should be referenced to in the disk storage article along side "floppy disks" and "C.D.s"

Snake (talk) 13:09, 15 April 2010 (CST) —Preceding unsigned comment added by 207.157.94.254 (talk)

I believe consensus has been reached that the merge should not occur. Removing request links. Mamyles (talk) 01:59, 24 April 2010 (UTC)
Actually the consensus seems to be to merge and it is well on its way, but removing the tag won't stop the merge. Tom94022 (talk) 04:25, 24 April 2010 (UTC)

consensus[edit]

This is going too far now. Disk storage is NOT just about computers...Tom you really need to consult and gain consensus before going so far demolishing an article Chaosdruid (talk) 02:47, 21 January 2010 (UTC)

I'm not sure what you are objecting to, please be more specific or better yet add to the article more material on disc storage for other than computer usage.
For example, disk storage is not about drums so I took it out.
For example, the section on Access Mechanisms is mainly about accessing computer disk storage and is incomplete/inaccurate at that. To your point it does not cover accessing audio or video in their players and probably shouldn't. So when it is fixed it is likely to be very small or gone.
  • That is, the portion on file systems is both inaccurate and incomplete and should be replaced with a single sentence linking to File systems, something like "Data on computer storage disks are organized into File systems which are the basis for access by the computing system." Tom94022 (talk) 18:18, 22 January 2010 (UTC)
For example, Disk storage is about audio (CD Audio) and video (DVD video, BR) storage and it is some what covered, some of which I added.
If you are objecting to my focus on digital then I suppose u can add or expand the material on phonographs and their records and the early analog video disks and players. As I suspected when we started this, getting rid of the HDD specific material will not leave much in the article. Tom94022 (talk) 18:20, 21 January 2010 (UTC)

Nuke Sections 2 thru 6 ?[edit]

The more I look at this article, the more I conclude we should nuke Sections 2 thru 6 and replace them with a single Section, Types of disk storage. The section in turn would list the major categories by with disc storage is subdivided, perhaps a short description and then a link to the appropriate article#section. At this point I can think of the following categories:

2. Types of disk storage {some generic words about the categorization of disk storage below}
Category Types
Medium: Rigid Magnetic, Flexible Magnetic or Rigid Magnetic are the primary disk storage media.
Access method: File system for computer storage, or ??? for audio and video.
Rotation: Typically Constant linear velocity for magnetic disk storage and Constant angular velocity for optical storage.
Size: Early disk storage devices, e.g. Early IBM disk storage, were separate cabinets but over time the sizes have evolved into standard Form Factors.
Mechanism: Early magnetic disk storage devices used Linear actuators; however, beginning 1975 magnetic storage devices began using rotary actuators[1] such that they are universal today. In a rotary actuator the heads move across the disk in an arc that approximates a radius. Optical disk storage devices use both Linear actuators and rotary actuators.
etc. TBD

I don't this list has to be exhaustive, just those key categories that significantly segment the market. For example, I did not include a category for Recording Method which would go into the distinction in disk storage between Analog recording and Digital recording because it seems like TMI. Note that in most cases I was able to link to an article so I didn't have to say much. In the case of "rotary actuator" where there is no article I put in a brief description. The idea is to keep this article short since so much of it is elsewhere. Comments?

Tom94022 (talk) 21:02, 23 January 2010 (UTC)

Sorry guys I have had conjunctivitis for a while - can only just see the computer screen for the first time in 5 days - should be back available in next few daysChaosdruid (talk) 17:57, 25 January 2010 (UTC)

Look out for possible copyright violations in this article[edit]

This article has been found to be edited by students of the Wikipedia:India Education Program project as part of their (still ongoing) course-work. Unfortunately, many of the edits in this program so far have been identified as plain copy-jobs from books and online resources and therefore had to be reverted. See the India Education Program talk page for details. In order to maintain the WP standards and policies, let's all have a careful eye on this and other related articles to ensure that no material violating copyrights remains in here. --Matthiaspaul (talk) 12:57, 31 October 2011 (UTC)

fixed-head disk[edit]

Information about fixed-head disk storage has apparently been removed from this disk storage article.

Is there some other article about fixed-head disks? Such fixed-head devices are alluded to in many Wikipedia articles -- paging, IBM System/360, 1ESS switch, OS/8, D-37C, Burroughs B2500, Autonetics Recomp II, etc. -- with a variety of names -- "fixed-head disk", "head-per-track disk drive", "head-per-track disk system", and probably other names as well.

In my opinion, it is misleading for "disk storage" in the above articles to link to this article -- which talks only about moving-head disk storage -- when those articles are actually referring to fixed-head disk storage.

Is there even a small section of some other article that discusses fixed-head disk storage, so I could point those links at something less misleading?

(Fixed-head storage has long been obsolete, but WP:RECENTISM and WP:DEFUNCTS imply that obsolescence alone is no reason not to have an article -- hence our articles on LaserDisc, IBM 305 RAMAC, disk pack, phonograph cylinder, etc.). --DavidCary (talk) 02:04, 24 January 2013 (UTC)

FWIW, Fixed-head disk drives are a subset of hard disk drives and IMO should be addressed as a section in the Hard disk drive article or perhaps in a separate article, but are not sufficiently notable to be added to this article. Tom94022 (talk) 17:46, 25 January 2013 (UTC)
Added a parapraph on head-per-track disk drives to the Hard disk drive article. That should be sufficient. I leave it to others to fix the links Tom94022 (talk) 21:12, 26 January 2013 (UTC)

Is data are?[edit]

Some recent edits and reverts have been over whether we say "data are" or "data is", that is, whether "data" is the plural of "datum" (therefore "the data are", similar to "the trees are") or an uncountable noun (therefore "the data is", similar to "the forest is").

From what I've seen, those more aware of the original Latin root, such as some of those in hard sciences, may use "data are", but the majority, like the computer and storage industry, never use "datum" and say "data is".

Wikt:Data states This word is more often used as an uncountable noun....

Our own Data article starts with Data (/ˈdeɪtə/ DAY-tə, /ˈdætə/ DA-tə, or /ˈdɑːtə/ DAH-tə)[1] is... and in non-specialist, everyday writing, "data" is most commonly used in the singular, as a mass noun (like "information", "sand" or "rain").[5]. (This has a ref, though hardly the most authoritative one.)

My !vote:

  • data is, based on the above (admittedly less that perfect) sources, and my personal experience in computing/storage industry. --A D Monroe III(talk) 19:33, 15 October 2017 (UTC)
I don't think this is a subject for a vote. Either usage is acceptable in American English and I believe somewhere in the MOS (or some other guidance) there is a statement that unless there is a good reason otherwise (as in quotes) an article should be consistent in its usage and a consistent article should not thereafter be arbitrarily changed from one form to the other. So far the edits have been for consistency as the article is now all "data are." Tom94022 (talk) 06:52, 16 October 2017 (UTC)
What MOS? MOS:ENGVAR? Is "data are" British-English, and this article in British-English? The only "variety" of English here is the American storage industry, which is "data is". Per ENGVAR, even if somehow an editor made the article Brexit all American-English, it must be converted British-English because of strong ties to its specific subject. --A D Monroe III(talk) 14:55, 16 October 2017 (UTC)
  1. This MOS: "While Wikipedia does not prefer any national variety of English, within a given article the conventions of one particular variety should be followed consistently."
  2. And this MOS: "When an English variety's consistent usage has been established in an article, maintain it in the absence of consensus to the contrary."
  3. The disk storage industry is worldwide not just American so strong ties does not apply.
  4. "Data are" is acceptable in American English
Tom94022 (talk) 23:39, 16 October 2017 (UTC)
I don't get this. What is the point here? Why search through the MOS concerning "national varieties of English" for justification of something that isn't a national variety? Much less do it for something that does not improve the article? The purpose of the this MOS is to avoid having any particular wording repeatedly jar the reader away from the prose -- to allow them to focus on the information, not the editor. That is the sole reason behind why we don't use "BC" dates in an article about Judaism, or say "defense" in an article on the British military, or flip-flop between "U.S. and "US" in the same article. Here, from the reader's prospective, how is using "data are" better? Laypersons, industry professionals, and sources use "data is"; many do not even know that "data are" can be (or used to be) technically "correct". Fifty years ago, okay, but computers have long since moved away from any "datum". The smallest unit of data is a byte; below that, it's all just signals. "Datum" doesn't exist in disks. Us saying "data are" is at best pointless, and at worse disrupting to our readers.
And seriously, MOS:TIES doesn't apply because disks are everywhere? Much like US GMC Trucks? Would calling them "lorries" in their articles improve anything?
The reader (and sources) should prevail, not editor preferences or rules. --A D Monroe III(talk) 17:13, 18 October 2017 (UTC)
This an old ald and frequently debated issue for which rules one and two above are what you should "get" here. The article is and was consistently written with "data are" it was not necessary to search thru the rules until you searched for and applied the one guideline which might justify a change but somehow you didn't find the two most applicable guidelines which say no change. The two reverts were for partial changes which are clearly inappropriate. Can we stop this dialog now? Tom94022 (talk) 00:19, 19 October 2017 (UTC)
The two rules claimed are in MOS for "national variations" -- this isn't one, so do not apply here. Even if they did, they would be overridden by MOS:TIES, so are are against "data are". We can't cherry-pick rules.
I'm not the one looking for any rules justifying "data are". There aren't any. Sources matter, and are against "data are".
What I am looking for is any actual reason for "data are" (besides WP:ILIKEIT). To repeat the question: how does "data are" improve the article for the reader? (That whole purpose of WP thing.) --A D Monroe III(talk) 20:03, 19 October 2017 (UTC)
"Data are" is an acceptable U.S. national varient even though "data is" is more common, so rules 1 and 2 do apply. In Wikipedia there are 20k "data are" and 40k "data is" which clearly establishes both useages. The answer to your question is neither "data is" nor "data are" will improve the article; someone will find either bothersome, which is why we have rules 1 and 2 to keep from wasting time. Tom94022 (talk) 06:09, 20 October 2017 (UTC)
Actually I think you are guilty of WP:IDONTLIKEIT - I'm actually agnostic, and I only edit articles to be consistent with as best I can tell the editorial intent. Sometimes intent is clear, but sometimes its just a matter of counting instances of the two varients and going with the majority. Again I would point out the article now has 4 instances of "data are" and no "data is'. My last edit was to revert an IP who changed one of the four to "data is." Its been that way for at least two years. Surely you can't support mixed usages in an article? Tom94022 (talk) 06:20, 20 October 2017 (UTC)

────────────────────

Okay, so it's agreed that "data are" is not an improvement over "data is". It took a while to get to that.

Additionally, we also agree that both "data are" and "data is" are valid English; no one's suggested otherwise. We also agree that we should be consistent within the article; no one's suggested otherwise. It's kind of pointless to keep repeating these.

So far, so good.

Before I go on, I have a minor point of procedure about the "two rules": MOS:ARTCON and MOS:RETAIN. To help this discussion, I'm reducing this to just RETAIN, since what ARTCON states is an agreed given, and doesn't affect this issue either way; as just noted, no one's trying to mix the datas up, regardless of repeating ARTCON or not.

Now, something we don't fully agree on: applying RETAIN here. To me, it seems quite a stretch, since it's under the section "National varieties of English", and "data are" isn't a national variety -- American, British, and other English nations (AFAIK), use both. Just as someone favoring "BC" over "BCE" shouldn't use ENGVAR as justification, this issue shouldn't either.

But, I actually don't think it much matters if it applies or not, because if RETAIN applies, so would MOS:TIES -- in the same MOS:ENGVAR section, right above it. Since the rules aren't rules, I'm skeptical of spending a lot of focus on this only to have it later dismissed as irrelevant WP:GAMING. What I want is to get beyond personal opinions and get to actually providing valid WP evidence -- sources. If this helps get there, okay.

So, before going further, do we agree on using both RETAIN and TIES (or neither)? --A D Monroe III(talk) 19:47, 21 October 2017 (UTC)

I agree that RETAIN applies but TIES does not. The Disk Storage industry is world-wide with leadership in the optical segment in Japan and manufacturing mainly on the Pacific Rim. Tom94022 (talk) 21:46, 21 October 2017 (UTC)
I guess I should have worded that better. I meant "applies" as in "has relevance in this discussion". That is, showing that either RETAIN or TIES has been followed or not has baring, in this case, for this article. Based on the fact that the above reply addresses TIES' determination details, and not its general applicable-ness to any discussion involving ENGVAR, should I take it TIES does have relevance, just like RETAIN? (Sorry for being picky, but I'm trying to maintain some definite progress here, rather than wandering in circles.) --A D Monroe III(talk) 16:13, 22 October 2017 (UTC)
I should have read more carefully, to be specific:
  • ARTCON applies and in accordance with its teachings this article should not be changed to "data is"
  • RETAIN applies and in accordance with its teachings this article should not be changed to "data is"
  • TIES applies and in accordance with its teachings this article should not be changed to "data is" since the Disk Storage industry is world-wide with leadership in the magnetic segment in the US and Pacific Rim, leadership in the optical segment in Japan, manufacturing mainly on the Pacific Rim and research world wide. Accordingly there is no strong tie to any English variant.
Tom94022 (talk) 18:06, 22 October 2017 (UTC)
Well, I must take this a "yes" to the question of having TIES be as relevant to this discussion as RETAIN, since their being mention together. Okay.
(As previously stated, I'm going to ignore ARTCON; it changes nothing either way. Seriously, it's just a pointless distraction to keep bringing it up.)
So, therefore, on to discussion TIES, which, based on what's agreed, should be able to resolve this discussion:
Replying to the above statement on TIES, this is the English WP; What others do in other countries in other languages has zero baring here. (Their sources can be referenced for other purposes, but not their language, especially not for a discussion on English language use.) --A D Monroe III(talk) 21:14, 23 October 2017 (UTC)
BTW, "data is" is not an improvement over "data are," either one will have proponents and opponents.
BTW, the "recent edits" you refer to are justified by MOSCON so it is relevent.
This is getting boring - you have not identfied any strong tie to American English; the one reference cited in the article uses "data are" exclusively. The English WP covers all local varients of English, including but not limited to Canada, Australia, New Zealand, UK, Ireland, Singapore and others to numerous to mention. Disc storage is discussed in all these locations. Even if you could establish a strong tie to American English, both varients are used so that RETAIN require the article be left as is. Tom94022 (talk) 06:45, 24 October 2017 (UTC)
(I assume MOSCON means ARTCON; if instead this is some other rule, please link. If it's just ARTCON again, I'm ignoring as a straw man to the discussion of "data is" vs "data are". The "recent edits" I mentioned once at the start was as the re-trigger of my long-standing thoughts on this, which existed before and regardless of them. I have not said, and will never say, that anyone should change just some but not all "data x" to "data y".)
It's true I have not yet identified the TIES to this subject and "data is". I wouldn't bother to spend effort to do so unless we agree that it will help reach consensus. So far, what I've done is lay the foundation for that, by rebutting claims that RETAIN somehow trumps TIES rather than the other way around, and rebutting several specific claims that there are no TIES at all.
The last such claim is unfortunate, because it goes back to saying that TIES is only about American, that is, only "national", variations. As I stated repeatedly, if TIES applies to only "national varieties", then so is all of ENGVAR (per its name), including RETAIN. One cannot say one of these two rules under ENGVAR applies to "data is/are" but the other does not. I'm okay either way, but I thought we agreed on both applying, not neither. Are we going back on this? Is that helpful? Please, which is it? --A D Monroe III(talk) 16:35, 24 October 2017 (UTC)

───────────────────────── At this point there is no consensus and I am not wasting my time responding to you. Tom94022 (talk) 05:41, 25 October 2017 (UTC)

Unilaterally declaring the discussion over just means refusing to discuss, which appears as giving up after one's points have been countered. --A D Monroe III(talk) 16:30, 25 October 2017 (UTC)
At this point there is no consensus and I am not wasting my time responding to your questions. If you have anything new to say I might consider it.Tom94022 (talk) 05:41, 25 October 2017 (UTC)
I haven't even gotten to my reasons for "data is"; so far, I've spent my time refuting the existence of any reasons for "data are", and then, whenever I thought I'd done that, carefully stating our agreements -- focusing on progress instead of backtracking circles around "rules". As I've hinted, reasons for "data is" are based on reader experience and sources. Any interest in that? --A D Monroe III(talk) 13:59, 26 October 2017 (UTC)
No. Tom94022 (talk) 20:40, 26 October 2017 (UTC)
Seriously? Actually, unequivocally, stating having no interest in the reader or sources, just cherry-picking the "rules". Okay then. --A D Monroe III(talk) 14:23, 27 October 2017 (UTC)

RfC on "data are" or "data is"[edit]

30 days have passed. Waiting for uninvolved closer to evaluate the result and write up a summery --Guy Macon (talk) 19:11, 3 December 2017 (UTC)

The rough consensus seems to be for "data is" by default with context-specific exceptions if sane/appropriate. The argument seems to be predominantly reader accessibility/preference, and while some have cited MOS:STYLERET as rationale, the spirit of MOS:STYLERET is to prevent community-wide edit wars (specifically over dates; see also the corresponding citation which itself points to date contexts). Local, consensus-backed style decisions in an article aren't prohibited by STYLERET; it's quite the opposite: when an article decides on a style that's otherwise ambiguous, that style is used at that article. Whether that happens with the existing text without discussion (i.e., the assumed case when nobody's raised the question locally) or via explicit local discussion (i.e., like is happening here)... that is the case-by-case/internal-consistency basis implied by STYLERET (at least, given my interpretation).

Incidentally, I'd suggest in the future not archiving a discussion until someone comes by to close it; 30 days is a rough guideline for RfCs, but the real "expiration time" is when the discussion dies down and someone actually agrees by closing it.

--slakrtalk / 18:26, 27 December 2017 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

In this article, should "data" be used as a count noun (the plural of "datum") or mass noun (like "information")? This affects its grammatical use in many sentences; examples: "the data are stored" vs. "the data is stored", and "SSDs hold fewer data" vs. "SSDs hold less data".

Note that this RfC is not for all WP articles, but for this Disk storage article (though any consensus here might be applied to some other closely related articles, such as Hard disk drive, Data storage, or Computer, following discussion in those articles). --A D Monroe III(talk) 16:37, 27 October 2017 (UTC)

Survey[edit]

Data are or Data is?

  • Data is. As nominator, per sources for this subject and field, per reader expectations and their ease of reading, and per #MOS:STYLERET in Threaded Discussion. --A D Monroe III(talk) 16:37, 27 October 2017 (UTC)
  • Data are. This appears to be the convention in the article (at present). In arbitrary grammatical matters like this, I would say that clarity is most important, then after that convention. Is the article presently unclear in some way because "data are"? Not that I can see, so why change it? Attic Salt (talk) 17:30, 27 October 2017 (UTC)
  • Data are is acceptable and should not be changed in this article per #MOS:STYLERET in Threaded Discussion; it is a matter of style.Tom94022 (talk) 18:07, 27 October 2017 (UTC)
  • Data is WP is a general purpose resource and the more common general usage is a) quite adequate b) endorsed by Oxford English Dictionaries. There are many instances within the article that minor rephrasing would 'side-step' the problem. Pincrete (talk) 17:01, 28 October 2017 (UTC)
  • Avoid the problem, by using words such as "information" when the collective meaning is intended. I don't actually see any cases in this article where such a change is needed, but in the example "SSDs hold fewer data" I'd change to ""SSDs hold less information" perhaps. When talking about transferring bytes, blocks, sectors, etc., the plural data is good. Dicklyon (talk) 16:46, 29 October 2017 (UTC)
  • Data is. "Data", as used in the article is a mass noun. There's no justifiable reason for treating it as countable here. Data is to bytes and bits as money is to dollars and cents (or pick your favorite currency). There are certainly circumstances where it makes some sense to treat it as plural ("The questionnaire asks for several data about the subject, such as name, age, birth place..."), but this isn't one of them. It would make no sense to say that a disk holds 27 data unless, say, the data on the disk is organized into 27 pieces of information, each one of which is one datum. Also, changing to "information" to avoid the issue is a cop-out and artificially avoids a good word. --Deacon Vorbis (talk) 17:40, 29 October 2017 (UTC)
  • Data is - I don't think we could reasonably say "data are" here. It just sounds odd, because it isn't something that is really countable. For example, you wouldn't know what I am saying if I said 100 kilos of data are. That's because "data" isn't a valid unit. Thus, it should be treated as a mass noun. RileyBugz会話投稿記録 20:13, 29 October 2017 (UTC)
  • Decide on a case-by-case basis according to context. "All of that data is now stored on the hard disk" and "These three specific pieces of data are now stored on the hard disk" are both correct. --Guy Macon (talk) 22:08, 29 October 2017 (UTC)
    Sure, but with "pieces of data", it's "pieces" that's plural, not "data". --Deacon Vorbis (talk) 22:12, 29 October 2017 (UTC)
    That's what "Decide on a case-by-case basis according to context" means. I gave grammatically correct examples of "data is" and "data are". This proves that no "always use 'data is' " or "always use 'data are' " rule is needed or appropriate. --Guy Macon (talk) 21:49, 6 November 2017 (UTC)
    No, you gave an example of "pieces of data are", not "data are"; that's why I said that in your example, that "pieces" is what's plural. You can't really have pieces of a countable noun anyway (not that I can think of at least). Read the rest of the discussion and see how it's being used in the article. No one is saying that anyone should ever use "pieces of data is". And for the most part, I don't think people are advocating for always using "data is"; I'm sure not. I even said that there are times when "data are" is appropriate – just not for how it's being used in this article, which is what's under discussion here. (Also, see MOS:LISTGAP; the way I had the indenting was correct already.) --Deacon Vorbis (talk) 22:51, 6 November 2017 (UTC)
    You may very well not be advocating for always using "data is" or always using "data are", but the question at the top of this RfC is, and if one of those two choices get enough !votes (including your) that will be what will be enforced as being the consensus of this RfC. -Guy Macon (talk) 15:24, 7 November 2017 (UTC)
    Let's move this to #Clarification on grammar vs. data is/are !votes under Threaded discussion below. --A D Monroe III(talk) 00:29, 8 November 2017 (UTC)
  • It depends on context. To use clearer examples, "The project's data is backed up off-site" (mass noun – data treated as something dumped into a container); "the project's data were subject to peer review" (count noun – each datum was examined). Dicklyon is also correct in that it can simply be written around, anyway. The average case in this article's text appears to call for "data is", because it's talking about data in the abstract, as virtual piles of stuff being stored and moved around on volumes, and is not about any specific datum or set thereof.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  01:57, 30 October 2017 (UTC)
The context here is all under Data storage, so would be "data is" throughout based on this reasoning. --A D Monroe III(talk) 22:13, 2 November 2017 (UTC)
  • Data is is correct. Summoned by bot. If it seems awkward to anybody and doesn’t make it more awkward by using another word then that seems like a reasonable compromise. TimTempleton (talk) (cont) 15:56, 30 October 2017 (UTC)
  • Both - depending on context. Usually 'data is' seems likely, but a 'data are' usage can also be correct. Markbassett (talk) 19:38, 2 November 2017 (UTC)
Sigh. I think the only 100% agreement we had was to not mix them. --A D Monroe III(talk) 22:08, 2 November 2017 (UTC)
  • Data is While both can be grammatically correct, "data is" is the more common usage of the two. While WP:COMMONNAME doesn't apply here (it only applies to article titles), I think we should apply the general concept here: use the most commonly recognized phrasing. Also, using both phrasings in the same article can lead to confusion, especially for those who aren't as computer-literate. Even those who are computer-literate may be confused by us using both or even just "data are" due to the latter's rarer usage. Gestrid (talk) 14:56, 6 November 2017 (UTC)
  • By !voting for "data is" instead of "choose according to context", you are proposing a rule that would call for changing the sentence "These three specific pieces of data are now stored on the hard disk" to "These three specific pieces of data is now stored on the hard disk". Is that what you intended to do? --Guy Macon (talk) 21:56, 6 November 2017 (UTC)
Incorrect. "Pieces of data" is a plural noun phrase, like "slices of apple" or "piles of sand". "Context" like this would still follow standard English grammar, of course. --A D Monroe III(talk) 15:05, 7 November 2017 (UTC)
On what basis do you make that claim? It certainly isn't in the question asked at the top of this RfC, which makes zero mention of context. If you make a rule, people will enforce the rule as written. --Guy Macon (talk) 15:24, 7 November 2017 (UTC)
Sigh. In this article, should "data" be used as a count noun (the plural of "datum") or mass noun (like "information")? I'm sorry if my is/are simplification just for !voting somehow overrides the RfC statement as written. --A D Monroe III(talk) 23:38, 7 November 2017 (UTC)
Let's move this to #Clarification on grammar vs. data is/are !votes under Threaded discussion below. --A D Monroe III(talk) 00:29, 8 November 2017 (UTC)
  • Data is This is common and quite adequate because I see no compelling need ever to use "Data are". If one needs to refer to a plurality of data, then be descriptive about it as in "Data bytes flow here", "Data bits are read sequentially", etc. or "The three data streams enter the device." Blooteuth (talk) 14:01, 17 November 2017 (UTC)
  • Data are. In evaluating the Wiki guideline, please keep in mind that data is also jars, even for readers who don't know Latin. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:09, 20 November 2017 (UTC)
    Any basis for that? I've seen no "data are" preference outside of those focused on Latin, which is outside this subject. --A D Monroe III(talk) 21:02, 20 November 2017 (UTC)
  • Data is It boils down to ILIKEIT per what I'm used to, but decades of examining computer manuals and books means "data are" looks pretentious. Data is stored on disk or data is transferred over a cable, just like water is (not are) transferred through a pipe. Johnuniq (talk) 21:25, 20 November 2017 (UTC)
  • Data is as a simple !vote jcc (tea and biscuits) 12:03, 3 December 2017 (UTC)

Threaded discussion[edit]

First, it's agreed that both "data are" and "data is" are currently used in English, so neither is "incorrect" (similar to "disc" vs. "disk"). Also note that this article currently uses "data are".

It can be noted that "data is" is the more favored use in modern dictionaries and in general English, but it's more important to note that these two uses are far from equally distributed.

The etymological origin of "data" is Latin, where it was the plural of the Latin "datum". This tends to be maintained in fields where knowledge of Latin is more common, like "hard" science and research where publishing in scientific journals is the goal. Here, using the word "datum" occurs, and "data" tends to be maintained as countable and the plural of "datum".

But outside of this, English use is otherwise. In newer "engineering" fields, where the the only end-target is products for common consumers, "data is" dominates: industry sources, consumer articles, and user manuals. Here, "data" a mass noun, used just like "information", and "datum" is virtually absent.

Thus the common reader for this subject will be used to, and expecting, "data is". When reading the article and coming across "data are", this can cause the reader to stop and wonder why our WP editor would choose that; is there some implied significance to this they've missed? It's an interruption, just like coming across "colour" in an article about an American film, or "BC" in an article on Judaism, or "phone it in" within an article about Shakespeare. Unusual wording just distracts the reader, with no benefit whatsoever. Since "data are" and "data is" are unequally correct English, we should simply go with the one more appropriate for the reader. In a few articles this choice may be difficult -- not here. --A D Monroe III(talk) 16:37, 27 October 2017 (UTC)

It's not actually similar to "disc" vs. "disk", which in this context have different referents (disc = optical and magneto-optical media) It's not okay to write "hard disc" and "compact disk" here.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  01:59, 30 October 2017 (UTC)
Ah, but it was similar -- exactly my point. Seagate Technology, especially, used to insist that its magnetic media HDD products were "hard disc drives", and sources often did the same. Back then, WP could use either "disc" or "disk" for the same subjects. But English use evolved, and use of disc diminished to only optical, and uniformly disk for magnetic, and WP followed suit. Thus similar to "data are/is" use here: terms used to be equal, but now splitting to be area-specific. --A D Monroe III(talk) 16:02, 30 October 2017 (UTC)
Data are is acceptable and should not be changed in this article. A D Monroe III has presented no substantial reasoning that "data are" is "unequally correct English" nor that "data is" "dominates" for encyclopedic works. Arguably this is a question of style in which case we should retain the existing style. My recollection is that this is an old issue which as best I can recall has been resolved by consistent usage in any one article of either style, that is, editors choice. Note also that there are 14,631 instances of "data are" in the English Wikipedia articles so this discussion is more suitable to MOS talk than this one article's talk. I do not understand Monroe's reasoning that this change only applies to this and "closely related articles" - it should be taken to MOS talk. Tom94022 (talk) 18:02, 27 October 2017 (UTC)
I said "dictionaries", not "encyclopedic works".
So, maybe vaguely something was said about this before? Languages and WP:Consensus can change. Similar to how "disc" changed to "disk" a way back.
Yes, there's a lot of "data are" in WP, and a lot of "data is". So, again, neither is "incorrect".
As clearly stated, this RfC isn't WP-wide; why say we should try to do that at MOS? It would be wrong if all of WP was changed to only one of either "data are" or "data is" uniformly across all articles. Each article should be judged based on its own sources and readers. Articles on scientific subjects, where use of "datum" is prevalent, should probably adopt "data are". In fact, this RfC may help clear the way for doing that in those articles, where they currently have "data is" even though contrary to the subject's sources, and it's disruptive to its readers. --A D Monroe III(talk) 22:52, 27 October 2017 (UTC)
Oxford Eng Dict, says unequivocally that it is a mass noun. Longman says mass noun in general speech, but less so in academic and formal Eng. I don't see need or benefit in this article for academic or scientific-exactness. Pincrete (talk) 16:57, 28 October 2017 (UTC)
Perhaps the Oxford English Dictionary is equivocal:
A handy summary
Data can take either a singular or plural verb in standard English, but be consistent within a piece of writing, always check the style policy of your organization, and make yourself familiar with the grammatical debate that exists around them.

Oxford Dictionaries Blog

This article is consistent, there is no explicit policy and the author of this RfC doesn't want to take it to the MOS talk where a policy could be developed. It would seem the Oxford blog suggests leaving the article alone. Tom94022 (talk) 19:02, 28 October 2017 (UTC)
The same OED blog continues "The word data in English usage has evolved: a mass noun use, recorded in the OED from the 18th century, has become increasingly common over the past 70 years, particularly in computing and general contexts." I think as both uses are recognised as valid, the case needs to be made for deviating from common use in a "computing or general context". Pincrete (talk) 19:45, 2 November 2017 (UTC)
This article is inconsistent per that very blog summary. This subject under the subject called "Data storage" (mass noun), not "Datum storage" (count noun, where singular is used for count noun collections in English). There is no debate on "data is" in WP. There's not even a debate in the industry, because it's overwhelmingly "data is". I do not favor any sort of one-rule-fits-all in MOS for this. Instead, for the readers' sake, I want articles with subjects that use data as a count noun to use it as a count noun, and articles with subjects that use data as a mass noun to use it as a mass noun. --A D Monroe III(talk) 15:46, 29 October 2017 (UTC)
BTW, For completeness, I've asked on MOS talk if MOS should decide this and by what means. We can see if anyone there is interested. --A D Monroe III(talk) 16:00, 29 October 2017 (UTC)
The blog at OxfordDictionaries.com isn't a particularly reliable source, but even if it were, what Oxford U. Pr. advises as a style "rule" has nothing to do with what WP does, unless MoS adopts the same rule. And MoS would never adopt that one, because it's "context-stupid" and defies real-world usage. It's no different from ignoring the fact that "fish" and "fishes" are both valid plurals in different contexts ("the aquarium has 7 fish" refers to 7 animals; "... has 7 fishes" refers to 7 species and maybe way more than 7 animals), and insisting on always using one or other ("the aquarium has 7 fish, meaning species, and 247 fish, meaning individual animals", or worse yet, "I ate two fishes for dinner; both were tilapia").  — SMcCandlish ¢ >ʌⱷ҅ʌ<  02:05, 30 October 2017 (UTC)
Ah now I understand The Feeding of the 5000! Pincrete (talk) 19:39, 2 November 2017 (UTC)

There's something that may be of interest here: a user essay called the "Yogurt Principle". It covers the long sad story of the 8 year ugly battle to move Yoghurt to Yogurt, which ended only when the move finally succeeded. The "principle" it proposes is to suggest asking the question, "if a proposed change were done, would anyone later bother to try and change it back"? It could be asked here: if this article was changed to "data is", would there be any more drive-by edits to change it back to "data are"? I really doubt it. Changing can end the current drive-by edits (see article edit history), and all further discussion. The only arguments for "data are" here are based on "let's just keep it the way it is". If changed, those same arguments would then join the consensus for "data is", making it overwhelming and permanent. --A D Monroe III(talk) 20:32, 29 October 2017 (UTC)

That's B2C's way of saying that he's OK with tyranny of the majority. I'm not so sure it applies here, as there are lots of editors who will generally try to fix treatment of data as singular. Dicklyon (talk) 21:45, 29 October 2017 (UTC)
Not quite. It's not saying "51%" is good enough", it's suggesting that for !votes that are only "just leave it be", with absolutely no preference for X or Y, that if the article was changed from X to Y, those !votes would then become votes in favor of Y. Their !vote lacking any preference may be more "let's spend as little time on this as possible", which might best be served by making the change to the most stable way and be done with it for good. That would have to be very carefully judged by the closer, of course.
On "lots of editors" for "data are", the evidence seems strongly against that. In the Data storage WP subjects, articles that use "data is" get little or no drive-by edits changing to "data are", while ones that use "data are" (like this one) get repeated drive-bys for "data is"; apparently it seems obviously wrong to them. Now, I don't have counts of any of these, searching change logs for specific changes is tedious, but this is what my experience tells me.

(I have moved the following from MOS Talk as more appropriate here. --A D Monroe III(talk) 01:49, 11 November 2017 (UTC))

With counting of "data is" vs "data are" being presented as evidence for some WP preference, I should point out that's not evidence for any preference of data as a plural of datum vs. data as a mass noun. In phrases like "different types of data are used", the subject is "types", not "data", so "are" is used regardless of data being a mass noun. There are many other indicators of count vs. mass, such as "fewer data" vs. "less data". English being what it is, the varieties pertinent variations on this are virtually endless. A better indicator (though still not perfect) would be counting the occurrence of "data" vs. "datum" in articles; in general, any article that uses "data" as a plural is at least somewhat likely to use its singular form as well, while articles using it as a mass noun are very unlikely to use "datum". --A D Monroe III(talk) 15:03, 10 November 2017 (UTC)

Search shows 190 instances of "of data are" in WP articles so Monroe's example aboove really doesn't illustrate much - regardless of the possible miscounting or mischaracterization by my somewhat crude search the fact is that there are thousands of ambiguous usages of these terms in WP articles which makes it a matter of style covered by MOS:STYLERET MOS:RETAIN which clearly states:
Retain existing styles
... editors should not change an article from one styling to another without "substantial reason"

WP Manual of style

The author of this RfC has stated neither "data is" nor "data are" is incorrect but he prefers "data is" "per reader expectations and their ease of reading." which is his POV about style that he is entitled to but is not supported by the actual usage in WP. Personaly my POV is although I use "data are" this is a writing style issue and therefore we must apply the MOS guidance to retain existing styles. The reasoning hereinabove is POV, none of which amounts to a substantial reason to modify this article given the thousands of articles in WP using "data are".
BTW there is ample WP:RS reliable sources that this is a style question and other style guides do address it. Also see Oxford dictionay blog quoted above. So unless someone comes up with a substantial reason this article should remain unchanged. Tom94022 (talk) 20:36, 11 November 2017 (UTC)
Since this thread has been switched from counting "data are" other articles (with one last attempt at cooked counting) to MOS:RETAIN, I'll continue with that. See MOS:RETAIN #MOS:STYLERET below.

Application of MOS:ENGVAR[edit]

Nothing to do with ENGVAR

While MOS:ENGVAR clearly applies to inter-language variations of English language we can consider its teachings in the context of this intra-American English language variation. Specifically:

  • MOS:ARTCON suggests the article should be consistent; the recent changes have been in accordance with this teaching
  • MOS:RETAIN suggests retaining the current language, "data are," since there is no valid reason for a change
  • MOS:STRONGNAT suggests retaining the current language, "data are" since it is not a valid reason for a change. There are no strong national ties to either varient in American English.

Monroe agrees that both varients are used in American English thereby admitting there are no "strong national ties" so that if we apply the teachings of MOS:ENGVAR we will not change the article. Tom94022 (talk) 19:49, 27 October 2017 (UTC)

MOS:ENGVAR, as fully titled, is WP:National varieties of English. It's a stretch to say this issue is all about a "national" variety; the variation here is not between nations, but (where the variance is significant) between some different fields of science and industry subjects, regardless and independent of whatever English-speaking nations are involved. But, even if ENGVAR were stretched to apply here, STRONGNAT (AKA MOS:TIES) under that same ENGVAR, would therefore trump RETAIN due to the strong ties of "data is" with this subject. If it's agreed that ENGVAR applies, then its TIES section would mean we change this article to "data is".
I can see no reason to bring up ARTCON here except as a straw-man. Of course no one would want a mix of both "data are" and "data is" in this (or any other) article.
(BTW, "clearly" is a weasel word. Using it can undermine the argument one may have.)
--A D Monroe III(talk) 22:52, 27 October 2017 (UTC)
I never said this issue is all about a national variety. This feels like a strawman argument but perhaps u didn't understand my phrase "this intra-American English language variation" - intra as within. All I am suggesting is that the policies regarding varients across languages might teach us about how to deal with variations within languages. You assert but have yet to demonstrate strong ties to the article by "data is" in any English language. A simple search will show substantial usage by both varients in any material related to Disk Storage so the teachings of MOS:TIES are that in the absence of strong ties "data are" should continue to be used. BTW, if "clearly" is a weasel word then yr use of it in the RfC discussion above doesn't give yr argument much weight. Tom94022 (talk) 07:07, 28 October 2017 (UTC)
But if this issue is not a national variety, then ENGVAR does not apply, per its own name and definition. Calling it "intra-American" specifically disqualifies it as ENGVAR. Yes, we should be aware of ENGVAR here, but repeatedly stating selected technical details of ENGVAR as the only argument supporting "data are" is weak, and won't be binding anyway, since editors can later dismiss any decision based on this as WP:GAMING.
As for TIES, there are only ties to "data is" in this subject. It begins with the name of subject -- not just its WP:COMMONNAME but its only name. The very term "Data storage" uses data as a mass noun; were it used as a count noun, it would be "Datum storage". In English, for countable nouns, the singular form is used for collections: Rock collection, Flower box, Seed storage, not Rocks collection, Flowers box, Seeds storage. "Data storage" plus "data is" occurs more than 10 times more often on the web than "Data storage" plus "data are". It goes higher than that if you restrict to reliable sources for Data storage. Using "data are" in this subject is just quirky and anachronistic. Insisting on following a very selected interpretation of ENGVAR while ignoring sources, and the reader, goes directly against WP's last pillar: Wikipedia has no firm rules and its specific wording The principles and spirit matter more than literal wording. --A D Monroe III(talk) 15:09, 29 October 2017 (UTC)
(Yes, the name of this article is "Disk storage", but it's under the wider subject "Data storage". Sorry for being sloppy in use between the two, but the logic still applies.) --A D Monroe III(talk) 15:49, 29 October 2017 (UTC)

WP:ENGVAR has nothing to do with this discussion at all, not does it's parts. It is solely about differences between dialects in different countries, such as the differences between British and American English. It is totally irrelevant here and a red herring. oknazevad (talk) 15:53, 29 October 2017 (UTC)

Agreed. I see not evidence this is a dialect matter. It's entirely a context matter.  — SMcCandlish ¢ >ʌⱷ҅ʌ<  02:05, 30 October 2017 (UTC)

Data as in Disk Storage[edit]

Several have asserted that "data" as in this Disk Storage article is a mass noun usage and therefore the article should use "data is." Disk storage is almost always enumerated in bytes which would justify the plural noun "data are" useage herein. The Access Methods section contains some enumeration information; it could be improved making the capacity ranges of the various disk storage media more clear and further supporting the plural noun useage. Tom94022 (talk) 06:51, 3 November 2017 (UTC)

That just means that "bytes" takes "are". Money is divided into dollars and cents, but we still write "money is...", not "money are...". --Deacon Vorbis (talk) 12:41, 3 November 2017 (UTC)
Money is a mass noun - data are not yet :-) Tom94022 (talk) 18:34, 3 November 2017 (UTC)
Data in Disk storage is a mass noun. Stating it could be broken down to something countable is irrelevant. Any mass noun can be broken down into atoms, so there are no mass nouns?
Disk storage is a subset of Data storage, which is called that because data in this industry is a mass-noun. If it were count, the article would be Datum storage, as collections use the singular form in English (Rock collection, Flower box, Seed storage, not Rocks collection, Flowers box, Seeds storage). The subject of data in this article is as a mass, and treating it otherwise in text for no reason is disruptive to read. --A D Monroe III(talk) 15:46, 5 November 2017 (UTC)

Should this discussion be moved to MOS?[edit]

MOS suggests this is a style retention issue Already rejected with an overwhelming consensus at MOS

(Edited from MOS)

There seem to be at least four sets of opinons:

  1. Only "data is" is correct since it has become a mass noun.
  2. Either "data is" or "data are" depending upon context
  3. Either "data is" or "data are" depending upon editorial style
  4. Only "data are" because it is a plural noun for publications such as WP (more of a technical than a general publication)

At least three of the four groups are represented above and it looks like no consensus will be reached here. Furthermore, it is an often repeated problem; a search of article talk pages turns up 1,022 instances of both varients. An analysis of the first 500 hits (about half) shows 288 articles discussing this issue in 2017 with hits going back to 2005 - here is the the detailed analysis. So "this dispute is recurrent and flamey enough that MoS should say something specific about it." It really doesn't matter if MoS addresses the more general issue of varients within a language (more or less along the lines of MOS:ENGVAR) or just this issue - my recommendations specifically for "data" are:

  1. Either "data is" or "data are" may be consistently used in an English WP article (i.e., editorial style decision, #3 above).
  2. If an English WP article contains "datum" then "data is" should not be used unless justfied by the article's content.

While this discussion attempts to limit itself to this article and closely related articles my research clearly shows this is enough of an repeated historical problem that it should be addressed in MOS. Tom94022 (talk) 18:16, 3 November 2017 (UTC)

This was discussed at Should MOS cover "data is" vs. "data are"?. The consensus rejected MOS covering this discussion, which is why the section above was closed. --A D Monroe III(talk) 15:51, 5 November 2017 (UTC)
Somewhat premature collapse since the discussion is still going on at MOS. The evidence suggests it is a large enough problem within WP that it should be coverred there. Tom94022 (talk) 23:16, 6 November 2017 (UTC)
The only person in that discussion who thinks that this discussion should be moved to MOS is Tom94022. Reclosing.

Clarification on grammar vs. data is/are !votes[edit]

Per the RfC statement, this RfC asks:

In this article, should "data" be used as a count noun (the plural of "datum") or mass noun (like "information")? This affects its grammatical use in many sentences; examples: "the data are stored" vs. "the data is stored", and "SSDs hold fewer data" vs. "SSDs hold less data".

The #Survey starts with suggestions for !voting:

Data are or Data is?

The suggested shorthand for voting was in no way meant to override the RfC as written. "Data are" !votes mean "'Data' is grammatically the plural of 'datum' in this article" and "Data is" !votes mean "'Data' is grammatically a mass noun in this article". Standard English grammar and WP guidelines would still apply, regardless of the text of the shorthand !voting. As per the RfC as written, a !vote for either "data is" or "data are" affects its grammatical use (emphasis added); it will not blindly replace all instances of "data is" or "data are" with the shorthand !vote, overriding correct English grammar, and WP rules for not changing quotes, etc.

Specifically, "pieces of data are" would remain unchanged in this RfC, as "pieces of data" is a plural noun phrase, regardless of whether data is a mass noun or count noun, and plural noun phrases use "are". --A D Monroe III(talk) 00:03, 8 November 2017 (UTC)

MOS:RETAIN[edit]

This is not an English variety issue coverred by MOS:RETAIN; it is a style retention issue

Per MOS:RETAIN, When no English variety has been established and discussion does not resolve the issue, use the variety found in the first post-stub revision that introduced an identifiable variety.

  • Currently, no English variety is established. Although there are a few instances of "data are" (data as count noun) currently in the article, this article also currently has these:
  • size of data to be stored (data as mass) instead of number of data to be stored or size of datum to be stored (data as count noun)
  • controlling the data transfer (mass) instead of controlling the datum transfer (count)
  • A link to Data transfer rate (mass) instead of Datum transfer rate (count)
  • user data bits are transferred (mass); there's no grammatical way to have two plurals in a row here -- it would be like saying trees leaves, the first noun has to be singular to make a noun phrase.
  • "gross" data transfer rate (mass) instead of "gross" datum transfer rate. (count)
So no variety is established, except for the several times that it's was established completely as data as mass, but changed to be a mix of count and mass ([1], [2], [3]) all by the single editor here still pushing "data are" with no benefit to the article or the reader.
  • Since its never been established as data as count in this article, per RETAIN, we use the variety found in the first post-stub revision that introduced an identifiable variety. That would be this version from May 2003, which establishes data as a mass noun for this article.
  • Data as mass has been fully established multiple times since then, even including during this RfC.

Following RETAIN requires that we go with "data is".

Oh, and once again, the whole article is part of Data storage (mass), not Datum storage (count). So even if the article was ever made consistent with "data are", per RETAIN, the article should be changed to reflect its fundamental ties with data as a mass noun. --A D Monroe III(talk) 15:14, 15 November 2017 (UTC)

My bad. I meant MOS:STYLERET, not MOS:RETAIN. Sorry. Let's try this again at #MOS:STYLERET, just below. --A D Monroe III(talk) 17:14, 18 November 2017 (UTC)

MOS:STYLERET[edit]

Per MOS:STYLERET, editors should not change an article from one styling to another without "substantial reason"

  • Currently, no style of data is consistently established in this article. Although there are a few instances of "data are" (data as count noun) currently in the article, this article also currently has these:
  • size of data to be stored (data as mass) instead of number of data to be stored or size of datum to be stored (data as count noun)
  • controlling the data transfer (mass) instead of controlling the datum transfer (count)
  • A link to Data transfer rate (mass) instead of Datum transfer rate (count)
  • user data bits are transferred (mass); there's no grammatical way to have two plurals in a row here -- it would be like saying trees leaves, the first noun has to be singular or a mass noun to make a noun phrase.
  • "gross" data transfer rate (mass) instead of "gross" datum transfer rate. (count)
So data as count style is not established, but several times it was established completely as data as mass, but changed back to be a mix of count and mass ([4], [5], [6]) all by the single editor here still pushing "data are" with no benefit to the article or the reader.
  • Since data as count has never been established in this article, we should consider the one data style found in the first post-stub revision that introduced an identifiable data style. That would be this version from May 2003, which established data as a mass noun for this article.
  • Data as mass has been fully established multiple times since then, even including during this RfC.

Following #MOS:STYLERET requires that we go with "data is".

Oh, and once again, the whole article is part of Data storage (mass), not Datum storage (count). So even if the article was ever made consistent with "data are", per MOS:STYLERET "substantial reason", the article should be changed to reflect its fundamental ties with data as a mass noun. --A D Monroe III(talk) 17:14, 18 November 2017 (UTC)

None of these observations by Monroe are substantial enough to change the four instances of "data are" to "data is." The observations are irrelevent, small and/or wrong as follows:
  • There is no requirement in English grammar for the consistency proposed by Monroe - a specific noun can appear in multiple forms in any one article and there is no requirement its particular usage in the title of a heirarchical article dictate the form of nouns in any subordinate material.
  • The statistics are not particularly relevent since both forms of data can be and are used in many WP articles, but if somehow statistics were relevent an accurate set would show usage of "data" is predominantly as a count noun in this article.
  • While early instances of this article did exclusivee use "data is" by 2009 it was using both styles and has apparently been stable and consistent in using "data are" since 2013 so given this stable longevity in its current form the early usage doesn't rise to the level of a substantial justifyication of a style change now.
  • In his devotion to pushing "data is" Monroe mistates my recent actions (([7], [8], [9]). Rather than making the article "completely as data as mass" as Monroe asserts in these three edits two IPs partially changed some but not all of the "data are" usages into "data is" thereby makeing the article inconsistent. In two instances data was clearly a count noun and one IP made addition changes to change the style. I reverted back since neither IP proved a substantial reason for their style change as required by MOS:STYLERET. Furthermore, I think we all agree that when the noun-verb pair starts with "data" and is followed by the irregular verb "to be" the form of "to be" should be consistent within an article. This specific consistancy is not foolisn so making these four nouns consistent would alos be a substantial reason for the style change as required by MOS:STYLERET.
None of the other arguments made in the discussions above rise to the level of substantial justification for the proposed style change as required by MOS:STYLERET. Therefore the article should remain as is. Tom94022 (talk) 20:01, 22 November 2017 (UTC)
Arguments to justify the changes that reverted back to "data are" were explicitly based STYLERET, by name, in the discussions above. Now that STYLERET has been shown to actually support the opposite, it's now claimed that STYLERET no longer applies? STYLERET applies, as agreed by this edit at MOS TALK by the above editor, after a long discussion with many editors at MOS. The only justification to revert changes back to "data are" are based on style and consistency. To now deny STYLERET is to remove the last remaining justification to revert back to "data are". --A D Monroe III(talk) 17:52, 23 November 2017 (UTC)
I am reminded about Emerson's remark on consistency. The article as it sits here has four instances of "data are" so there is no reversion under discussion just a style change and so far there have been no substantial reasons raised for such a change. Three recent reversions were partial style changes with no reason stated at al which were violations of MOS:STYLERET and therefore the absence of any reason is a substantial reason for those reversions. We can debate whether inconsistancy in usage (as in fourth bullet above) is a substantial reason for a style change but to be clear,
MOS:STYLERET does apply and so far there is not a substantial reason to change "data are" to "data is" in this article as it now reads! Tom94022 (talk) 18:26, 23 November 2017 (UTC)
The style in question is data as count or data as mass. Currently, use of data in the article it is mixed, but it was at the start and several times since then been completely consistent with data as mass. All changes that reverted these are against STYLERET. It cannot ever be made consistent with data as count, because this article's main subject is Data storage, named as data as mass; I can't think of a more substantial reason for it being data as mass. Bold and exclamation points aside, as STYLERET applies here, then it requires the article be made consistent with data as mass, changing the few remaining "data are" to "data is". No other resolution is possible. --A D Monroe III(talk) 16:09, 24 November 2017 (UTC)

Close this discussion[edit]

In the absence of consensus this discussion should be closed without changing the article.

An opinion at MOS on this specific subject is that this discussion falls under Retaining Existing Styles which states, "... editors should not change an article from one styling to another without "substantial reason" " I agree.

There are about 30,000 WP articles that use one form or the other of data are/data is. In these articles WP editors use "data are" over "data is" by a ratio greater than 4/3 so that actual style used by actual WP editors demonstrates there there is no "substantial reason" for a change. The best unsubstantial reason raised above is that "data is" is "more common" in general works. It is not relevant in WP where editors prefer "data are;" making WP more of a technical publication where "data are" is more common. Either is acceptable.

I would note that this has been a very stable article and I am the only editor participating in this discussion who contributed to the article. It is one of about 17,000 WP articles exclusively using "data are." So I ask why put all this effort into imposing a rule in this one article and oppose taking it to MOS? There is nothing about its usage herein that is any different than its usage in the other 17,000 article! If this is not a question of retaining an existing style then shouldn't it be discussed at MOS? If it is a retention question then the discussion should be closed with the article unchanged. Tom94022 (talk) 18:56, 8 November 2017 (UTC)

The RfC exists to establish a substantial reason to change; common use and reader expectation for this subject.
This RfC isn't for all of WP; what other articles do isn't relevant.
Claims that verge on WP:OWN are not relevant.
MOS is a dead horse here. --A D Monroe III(talk) 02:49, 9 November 2017 (UTC)
The only valid reason to close down an RfC before 30 days is WP:SNOW, and the votes are not one-sided enough to invoke SNOW. Closing down RfCs early is undesirable, because a significant number of people take two-week vacations and we don't want them to come back and discover that there was an RfC that they were not able to comment on. Let it run.
BTW, a careful reading of WP:LOCALCON may help some of the editors who are participating in this discussion. --Guy Macon (talk) 06:04, 9 November 2017 (UTC) --Guy Macon (talk) 06:04, 9 November 2017 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Note: The closure of this RfC was reviewed and endorsed; see Request review of closure at Disk storage. --A D Monroe III(talk) 21:27, 2 February 2018 (UTC)