Talk:Binary prefix

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Measurement (Rated B-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Measurement, a collaborative effort to improve the coverage of Measurement on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B-Class article B  This article has been rated as B-Class on the project's quality scale.
Checklist icon
 Mid  This article has been rated as Mid-importance on the project's importance scale.
WikiProject Computing / Software / Hardware (Rated B-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B-Class article B  This article has been rated as B-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Software.
Taskforce icon
This article is supported by Computer hardware task force (marked as High-importance).

Template-table references removed from article, preserved here[edit]

Bit rates
Name Symbol Multiple
bit per second bit/s 1 1
Decimal prefixes (SI)
kilobit per second kbit/s 103 10001
megabit per second Mbit/s 106 10002
gigabit per second Gbit/s 109 10003
terabit per second Tbit/s 1012 10004
Binary prefixes (IEC 80000-13)
kibibit per second Kibit/s 210 10241
mebibit per second Mibit/s 220 10242
gibibit per second Gibit/s 230 10243
tebibit per second Tibit/s 240 10244
Multiples of bits
Value SI
1000 103 kbit kilobit
10002 106 Mbit megabit
10003 109 Gbit gigabit
10004 1012 Tbit terabit
10005 1015 Pbit petabit
10006 1018 Ebit exabit
10007 1021 Zbit zettabit
10008 1024 Ybit yottabit
1024 210 Kibit kibibit Kbit kilobit
10242 220 Mibit mebibit Mbit megabit
10243 230 Gibit gibibit Gbit gigabit
10244 240 Tibit tebibit -
10245 250 Pibit pebibit -
10246 260 Eibit exbibit -
10247 270 Zibit zebibit -
10248 280 Yibit yobibit -

table that was formerly in lede, preserved here[edit]

IEC Binary Prefixes
Prefix Symbol Multiplier
kibi Ki 210 1024
mebi Mi 220 10242
gibi Gi 230 10243
tebi Ti 240 10244
pebi Pi 250 10245
exbi Ei 260 10246
zebi Zi 270 10247
yobi Yi 280 10248

"are not be used as placeholders" ...[edit]

... is not correct English. Dondervogel 2 (talk) 21:27, 9 June 2017 (UTC)

Oh. Didn't see it before. Fixed? Jeh (talk) 21:38, 9 June 2017 (UTC)
The present sentence reads "Compliance with the SI requires that the prefixes take their 1000-based meaning, and are not to be used as placeholders for other numbers, like 1024", which is better than before but still does not seem correct to me. On first reading the "are" in the second half of the sentence appears refer to refer to the "compliance" in the first, which would not make sense. On second reading you realise it's actually meant to refer to "the prefixes", but the reader does not benefit from this tortuosity. I'm no linguist but the sentence does not read well. Dondervogel 2 (talk) 23:17, 9 June 2017 (UTC)
How about "... and that they are not to be used... "? Jeh (talk) 23:25, 9 June 2017 (UTC)
Yep - that works. Dondervogel 2 (talk) 11:58, 10 June 2017 (UTC)

Not neutral information[edit]

This article does not take a neutral point of view, and actually functions as a representative of one side of the argument by ignoring and neglecting to mention any and all usage of binary structures that would counter the argument to use decimal notations.

The fact is that ALL data structures in memory and on disk habitually make use of powers of 2 because the memory locations (variables) that hold their locations have maximum values revolving around powers of two, e.g. 16, 256, and so on.

Therefore, most data structures will be organized around those sizes. This is the reason that computer memory and files on disk are referred to in binary sizes.

If there had been a section of filesystems, this would very quickly become evident.

No mention is made whatsoever of these reasons FOR using binary numbers. The original article (before my edit) said that only computer RAM and nothing else basically uses powers of 2, this is false.

It said "most other" contexts use decimal numbers. This is false.

There are only two main contexts wherein decimals numbers are used: disk capacity and data transfer rates.

Data transfer rates have historically been denoted in bits/second, and there has never been a confusion around this.

To this day, such data rates are mentioned in "Kbit/s" or "Mbit/s" and always denote powers of 1000.

This article simple gives a very skewed representation of the real state of affairs. It seems arduous effort was put into hiding away all usages of binary numbers and their reasons why (which can be found in their usages).

To understand anything about binary numbers you MUST mention data structures or file(system)-structures on disk.

You must mention WHY there are 512 and 1024 and 4096 byte sectors.

This is wholly neglected and avoided. Aye, it is silently kept back.

To lie by omission is still to lie. (talk) 19:13, 12 June 2017 (UTC)

Thank you for raising this on talk. From the above, you seem to have two main concerns:
  1. The article is (your word) "skewed"? Please provide specific examples of text that needs improvement.
  2. There is a need for a section on "filesystems". Why not add such a section? That way the new section can be judged on its merits, separately from your other edits.
Have I missed anything else important? Dondervogel 2 (talk) 19:38, 12 June 2017 (UTC)
This article is not promoting the use of binary or decimal prefixes for anything. It is merely descriptive of practice. If the IP wants its objections to be considered reasonably it should recast its objections in language that is not such a blatant failure to AGF. Jeh (talk) 20:11, 12 June 2017 (UTC)
I'll add a comment that whatever the IP thinks the reason for 512, 1024 and 4096 byte disk sectors, he's probably wrong. I've worked with systems using 576 and 2080 octet blocks, too. He's mistaken that all data on disk is stored in powers of two. Generally it's stored in either a multiple or fraction of the host computer's main memory page size (which itself might not be 2^n, e.g., DEC-10 systems used 4608-bit blocks - 128 words ☓ 36-bit words), or some block size plus some overhead (e.g., early ZFS systems). I suspect the IP has a narrow view of the computer world. Tarl N. (discuss)
Let's look at a few of the IP's claims... wrote: The fact is that ALL data structures in memory and on disk habitually make use of powers of 2 [...]
No, they don't.
Data structures are whatever size they are declared to be - possibly padded up slightly by the compiler and/or memory allocator, but there is no particular preference for larger powers of 2.
Here are a few examples from the Windows 10 kernel, 64-bit edition (you can easily verify these with the debugger):
IRP (I/O request packet) - 208 bytes, appended to which is an array of from 1 to an indeterminate number (usually 10 or fewer, but could be, say, three or seven) of "I/O stack locations" of 78 bytes each.
Executive process object (EPROCESS) - 1256 bytes. It contains a kernel process object (KPROCESS) - 352 bytes. It is associated with the Process environment block (PEB) - 896 bytes.
There's a similar trio of structures for threads. 1192, 872, and 6168 bytes, respectively.
File object (FILE_OBJECT) - 216 bytes.
Security descriptor (_SECURITY_DESCRIPTOR) - 40 bytes.
Page Frame Number array entry - 48 bytes.
etc., etc., etc.
And of course the size of an array, such as a character string, or the array of IO_STACK_LOCATIONs that follows an IRP, is whatever it is. Nobody who needs to store, say, a 83-byte string pads the allocation up to 128 bytes. Nor does the compiler.
Now, about the supposed cause. the IP claims because that this is (continuing from the quote above):
[...] because the memory locations (variables) that hold their locations have maximum values revolving around powers of two, e.g. 16, 256, and so on.
Well, since the data structure sizes are obviously not being influenced by anything to be small powers of two in the first place, this assertion of a cause for powers-of-two sizes is obviously wrong.
I must say though that I find this proposed reason to be absurd (even beyond the fact that it doesn't seem to be "working"). When I define a data structure, the fact that the maximum address for that structure is e.g. 0xFFFFFFFF (minus the size of the structure) does not at all influence me to make the structure 32 instead of 27 bytes, or 512 instead of 490, or any such thing.
Besides, almost all in-memory data structures are so small that we wouldn't usually express their sizes using any prefix at all! So they're pretty irrelevant to any talk of prefixes, binary or otherwise.
I will grant that there are exceptions. An x86 or x64 page table page occupies exactly 4096 bytes, and furthermore is page-aligned. However, I then have to point out that the number of page table pages that exist is in no way constrained or even influenced to a power of two. It is a highly dynamic number, depending primarily on the number of processes (also not a power of two, let alone 1024) and on how much virtual address space each has defined. So the total space occupied by page tables, while it will always be a multiple of 4 Ki, will only be a multiple of 1 Mi or 1 Gi by happenstance. And even then for not very long.
As for disks... as Tarl N. (talk · contribs · deleted contribs · logs · edit filter log · block user · block log) said, disk sector size has been all over the map, historically. The article already mentions the IBM 350 (sector size of 100 bytes) and the variable block length format (CKD) used by most drives attached to mainframes; the block size was picked by each installation. In the mid-70s I spent quite a bit of time on an HP system with disks that had a block size of 200 bytes. Yes, today, 4096 or 8192 byte memory pages are common, and that is why hard drives almost universally use 512 or 4096 byte sectors. But this was not always the case.
I must also mention that we're speaking here of the "end user" data in the sector. The actual length of the data field in each sector is considerably longer, to store the CRC data, the sector preamble, etc. Then there is the channel code... It is tempting to think that it must be easier for the drives' read/write circuitry to count to a nice binarily-even 0x200 bytes (or rather from 0 through 0x1FF) when reading or writing a block... since the end can then be noted by the counter overflowing... but since the drive actually has to read or write a somewhat larger number of bytes, and the resulting total size is not a power of two, this is a red herring. The number of bytes the drive actually reads or writes per block is just not a "convenient" binary number.
It is true that some elements of on-disk structure (file system metadata) are commonly small powers of two in their unit size. For example, one "file record" in NTFS is 1024 bytes. But that's because they conveniently fit into blocks - which are the size they are because of common machines' page size. It has nothing to do with what makes an efficient file system. But, like the page table example above, the number of file records in a file system is not influenced or "preferred" to be a power of two. In any case we never really see the "amount of space in file records" unless we're looking very very deep into the volume. This article is concerned with common uses of decimal and binary prefixes.
Similarly, also in NTFS, the security descriptor for a file may contain an arbitrary number of Access Control Entries.
Just like a hard drive can have an arbitrary number of cylinders. Which blows any "binary preference" for total hard drive capacity out the window.
As for the file itself, all modern file systems track the actual length of the file (e.g. byte offset to last byte written) down to the byte. Thus even though the file has to occupy an integral number of fixed-length sectors, its data length is not necessarily that.
To sum up so far: Both in-memory data structures and file system structures are red herrings as far as examples of binary-sized things are concerned.
The IP is invited to come up with common examples of binary-sized things that are not currently reported by the article. But not data structures or file system metadata. Those claims just don't fit the facts.
The IP is also invited to cite specific wording that is "non-neutral" and suggest alternatives.
But please do avoid the argumentative and accusing verbiage with which the thread was started.
I would like to point out that the article's current form is the result of a lot of discussion that culminated in the article being replaced by a major rewrite. The replacement happened with this diff. The discussion of that effort, which began with copying the page into a sandbox and then editing on that while discussing, is archived starting with archive 15 and continuing through the first section of archive 16. Many opponents of the IEC binary prefixes participated. They felt it was key that the article not assign undue weight to those prefixes, to argue for their use, or to otherwise violate neutrality. As part of that, there was quite an effort to make the lists of "things with sizes quoted using powers of 1024" and "things with sizes quoted using powers of 1000" exhaustive. I really doubt we missed much on either side. Nobody brought up "data structures", either in memory or on disk. Jeh (talk) 09:44, 13 June 2017 (UTC)
Later thoughts: I believe that nobody mentioned data structures because the examples given in the article are products that one might buy (RAM, hard drives, etc.). The entire binary prefix confusion in the public (tech folks are, in general, not confused) arises from the fact that Windows and some other OSs display sizes of hard drives and similar devices using customary binary prefixes while hard drive makers use decimal. It is true that Windows displays file sizes using customary binary prefixes but there is no alternative display using decimal prefixes to compare these to, nor did anybody pay money for a "file" and then be disappointed that it's not as big as they thought it was!
And in general only developers have visibility of the sizes of file system metadata structures or of in-memory objects.
And... dare I say it? Wikipedia is written for a general audience. Jeh (talk) 02:14, 15 June 2017 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just modified 10 external links on Binary prefix. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

You may set the |checked=, on this template, to true or failed to let other editors know you reviewed the change. If you find any errors, please use the tools below to fix them or call an editor by setting |needhelp= to your help request.

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

If you are unable to use these tools, you may set |needhelp=<your help request> on this template to request help from an experienced user. Please include details about your problem, to help other editors.

Cheers.—InternetArchiveBot (Report bug) 15:43, 20 July 2017 (UTC)