Talk:Count key data
This is the talk page for discussing improvements to the Count key data article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
|
|
This page has archives. Sections older than 180 days may be automatically archived by Lowercase sigmabot III when more than 3 sections are present. |
Nomenclature: blocks versus records, hardware versus software
[edit]There is a disconnect in the vocabulary of IBM hardware manuals versus software manuals. The hardware manuals for CKD DASD use the term record while the software manuals use the term block or physical record. I believe that the lead should mention this.
@Tom94022: Also, edit special:permalink/1212325573 changed or a copy of the highest key in the block, for "blocked" records)
to the garbled or a copy of the first n bytes in of the first data when the record has several "blocks" data concatenated in one data field
, which does not conform to either hardware or software nomenclature. I suggest or a copy of the highest key in the record, for blocked logical[a] records)
-- Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:29, 7 March 2024 (UTC)
- I am not sure what substantive relevant distinction you are making - I am not aware of any reference manual for any CKD device that uses the term "block" to refer to "record." Ditto at the channel command reference manual level. Yes, at some level (e.g. to_IBM_Direct-Access_Storage_Devices_and_Organization_Methods_Dec75.pdf) IBM does distinguish between logical record and physical record with the "block" sometimes the same as "physical record" but I think that is a level of complexity we don't need here. IBM doesn't uses "record" not "physical record" in the relevant reference manuals I think that is best what we consistently use in this article. If someone wants to add a note someplace in the article that CKD record is sometimes referred to as "physical record" or "block" that should be more than enough. Tom94022 (talk) 22:05, 7 March 2024 (UTC)
- Exactly what I wrote; the hardware and software nomenclature differ.
- The quoted text in the cited edit broke consistency, even within a single sentence, and used nomenclature that is OR; since you removed it in permalink/1212432070, it is no longer an issue. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:33, 8 March 2024 (UTC)
Notes
- ^ IBM uses the terms block and physical record in software manuals for what it calls record in the hardware manuals.
Sections having nothing to do with CKD format
[edit]Some sections of the article deal with topics that seem unrelated to CKD format, such as Count key data § Dynamic paths, which discusses a feature that appears to discuss data paths; the feature sounds like one that could also be used with FBA. Others that might be unrelated to CKD include Count key data § Multiple Requesting and Count key data § Command Retry. Perhaps there should be a page, or a section of a page, discussing the architecture of IBM mainframe storage hardware and firmware, and perhaps the general notion of channel, control unit, and device. Guy Harris (talk) 06:28, 8 March 2024 (UTC)
- The article lede does state the article also covers CKD channel commands but the sections cited above do seem to be inappropriate to this article. Problem is IBM channels are only briefly covered in the various IBM System articles, e.g. IBM System/360 Channels with the global article on Channel I/O being perhaps to generic for such material. Maybe these sections can be moved into the article on the IBM system in which such features were first announced. A bit of a research project, but maybe some of the editors will recall the announcement date and system. Tom94022 (talk) 19:22, 8 March 2024 (UTC)
- Upon further study it seems to me most of what is in the block multiplexer channel enhancements section of this article should be moved to History of IBM CKD Controllers article. Tom94022 (talk) 21:40, 8 March 2024 (UTC)
- History of IBM CKD Controllers looks as if it can serve as the page that "[discusses] the architecture of IBM mainframe storage hardware and firmware", at least for CKD drives. Guy Harris (talk) 21:55, 8 March 2024 (UTC)
- The CCW opcodes controlling Dynamic paths are specific to ECKD controllers, and belong here until there is an ECKD article. I'm not aware of any FBA controller supporting them, I don't know enough about SCSI drives accessed over FCP to comment on them.
- Some of the material belongs in IBM System/360#Input/Output. Some, however is specific to DASD, e.g., the Set Sector and Read Sector CCW opcodes.
- There really ought to be architecture article fir S/360 through z/Architecture, as well as an ECKD article; much of what you mention would fit naturally there. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:25, 10 March 2024 (UTC)
- "Architecture article", singular, or "architecture articles", plural? There's already IBM System/360 architecture, with IBM System/360 covering the family as a product line, and z/Architecture, with IBM Z covering the family as a product line, but the intermediate architectures don't have such a split.
- (x86 has a single x86 page, which covers the product line of chips, and has some details common to 16-bit, 32-bit, and 64-bit architectures, and also IA-32 for the 32-bit version and x86-64 for the 64-bit version.) Guy Harris (talk) 23:17, 10 March 2024 (UTC)
- The IBM System/360 architecture doesn't cover S/370. 370/XA, ESA/370, ESA/390 or z/Architecture, and is already at 86K; a single architecture article would be too big. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:27, 11 March 2024 (UTC)
- Actually I think all architectures have homes either in articles or sections as follows S/370, 370/XA, ESA/370, ESA/390 or z/Architecture so all we have to do is find the right one to move the irrelevant material. Tom94022 (talk) 17:42, 11 March 2024 (UTC)
- The IBM System/360 architecture doesn't cover S/370. 370/XA, ESA/370, ESA/390 or z/Architecture, and is already at 86K; a single architecture article would be too big. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:27, 11 March 2024 (UTC)
The item about "virtualized" CKD should probably be expanded
[edit]The sentence
Originally CKD records had a one-to-one correspondence to a physical track of a DASD device; however over time the records have become more and more virtualized such that in modern IBM mainframes there is no longer a direct correspondence between a CKD record ID and the physical layout of a track.
should perhaps have more text explaining what that means. As I understand it, the drives are just ordinary fixed-block drives, and some combination of hardware, firmware, and software(?) makes that look like a CKD device. I don't know if the details of how that's done are available in any reference. Guy Harris (talk) 06:34, 8 March 2024 (UTC)
- Does US 5581743A, "CKD to fixed block mapping for optimum performance and space utilization", describe one way IBM does that CKD emulation? (US551743A at Google Patents) Guy Harris (talk) 06:53, 8 March 2024 (UTC)
- I doubt if the existence of an IBM patent is sufficient to establish it was ever practiced in any actual product. We would have to find an RS linking the patent to a product to make such an assertion. It is OR but I am pretty sure the first RAMAC RAID mapped CKD into FBA devices by writing the entire physical CKD track, gaps, ECC, etc into fixed blocks and then staging the track into a buffer before reconnecting for a transfer. Not a very efficient use of storage but made the virtualization straight forward. I am pretty sure later virtualizations were more efficient but don't know any details. Also IBM was not the first virtualizer, I believe EMC was and I am not aware of any published material on its virtualization architecture. It is indeed a combination of hardware and mainly firmware in a storage subsystem but maybe that is all we can say and even then finding a RS might be difficult. I tried BARD and got no help. :-) Tom94022 (talk) 07:45, 8 March 2024 (UTC)
- This slideshow seems to suggest that the first emulated CKD device was from a product code-named "Iceberg" from Storage Technology Corporation. Guy Harris (talk) 07:51, 8 March 2024 (UTC)
- The "EMC Symmetrix Integrated Cached Disk Array" introduced in September 1990 was an IBM CKD emulating storage device using a RAID array of Seagate FBA HDDs; it was likely the first such system and well before STC's Iceberg in 1994. Tom94022 (talk) 19:03, 8 March 2024 (UTC)
- There are two things. One is that IBM sells boxes that to the hardware look like traditional disk drives, though (if you open the box) you find a bunch of ordinary drives. I suspect, though, that you are not supposed to open them. The second is the files used for P/370 and P/390 systems, by AWSCKD, where you can access the files on the host system. The former should be considered as black boxes, as IBM could change them at any time. Gah4 (talk) 08:17, 8 March 2024 (UTC)
- This slideshow seems to suggest that the first emulated CKD device was from a product code-named "Iceberg" from Storage Technology Corporation. Guy Harris (talk) 07:51, 8 March 2024 (UTC)
- I doubt if the existence of an IBM patent is sufficient to establish it was ever practiced in any actual product. We would have to find an RS linking the patent to a product to make such an assertion. It is OR but I am pretty sure the first RAMAC RAID mapped CKD into FBA devices by writing the entire physical CKD track, gaps, ECC, etc into fixed blocks and then staging the track into a buffer before reconnecting for a transfer. Not a very efficient use of storage but made the virtualization straight forward. I am pretty sure later virtualizations were more efficient but don't know any details. Also IBM was not the first virtualizer, I believe EMC was and I am not aware of any published material on its virtualization architecture. It is indeed a combination of hardware and mainly firmware in a storage subsystem but maybe that is all we can say and even then finding a RS might be difficult. I tried BARD and got no help. :-) Tom94022 (talk) 07:45, 8 March 2024 (UTC)
- That sentence is iffy anyway: it is the logical track that was in 1-1 correspondence with the physical track, not the individual records, especially when there were multiple records per track.
- Virtual tracks go back at least as far as the IBM 3350, which could be configured to look like a 3330-11 or like a pair of 3330-1 volumes; in either case the logical volumes had a different track size from a native volume. I believe that the 3350 was the last CKD device to simulate a dissimilar device. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:11, 8 March 2024 (UTC)
- I'm not sure what AWS Cloud Development Kit had to do with this discussion, but maybe I have the wrong acronym?
- I am pretty sure the physical track on a 3350 emulating a 3330 was identical to the virtual track - the count field, gaps and ECC were different in content or length but each record was in the same physical order and as were the key and data fields of each record. Ditto for the 3344 emulating a 3340. They could do this because the track length of the 3350/3344 was much greater than that of the 3330/3340 so the emulated physical track just had a bunch of unused bytes at the end (longer G4) That's somewhat different when u use a bunch of FBA blocks to emulate a CKD track. Tom94022 (talk) 18:38, 11 March 2024 (UTC)
- I hate those overloaded acronyms :-(
- Every IBM software product, at least starting with OS/360 days, has a three letter code used for its modules and messages. For P/370 is is AWS. The module emulating CKD for P/370 is AWSCKD. (Note: not AWSCDK.) There is also AWSFBA for emulating FBA disks. Gah4 (talk) 06:56, 18 March 2024 (UTC)
- It's plausible that the 3350 used an entire physical track for each 3330 logical track, but some virtualization would still have been necessary so that only 13030 (13165) bytes were visible per track and the overhead factors yield compatible results. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:15, 11 March 2024 (UTC)
- The gaps have to be the same in time to prevent overrun so therefore given the higher data rate they are longer in byte count on a when written on a 3350 in 3330 mode than they are on a 3330, but a simple counter stops transfer when it reaches 13330, the number raw bytes in a 3330 revolution. The counter is incremented by the length of each 3330 gap when the 3350 gap is actually written. Tom94022 (talk) 06:02, 12 March 2024 (UTC)
- I haven't thought about this for a while. It is 13030 bytes for the full track blocksize, plus 135 bytes overhead per block. So it should be 13165. For the 2314, 7294 for full track, and the overhead is only for other than the first track. As far as I know, the difference has some significance. Gah4 (talk) 06:52, 18 March 2024 (UTC)
- Yes, 13165 is the published value and also the value used in capacity calculations. Starting with the 3375 the calculation included another factor[a] reflecting sector size. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:41, 18 March 2024 (UTC)
- Prior to the 3330 all IBM DASD had a variable number of bytes from index to index due to speed variation. Beginning with the 3330 (2305 perhaps) IBM DASD wrote in synchronization with rotating spindle so there was a precise integer number of bytes per track, 13,330 to the best of my recollection for the 3330 and BTW exactly 1.5 times that for the CDC SMD. It is IBM's Gap 1 and the HA field that account for the difference between the published IBM numbers and the actual full track capacity, that, for example is the reason behind the difference between 13165 published and 13440
13330actual in the case of the 3330. Tom94022 (talk) 07:21, 19 March 2024 (UTC)- For the 3330, the 135 overhead is per block. So half track is 13165/2-135 and 1/3 track is 13165/3-135. But ok, the difference between 13330 and 13165 is 165, or 135+30. Is there both HA and R0?
- The one I never knew about, from when I first knew about the 3330 51 years ago: for the 2314 the block overhead formula doesn't have overhead for the first block, but for the 3330 it does. There needs to be at least a gap for write spice, to turn on and off the writing of bits, as long as you don't rewrite the whole track. Well, then there is still a write splice at the end of the track. Gah4 (talk) 12:35, 19 March 2024 (UTC)
- There is always overhead for the first and last block; there are different ways to document that overhead. Yes, the 2314 and 3330 have both HA and R0. Further, IBM software assumes a standard (DL=8, KL=0) R0. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:37, 19 March 2024 (UTC)
- I found the table in this one on page 137. For the 2314, other than the last block have overhead of 101 bytes, and also a factor of 534/512. (More with KL>0). So half track is 3520, as 101+3520*534/512+3520 is 7292.25. As noted in the footnote, there is no overhead for the last track. Seems to me that this complicated formula comes somehow from the low level format. And how convenient that 3520 is a multiple of 80. 50 years ago, if we didn't know if a data set was to 2314 or 3330, the favorite was 3120, for LRECL=80, not so far off for either. Gah4 (talk) 13:08, 19 March 2024 (UTC)
- IBM gives separate 2314 capacity formulae for R1 and subsequent records, where the formula for R1 includes the overhead. ITYM last block, and there is overhead, but it's factored in elsewhere. Also, all published formulae assume that R0 has DL=8, KL=0. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:37, 19 March 2024 (UTC)
- For the 2314, it seems that the 534/512 overhead is only for the non-last block. You can easily add the 101 to any last gap, but harder to explain the 534/512. But gaps have to be big enough to allow for the possible variations, such as disk speed. (Unless, as Tom notes, it is synchronized to disk speed.) When you write a block, the hardware has to, sometime, figure out that you are the last block that fits. As far as I know, it can only do that by trying to write it, and then finding it doesn't fit. (For RECFM=U, it has no idea what is coming next.) So, in the case of other than the last block, the overhead allowance is slightly larger than it needs to be, just to be sure. Gah4 (talk) 18:55, 19 March 2024 (UTC)
- I was trying to understand Tom's 13330, which allows for one 135 byte overhead (presumably mostly gap), and 30 bytes of data. But maybe the 13330 does not include HA. Gah4 (talk) 18:55, 19 March 2024 (UTC)
- Memory failed me, the actual unformatted track capacity of the 3330 was 13,440 bytes. Simple proof is 13440 bytes/rev * 60 rev/sec = 806,400 kB/sec the data transfer rate (published rounded to 806 kB/sec). All other numbers do not arrive at the data rate. The various IBM published track numbers assume as a minimum an HA field which requires a post index gap (G1).Tom94022 (talk) 06:45, 20 March 2024 (UTC)
- The published numbers also assume a standard (KL=0,DL=8) R0. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:56, 20 March 2024 (UTC)
- OK, so 13440 is enough for both HA and R0, and some gaps. If you never rewrite HA, you don't need write splice for it. As well as I know, in addition to the data rate servo, the sector values, and I suspect index are on that track. Is there a servo track for each cylinder? Gah4 (talk) 16:18, 20 March 2024 (UTC)
- Yes. Technically there is one servo surface in the cylinder dedicated to track position information and the servo mechanism reading this servo information centers all data track in a cylinder between two servo tracks. This servo surface approach became obsolete beginning in the late 1980s with the introduction of embedded servos, the servo information was intermixed with data on all tracks of all cylinders freeing the servo surface for data. Very important in small disk drives with only a few surfaces per cylinder (e.g. ST506 with 4 surfaces), less so say in the 3330 with 20 surfaces. Tom94022 (talk) 17:11, 20 March 2024 (UTC)
- Memory failed me, the actual unformatted track capacity of the 3330 was 13,440 bytes. Simple proof is 13440 bytes/rev * 60 rev/sec = 806,400 kB/sec the data transfer rate (published rounded to 806 kB/sec). All other numbers do not arrive at the data rate. The various IBM published track numbers assume as a minimum an HA field which requires a post index gap (G1).Tom94022 (talk) 06:45, 20 March 2024 (UTC)
- IBM gives separate 2314 capacity formulae for R1 and subsequent records, where the formula for R1 includes the overhead. ITYM last block, and there is overhead, but it's factored in elsewhere. Also, all published formulae assume that R0 has DL=8, KL=0. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:37, 19 March 2024 (UTC)
- The 13165 value is for the full track capacity, including overhead. The 13030 value is the largest possible unkeyed block size. As a side note, the 3330 had a synchronization track, which the 2314 did not; more precisely, the 3336 versus the 2316. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:37, 19 March 2024 (UTC)
- The 3330 and most subsequent IBM DASD into the 1990s had a servo surface which provided a synchronized write clock to its controller - on the 3336 disk pack for every track there were precisely 13440 servo bursts per revolution which were used to generate the write frequency of 13440 bytes/rev * 60 rev/sec = 806.4 KB/sec. Prior DASD, e.g. 2314, wrote with a constant frequency; the rotational speed varied with wear and voltage resulting in a varying number of bytes written per revolution. IBM accounted for this by including an allowance for speed tolerance within the various formulae for track capacity of such DASD. Tom94022 (talk) 06:45, 20 March 2024 (UTC)
- Prior to the 3330 all IBM DASD had a variable number of bytes from index to index due to speed variation. Beginning with the 3330 (2305 perhaps) IBM DASD wrote in synchronization with rotating spindle so there was a precise integer number of bytes per track, 13,330 to the best of my recollection for the 3330 and BTW exactly 1.5 times that for the CDC SMD. It is IBM's Gap 1 and the HA field that account for the difference between the published IBM numbers and the actual full track capacity, that, for example is the reason behind the difference between 13165 published and 13440
- Yes, 13165 is the published value and also the value used in capacity calculations. Starting with the 3375 the calculation included another factor[a] reflecting sector size. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:41, 18 March 2024 (UTC)
- I haven't thought about this for a while. It is 13030 bytes for the full track blocksize, plus 135 bytes overhead per block. So it should be 13165. For the 2314, 7294 for full track, and the overhead is only for other than the first track. As far as I know, the difference has some significance. Gah4 (talk) 06:52, 18 March 2024 (UTC)
- The gaps have to be the same in time to prevent overrun so therefore given the higher data rate they are longer in byte count on a when written on a 3350 in 3330 mode than they are on a 3330, but a simple counter stops transfer when it reaches 13330, the number raw bytes in a 3330 revolution. The counter is incremented by the length of each 3330 gap when the 3350 gap is actually written. Tom94022 (talk) 06:02, 12 March 2024 (UTC)
- I hate those overloaded acronyms :-(
Notes
- ^ CKD over sector space calculation
- 3375 with key
- 224 + ((KL + 191)/32)(32) + ((DL + 191)/32)(32)
- 3375 without key
- 224 + ((DL+ 191)/32)(32)
- 3380 with key
- 256 + ((KL + 267)/32)(32) + ((DL + 267)/32)(32)
- 3380 without key
- 256 + ((DL + 267)/32)(32)