Au file format

Au
Filename extension	.au; .snd
Internet media type	audio/basic (headerless format)
Type code	public.au-audio; public.ulaw-audio (headerless variant);
Magic number	.snd (newer versions)
Developed by	Sun Microsystems
Type of format	audio file format, container format
Container for	Audio, most often μ-law

The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being 8-bit μ-law-encoded data at an 8000 Hz sample rate.^[1] Hardware from other vendors often used sample rates as high as 8192 Hz, often integer multiples of video clock signal frequencies. Newer files have a header that consists of six unsigned 32-bit words, an optional information chunk which is always of non-zero size, and then the data (in big-endian format).

Although the format now supports many audio encoding formats, it remains associated with the μ-law logarithmic encoding. This encoding was native to the SPARCstation 1 hardware, where SunOS exposed the encoding to application programs through the /dev/audio device file interface. This encoding and interface became a de facto standard for Unix sound.

New format

All fields are stored in big-endian format, including the sample data.^[4]^[5]

uint32 word	field	Description
0	Magic number	The value 0x2e736e64 (four ASCII characters ".snd")
1	Data offset	The offset to the data in bytes. (In the older Sun version, this had to be a multiple of 8.) The minimum valid number is 28 (decimal), since this is the header length (six 32-bit words) plus a minimal annotation size (4 bytes, another 32-bit word).
2	data size	Data size in bytes, not including the header. If unknown, the value 0xffffffff should be used.
3	Encoding	Data encoding format: Unspecified 8-bit G.711 μ-law 8-bit linear PCM 16-bit linear PCM 24-bit linear PCM 32-bit linear PCM 32-bit IEEE floating point 64-bit IEEE floating point Fragmented sample data Nested (unclear format) DSP program 8-bit fixed point 16-bit fixed point 24-bit fixed point 32-bit fixed point (Unassigned) non-audio display data μ-law Squelch format^[6] 16-bit linear with emphasis 16-bit linear compressed 16-bit linear with emphasis and compression Music kit DSP commands Music kit DSP commands samples ITU-T G.721 4-bit ADPCM ITU-T G.722 SB-ADPCM ITU-T G.723 3-bit ADPCM ITU-T G.723 5-bit ADPCM 8-bit G.711 A-law Values 0 through 255 are supposed to be assigned by a file format authority (was NeXT, now Oracle). Other values can be used for custom formats.^[5]
4	Sample rate	The number of samples/second, e.g., 8000, 11025, 22050, 44100, and 48000.^[4] NeXT may use 8013.^[5]
5	Channels	The number of interleaved channels, e.g., 1 for mono, 2 for stereo; more channels possible, but may not be supported by all readers.
6	–	Optional annotation or description string, NULL-terminated. A minimum of 4 bytes must be stored even if unused. In the older Sun version, its length had to be a non-zero multiple of 8 bytes. In some older implementations, the string is not properly NULL-terminated, but the offset remains reliable.^[4]

The type of encoding depends on the value of the "encoding" field (word 3 of the header). Formats 2 through 7 are uncompressed linear PCM, therefore technically lossless (although not necessarily free of quantization error, especially in 8-bit form). Formats 1 and 27 are μ-law and A-law, respectively, both companding logarithmic representations of PCM, and arguably lossy, as they pack what would otherwise be almost 16 bits of dynamic range into 8 bits of encoded data, even though this is achieved by an altered dynamic response and no data are discarded. Formats 23 through 26 are ADPCM, which is an early form of lossy compression, usually with four bits of encoded data per audio sample (for 4:1 efficiency with 16-bit input, or 2:1 with 8-bit; equivalent to, e.g., encoding CD quality MP3 at a 352 kbit rate using a low quality encoder). Several of the others (number 8 through 22) are DSP commands or data, designed to be processed by the NeXT Music Kit software.

Note: PCM formats are encoded as signed data, as opposed to unsigned.

The current format supports only a single audio data segment per file. The variable-length annotation field is currently ignored by most audio applications.

References

^ ^a ^b "audio/basic". IANA.org. Retrieved 23 February 2023.
^ "AVFileTypeSunAU". Apple Developer Documentation. Apple Inc.
^ "System-Declared Uniform Type Identifiers". Uniform Type Identifiers Reference. Apple Inc.
^ ^a ^b ^c Oracle man pages: au(4) - AU audio file format (current specification)
^ ^a ^b ^c "Audio File Formats FAQ: File Formats". sox.sourceforge.net. Archived from the original on 23 February 2023.
^ "Audio File and Compression Formats". docs.oracle.com.

External links

Oracle man pages: audio(7i) — generic audio device interface (for information on the /dev/audio interface)

[IANA-1] "audio/basic". IANA.org. Retrieved 23 February 2023.

[2] "AVFileTypeSunAU". Apple Developer Documentation. Apple Inc.

[3] "System-Declared Uniform Type Identifiers". Uniform Type Identifiers Reference. Apple Inc.

[spec-4] Oracle man pages: au(4) - AU audio file format (current specification)

[sox-5] "Audio File Formats FAQ: File Formats". sox.sourceforge.net. Archived from the original on 23 February 2023.

[6] "Audio File and Compression Formats". docs.oracle.com.

[1]

[2]

[3]

[4]

[5]

[6]