Binary-safe

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Binary-safe is a computer programming term mainly used in connection with string manipulating functions. A binary-safe function is essentially one that treats its input as a raw stream of data without any specific format. It should thus work with all 256 possible values that a character can take (assuming 8-bit characters).

Binary-safe file read and write[edit]

On Windows, an end of line is coded in text files using two successive characters: carriage return (\r aka 0x0D aka CR) followed by new line (\n aka 0x0A aka LF). Any attempt to read (or write) a file that has only one of these characters (i.e. CR or LF) will convert this character to the full sequence (CRLF) in the read data. Unix doesn't make this conversion.

Most programming languages introduce special flags (or different functions) on their file read and write functions to prevent this conversion from happening. For example, in the PHP programming language, developers have to use fopen($filename, "rb") instead of fopen($filename, "r") to treat the file in binary-safe mode.

Special characters[edit]

Most functions are not binary safe when using any special or markup characters, such as escape codes or those that expect null-terminated strings. A possible exception would be a function whose explicit purpose is to search for a certain character in a binary string.

Data format[edit]

Binary safe functions are required when working with data of unknown format (otherwise the format would not be preserved) such as arbitrary files, encrypted data, and the like. The length of the data must be known by the function in order to operate on the entirety of the data.

Binary safety over internet[edit]

Issues with binary safety are often encountered when binary files are transferred over the Internet. This is especially true for large files, which can cause overflow of the memory, buffer, or storage capacity in one or more servers. Sometimes, transferred files are subjected to functions that strip formatting codes, or that incorrectly interpret certain binary strings as formatting codes. For example, angle brackets can be falsely interpreted as markers for HTML tags, or desired tags may be lost in the transfer of an HTML file. Quotation marks in plain-text or ASCII format may not be read that way by a Web browser. An HTML editor will convert quotation marks (") into a string of characters (") to prevent this confusion. An extra space(" ") in a Web page appears as a string of characters ( ) when the HTML source code is viewed in a text editor. Conversely, such character strings are interpreted by Web browsers as quotation marks and spaces, even if the author of the file did not intend for them to be interpreted that way.[1]

References[edit]

  1. ^ "Binary-Safe Function". Retrieved 29 May 2012.