= Comparison of data-serialization formats =

This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.

==Overview==

| Name | Creator-maintainer | Based on | Standardized? | Specification | Binary? | Human-readable? | Supports references? | Schema-IDL? | Standard APIs | Supports zero-copy operations |
| Apache Arrow | Apache Software Foundation | | | Arrow Columnar Format | | | | | C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift | |
| Apache Avro | Apache Software Foundation | | | Apache Avro™ Specification | | | | | C, C#, C++, Java, PHP, Python, Ruby | |
| Apache Parquet | Apache Software Foundation | | | Apache Parquet | | | | | Java, Python, C++ | |
| Apache Thrift | Facebook (creator) Apache (maintainer) | | | Original whitepaper | | | | | C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages | |
| ASN.1 | ISO, IEC, ITU-T | | | ISO/IEC 8824 / ITU-T X.680 (syntax) and ISO/IEC 8825 / ITU-T X.690 (encoding rules) series. X.680, X.681, and X.683 define syntax and semantics. | | | | | | |
| Bencode | Bram Cohen (creator) BitTorrent, Inc. (maintainer) | | | Part of BitTorrent protocol specification | | | | | | |
| BSON | MongoDB | JSON | | BSON Specification | | | | | | |
| Cap'n Proto | Kenton Varda | | | Cap'n Proto Encoding Spec | | | | | | |
| CBOR | Carsten Bormann, P. Hoffman | MessagePack | | RFC 8949 | | | , through tagging | | | |
| Comma-separated values (CSV) | RFC author: Yakov Shafranovich | | | RFC 4180 (among others) | | | | | | |
| Common Data Representation (CDR) | Object Management Group | | | General Inter-ORB Protocol | | | | | Ada, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk | |
| D-Bus Message Protocol | freedesktop.org | | | D-Bus Specification | | | | (Signature strings) | | |
| Efficient XML Interchange (EXI) | W3C | XML, Efficient XML | | Efficient XML Interchange (EXI) Format 1.0 | | | | | | |
| Extensible Data Notation (edn) | Rich Hickey / Clojure community | Clojure | | Official edn spec | | | | | Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python | |
| FlatBuffers | Google | | | Flatbuffers GitHub | | | (internal to the buffer) | | C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScript | |
| Fast Infoset | ISO, IEC, ITU-T | XML | | ITU-T X.891 and ISO/IEC 24824-1:2007 | | | | | | |
| FHIR | Health Level 7 | REST basics | | Fast Healthcare Interoperability Resources | | | | | Hapi for FHIR JSON, XML, Turtle | |
| Ion | Amazon | JSON | | The Amazon Ion Specification | | | | | C, C#, Go, Java, JavaScript, Python, Rust | |
| Java serialization | Oracle Corporation | | | Java Object Serialization | | | | | | |
| JSON | Douglas Crockford | JavaScript syntax | | STD 90/RFC 8259 (ancillary: RFC 6901, RFC 6902), ECMA-404, ISO/IEC 21778:2017 | , but see BSON, Smile, UBJSON | | | (JSON Schema Proposal, ASN.1 with JER, Kwalify , Rx, JSON-LD | (Clarinet, JSONQuery / RQL, JSONPath), JSON-LD | |
| MessagePack | Sadayuki Furuhashi | JSON (loosely) | | MessagePack format specification | | | | | | |
| Netstrings | Dan Bernstein | | | netstrings.txt | | | | | | |
| OGDL | Rolf Veen | | | Specification | | | | | | |
| OPC-UA Binary | OPC Foundation | | | opcfoundation.org | | | | | | |
| OpenDDL | Eric Lengyel | C, PHP | | OpenDDL.org | | | | | | |
| PHP serialization format | PHP Group | | | | | | | | | |
| Pickle (Python) | Guido van Rossum | Python | | PEP 3154 – Pickle protocol version 4 | | | | | | |
| Property list | NeXT (creator) Apple (maintainer) | | | Public DTD for XML format | | | | | Cocoa, CoreFoundation, OpenStep, GnuStep | |
| Protocol Buffers (protobuf) | Google | | | Developer Guide: Encoding, proto2 specification, and proto3 specification | | | | | C++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, D, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, TypeScript, Vala, Visual Basic | |
| | John McCarthy (original) Ron Rivest (internet draft) | Lisp, Netstrings | | "S-Expressions" Internet Draft | , canonical representation | , advanced transport representation | | | | |
| Smile | Tatu Saloranta | JSON | | Smile Format Specification | | | | (JSON Schema Proposal, other JSON schemas/IDLs) | (via JSON APIs implemented with Smile backend, on Jackson, Python) | |
| SOAP | W3C | XML | | SOAP/1.1 SOAP/1.2 | (, , , MTOM, ) | | | | | |
| | Max Wildgrube | | | RFC 3072 | | | | | | |
| UBJSON | The Buzz Media, LLC | JSON, BSON | | ubjson.org | | | | | | |
| eXternal Data Representation (XDR) | Sun Microsystems (creator) IETF (maintainer) | | | STD 67/RFC 4506 | | | | | | |
| XML | W3C | SGML | | 1.0 (Fifth Edition) 1.1 (Second Edition) | (, , , ) | | | | | |
| XML-RPC | Dave Winer | XML | | XML-RPC Specification | | | | | | |
| YAML | Clark Evans, Ingy döt Net, and Oren Ben-Kiki | C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON | | Version 1.2 | | | | (Kwalify , Rx, built-in language type-defs) | | |
| Name | Creator-maintainer | Based on | Standardized? | Specification | Binary? | Human-readable? | Supports references? | Schema-IDL? | Standard APIs | Supports zero-copy operations |

==Syntax comparison of human-readable formats==

| Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
| ASN.1 (XML Encoding Rules) | | <foo>true</foo> | <foo>false</foo> | <foo>685230</foo> | <foo>6.8523015e+5</foo> | | <syntaxhighlight lang="xml"><SeqOfUnrelatedDatatypes> | An object (the key is a field name): |
| CSV | null (or an empty element in the row) | 1 true | 0 false | 685230 -685230 | 6.8523015e+5 | | true,,-42.1e7,"A to Z" | <pre>42,1 |
| edn | nil | true | false | 685230 -685230 | 6.8523015e+5 | "A to Z", "A \"up to\" Z" | [true nil -42.1e7 "A to Z"] | {:kw 1, "42" true, "A to Z" [1 2 3]} |
| Ion | | true | false | 685230 -685230 0xA74AE 0b111010010101110 | 6.8523015e5 | "A to Z" A to Z | <syntaxhighlight lang="json"> | <syntaxhighlight lang="javascript"> |
| Netstrings | 0:, 4:null, | 1:1, 4:true, | 1:0, 5:false, | 6:685230, | 9:6.8523e+5, | | 29:4:true,0:,7:-42.1e7,6:A to Z,, | |
| JSON | null | true | false | 685230 -685230 | 6.8523015e+5 | | <syntaxhighlight lang="json"> | <syntaxhighlight lang="json"> |
| OGDL | null | true | false | 685230 | 6.8523015e+5 | "A to Z" 'A to Z' NoSpaces | <pre>true | <pre>42 |
| OpenDDL | ref {null} | bool {true} | bool {false} | int32 {685230} int32 {0x74AE} int32 {0b111010010101110} | float {6.8523015e+5} | string {"A to Z"} | | <pre>dict |
| PHP serialization format | N; | b:1; | b:0; | i:685230; i:-685230; | d:685230.15; d:INF; d:-INF; d:NAN; | s:6:"A to Z"; | a:4:{i:0;b:1;i:1;N;i:2;d:-421000000;i:3;s:6:"A to Z";} | Associative array: a:2:{i:42;b:1;s:6:"A to Z";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}} Object: O:8:"stdClass":2:{s:4:"John";d:3.14;s:4:"Jane";d:2.718;} |
| Pickle (Python) | N. | I01\n. | I00\n. | I685230\n. | F685230.15\n. | S'A to Z'\n. | (lI01\na(laF-421000000.0\naS'A to Z'\na. | (dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas. |
| Property list (plain text format) | | <*BY> | <*BN> | <*I685230> | <*R6.8523015e+5> | "A to Z" | ( <*BY>, <*R-42.1e7>, "A to Z" ) | <pre>{ |
| Property list (XML format) | | <true /> | <false /> | <integer>685230</integer> | <real>6.8523015e+5</real> | | <syntaxhighlight lang="xml"><array> | <syntaxhighlight lang="xml"><dict> |
| Protocol Buffers | | true | false | 685230 -685230 | 20.0855369 | | <pre>field1: "value1" | <syntaxhighlight lang="protobuf"> |
| S-expressions | NIL nil | T #t true | NIL #f false | 685230 | 6.8523015e+5 | YWJj|</nowiki></code> | (T NIL -42.1e7 "A to Z") | ((42 T) ("A to Z" (1 2 3))) |
| TOML | | true | false | 685230 +685_230 -685230 0x_0A_74_AE 0b1010_0111_0100_1010_1110 | 6.8523015e+5 685.230_15e+03 685_230.15 inf -inf nan | "A to Z" 'A to Z' | ["y", -42.1e7, "A to Z"] | { John = 3.14, Jane = 2.718 } |
| YAML | ~ null Null NULL | y Y yes Yes YES on On ON true True TRUE | n N no No NO off Off OFF false False FALSE | 685230 +685_230 -685230 02472256 0x_0A_74_AE 0b1010_0111_0100_1010_1110 190:20:30 | 6.8523015e+5 685.230_15e+03 685_230.15 190:20:30.15 .inf -.inf .Inf .INF .NaN .nan .NAN | A to Z "A to Z" 'A to Z' | [y, ~, -42.1e7, "A to Z"] | {"John":3.14, "Jane":2.718} |
| XML and SOAP | | true | false | 685230 | 6.8523015e+5 | | <syntaxhighlight lang="xml"> | <syntaxhighlight lang="xml"><map> |
| XML-RPC | | <value><boolean>1</boolean></value> | <value><boolean>0</boolean></value> | <value><int>685230</int></value> | <value><double>6.8523015e+5</double></value> | <value><string>A to Z</string></value> | <syntaxhighlight lang="xml"><value><array> | <syntaxhighlight lang="xml"><value><struct> |

==Comparison of binary formats==

| Format | Null | Booleans | Integer | Floating-point | String | Array | Associative array/object |
| ASN.1 (BER, PER or OER encoding) | type | : | : | : | Multiple valid types () | Data specifications (unordered) and (guaranteed order) | User definable type |
| BSON | \x0A (1 byte) | True: \x08\x01 False: \x08\x00 (2 bytes) | int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement | Double: little-endian binary64 | UTF-8-encoded, preceded by int32-encoded string length in bytes | BSON embedded document with numeric keys | BSON embedded document |
| Concise Binary Object Representation (CBOR) | \xf6 (1 byte) | | | | | | |
| Efficient XML Interchange (EXI) | xsi:nil is not allowed in binary context. | 1–2 bit integer interpreted as boolean. | Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number. | | Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead. | Length prefixed set of items. | |
| FlatBuffers | Encoded as absence of field in parent object | | Little-endian 2's complement signed and unsigned 8/16/32/64 bits | | UTF-8-encoded, preceded by 32-bit integer length of string in bytes | Vectors of any other type, preceded by 32-bit integer length of number of elements | Tables (schema defined types) or Vectors sorted by key (maps / dictionaries) |
| Ion | \x0f | | | | | \xbx Arbitrary length and overhead. Length in octets. | |
| MessagePack | \xc0 | | | Typecode (1 byte) + IEEE single/double | | | |
| Netstrings | | | | | Length-encoded as an ASCII string + ':' + data + ',' | | |
| OGDL Binary | | | | | | | |
| Property list (binary format) | | | | | | | |
| Protocol Buffers | | | | | UTF-8-encoded, preceded by varint-encoded integer length of string in bytes | Repeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length | |
| Smile | \x21 | | | IEEE single/double, BigDecimal | Length-prefixed "short" Strings (up to 64 bytes), marker-terminated "long" Strings and (optional) back-references | Arbitrary-length heterogenous arrays with end-marker | Arbitrary-length key/value pairs with end-marker |
| Structured Data eXchange Formats (SDXF) | | | Big-endian signed 24-bit or 32-bit integer | Big-endian IEEE double | Either UTF-8 or ISO 8859-1 encoded | List of elements with identical ID and size, preceded by array header with int16 length | Chunks can contain other chunks to arbitrary depth. |
| Thrift | | | | | | | |

==See also==
- Comparison of document markup languages
