Comparison of JSON Like Serializations – JSON vs UBJSON vs MessagePack vs CBOR

Recently I’ve been working on some extensions to ASEXOR, adding there direct support for messaging via WebSocket and I use JSON for small messages that travels between client (browser or standalone)  and backend.  Messages looks like these:

I wondered, if choosing different serialization format(s) (similar to JSON, but binary) could bring more efficiency into the application –  considering  both message size and encoding/decoding processing time.  I run small tests  in python (see tests here on gist) with few established serializers, which can be used as quick replacement for JSON and below are results:

Format Total messages size (bytes) Processing time – 10000 x encoding/decoding all messages
JSON (standard library) 798 833 ms
JSON (ujson) 798 193 ms
MessagePack (official lib) 591 289 ms
MessagePack (umsgpack) 585 3.15 s
CBOR 585 163 ms
UBJSON 668 2.28 s

As messaging can use clients in web browser we can also look at performace  of some serializers in Javascript on this page.  As JSON serialization in part of browsers Web API, unsurprisingly it’s fastest there.

In Python pure Python libraries (UBJSON, MessagePack with umsgpack package) are slowest ( but their performance might get better in PyPy).  Standard library implementation of JSON serializer can be easily replaced by better performing ujson package.

Conclusions

JSON is today really ubiquitous, thanks to it’s ease of use and readability.  It’s probably good choice for many usage scenarios and luckily JSON serializers show good performance.   If size of messages is of some concern, CBOR looks like great, almost  instant replacement for JSON, with similar performance in Python ( slower performance in browser is not big issues as browser will process typically only few messages)  and 27% smaller messages size.

If size of messages is big concern carefully designed binary protocol ( with Protocol Buffers for instance) can provide much smaller messages ( but with additional costs in development).

One thought on “Comparison of JSON Like Serializations – JSON vs UBJSON vs MessagePack vs CBOR”

  1. Thanks for the great blog post.

    Another dataformat worth checking out is Smile.
    For a similar example I get these numbers.

    JSON: 744 bytes
    Smile: 470 bytes
    CBOR: 600 bytes
    Msgpack: 586 bytes

    The reason why Smile is much smaller is the built in back reference feature. Formats like json, cbor and msgpack have the problem that they have to send the key name with every field. In your example json, cbor and msgpack all contain the string ‘call_id’ 8 times in the output. But smile only writes this string once and then adds a reference in all the other locations. When you send a lot of similar objects this can save a lot of bandwith.

    Text from Wikipedia: https://en.wikipedia.org/wiki/Smile_(data_interchange_format)
    Compared to JSON, Smile is both more compact and more efficient to process (both to read and write). Part of this is due to more efficient binary encoding (similar to BSON, CBOR and UBJSON), but an additional feature is optional use of back references for property names and values. Back referencing allows replacing of property names and/or short (64 bytes or less) String values with 1- or 2-byte reference ids.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">