Locked out

created:2012-01-21
updated:2012-01-25

Let’s pretend you have this awesome Web 2.0 application with a shiny user interface, using the most recent technology available in the browser. Of course your application uses some kind of API to talk to your backend – the details don’t matter here. And everything is fine.

Until, one day, you discover someone wrote a free desktop client that uses your API, which was never meant to be “public”, to do exactly the task your Web 2.0 application was designed for. It is fast, lightweight and easy to use, but wait: It doesn’t show the user any advertising.

So, what are you going to do about it? Your first idea: Let’s encrypt the API requests. Encryption is widely used and there are algorithms considered “secure” by experts, so what could possibly go wrong?

But a few days after implementing the encryption and deploying it to your servers this annyoing little third-party client is updated: Now it encrypts the API requests too! Well, you say to yourself, we’ll just update the encryption keys and everything is fine again. You do that and not even a day after you deploy the change the third-party client is updated and works again.

What happened? Your Web 2.0 application somehow needs to encrypt the requests, so you had to put the key into the JavaScript, protected by a layer of eval() obfuscation as seen below. You can sense the problem already: You gave away the encryption key, rendering the encryption useless. The obfuscation didn’t protect you either.

/* Classic XOR cipher with plaintext x and key k, from www.pandora.com */
function _m(x, k) {
    var r = "";
    for(var i=0; i < x.length; i++) {
        r += String.fromCharCode(x.charCodeAt(i) ^ k[i % k.length]);
    }
    return r;
}

As the time passes more and more people start using this third-party client, new implementations with different user interfaces in different programming languages appear and they’re all using your API – encryption with rotating keys did not solve your problem at all. Is there anything else you can do? After all you have to pay your bills from your advertisement revenue!

Of course there is. These third-party clients try to get rid of the browser to do what your awesome Web 2.0 application does. And this is, from your point of view, the weak point. Require some kind of authorization that only a real browser can pass, require them to parse HTML, parse CSS and execute JavaScript. In the end you come up with something that, once released from the eval() cage, looks as innocent as the following piece of code:

/* Authorization key decoder, from www.pandora.com */
function calcPageHeight(tvs) {
    var ts = [];
    ts.length = 128;
    var positions = [32, 50, 42, 36, 15, 34, 35, 7, 62, 47, 26, 0, 33, 40,
            30, 41, 5, 28, 19, 59, 22, 52, 2, 10, 48, 39, 49, 56, 61, 3,
            45, 44, 20, 23, 46, 63, 17, 4, 1, 27, 6, 53, 29, 21, 54, 57,
            25, 16, 58, 38, 12, 11, 37, 60, 55, 43, 9, 31, 8, 51, 18, 13,
            14, 24];
    var ps = document.getElementsByTagName("p");
    var pNodes = [];
    for(var i = 0; i < ps.length; i++) {
        if (ps[i].parentNode.className == '_spacer') {
            pNodes.push(ps[i]);
        }
    }

    for(i = 0; i < positions.length; i++) {
        var code = parseInt(pNodes[i].getAttribute("height"));
        if (code < 128) {
            ts[positions[i]] = code;
        }
    }

    ts.length = 32;

    var s = "";
    for(i = 0; i < ts.length; i++) {
        s += String.fromCharCode(ts[i]);
    }

    tvs[unescape('%5fd%61t%61')][unescape('%63')] = s;
}
<!-- excerpt from www.pandora.com -->
<div class='_spacer'><p height='128px'/></div>
<div class='_spacer'><p height='128px'/></div>
<div class='_spacer'><p height='128px'/></div>
<div class='_spacer'><p height='49px'/></div>
<div class='_spacer'><p height='128px'/></div>
<div class='_spacer'><p height='50px'/></div>
<div class='_spacer'><p height='98px'/></div>
<div class='_spacer'><p height='128px'/></div>
<div class='_spacer'><p height='57px'/></div>
<div class='_spacer'><p height='97px'/></div>
<div class='_spacer'><p height='128px'/></div>
<!-- … -->

Of course the function name does not accurately describe what the code really does:

  1. Collect all p tags from your HTML page
  2. Apply a permutation table
  3. Get the height of every p element and look up the corresponding ASCII character
  4. Concatenate everything into a 32 byte string, the authorization token

This token was generated by the server while serving the page. And it defeats any attack scenario based on static analysis (unless you somehow manage to lose the algorithm that generates these tokens on the server).

The third-party client could download the HTML page, extract all p tags, parse their heights and process them with its own calcPageHeight implementation, but you would simply permutate the positions array now and then, move the height of some tags to CSS or change them with embedded JavaScript, which is by the way a Turing complete language. Implementing a JavaScript engine (including all the HTML/CSS glue) is the only choice the third-party client has now. Game over.

Does it really work in practice, you may ask now? Yes, the personalized internet radio pandora.com does this to limit the number of playlists third-party clients like pianobar can fetch. Up to now nobody has found a way to circumvent this restriction permanently and can be implemented in the client itself.