Dropbox File Revisions with Node.js

Mike Lenz · Dec 13, 2012 · @Galler.io · New: Startup chat | Dropbox search
Our previous examples have not actually served any files yet, but that is simple enough. We make it more interesting by respecting the browser cache. If the client already has an up-to-date copy of a file, there's no need to re-fetch it from Dropbox.
First, the simple case. Here's our node GET handler for a file. We're using express.js for simplicity and node-dbox to wrap the Dropbox API. In this example we assume the file is a jpg but it works for any file types.
app.get(/*\.jpg/, function(req, res) {
    var client = dropbox.client(auth_token);
    client.get(req.path, function(status, reply, metadata) {
        res.set('Content-Type', metadata.mime_type);
        res.set('ETag', metadata.rev);
        res.send(reply);
    });
});
The first line connects to the API using a token which was previously authorized. Then we fetch the file and set a couple of headers to return to the browser: Content-Type, which will be image/jpeg in this case, and Etag. The ETag header is stored in the browser's cache as a unique fingerprint of the file. When requesting the file again, the browser passes the ETag in an If-None-Match header. The server can determine if the file has been modified by comparing fingerprints, returing a 304 "Not Modified" header if they match, or the current file contents otherwise.
Express.js has built-in ETag handling but it generates those based on the file system, so we extend it here to respect Dropbox's revisioning scheme. In the API, metadata.rev is "A unique identifier for the current revision of a file." That suits our needs perfectly, so we hand that back to the client to cache.
On the server, recall we are caching the metadata per directory in MongoDB. With that we can implement the logic of comparing the browser's revision to ours and returning a 304 when needed in lieu of re-fetching the file from Dropbox. In our GET handler add a callback wrapper:
    ...
    ifFileModified(path, req, res, function() {
        client.get(req.path, function(status, reply, metadata) {
        ...
        });
    });
The ifFileModified method takes a callback, since it will query Mongo, allowing Node to handle other events and invoke our code when ready:
// check ETag header from browser. If blank or does not match the cached 'rev' metadata value,
// invoke callback to download new file. If rev matches, return a 304.
function ifFileModified(file, req, res, callback) {
    var requestedRev = req.headers['if-none-match'];
    if (!requestedRev) {
        return callback();
    }
    var dir = file.replace(/\/[^\/]*$/, "") || "/";   // strip filename, leaving / if at root
    db.collection("user", function(err, collection) {
        var query = {};
        query.uid = uid;   // the user's Dropbox id
        query["metadata." + dir + ".contents.rev"] = requestedRev;
        collection.findOne(query, {fields: {_id : 1}}, function(err, item) {
            if (item) {
                return res.send(304, "");
            } else {
                callback();
            }
        });
    });
}
If the browser didn't pass an ETag, it doesn't have a cached file, so we invoke the callback to fetch the file. Otherwise, get the rev value from our stored metadata, if any. As you recall, we are storing metadata on a per directory basis, with the contents field holding an array of files. A quick regex strips off the filename, giving us the directory.
A trick here, assuming rev values are unique across files, is simply to query Mongo for the existence of the desired rev in any file in the directory. The lookup key to Mongo is metadata.dir_name.contents.rev. Mongo recognizes that contents is an array and transforms that to the appropriate query. We only care about existence of a key/value pair with the given revision, so we minimize the result set by fetching only the _id field.
Finally, if a match is found, we return a 304 header and empty body to the client. Otherwise, invoke the callback to fetch the file. The way our client is written, it always queries the parent directory when fetching a file, which will trigger the latest metadata to be fetched and stored, so the next time this file is requested we'll have an updated rev and be able to return a 304.
That's all for today! Check out the complete photo gallery site at galler.io and contact me on Twitter.

Dropbox Authentication with Node.js

Mike Lenz · Nov 28, 2012 · @Galler.io
Hi everyone, on the heels of our previous post about using the Dropbox API with Node and Mongo, I'm going to walk through an example of user authentication. Dropbox uses OAuth for a 3-step flow:
  1. Obtaining a temporary request token
  2. Directing the user to dropbox.com to authorize your app
  3. Acquiring a permanent access token
Here's the code for steps 1 and 2. We are using node-dbox to wrap the Dropbox API so we simply invoke that, then redirect the user's browser to the authorization page for our app. We have a variable callbackHost defined that's our app URL to which the user will return.
function requestToken(res) {
    dropbox.requesttoken(function(status, req_token) {
        res.writeHead(200, {
            "Set-Cookie" : ["oat=" + req_token.oauth_token,
                            "oats=" + req_token.oauth_token_secret]
        });
        res.write("<script>window.location='https://www.dropbox.com/1/oauth/authorize" +
                  "?oauth_token=" + req_token.oauth_token +
                  "&oauth_callback=" + callbackHost + "/authorized" + "';</script>");
        res.end();
    });
}
Note we're storing the returned request token in a session cookie for use in the next step. Note also you must check the status code for any errors, which I'm not doing here for brevity. The redirection happens by writing a piece of javascript to our http response, but that could also be done with a redirect header or a meta tag.
Using express.js simplifies registering endpoints in our Node application. The user upon successful authorization is directed to our app's /authorized page:
app.get(/\/authorized/, function(req, res) {
    // callback from dropbox authorization
    accessToken(req, res);
});
Now on to step 3, completing the authorization.
function accessToken(req, res) {
    var req_token = {oauth_token : req.cookies.oat, oauth_token_secret : req.cookies.oats};
    dbox.accesstoken(req_token, function(status, access_token) {
        if (status == 401) {
            res.write("Sorry, Dropbox reported an error: " + JSON.stringify(access_token));
        }
        else {
            var expiry = new Date(Date.now() + 1000 * 60 * 60 * 24 * 30); // 30 days
            res.writeHead(302, {
                "Set-Cookie" : "uid=" + access_token.uid + "; Expires=" + expiry.toUTCString(),
                "Location" : "/"
            });
            db.collection("user", function(err, collection) {
                var entry = {};
                entry.uid = access_token.uid;
                entry.oauth_token = access_token.oauth_token;
                entry.oauth_token_secret = access_token.oauth_token_secret;
                collection.update({"uid": access_token.uid}, {$set: entry}, {upsert:true});
            });
        }
        res.end();
    });
}
First, we grab the request token from the user's cookie and pass that along to request an access token. If that returns a 401 unauthorized status, various things could have gone wrong: the user's didn't grant access, or the request token has expired, etc. None of the Dropbox API will work at this point so you need to ask the user to retry.
Upon a successful response, the API returns three things: an access token, a secret, and a user ID. We're storing the uid in a cookie with a 30-day expiration for use in future requests. This is a low-security way to manage the user's credentials, since the cookie can be spoofed and the user has granted our app access to the user's Dropbox account. If your app requires a higher level of security, you'll want to implement a login and password scheme or use a third-party authentication service.
For a final step, we're persisting the access token and secret in our Mongo DB user table with the uid as primary key. This could just as easily go into a SQL db, or not be persisted at all and ask the user to re-authorize whenever access is needed. The access keys should not be written to a cookie due to the ease of spoofing.
That's all for today! Check out the complete photo gallery site at galler.io and contact me on Twitter.

Using the Dropbox API with Node.js and MongoDB

Mike Lenz · Nov 27, 2012 · @Galler.io
Hi Dropbox fans! I'm going to share a quick example of integrating the Dropbox API with Node.js and MongoDB. Node.js is a fast, event-based server that is a convenient host for a JSON-based API. MongoDB is a NoSQL database which natively stores JSON objects. But you know all that, so let's get going.
Our web app is a photo gallery viewer: a simple application of querying Dropbox for a given user, fetching files and directory listings, and so on. The example I'll share has an already authenticated user fetching a directory of photos. As you know, the metadata API takes an optional hash which is used to determine if the directory contents have changed since last being fetched. We'll put our tools to work building a simple cache to make use of the hash feature.
To start, we use the excellent node-dbox SDK, and do some setup with your Dropbox API key and the URI of your Mongo instance.
var dbox = require("dbox");
var mongo = require("mongodb");
var http = require("http");

var dropbox = dbox.app({ "app_key" : ..., "app_secret" : ... });
mongo.connect(mongoUri, {auto_reconnect : true}, function(err, db) { ... } );
Next, I'll skip over the authentication flow and assume we have a Dropbox user ID and a directory we want to query. For example, this might come from a path requested by our client app such as /123456/Photos/Venice of the form /user-id/dir-path. Here's a minimal Node server with a regex to parse our URL.
http.createServer(function(request, response) {
    var parsedPath = /\/([0-9]+)(\/.*)$/.exec(request.url);
    listFiles(parsedPath[1] /* uid */, parsedPath[2] /* dir */, response);
}).listen(8080, 'localhost');
Before we query the Dropbox API, listFiles will check our cache to avoid unnecessary overhead. Where Mongo comes in handy is that it stores JSON natively, so the entire metadata structure that Dropbox returns can directly be inserted into the database. Here's what our function looks like:
function listFiles(uid, dir, response) {
    cachedMetadata(uid, dir, function(metadata) {
        var hash = metadata ? metadata.hash : null;
        var client = dropbox.client(auth_token);
        client.metadata(dir, {hash : hash}, function(status, reply) {
            if (status != 304) {
                metadata = reply;
                cacheMetadata(uid, dir, metadata);
            }
            response.write(metadata);
            response.end();
        });
    });
}
You'll see the common Node.js coding pattern of nested callbacks; any time we need to wait on an external resource, Node will work on other requests until our data is ready and our callback is invoked. One way to simplify that structure is to write helper functions which invoke callbacks of their own, which we do with the cachedMetadata method below.
First, listFiles gets the cached metadata, if any, for the user and directory. Among other properties, the metadata contains a hash of the directory contents that Dropbox computed for us. Pass that hash to the /metadata API: if we get a 304 response, nothing has changed; otherwise (assuming a 200 status; you'll need to check for other errors), we use the newly returned metadata and write it to our cache (not shown). Finally, any processing of the file listing for our app would happen here, and we write the JSON response and close the output socket. Expanding on the first step, MongoDB comes into play:
function cachedMetadata(uid, dir, callback) {
    db.collection("user", function(err, collection) {
        var query = {}, fields = {};
        query.uid = uid;
        query["metadata." + dir] = {$exists : true};
        fields["metadata." + dir] = 1;
        collection.findOne(query, {fields: fields}, function(err, result) {
            callback(result ? result.metadata[dir] : null);
        });
    });
}
Using the Mongo instance we opened earlier, we query a user collection whose primary key is the Dropbox user id. The user document has a metadata field which itself is a nested object, each key being a directory path and its value a metadata structure. For example, the path /Photos/Venice for user 123456 is stored in the key user["123456"].metadata["/Photos/Vacation"]. We construct a "does this key exist" query, and the only field we need is the metadata itself; we run the query and pass either the returned metadata or null to our callback.
That's all for today! Check out the complete photo gallery site at galler.io and contact me on Twitter.