Backbonification: migrating a large JavaScript project from DOM spaghetti to Backbone.js
We've all done it. Our code base has one huge monolithic file, packed full of JavaScript spaghetti. It's unwieldy, hard-to-debug, and has little to no separation of concerns. It is a nightmare to bring new engineers up to speed.
This blog post is about decomposing NewsBlur's single-file 8,500 line JavaScript application into its component parts: 8 models, 12 views, 3 routers, 3 collections. This post explores patterns, techniques, and common pitfalls in migrating from vanilla JavaScript to Backbone.js. It covers moving routers, models, and views, and the process used to migrate a living app.
NewsBlur is a free RSS feed reader and is open-source. The benefit of being open-source is that you can see all of the changes I made in this migration by looking through the commit history.
As a bit of background, I worked on Backbone.js in its infancy, when Jeremy Ashkenas and I worked on DocumentCloud's many open-source projects.
The Presentation
This post was written concurrently with a presentation. Depending on your style, you can either read on or flip through this deck. Both have the same content, but this post expands on every concept in the presentation.
There's no need to go through the presentation. Just read on for the whole kaboodle.
Pre-reqs: Libraries
There are only two libraries you need to be intimately familiar with in order to make the most of your Backbone transition: Underscore.js and Backbone.js. That means not only being comfortable with reading the source code of these two libraries, but also knowing all of the methods exposed so you can reach into your grab-bag of tricks and pull out the appropriate function.
Underscore.js
Underscore.js is another DocumentCloud library that makes your code more readable and compact by providing useful functions that massage, filter, and jumble data.
One popular use of Underscore is creating short pipelines that take a large collection of models and filters it based on conditions. That much is easy. But there are other uses that are beneficial to know.
You should be comfortable with all enumerable methods. Think about all of your model collections as reduce-able, filterable, and selectable.
Here are two examples of Underscore.js at work:
// Get ids of all active feeds
_.pluck(this.feeds.select(function(feed) {
return feed.get('active');
}), 'id');
// Returns: [42, 36, 72, ...]
// Count fetched/unfetched feeds
var counts = this.feeds.reduce(function(counts, feed) {
if (feed.get('active')) {
if (!feed.get('not_yet_fetched') || feed.get('has_exception')) {
counts['fetched_feeds'] += 1;
} else {
counts['unfetched_feeds'] += 1;
}
}
return counts;
}, {
'unfetched_feeds': 0,
'fetched_feeds': 0
});
// Returns: {'unfetched_feeds': 3, 'fetched_feeds': 42}
Backbone.js
The star of the show is Backbone.js. The entire backbone.js file is fewer than 1,500 lines long, and that's with 228/1478 lines of whitespace (15%) and 389/1478 lines of comments (26%).

This is a basic example of the layout of the four main classes: models, views, collections, and routers. A fifth meta-class called Events is mixed in to each of these classes.
How to start
The first step is no easy task. Take your existing design and visually decompose it into its component views. Each view will be represented by either a single model or a combination of models. In fact, you can even have a view not be backed by a model at all.
Take the NewsBlur UI for example. It's a standard three-pane view, with feeds, stories, and story detail:

Notice that there are multiple views inside other views. Some views are meant to be simple wrappers around other, more functional views.
On the left there is a list of feeds inside a list of folders. These folders and feeds can be embedded inside each other, creating a recursive structure that can be easily assembled by Backbone.js. Each feed view also contains an unread count, a feed title, a favicon, and a feed menu. All of these views are generated by their respective parents. Your job at this stage is to simply figure out what those views are so you can create the appropriate models, views, and controllers (routers).
Here's another example, coming from the DocumentCloud workspace, the original Backbone.js site:

A bit simpler than NewsBlur, this is a classic dual-pane view, with an organizer on the left and a detail pane on the right. Notice that there is a view collection, the document list, that holds numerous document views. It's important to consider each view as granular as possible and then bring them together in collections that are simply views of model collections.
Moving routers
You have routers, even if you don't realize it yet. Not only that, but you probably have multiple routers. Routers are the smallest part of a Backbone.js project, but are vital because they serve as the entry point for execution.

Anytime a URL is involved, your router should be handling it. You can also have multiple routers in a project. Before version 0.5.0 routers used to be called controllers, if that shines a light on their purpose.
If an out-of-band AJAX call is necessary, and it doesn't correspond to a specific model, then the router is a great place to put it.
There's a lot written on conventions for writing your router. I suggest going directly to the source: the original router from the DocumentCloud workspace. This is the first router ever written and should give you as canonical an idea as possible for what your router should and can include.
// Main controller for the journalist workspace. Orchestrates subviews.
dc.controllers.Workspace = Backbone.Router.extend({
routes : {...},
// Initializes the workspace, binding it to body.
initialize : function() {
this.createSubViews();
this.renderSubViews();
Backbone.history.start({
pushState : true,
root : dc.account ? '/' : '/public/'
})
},
// Create all of the requisite subviews.
createSubViews : function() {
dc.app.paginator = new dc.ui.Paginator();
dc.app.navigation = new dc.ui.Navigation();
dc.app.toolbar = new dc.ui.Toolbar();
dc.app.organizer = new dc.ui.Organizer();
dc.app.searchBox = VS.init(this.searchOptions());
this.sidebar = new dc.ui.Sidebar();
this.documentList = new dc.ui.DocumentList();
if (!dc.account) return;
dc.app.uploader = new dc.ui.UploadDialog();
dc.app.accounts = new dc.ui.AccountDialog();
},
// Render all of the existing subviews and place them in the DOM.
renderSubViews : function() {
var content = $('#content');
content.append(this.sidebar.render().el);
content.append(this.panel.render().el);
dc.app.navigation.render();
dc.app.hotkeys.initialize();
this.panel.add('search_box', dc.app.searchBox.render().el);
this.panel.add('pagination', dc.app.paginator.el);
this.panel.add('document_list', this.documentList.render().el);
if (!dc.account) return;
this.sidebar.add('account_badge', this.accountBadge.render().el);
}
});
The router is used for laying out all of the workspace-level subviews. Each of these subviews is then responsible for laying out specific instances of documents, collections, toolbar items, search facets, etc.
Moving models
Before you can even start playing with Backbone models, you'll need to get your data in a format that Backbone can vivify. Your server should be sending arrays of dictionaries, each array consisting of a single parent model. This may cause versioning on your server end due to having to change the format of your API's response.
Versioning: Objects as dicts vs. arrays
Perhaps you were giving data in a format that made it easy for you to key into the dictionary to retrieve a model, like so:
{
64: {
'id': 64,
'title': "The NewsBlur Blog"
},
128: { ... }
}
However, in order to vivify these models, Backbone expects an array of dictionaries. We can modify the backend to provide models in this format, by adding a v=2 query paramter:
[
{
'id': 64,
'title': "The NewsBlur Blog"
},
{ ... }
]
Backbone reads through and finds all id attributes and hashes them into the _byId
object on the model. You can do this client-side instead of versioning your API, but that would require you to write a custom parse method.
You can then override your collections fetch method to add in the version information. This transparently handles appending a version parameter to your requests.
fetch: function(options) {
var data = {
'v': 2
};
options = _.extend({
data: data,
silent: true
}, options);
return Backbone.Collection.prototype.fetch.call(this, options);
},
Notice, by the way, the last line is how you call super() in JavaScript. This is a clear demonstration of over-riding methods in Backbone and then calling super at the appropriate time.
Attributes
If your models were simple JavaScript object literals (dictionaries), then you were using one of these styles to work with attributes:
// JavaScript:
model.title
model['title']
var attr = 'title';
model[attr]
However, Backbone uses a get method to retrieve an attribute from a model:
// Backbone.js:
model.get('title')
The trick here is that during a large-scale refactor, you want to change as few things as possible. In this case, you can pass a Backbone model's model.attributes to old Javascript methods. Then when done, clean up by
looking for .attributes.
var iconSrc = $.favicon(socialFeed.attributes);
this.$('.NB-header-icon').attr('src', iconSrc);
This way you do not have to immediately rewrite all of your model attribute getters until you have tested the modified parts of your code.
Populating a collection that has side-effects
Looking at feeds and folders above, in order to populate the list of folders you need the feed models. But when populating the feed models, their view is bound to the reset event, which will try to render the feeds, but there are no folders yet!
Pass {silent: true} to the initial reset for feeds, then manually trigger the reset event after the dependencies are met.
Listening for events on a collection's models
Any event that is triggered on a model in a collection will also be triggered on the collection directly, for convenience. This allows you to listen for changes to specific attributes in any model in a collection.
// Bind to all models
Documents.bind('reset', this.reset);
Documents.bind('add', this._addDocument);
Documents.bind('remove', this._removeDocument);
// Bind to specific attributes on the collection's models
Documents.bind('change:pages', this._renderPageCount);
The Documents collection contains all of the documents on the page, but you could also create specialized collections with a subset of those models that respond to change events without having to filter the change event on the bigger collection to only apply to those specific models.
Seeing changed attributes
In this case, a model is updated from elsewhere, so it needs to be refreshed on the page. Use model.hasChanged() and model.previousAttributes() to see what's changed.
However, before you may have iterated over all new values and compared to existing values. Backbone has this built in.
// The collection is selective about changing attributes
this.bind('change', this._onModelChanged);
...
_onModelChanged : function(doc) {
if (doc.hasChanged('access') && doc.isPending())
this._checkForPending();
},
...
// The view is also selective about changing attributes.
// Re-renders the tile if an server-backed attribute changes.
_onDocumentChange : function() {
if (this.model.hasChanged('selected')) return;
this.render();
},
You can short-circuit the change event if you are looking for a specific attribute. But sometimes you want the change event fired only once yet you're looking to do different things based on which attribute has been changed. So instead of relying on each attribute's individual change event, you can wait for the bundled change event.
Naturally, you may be wondering what gets fired first? Each individual change event or the bundled change event? Well, to figure that out we just turn to the Backbone.js source code:
// Call this method to manually fire a `"change"` event for this model and
// a `"change:attribute"` event for each changed attribute.
// Calling this will cause all objects observing the model to update.
change: function(options) {
...
for (var i=0, l=triggers.length; i < l; i++) {
this.trigger('change:' + triggers[i], this,
current[triggers[i]], options);
}
...
this.trigger('change', this, options);
...
},
Notice that the individual change attributes are guaranteed to fire before the bundled change event. This goes to show that the source code for Backbone.js is not as thorny and cumbersome to read as other libraries may have led you to believe about all libraries. Backbone.js's source code is easy to follow and is written as close to plain english as possible
Intermediary models
Sometimes an active item needs a bit more meta-data than the non-active counterpart. Take the feed list, for instance. When a feed becomes selected, it needs to be referred to by many other components, each of which needs to know about the active feed. Storing a reference to this model, say ActiveFeed, then allows you to add view-specific meta-data that would be helpful in other views.
NEWSBLUR.Models.Feed.prototype.setSelected = function() {
NEWSBLUR.app.feeds.deselect();
NEWSBLUR.app.activeFeed = this;
}
NEWSBLUR.Views.FeedList.prototype.findSelected = function() {
return _.pluck(NEWSBLUR.app.activeFeed.views, '$el');
}
If you are operating in a loop, then you'll definitely want to cache a reference to a model like this.
Moving views
This part of the process is a bit more involved than moving models or simply constructing routers. This is where most of the cleanup is involved.
Writing Templates
The first thing that needs to be changed is how DOM fragments are constructed. In NewsBlur's case, we're moving from JavaScript element creation to interpolated templates.
Old style: Manual DOM element creation
This style is simply a wrapper around a variety of document.createElement calls, where $.make will take attributes and add them correctly to the newly created element, as well as appending each of the children in a easy-to-read function.
openSocialCommentReplyForm: function($comment) {
var profile = this.model.userProfile;
var $form = $.make('div', { className: '...' }, [
$.make('img', { className: '...', src: profile.get('url') }),
$.make('div', { className: '...' }, profile.get('username')),
$.make('input', { className: '...', type: 'text' }),
$.make('div', { className: '...' }, 'Post')
]);
$('.story-comment-reply-form', $comment).remove();
$comment.append($form);
$('.comments', $form).bind('keydown', 'enter, return',
_.bind(this.saveSocialCommentReply, this, $comment));
$('.comments', $form).bind('keydown', 'esc', function() {
$('.NB-story-comment-reply-form', $comment).remove();
});
$('input', $form).focus();
this.fetchStoryLocationsInFeedView();
},
While it's easy to read and write, it is not fast. This method is an order of magnitude slower than the better methods described below, each of which use string interpolation to inject data into the template.
Template option #1: inline strings
This is the option that I eventually chose, if only because it was the simplest, could be easily cached by the browser, and was inline with the view code.
render: function() {
var $feed = $(_.template('\
<<%= listType %> class="feed">\
<img class="feed-favicon" src="<%= $.favicon(feed) %>">\
<span class="feed-title">\
<%= feed.get("feed_title") %>\
</span>\
<div class="feed-exception-icon"></div>\
<div class="feed-manage-icon"></div>\
</<%= listType %>>\
', {
feed : this.model,
listType : this.options.type == 'feed' ? 'li' : 'div',
}));
this.$el.replaceWith($feed);
this.setElement($feed);
this.renderCounts();
return this;
},
Notice that this.setElement is used on the new $feed. The reason for this is because of listType changing the top-level element depending on the location of the feed. In some cases it's part of a list, and in other cases it's a stand-alone feed title. In order to make it semantically correct, different top-level tags are needed, so you can't just use this.$el.html(), otherwise there will still be a top-most div wrapping your view (which can be customized by setting this.tagName).
Also, note that a better way to create these multi-line strings is to use the heredoc (multiline) strings in CoffeeScript. However, the template string still goes inline, which means you do not have to do any asset pre-compiling, which can be more or less difficult depending on your framework.
Template option #2: inline templates
These templates are just <script> tags that go in your HTML templates. The downside to this method is that your JavaScript templates are not cached by the browser and have to be downloaded as part of every page load.
<script type="text/html" id="feed-template">
<<%= listType %> class="feed">
<img class="feed-favicon" src="<%= $.favicon(feed) %>" />
<span class="feed-title">
<%= feed.get("feed_title") %>
</span>
<div class="feed-exception-icon"></div>
<div class="feed-manage-icon"></div>
</<%= listType %>>
</script>
...
render: function() {
this.template = this.template || $('#feed-template').html();
var $feed = _.template(this.template, {
feed : this.model,
listType : this.options.type == 'feed' ? 'li' : 'div',
});
this.$el.replaceWith($feed);
this.setElement($feed);
return this;
},
Not recommended, but easy enough to do without a pre-compiler or asset pipeline.
Template option #3: JSTs
This is the recommended method, but it requires you to have a pre-compiler that works in both development and production, referring to your templates individual in development and as a concatenated file in production. Not a big deal to implement, but many asset packagers do not handle JavaScript templates and allow you to wrap them in the appropriate interpolater, such as Underscore.js' _.template.
window.JSTs['feed'] = _.template('<script type="text/html" id="feed-template">'+
'<<%= listType %> class="feed"><img class="feed-favicon" '+
'src="<%= $.favicon(feed) %>" /><span class="feed-title">'+
'<%= feed.get("feed_title") %></span>'+
'<div class="feed-exception-icon"></div>'+
'<div class="feed-manage-icon"></div>'+
'</<%= listType %>></script>');
...
render: function() {
var $feed = JSTs['feed']({
feed : this.model,
listType : this.options.type == 'feed' ? 'li' : 'div',
});
this.$el.replaceWith($feed);
this.setElement($feed);
return this;
},
If you are on Ruby on Rails, you can use Jammit, another DocumentCloud project, to automatically handle JavaScript templates. Alternatively, node-jst and sprockets are worth a look.
Event delegation
The most common view change you'll make is moving from event binding to event delegation.
// From:
$(".feed", $feedList).bind('click', function(e) { ... });
// To:
NEWSBLUR.Views.FeedView = Backbone.View.extend({
...
events: {
"click" : "open"
}
...
});
One of the biggest benefits you'll receive by moving to Backbone.js is going from event binding to event delegation. If you're not already familiar with what this is, it is simply attaching all events to the top-level view element and then bubbling any events that happen to a child element up to the parent, where it is caught and delegated to the appropriate method.
This also means that you won't have events bound all over the DOM. And when you destroy views, you know where all of the events are bound and will not have as much work to do in order to prevent memory leaks from events bound to missing elements.
Delegating the same object from multiple views
Sometimes an object on your page can be better represented by different views attached to the same element.
NEWSBLUR.Views.FeedView = Backbone.View.extend({
initialize: function() {
this.menu = new NEWSBLUR.Views.FeedMenuView({el: this.el});
}
});
NEWSBLUR.Views.FeedMenuView = Backbone.View.extend({
...
events: {
"click" : "open"
}
...
});
In this case it's important to remember that both of these views will be listening for bubbling events. That means that it is possible to create a race condition if you do not know the order these views are instantiated. You will want to be careful in that the separated views do not step on each other's toes.
In this instance the feed title view and the feed menu view are attached at the same place but serve completely different purposes. The feed title view is the "parent" of the menu view, even though they exist at the same level in the DOM hierarchy.
Which element to use
If your top-level element is complicated and you are creating it as part of your render, then you can use setElement instead of $(this.el).html().
var $feed = _.template('<li class="feed"> ... </li>', {
...
});
this.setElement($feed);
// Would include a surrounding <div>
// $(this.el).html($feed)
But you can't just perform a setElement, because you need to replace the $el if it is on the page:
this.$el.replaceWith($feed);
this.setElement($feed);
This is meant for switching between element types. For example, a view that will sometimes go into a list as a list item, but then also be displayed solo, then you will want to control the tagName or just include it as part of your template and use setElement.
View collections
Models should not know about views. So in order to keep track of views, a parent view should encapsulate them and store references.
findFeedInFeedList: function(feedId) {
var $feeds = $('.feed', $feedList).map(function() {
var dataId = $(this).data('id');
if (dataId == feedId) {
return this;
}
});
return $feeds;
}
[turns into a view collection on FeedListView]
Recursive subviews

This is a typical case where you want to have a recursive subview that can contain more of itself. In this case, we have feeds and folders, and both can be children of folders. To accomplish this hierarchy, we just have to descend down the chain, rendering each subview and keeping track of each child view.
NEWSBLUR.Views.Folder = Backbone.View.extend({
render: function() {
var depth = this.options.depth;
var $feeds = this.collection.map(function(item) {
if (item.isFeed()) {
var feed = new NEWSBLUR.Views.Feed({
model: item.feed,
depth: depth
}).render();
item.feed.views.push(feed);
return feed.el;
} else {
var folder = new NEWSBLUR.Views.Folder({
model: item,
collection: item.folders,
depth: depth + 1
}).render();
item.folderViews.push(folder);
return folder.el;
}
});
var $folder = this.renderFolder();
$(this.el).html($folder);
this.$('.folder').append($feeds);
return this;
}
});
Traversing a view
When moving from story to story or feed to feed, you want to move in the order of what's on screen. The order is handled by the collection, but keeping track of the active model is something you have to do manually.
// Go to the next feed. Old style:
$('.feed.selected').next('.next')
// New style:
Feeds.activeFeed = Feeds.next();
Feeds.activeFeed.set('selected', true);
The Feeds.next() method can be a complicated method that walks your recursive hierarchy. But that's hidden away and you can just call that method idempotently.
Action hierarchy
Views handle their own actions, but what about cross-view actions? One view modifies another view. Propagate that up to the Router or parent view that is drawing the views. Have the parent talk to the necessary models, changing appropriate data.
Once the data is changed, the correct views will update, based on their triggers and bindings.
For instance, if you are deleting a feed, the context menu view, which knows which $feed is being deleted, sends that info to the model, which then scans its own views and triggers the removal on the correct one. Finally, an AJAX request is made (this is optimistic) in the Router.
NEWSBLUR.Views.FeedMenu = Backbone.View.extend({
deleteFeed: function() {
this.model.removeFeedFromFolder(this.folder);
}
});
NEWSBLUR.Models.Feed = Backbone.Model.extend({
removeFeedFromFolder: function(folder) {
this.feedViews.chain().select(function(feedView) {
return feedView.folder == folder;
}).each(function(feedView) {
feedView.animateDestroy();
});
}
});
No need for a model to back a view

Some views just don't have models. They control a visual element on the page but have no corresponding server model. The closest object they have to a model is the page itself or the user.
This means you need to keep a reference to the view. You can't just rely on the model to update the view. Other times you do have a model. If the model for the view would change depending on the model, you can just render a new view with the correct model and replace it on the page.
Common pitfalls
TypeError: 'undefined' is not an object (evaluating 'func.bind')
This error comes from trying to bind to a method that doesn't exist. But you don't get the name or line of the error, so the only way to debug it is to set a breakpoint and work your way up the stack.
_.bindAll(this, 'render', 'open', 'methodRemovedAndWillThrowTypeError');
Firing a change event while still setting up models and views
Add {silent: true} to a model.set() call if you're not ready to handle the change events.
Selectively re-render/toggle Classes based on specific change events.
Sometimes an attribute change merely results in a changed class on the element, not a full render. One technique you can use is to bind to the bundled change event and then selectively look for attributes that only result in a class change.
onChange: function() {
var onlyClasses = _.all(_.keys(this.model.changed), function(change) {
return _.contains(['selected', 'has_exception'], change);
};
if (onlyClasses) {
this.toggleClasses();
return;
}
this.render();
}
Here, we're checking that every changed attribute is one that results in a class change. Otherwise, do the full render.
Cleanup of ghost views
When removing a view, you need to both remove it from the DOM and then unbind it. The model still has bindings to the destroyed view.
// Parent view:
view.destroy();
// View:
initialize: function() {
this.model.bind('change', this.render);
},
destroy: function() {
this.remove();
this.model.unbind('change', this.render);
},
This has changed in the latest version of Backbone.js, but it's not yet released (the coming version 1.0), so you have to manually destroy both the views and view's event bindings.
The disappearing view

Before Backbone, views would not automatically update from beneath you. Now that views are tied to models, check to see if you are modifying a view post-render, just as inserting a special sub-view that the parent view doesn't know about. Because when the view re-renders, it won't know to re-insert the subview.
Resources
A couple resources that I like are:
- Developing Backbone.js Applications by Addy Osmani
- Backbone Patterns by Rico Sta. Cruz
As always, make sure to read the source of Backbone.js to see if you can just figure out what's happening under the hood.
I'm @samuelclay on GitHub, where you can follow me to watch the development of the NewsBlur front-end, back-end, iOS and Android apps. And I'm @samuelclay on Twitter where you can ask me further questions about Backbone.js.
Old-style Mac OS X Leopard Exposé in Snow Leopard
Progress is progress, except when it gets in the way of your workflow. Let's compare these two screenshots:
Old-style Leopard Exposé
New-style Snow Leopard Exposé
Notice how much more pleasant the old-style Exposé is? Introduced in Mac OS X 10.3 Panther, and virtually unchanged until OS X 10.6 Snow Leopard, it featured proportional windows. By just looking at the size of the window relative to the other windows, you can get a fair idea of what the application is.
The proportional windows went out the window with the new Exposé. Now it features an inexplicable grid, with windows resized to all different dimensions relative to their original size.
Old-style Exposé in Snow Leopard
The great news is that you can get the old-school Exposé back. The beta builds of Snow Leopard included a new Dock.app that used the old-style exposé. By installing the old Dock.app, you get the new Dock features of Snow Leopard, while preserving the legendary Exposé.
Installation
- Download the Snow Leopard beta-build of Dock.app
- Save to your Desktop and unzip.
Run the following commands in Terminal.app:
sudo chown -R root ~/Desktop/Dock.app;
sudo chgrp -R wheel ~/Desktop/Dock.app;
sudo killall Dock && \
sudo mv /System/Library/CoreServices/Dock.app ~/Desktop/OldDock.app && \
sudo mv ~/Desktop/Dock.app /System/Library/CoreServices/
Easy to do and indispensible now that you have it back. Hat-tip to miknos at MacRumors for the original find.
Note that you will have to repeat this process every time you upgrade your Mac OS to a new patch release (10.6.6 -> 10.6.7).
@samuelclay is on Twitter.
Use Google Reader? I built NewsBlur, a new feed reader with intelligence.
What Happened to NewsBlur: A Hacker News Effect Post-Mortem
Last week I submitted my project, NewsBlur, a feed reader with intelligence, to Hacker News. This was a big deal for me. For the entire 16 months that I have been working on the project, I was waiting for it to be Hacker News ready. It's open-source on GitHub, so I also had the extra incentive to do it right.
And last week, after I had launched premium accounts and had just started polishing the classifiers, I felt it was time to show it off. I want to show you what the Hacker News effect has been on both my server and my project.
Hacker News As the Audience
When I wasn't writing code on the subway every morning and evening, I would think about what the reaction on Hacker News would be. Would folks find NewsBlur too buggy? Would they be interested at all? Let me tell you, it's a great motivator to have an audience in mind and to constantly channel them and ask their opinion. Is a big-ticket feature like Google Reader import necessary before it's Hacker News ready? It would take time, and time was the only currency which I could pay with. In my mind, all I had to do was ask. ("Looks cool, but if there's no easy way to migrate from Google Reader, this thing is dead in the water.")
Kurt Vonnegut wrote: "Write to please just one person. If you open a window and make love to the world, so to speak, your story will get pneumonia." (From Vonnegut's Introduction to Bagombo Snuff Box.)
Let's consider Hacker News as that "one person," since for all intents, it is a single place. I wasn't working to please every Google Reader user: the die-hards, the once-in-a-seasons, or the twitter-over-rss'ers. For the initial version, I just wanted to please Hacker News. I know this crowd from seeing how they react to any new startup. What's the unique spin and what's the good use of technology, they would ask. What could make it better and is it good enough for now?
If you're outsourcing tech and just applying shiny visuals to your veneer, the Hacker News crowd sniffs it out faster than a beagle in a meat market. So I thought the best way to appeal to this crowd is to actually make decisions about the UI that would confuse a few people, but enormously please many people. From comments on the Hacker News thread, it looks like I didn't wait too long.
How the Server Handled the Traffic
Have I got some graphs to show you. I use munin, and god-love-it, it's fantastic for monitoring both server load and arbitrary data points. I watch the load on CPU, load average, memory consumption, disk usage, db queries, IO throughput, and network throughput (both to external users and to internal private IPs).
I also have a whole suite of custom graphs to watch how many intelligence classifiers users are making, how many feeds and subscriptions users are adding, the rate of new users, premium users, old users returning, new users sticking around, and load times of feeds (rolling max, min, and average).
Used to be that when a thundering herd of visitors came to NewsBlur, I'd have to watch the server nervously, as CPU would smack 400% (on a 4-core machine), the DB would thrash on disk, and inevitably some service or another would become overrun.
Let's see the CPU over the past week:
CPU - Past week
Spot the onslaught? NewsBlur's app server is only responsible for serving web requests, queueing feeds to be updated, and calculating unread counts. Needless to say, even with nearly a thousand new users, I offloaded so much of the CPU-intensive work to the task servers that I didn't have a single problem in serving requests.
This is a big deal. The task server was overwhelmed (partially due to a bug, but partially because I was fetching tens of thousands of new feeds), but everybody who wanted to see NewsBlur still could. Their web requests, and loading each feed, were near instantaneous. It was wonderful to watch it happen, knowing that everybody was being served.
CPU - Past year
Clearly, bugs have been fixed, and CPU-intensive work has been offloaded to task servers.
Load average - Past week
The load of the server went up and stayed up. Why did it not fall back down? Because the app server is calculating unread counts, it has more work to do even after the users are gone. This will become a pain point when one app server is not enough for the hundreds of concurrent users NewsBlur will soon have. But luckily, app servers are the easiest to scale out, since each user will only use one app server at a time, so the data only has to be consistent on that one server, as it propagates out to the other app servers (which may become db shards, too).
# of feeds and subscriptions - Past week
Economies of scale. The more feeds I have, the more likely a subscription to a feed will be on a feed that already exists. I want that yellow line to run off into space, leaving the green line to grow linearly. It's fewer feeds to fetch.
Memory - Past week
Memory doesn't move, because I'm being CPU bound. I'm not actually moving all that much more data around. I use gunicorn to rotate my web workers, so NewsBlur's few memory leaks can be smoothed over.
MongoDB Operations - Past week
I use MongoDB to serve stories. All indexes, no misses (there's a graph for this I won't bother showing). You can extrapolate traffic through this graph. Sure, you don't know average feeds per user, but you can take a guess.
My Way of Building NewsBlur
In order to build all of the separate pieces, I broke everything down into chunks that could be written down and crossed off. Literally written down. I have all of my priorities from the past 7 months. It's both a motivator and estimator. I've learned how to estimate work load far better than back in May, when these priorities start. I finish more of what I tried to start.
The way it works is simple: write down a priority for the month it's going to be built in, number it, then cross it off if it gets built before the end of the month. You get to go back and see how much you can actually do, and what it is you wanted to build. This means I'm setting myself up for a pivot every month, when I re-evaluate what it is I'm trying to build.
Google Reader as a Competitor
Lastly, what more could you ask for? A prominent competitor, known to every Gmail user as the empty inbox link. Feed reading is a complicated idea made simple by having most users already exposed to a product that fulfills the feed reading need. By improving over that experience, users can directly compare, instead of having to learn NewsBlur on top of learning how to use RSS and track every site you read.
If your space has a major competitor and the barrier to entry is an OAuth import away, then consider yourself lucky. Anybody can try your product and become paid customers in moments. It's practically a Lotus123 to Excel import/export, except you don't need to buy the software before you try it out.
Going Forward
I'm half-way to being profitable. I only need 35 more premium subscribers. But so far, people are thrilled about the work I'm doing. Here are some tweets from a sample of users:
I'm e-mailing blogs, chatting with folks who have a blog influence, and most importantly, continuing to launch new features and fix old ones. Thanks to Hacker News, I get to appeal to a graceful and sharp audience. And good looking.
I'm on Twitter as @samuelclay, and I'd love to hear from you.
There are Two Paper Towel Rolls
It's almost time to restock, but the shelf can only hold 5 rolls, so you might as well restock at an appropriate time. But you have to choose which of the two remaining rolls is going in the business end of the side-gripping dispenser.
I can choose the larger of the two rolls. The Mega-Roll. Or I can choose the standard size, which is visibly puny compared to the bigger choice. If you know the answer, it seems obvious, and that's because it's an obvious answer.
But it's not so obvious if you start thinking about why choose one in the first place. The larger roll is larger, but does that mean it should go first simply because it is preferable? The assumption is that you don't like changing rolls often and you don't think larger rolls look or work any differently than their smaller counter-part.
And maybe the smaller roll has preference, just to get it out of the way for more Megas when it's time to buy more. You need to remember to buy more. What causes you to remember to buy more? Absence or a dwindling stock. Once you get down to having one left, and it gets placed into service, you commit to memory that you need to stock up next time you remember. It's a modified version of The Game that you play with yourself, except that by remembering, you win.
The smaller roll goes in first, so that at exhaustion the larger roll has a longer opportunity for you to remember to buy more. Nothing shocks you more than an absence.
Migrating Django from MySQL to PostgreSQL the Easy Way
I recently moved NewsBlur from MySQL to PostgreSQL for a variety of reasons, but most of all I want to use connection pooling and database replication using Slony, and Postgres has a great track record and community. But all of my data was stored in MySQL and there is no super easy way to move from one database backend to another.
Luckily, since I was using the Django ORM, and with Django 1.2's multi-db support, I can use Django's serializers to move the data from MySQL's format into JSON and then back into Postgres.
Unfortunately, If I were to use the command line, every single row of my models has to be loaded into memory. Issuing commands like this:
#!python
python manage.py dumpdata --natural --indent=4 feeds > feeds.json
would take a long, long time, and it wouldn't even work since I don't have even close to enough memory to make that work.
Luckily, the dumpdata and loaddata management commands are actually just wrappers on the internal serializers in Django. I decided to iterate through my models and grab 500 rows at a time, serialize them and then immediately de-serialize them (so Django could move from database to database without complaining).
#!python
import sys
from django.core import serializers
def migrate(model, size=500, start=0):
count = model.objects.using('mysql').count()
print "%s objects in model %s" % (count, model)
for i in range(start, count, size):
print i,
sys.stdout.flush()
original_data = model.objects.using('mysql').all()[i:i+size]
original_data_json = serializers.serialize("json", original_data)
new_data = serializers.deserialize("json", original_data_json,
using='default')
for n in new_data:
n.save(using='default')
migrate(Feed)
This assume that you have both databases setup in your settings.py like so:
#!python
DATABASES = {
'mysql': {
'NAME': 'newsblur',
'ENGINE': 'django.db.backends.mysql',
'USER': 'newsblur',
'PASSWORD': '',
},
'default': {
'NAME': 'newsblur',
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'USER': 'newsblur',
'PASSWORD': '',
}
}
Note that I changed my default database to the Postgres database, because otherwise some management commands would still try to run on the default MySQL database. This is probably resolved and I didn't do something right, but when I migrated, I changed Postgres to be the default database.
I just run the short script in the Django console and wait however long it takes. This script prints out which set it's working on, so you can at least track the progress, which might take a long, long time, but is much less prone to crashing like dumpdata and loaddata.
A word of warning to those with large datasets. Instead of iterating straight through the table, see if you have a handier index already built on the table. I have a table with a million rows, but there are a few indices which can quickly find stories throughout the table, rather than having to order and offset the entire table by primary key. Adapt the following code to suit your needs, but notice that I use an index on the Feed column in the Story table.
#!python
import sys
from django.core import serializers
def migrate_with_model(primary_model, secondary_model, offset=0):
secondary_model_data = secondary_model.objects.using('mysql').all()
for i, feed in enumerate(secondary_model_data[offset:].iterator()):
stories = primary_model.objects.using('mysql').filter(story_feed=feed)
print "[%s] %s: %s stories" % (i, feed, stories.count())
sys.stdout.flush()
original_data = serializers.serialize("json", stories)
new_data = serializers.deserialize("json", original_data,
using='default')
for n in new_data:
n.save(using='default')
migrate_with_model(primary_model=Story, secondary_model=Feed)
This makes it much faster, since I only have to sort a few hundreds records rather than the entire Story table and its million rows.
Also of note is that while all of the data made it into the Postgres tables, the sequences (counts) were all off. Many were at 0. To remedy this easily, just use the count of the table itself and store it in the sequence table, like so:
#!sql
select setval('rss_feeds_tag_id_seq', max(id)) from rss_feeds_tag;
select setval('analyzer_classifierauthor_id_seq', max(id)) from analyzer_classifierauthor;
select setval('analyzer_classifierfeed_id_seq', max(id)) from analyzer_classifierfeed;
select setval('analyzer_classifiertag_id_seq', max(id)) from analyzer_classifiertag;
select setval('analyzer_classifiertitle_id_seq', max(id)) from analyzer_classifiertitle;
select setval('analyzer_featurecategory_id_seq', max(id)) from analyzer_featurecategory;
I just made a quick text macro on the table names. This quickly set all of the sequences to their correct amounts.
This post has been translated to Spanish by Maria Ramos.
NewsBlur: Most Watched This Week
It's always nice to see that after working on a project for 13 months, people are finally starting to use it. The source behind NewsBlur is available on GitHub: http://github.com/samuelclay/NewsBlur/. And recently, in response to a Hacker News thread about why RSS readers sucks, I linked to NewsBlur and explained my rationale:
I think I created a very nice feed reading experience with NewsBlur: http://www.newsblur.com.
It shows the original site, allows you to read as you normally would, but keeps track of the stories you're scrolling past.
It also allows you to filter stories based on what you like and dislike about them: words/phrases in the title, tags and categories, authors, and the publisher themselves. There is a slider that allows you to show/hide stories based on this filter. It's very fast, too.
I am writing an iPhone app so you can use NewsBlur everywhere. It's just a hobby project, and people have so far been impressed. But I would love for NewsBlur to become a useful tool that people choose to use.
I wrote it because I was also dissatisfied with readers, especially Google Reader. I also knew Python (Django!), JavaScript, and wanted to put them together to test my abilities.
Currently, I am writing the iPhone app that will allow NewsBlur to be useful to a significant portion of Internet users who read RSS. Everybody that I have talked to says they are waiting for a good mobile version before they sink in time and curation into NewsBlur. Let's hope I am not underestimating when I say 1-2 months.
Code Snippet: jQuery Edit In Place
There are many solutions to the edit-in-place problem, but I wanted to make an easy solution that wasn't as complicated as some of the other edit-in-place JavaScript scripts, like jEditable.
Features:
- Detects surroundings and keeps the input container as either a block or inline display.
- Highlights text if it is the original text. If the text has changed, the entire text is not highlighted on edit.
- Easy customizable and styleable.
Demo
JavaScript Code
(function($) {
$.fn.extend({
edit_in_place: function(opts, callback) {
var self = this;
var defaults = {
'input_type': 'text'
}
var options = $.extend({}, defaults, opts);
return this.each(function() {
var $this = $(this);
var $input;
var original_value = $this.html().replace(/<br.*?>/g, '\n');
var original_display = $this.css('display');
$this.bind('click', function() {
var starting_value = $this.html().replace(/<br.*?>/g, '\n');
if (options['input_type'] == 'text') {
$input = $.make('input', { type: 'text', name: 'eip_input', value: starting_value });
} else if (options['input_type'] == 'textarea') {
$input = $.make('textarea', { name: 'eip_input' }, starting_value);
}
var $form = $.make('div', { className: 'eip-container' }, [
$input,
$.make('button', { className: 'eip-submit' }, 'OK'),
$.make('button', { className: 'eip-cancel' }, 'Cancel')
]);
$this.css({'display': 'none'});
$this.after($form);
$input.focus();
if (original_value == starting_value) {
$input.select();
}
var restore_input = function(input) {
return function($this, $form) {
$this.css({'display': original_display});
$form.empty().remove();
if (input) {
$this.html(input.replace(/[\n\r]+/g, "<br /><br />"));
$.isFunction(callback) && callback.call(self, input);
}
}($this, $form);
};
setTimeout(function() {
$(document).one('click.edit_in_place', function() {
restore_input($input.val());
});
$form.click(function(e) {
if (e.target.className == 'eip-cancel') {
restore_input();
$(document).unbind('click.edit_in_place');
} else if (e.target.className == 'eip-submit') {
restore_input($input.val());
$(document).unbind('click.edit_in_place');
}
e.stopPropagation;
return false;
});
}, 10);
});
});
}
});
$.extend({
make: function(){
var $elem,text,children,type,name,props;
var args = arguments;
var tagname = args[0];
if(args[1]){
if (typeof args[1]=='string'){
text = args[1];
}else if(typeof args[1]=='object' && args[1].push){
children = args[1];
}else{
props = args[1];
}
}
if(args[2]){
if(typeof args[2]=='string'){
text = args[2];
}else if(typeof args[1]=='object' && args[2].push){
children = args[2];
}
}
if(tagname == 'text' && text){
return document.createTextNode(text);
}else{
$elem = $(document.createElement(tagname));
if(props){
for(var propname in props){
if (props.hasOwnProperty(propname)) {
if($elem.is(':input') && propname == 'value'){
$elem.val(props[propname]);
} else {
$elem.attr(propname, props[propname]);
}
}
}
}
if(children){
for(var i=0;i<children.length;i++){
if(children[i]){
$elem.append(children[i]);
}
}
}
if(text){
$elem.html(text);
}
return $elem;
}
}
});
})(jQuery);
To use this code, simply use this HTML, CSS, and small JavaScript snippet:
<div class="eip">
Test Input: <span class="eip-text">Click here to change this text.</span>
</div>
And this CSS:
.eip {
font-family: Helvetica;
font-size: 16px;
}
.eip .eip-text {
font-weight: bold;
padding: 2px 3px;
border: 1px solid white;
}
.eip .eip-container {
display: inline;
}
.eip input {
font-family: Helvetica;
font-size: 16px;
font-weight: bold;
padding: 2px;
border: 1px solid #A0A0A0;
display: inline;
width: 250px;
}
And this simple piece of JavaScript, which includes a callback function that has the same scope as the original selectors:
$(document).ready(function() {
$('.eip .eip-text').edit_in_place({}, function() {
var $this = $(this);
$this.animate({'backgroundColor': 'orange'}, {'duration': 300, 'queue': false, 'complete': function() {
$this.animate({'backgroundColor': 'white'}, {'duration': 300, 'queue': false});
}});
});
});
Note that I am animating background colors in this small JavaScript snippet. To animate colors, you need John Resig's excellent jQuery.color.js.
Code snippet: Stopping a jQuery AJAX Request
I want JavaScript to feel as smooth as a native application. I think scrolling is one of the largest issues, but this code snippet is more about aborting the jQuery AJAX event before it has a chance to complete.
There's no good documentation in the jQuery docs about how to do this. other than to just use this command on an existing AJAX request:
var request = $.ajax('/url', data, callback);
request.abort();
That doesn't work. Well, it does work, but if you try to run it again or synchronously with other requests, you'll run into issues.
The issues are non-trivial, but avoidable. I'll cut to the chase; I came up with a solution, then found that somebody did it better and more correct.
Rather than spreading incorrect (rather, incomplete) code, I'll just show the proper way to do it and then link to the source.
_isAbort: function(xhr, o){
var ret = !!( o.abortIsNoSuccess
&& ( !xhr
|| xhr.readyState === 0
|| this.lastAbort === o.xhrID ) );
xhr = null;
return ret;
},
That's a lot of work. Don't bother, just use jquery.ajaxManager v.3.0: http://www.protofunc.com/scripts/jquery/ajaxManager3/
Note, however, that if you just google "jquery ajax manager" or some variant, you will end up at the old version, which is at: http://www.protofunc.com/scripts/jquery/ajaxManager/. They could do some work on their google juice pointing to the latest version.
Hope this helps somebody else, even if part of a google search for "jquery ajax stop request" someday.
A jQuery Plugin: Default Values for Input Fields
One of the best ways to write code that you tend to have to re-use is to put it in the public domain. That way when you need it again, it's a Google search away from your own blog.
This is a rather simple working example of default text on an input field. Click on the field, the text disappears, only to reappear if the user clicks somewhere else on the page without typing. The input also has a special class signifying that it is empty, so you can style the empty input.
Demo
JavaScript Code
(function($) {
$.fn.extend({
input_default: function(default_text, opts) {
if (typeof default_text !== 'string') {
opts = default_text;
} else if (!opts) {
opts = {
'default_text': default_text
};
} else {
$.extend(opts, {'default_text': default_text});
}
var defaults = {
'default_text': 'Type here...',
'class_name': 'empty-input'
};
var options = $.extend({}, defaults, opts);
return this.each(function () {
var $this = $(this);
if ($this.val() == ''
|| $this.val() == options['default_text']) {
$this.addClass(options['class_name'])
.val(options['default_text']);
}
$this.bind('focus', function() {
if ($this.val() == options['default_text']) {
$this.val('')
.removeClass(options['class_name']);
} else {
$this.select();
}
}).bind('blur', function() {
if ($.trim($this.val()) == '') {
$this.val(options['default_text'])
.addClass(options['class_name']);
} else {
$this.removeClass(options['class_name']);
}
});
});
}
});
})(jQuery);
Usage
First, the HTML you can use:
You can call `input_default` with no arguments and get the defaults:
$('.text').input_default();
Specify an optional string or class:
$('.text').input_default('Enter text here...', {'class_name': 'empty'});
Here is some sample CSS to use:
.default-text {
border: 1px solid #C0C0C0;
padding: 2px;
font-weight: bold;
font-size: 14px;
}
.empty-input {
color: #A0A0A0;
}
.default-text-label {
font-size: 16px;
font-weight: bold;
color: #303030;
}
A Faulty Heist: A Storybird
This Storybird is written by thesundaybest, found on twitter: @thesundaybest.
A Faulty Heist by thesundaybest on Storybird
Storybirds like this remind me why I love working with a community of artists and children's literature.
