walmart pharmacy morton il passing urine drug test
Home + News + Speaking + Connect + About

Netscape instability

July 3, 2006 @ 8:57am

No, I'm not talking about the mental health of the team that's now running the site. I'm talking about trying to put together the largest site of my career so far.

The earliest sites I worked on were TV Guide's first site and Business Week's first site, both in 1995. There was no such thing as a crush of traffic back then and everything I did was on the front-end anyway, just design work. I started doing database-driven sites with my Tech-Engine crew, but that never got popular enough to strain under heavy usage. I did a bunch more database/web apps for Business Week, but it was never something like the home page of, just little tools like the business school ROI calculator or a private PDF downloads site.

The first time I had to deal with a crush of traffic that was too much for my server-side code to handle was when I did the CMS for A List Apart. The tools were great and the site was an upgrade from the hand-rolled versions before it, but it was definitely not Slashdot-proof. It was running on my mail server. More accurately, it was running on my lone mail/DNS/SQL/FTP/web server.

As completely dynamic and uncached as my Capgemini site is, it must not have enough traffic to reveal any flaws it its ability to scale. Either that or they do have tons of traffic, but they are running on really good hardware. Or maybe it has no flaws!

On the Kansas City Chiefs NFL team site, that was probably the only example of a Windows IIS/SQL site that has also never showed any performance trouble and I know it gets a ton of traffic. Again, I chalk that up to good hardware, since there's very little caching going on and no one's complaining about performance.

Weblogs, Inc. had its ups and downs. It started out on that same email server, sharing resources with A List Apart. The first two Windows versions of the Blogsmith platform had limited caching, mainly certain dynamic portions of pages were written into include files as were all RSS feeds, but we had outages and growing pains. Things like a Slashdot article hotlinking straight to two of our video files didn't help. We quickly moved to two new dedicated servers and eventually split the work among ten servers, but the ASP application wasn't really designed to be run like that. It wasn't built with worst-case traffic levels in mind.

The latest Blogsmith version was built for scaling. It runs on Linux and it has smart caching and a ton of redundancy. We can add web servers and database servers as needed to handle new traffic levels. Fortunately, much of the same team that did the latest Blogsmith platform also worked on the new Netscape.

The Netscape application is way more difficult to cache. On Blogsmith, you're dealing with blog posts and comments. If a list of blog categories is stale for 30 minutes, no one can really tell, but on Netscape if someone adds a vote or a comment and the numbers of votes and comments don't change, the experience is ruined. We were also running servers on the east and west coasts and we ran into a problem with sessions where we could attach visitors to a single server to make the experience more real-time, but the network wouldn't lock visitors into a single city. So as you bounced between the two cities your vote numbers would change and you'd think the site was broken and unstable.

I always compare the editorial side of Netscape -- the news, voting and anchor reporting -- to the stock market. The public drives stock prices (stories) up and down and the anchors are just here to tell you about the top movers -- stocks on their way up or down -- and call attention some that you might have missed. On the technical end, it's clear that we're dealing with the exact same thing as you'd see on a trading floor: we have to be able to scale to incredible levels and at the same time keep up the appearance of being a real-time application.

So we tuned a bunch during our beta testing in June and we stress tested eight different ways before converting over the DNS and adding the traffic of to the system that was only running at

Even with all of that preparation, Netscape's built-in traffic load was a whole new level of pain. The first night of the cutover was overwhelming. We turned off some features to lighten the load. We increased some caching times. We turned off some of the cool-factor Ajax for non-members.

It was unreal, but we survived it.

We're adding more equipment. We're fixing bugs and making the system smarter. We know where the remaining pain points are. Most importantly, we gained a ton of experience at taking a dynamic site beyond the previous heights of a global consulting firm, an NFL team and the world's largest blog publishing network. Soon we can get back to adding hot new features, because a whole lot of stuff didn't make it into last week's launch.

Newer: Automating Netscape submissions for AOL and Weblogs, Inc.

Older: 1010 LOSES