18th June 2008

Office Hours

Brad Feld posts about Office hours during college and how he tries to do similar things now at their TechStars incubator. We have a different take on office hours at DeepRockDrive but so far it works out really well and I thought it would be interesting to share.

When I started working with DeepRockDrive the technology folks up here in the Seattle area didn’t have a real office at all yet. Folks just met most days in a coffee shop and would hang out and work on the code. There were a few contractors scattered off around various parts of the world and people would often work at home. Everyone would log in to Skype all day in a common chat room so you get the similar concept to shouting over to the guy at the desk near you.

We have had an office now for several months, but its over in Bellevue. Our staff is all over the Puget Sound area and traffic isn’t so wonderful around here most of the time. I was hoping we could maintain some of the culture of being able to avoid wasting 40-60 minutes a day in traffic plus the advantages of being able to concentrate at my home office (not to mention reducing the environmental impact of all that driving especially in stop and go traffic). At the same time to act as a well oiled agile startup we need to have great communication with each other and it was sometimes really difficult to find a time when all the right people were around to discuss a given topic.

What we came up with was the concept of “core office hours”. This is roughly Monday, Wednesday and Friday from 10am to 3pm. During those times people are expected to be in the office (with the obvious exceptions for vacation, travel, important appointments, etc). Those are great times to schedule a meeting, plus you can usually pull together the right people for an impromptu meeting for just about anything. But, with the limited hours this also helps prevent our schedules from filling up with constant meetings so we have solid times to get code done, tested, write important documents, etc. On Tuesday and Thursdays I can avoid getting in the car at all. On Monday, Wednesday and Friday when I do need to go into the office I can do it at a time when traffic is WAY better (20min vs 40-60) plus its a nicer time if you want to bike too.

So far overall I’d say this system is working great, but I do have a few thoughts about some considerations that are necessary to make it work-

  • It is not going to work for all job roles. Some types of jobs require you to be at the central place where people can be there together. And the job needs to be something where the output is pretty measurable- if you can’t tell if someone is goofing off, its going to breed ill-will. If the job is something where the results speak for themselves (amount of code written, bugs found, etc) it is a good fit.
  • It is not going to work for all people. To make this work you need people that are very self-motivated and self-starting.
  • The Skype thing helps us a ton (although any other form of live chat-room with presence information also works). It helps both give us that ability to communicate and get problems solved with colleagues in real time, as well as helps people be visibly “on the job”.
  • It helps to have good network resources. We rely on a combination of the Skype stuff, as well as GMail, Google Docs, Basecamp, and an SVN and Trac server that we have deployed in our data-center. I’d also point out that all of those services are accessible without VPN so our staff can easily work on stuff from home / a cafe / vacation / the road. In theory having to VPN shouldn’t matter but I’ve always found it to be a big barrier to getting real work done.

posted in Business, Jobs, Management, Technology | 0 Comments

18th June 2008

Problems Installing Vista SP1

I’m having a problem installing Vista SP1 on my main workstation. Microsoft Update keeps throwing a Code 80070570 error which has no real description. The best hint I can find searching the net is that it might be related to some disk issues, but running a CHKDSK to fix errors didn’t make the problem go away.

My next attempt is to download the standalone upgrade- its possible the problem I’m experiencing is with Microsoft Update rather than the service pack itself.

posted in Microsoft, Software, Technology, Vista | 2 Comments

13th June 2008

Memcache Race Condition Fun

Dealing with cache expiration with memcache has some subtle gotchas that many people ignore at first. Lets say you are using memcache to cache an object that you want to refresh at least every 5 minutes. The typical pattern for this would be (in PHP)-

$myobj = $memcache->get('key');
if($myobj == NULL)
{
  $myobj = LoadMyObjFromMySql();
  $memcache->set('key', $myobj, 300);
}

With this code you should expect to see one query on your database every 5 minutes. But implement this on a high capacity web site and you will probably see your MySql database get pegged every 5 minutes with a spike of a ton of queries. What is going on?

Notice that there is a time window in between the memcache get, the database query and the memcache set. This window isn’t large, but it could easily be 10ms. On a web site that is running 1000 requests per second you could easily have 10 different requests where the memcache get fails (because the object is expired), and they each hit the database before they update the memcache and everything is back to normal.

The solution is to not rely on memcache to manage the expiration. We can rewrite the code above to look more like-

$myobj = $memcache->get('key');
if($myobj == NULL || $myobj->expire + rand(0,120) < time())
{
  $myobj = LoadMyObjFromMySql();
  $myobj->expire = time() + 300;
  $memcache->set('key', $myobj, 600);
}

This way memcache still has an expiration for the objects, but its longer than our real target. For the first 300 seconds the objects always just get loaded from the memcache server. Starting at 300 seconds there is a tiny chance that each request will refresh from the database and update the memcache. This way if there is a large load on the server, the chances are that only 1 or 2 updates will happen, but that it will happen right away. If there is a lighter load within 7 minutes the object will eventually get expired for sure. So the chance that any given request will refresh the cache ends up being inversely proportional to the current load.

The above code of course relies on you storing PHP “objects” in the memcache so it can tag the extra expire property on. If you aren’t storing something that is already an object you can always create a new object that stores your real thing in a property called “data” and still uses expire. You will pay a small overhead for the object marshaling but it shouldn’t be too painful.

posted in Software, Technology | 0 Comments

10th June 2008

Cheap SSD Drive for my Laptop

A couple of weeks ago I wrote about problems trying to get my SSD drive working in my laptop. Since then I’ve done some experimentation and figured out the issues.

Initially I bought a RiData 32GB CF card (266x speed) and a SYBA SY-SATA2CF CF to SATA Adapter. It wasn’t working (would hang in Windows setup or boot) but I couldn’t tell which component was at fault. Since then I noticed that Sans Digital has the CS2T CF adapter which is shaped like a 2.5″ drive and accepts two CF cards. Its a lot more expensive than the Syba adapter ($99 vs. $18), but it works and $18 isn’t a bargain for a card that I just couldn’t get work right.

Having the Sans Digital be shaped like a normal drive is also a huge help. With laptops you often need to insert the thing way back into the case and they all pretty much assume the standard drive form factor. The Syba was a big pain to get in but the Sans Digital fits into my Dell laptop easily. Once I could tell it worked I bought a second RiData 32GB card and was able to just insert it and expand my volume in Windows- Presto! 64GB SSD drive for $290. It runs Vista great and I’ve installed Voyager (flight planning software) so I should be able to use it in the airplane.

I should mention that I’ve bought several products from Sans Digital so far- I’ve also gotten both of their 4-drive SATA external enclosures, the USB TR4U and the eSATA TR4M. Both work great and are inexpensive and easy to manage ways to add massive storage to your computers.

posted in Hardware, Technology | 0 Comments

9th June 2008

Microsoft Velocity and Memcache

Just saw a post about Microsoft Velocity, Microsoft’s answer to memcache. I’m looking forward to checking this out soon- we have had a ton of success using memcache on the LAMP platform and it was a missing piece in the .NET world. I understand why it was a bit hard to see its importance in that environment- Unlike PHP, .NET can persist things in memory in between requests. PHP really needs memcache quickly since you can’t really save anything from request to request so you go to the database very quickly. But as your system grows Memcache fills an even more important role because of how easily you can scale out adding more caching servers.

Memcache (at least the standard one) has a few problem scenarios all revolving around how it so easily shares the load between servers. The coolest thing about memcache is how simple of a mechanism it is- no complex configuration to tune (there are a few things to tune), maintain, debug. You simply configure N memcache servers on all your web server, and when you save or retrieve an object, it takes the ‘key’ (an arbitrary string), hashes it and uses the hash to pick one of the memcache servers (in effect it does H(key)%N where H is the hashing function and N is the number of servers). In effeect you automatically get a smooth distribution of your keys across your caching servers.

However, if you add or remove memcache servers from the array, it changes the hash, so all of the sudden your keys are on different servers. Now, if your site is under low load, this isn’t that big a deal- you just dumped your whole cache and it will build back up and be fine. But if you are running memcache because you really need it, your site just went down as all the web-servers just started pounding directly on the database with every request. Right now DeepRockDrive has a pretty unique situation where we get huge spikes of traffic (that are mostly predictable- showtime) during which the memcache servers really save our bacon, and most of the time we can clear them out more or less safely, but most big sites are going to have a more consistent traffic pattern and would have a harder time with this.

This also means that if you a memcache server goes down you can’t just pull it out of your configuration (at least under normal load). You really really need to replace it. The easiest (although resource intensive) way to do this is to just have a hot-spare server or two in your track. If one of your memcache servers goes down, you map that spare to the same IP and bring it up. You just lost a portion of your cache (10% if you have 10 servers, etc..), but its way better than losing the whole thing. A more complicated setup would be to run multiple instances of memcache on every machine. So if you have 10 memcache machines, you run each with 3 IPs and it looks like your whole array is 30 “servers”. If one goes down, you bring those 3 IPs up on 3 of the other machines distributing some of its load to those machines.

We hadn’t had a chance to fully work out the scalability of the standard memcached running with multiple instances on the same box so far. We have played with it some and on our 8-core boxes even with the right threading libraries we haven’t gotten close to maxing it out with a single instance of memcache. It looks like there may be some I/O limits but I can’t be sure about what is actually going on, still the notion of running multiple instances on the same box seems like a fairly reasonable one for scalability and these fail-over flexibility cases.

The other tricky issue is that memcache gives you a balanced distribution of keys but does not necessarily give you a balanced distribution of access. Lets say you had some runtime configuration information that you wanted to persist on your site. The easy thing to do would be to save it in a key called ‘config’ and just retrieve that key on every request.

What you have accomplished here is to just introduce a nice hard-scalability limit into your system. Memcache isn’t actually that much faster than MySql is for basic queries. If MySql can cache the query well (as it would be able to for a query on a simple table that just gets hit over and over), the performance of the two will be pretty similar. Where memcache shines is that because of how its keys and the hashing thing work, it can transparently distribute that load over the multiple servers. So the person building the app with the ‘config’ key will have something that looks great as its small and on one caching server and then when they try to apply it to a high load site with multiple servers. All the traffic still just goes to the one memcache server (since its one key that gets hashed the same every time) and they will typically be stumped why the performance isn’t better.

The way around this is to generate keys that look something like-

‘config-’.rand(0,9)

(php syntax)

At first this is counter-intuitive. I’m storing the same thing in 10 keys? That means that when they expire I’m going to have to go back and do 10x the initial loading of this object (whether from the database or config files or whatnot). However, at the cost of a very small # of those database queries (they only happen once every 5 or 60 minutes right), I’m spreading my keys out across my memcache array and the result is that the load gets spread smoothly across my whole array. I can even do a couple of slick extra things like every time I refresh the config data write to all 10 keys at once, resulting in no extra load on the database (except for a race condition I’ll cover in a future post) and just a small amount of occasional load on one web server.

I started this post mentioning Microsoft Velocity and then went into memcache- looping back to Velocity, in typical Microsoft fashion it looks like its a much more complicated solution, but it also automatically deals with some of the above issues. As far as I can tell from a few architecture diagrams the servers maintain knowledge of a cluster (memcache servers have no idea about each other) and I’d assume they automatically deal with some of the fail-over and “add a server” cases. It also has a more explicit concurrency model- although memcache supports building things with much of the same concurrency behavior you need to manage it a bunch more yourself.

Looking forward to checking out velocity more later. In particular I’m interested in when the protocol to talk to the servers will be published and whether there will be support for PHP/Python clients talking to these servers.

posted in Developers, Microsoft, Software, Technology | 4 Comments

28th May 2008

FBCal- Calendar events from Facebook into your calendar

FBCal is one of the most useful Facebook apps yet. It just creates an iCal feed from your friends birthdays and/or your events in Facebook. You can subscribe in iCal, Windows Calendar or Outlook. Slick, and very useful.

posted in Facebook, Technology | 0 Comments

23rd May 2008

DIY Laptop Solid State Drive

The hard drive in my Dell laptop started acting poorly so I’m trying to replace it with a solid state drive. Being too cheap to go spend $600 on an “off the shelf” SSD I’m trying to make one using a CF->SATA adapter and a 32GB CF card. Total cost $160.

The only catch is it doesn’t work so far. The Vista install dies part way through “uncompressing files”. Same with XP. At this point I’m wondering if the problem is my CF card (RiDATA 32GB 233X) or the adapter?

Any thoughts? Anyone get this working? There is a really cool looking adapter that lets you use 3 CF cards, but its $180 and its only from geekstuff4u where the shipping to the US is another $45. I can’t find that part from any US place.

posted in Hardware, Storage, Technology | 1 Comment

22nd May 2008

Cool DeepRockDrive Intro

A couple of weeks back Bertrand Paré, DeepRockDrive’s interactive director did this cool intro video that we run at the beginning of shows explaining how things work. We put together a version that folks can preview online and I think its pretty cool-

posted in Technology | 0 Comments

22nd May 2008

Facebook Stats Weirdness

Facebook has some really nice developer pages that let you track the stats of your application. They show you total users, active users each day, daily adds, removes, blocks, and more.

The only catch is that the stats appear to be wrong half the time which limits their usefulness quite a bit. At DeepRockDrive our “daily active users” is frequently LOWER than our daily application adds.

Even worse the total user count which should be a pretty easy statistic swings wildly all over the place. On Tuesday it said we had about 15,000 users. On Wednesday morning it was reporting over 20,000 users, but only about 1,000 new ones??? And today its down to 12,000. From some other stats I suspect the 15,000 number is the closest to reality, but it does undercut my confidence in the overall stats.

posted in Facebook, Technology | 0 Comments

21st May 2008

DeepRockDrive Videos

Stuff at DeepRockDrive has been pretty exciting lately- bigger and bigger shows, some VC visits, scaling up the new site, and all that. But today we just came across something that was pretty amazing.

We have had people post videos of our shows for quite a while. All in all there are 30+ videos of the Sunday Marie Digby show but today we found a few videos on YouTube of people watching the shows. NOT the show itself, but fans enjoying the shows, clapping, waving hands in front of the screen…

http://youtube.com/watch?v=mSEa_1w9Wh8

http://youtube.com/watch?v=esIRYCvfok0

Whoa…

posted in Technology | 0 Comments