19th June 2008

Recent Interesting Posts- Functional Programming and Dark Launches

Back in March Dare Obasanjo said he was going to stop blogging. Luckily it seems like he didn’t carry through his threat very well and has had a ton of great posts recently.

Dare posted on functional programming, Map/Reduce/Filter in C# 3.0 including some nice background on the topic. I’ve been getting into Python lately which has some really elegant support for anonymous functions, list/collection operations, and specifically things like map/reduce/filter. This is also an interesting approach on some similar techniques in PHP.

Dare also did an interesting post on techniques to dark launch / gradually ramp up new features. The idea is that you want to try out a new feature without risking taking down your entire user base at once.

Some other really simple approaches include putting a hidden iframe into existing pages that access the new feature. Its a really easy way to isolate it from the rest of what is going on for your site. One thing to keep in mind is that as you dark-launch a feature its really important to figure out how you can throttle it up/down. If you have 20 front end servers, maybe you can deploy it only on a certain set of them or else your pages can only serve it up a certain random percentage of the time. This way you can start it up on a very low load (say 1% of the requests) and slowly increase it to try it out more. If you start having some performance problems in your data center, just kick it down a couple of notches.

Deploying to only certain specific servers can be especially interesting because it can (if done right) focus the load on those servers. So those servers behave the way the whole system will once the system gets fully turned on, but don’t have the same risk of taking your whole environment down.

posted in Developers, Software, Technology | 0 Comments

18th June 2008

Problems Installing Vista SP1

I’m having a problem installing Vista SP1 on my main workstation. Microsoft Update keeps throwing a Code 80070570 error which has no real description. The best hint I can find searching the net is that it might be related to some disk issues, but running a CHKDSK to fix errors didn’t make the problem go away.

My next attempt is to download the standalone upgrade- its possible the problem I’m experiencing is with Microsoft Update rather than the service pack itself.

posted in Microsoft, Software, Technology, Vista | 2 Comments

13th June 2008

Memcache Race Condition Fun

Dealing with cache expiration with memcache has some subtle gotchas that many people ignore at first. Lets say you are using memcache to cache an object that you want to refresh at least every 5 minutes. The typical pattern for this would be (in PHP)-

$myobj = $memcache->get('key');
if($myobj == NULL)
{
  $myobj = LoadMyObjFromMySql();
  $memcache->set('key', $myobj, 300);
}

With this code you should expect to see one query on your database every 5 minutes. But implement this on a high capacity web site and you will probably see your MySql database get pegged every 5 minutes with a spike of a ton of queries. What is going on?

Notice that there is a time window in between the memcache get, the database query and the memcache set. This window isn’t large, but it could easily be 10ms. On a web site that is running 1000 requests per second you could easily have 10 different requests where the memcache get fails (because the object is expired), and they each hit the database before they update the memcache and everything is back to normal.

The solution is to not rely on memcache to manage the expiration. We can rewrite the code above to look more like-

$myobj = $memcache->get('key');
if($myobj == NULL || $myobj->expire + rand(0,120) < time())
{
  $myobj = LoadMyObjFromMySql();
  $myobj->expire = time() + 300;
  $memcache->set('key', $myobj, 600);
}

This way memcache still has an expiration for the objects, but its longer than our real target. For the first 300 seconds the objects always just get loaded from the memcache server. Starting at 300 seconds there is a tiny chance that each request will refresh from the database and update the memcache. This way if there is a large load on the server, the chances are that only 1 or 2 updates will happen, but that it will happen right away. If there is a lighter load within 7 minutes the object will eventually get expired for sure. So the chance that any given request will refresh the cache ends up being inversely proportional to the current load.

The above code of course relies on you storing PHP “objects” in the memcache so it can tag the extra expire property on. If you aren’t storing something that is already an object you can always create a new object that stores your real thing in a property called “data” and still uses expire. You will pay a small overhead for the object marshaling but it shouldn’t be too painful.

posted in Software, Technology | 0 Comments

9th June 2008

Microsoft Velocity and Memcache

Just saw a post about Microsoft Velocity, Microsoft’s answer to memcache. I’m looking forward to checking this out soon- we have had a ton of success using memcache on the LAMP platform and it was a missing piece in the .NET world. I understand why it was a bit hard to see its importance in that environment- Unlike PHP, .NET can persist things in memory in between requests. PHP really needs memcache quickly since you can’t really save anything from request to request so you go to the database very quickly. But as your system grows Memcache fills an even more important role because of how easily you can scale out adding more caching servers.

Memcache (at least the standard one) has a few problem scenarios all revolving around how it so easily shares the load between servers. The coolest thing about memcache is how simple of a mechanism it is- no complex configuration to tune (there are a few things to tune), maintain, debug. You simply configure N memcache servers on all your web server, and when you save or retrieve an object, it takes the ‘key’ (an arbitrary string), hashes it and uses the hash to pick one of the memcache servers (in effect it does H(key)%N where H is the hashing function and N is the number of servers). In effeect you automatically get a smooth distribution of your keys across your caching servers.

However, if you add or remove memcache servers from the array, it changes the hash, so all of the sudden your keys are on different servers. Now, if your site is under low load, this isn’t that big a deal- you just dumped your whole cache and it will build back up and be fine. But if you are running memcache because you really need it, your site just went down as all the web-servers just started pounding directly on the database with every request. Right now DeepRockDrive has a pretty unique situation where we get huge spikes of traffic (that are mostly predictable- showtime) during which the memcache servers really save our bacon, and most of the time we can clear them out more or less safely, but most big sites are going to have a more consistent traffic pattern and would have a harder time with this.

This also means that if you a memcache server goes down you can’t just pull it out of your configuration (at least under normal load). You really really need to replace it. The easiest (although resource intensive) way to do this is to just have a hot-spare server or two in your track. If one of your memcache servers goes down, you map that spare to the same IP and bring it up. You just lost a portion of your cache (10% if you have 10 servers, etc..), but its way better than losing the whole thing. A more complicated setup would be to run multiple instances of memcache on every machine. So if you have 10 memcache machines, you run each with 3 IPs and it looks like your whole array is 30 “servers”. If one goes down, you bring those 3 IPs up on 3 of the other machines distributing some of its load to those machines.

We hadn’t had a chance to fully work out the scalability of the standard memcached running with multiple instances on the same box so far. We have played with it some and on our 8-core boxes even with the right threading libraries we haven’t gotten close to maxing it out with a single instance of memcache. It looks like there may be some I/O limits but I can’t be sure about what is actually going on, still the notion of running multiple instances on the same box seems like a fairly reasonable one for scalability and these fail-over flexibility cases.

The other tricky issue is that memcache gives you a balanced distribution of keys but does not necessarily give you a balanced distribution of access. Lets say you had some runtime configuration information that you wanted to persist on your site. The easy thing to do would be to save it in a key called ‘config’ and just retrieve that key on every request.

What you have accomplished here is to just introduce a nice hard-scalability limit into your system. Memcache isn’t actually that much faster than MySql is for basic queries. If MySql can cache the query well (as it would be able to for a query on a simple table that just gets hit over and over), the performance of the two will be pretty similar. Where memcache shines is that because of how its keys and the hashing thing work, it can transparently distribute that load over the multiple servers. So the person building the app with the ‘config’ key will have something that looks great as its small and on one caching server and then when they try to apply it to a high load site with multiple servers. All the traffic still just goes to the one memcache server (since its one key that gets hashed the same every time) and they will typically be stumped why the performance isn’t better.

The way around this is to generate keys that look something like-

‘config-’.rand(0,9)

(php syntax)

At first this is counter-intuitive. I’m storing the same thing in 10 keys? That means that when they expire I’m going to have to go back and do 10x the initial loading of this object (whether from the database or config files or whatnot). However, at the cost of a very small # of those database queries (they only happen once every 5 or 60 minutes right), I’m spreading my keys out across my memcache array and the result is that the load gets spread smoothly across my whole array. I can even do a couple of slick extra things like every time I refresh the config data write to all 10 keys at once, resulting in no extra load on the database (except for a race condition I’ll cover in a future post) and just a small amount of occasional load on one web server.

I started this post mentioning Microsoft Velocity and then went into memcache- looping back to Velocity, in typical Microsoft fashion it looks like its a much more complicated solution, but it also automatically deals with some of the above issues. As far as I can tell from a few architecture diagrams the servers maintain knowledge of a cluster (memcache servers have no idea about each other) and I’d assume they automatically deal with some of the fail-over and “add a server” cases. It also has a more explicit concurrency model- although memcache supports building things with much of the same concurrency behavior you need to manage it a bunch more yourself.

Looking forward to checking out velocity more later. In particular I’m interested in when the protocol to talk to the servers will be published and whether there will be support for PHP/Python clients talking to these servers.

posted in Developers, Microsoft, Software, Technology | 6 Comments

12th May 2008

Twitter, Ruby on Rails and Scalability

Blaine Cook, the former CTO of Twitter writes about scalability. Twitter has often been pointed to as an example of the kind of problems that a Ruby on Rails application will often encounter when trying to really scale big. He points out that languages don’t scale, architectures do.

Which is right. The problem isn’t Ruby. Its Ruby on Rails. Ruby is just a language. Ruby on Rails is an architecture that makes database interactions sometimes _too_ automatic. Unfortunately Blaine’s post seems to miss this distinction.

Its possible to build scalable applications with Ruby on Rails, but for all that Rails advocates making writing web apps the right way “on rails”, it leads you down some poor paths with respect to scalability. I know some great developers who understand enough about how the inner stuff works that I’m sure they can make scalable Rails applications, but I’ve also seen most that aren’t.

posted in Developers, Software, Technology | 0 Comments

9th May 2008

Facebook Connect

Facebook just announced Facebook Connect which lets you use Facebook to authenticate on your own site. Except that, uh, its already possible to do that with the existing Facebook APIs, although other than our implementation at DeepRockDrive I haven’t seen many other sites do this. I suppose the Connect stuff makes the approach a bit more smooth and supports it more officially, but its all there already.

For us supporting Facebook authentication was a no-brainer. We care about getting a ton of people to our site to see our cool interactive shows. If we can skip a whole registration process and all that mess and have them just click the Facebook icon, and it works, perfect.

posted in Developers, Facebook, Software, Technology | 0 Comments

6th May 2008

Visual Studio 2008 Crashes

I’ve been having a ton of issues with Visual Studio 2008 since it came out. I mostly use it as a text editor at the moment for editing my PHP files. I know this is a bit of a wacky scenario, but you can open a web project, get a good view of your directories, and I’m very used to all its shortcuts, etc (I feel lost in Eclipse, other editors).

One of my favorite VS features has been the fairly powerful “find in files” command. I’m sure Eclipse has something equivalent, but I haven’t found it yet. Unfortunately this has been routinely crashing in VS 2008, to the point where I have a copy of VS 2005 open just to search in my project. Then today Visual Studio 2008 crashes just opening one of the PHP files in my project.

I’ve seen a bunch of reports of similar things on the net, although most seem to be issues with 64-bit systems. Searching around a bit I found Scott had posted a link to a hot-fix patch roll-up. The good news is that it looks like it fixed my “find in files” problem, but it doesn’t seem to have fixed the problem opening that one file. I’ll post more as I figure it out later.

Update- After editing that problem file in VS 2005, the crash went away in VS 2008- so far the patch is a big success. Also, Scott Guthrie continues to score points in the “most responsive and helpful Microsoft person ever” category by jumping on my post with an offer to help. Given his 5-gold stars (I’ve been playing too much Rockband) and 1million+ score in that contest, I’m sure no one can catch him anyway.

posted in Developers, Software, Technology | 1 Comment

2nd May 2008

Microsoft Mesh Second Impressions

I mentioned last week that I was going to have a hard time playing with Mesh because it won’t run with UAC disabled. First of all, Microsoft did post an explanation of why the current version requires UAC. I understand the explanation, but it is one of the typical cases where teams have to build on this internal technology and that one and this other one, none of which are ready for prime-time and all of which aren’t necessarily focused on actually providing user value…

In any case, I did install it on two XP machines. I’m about to shut it off. I noticed one of them was running a bit slow lately and checked the task manager. Mesh is using 490MB of memory, and has consumed 9 hours of CPU over the past couple of days. And in case it wasn’t clear, I haven’t really been asking it to DO anything…

I’m also mildly annoyed that the process is named “Moe.exe” so it took a small amount of digging to figure out what it was that was consuming all my system resources. They also have a second process called MoeMonitor.exe that somehow has consumed over 1.25hours CPU and 33mb RAM. It appears to be responsible for the task-bar icon.

posted in Microsoft, Software, Technology | 2 Comments

24th April 2008

Mesh First Impressions

I wanted to post some first impressions playing with the new Microsoft Mesh. Unfortunately I’ve barely been able to use it. Many of my friends I’m sure will complain about the lack of Mac support so far, but for me the lack of Windows support is worse.

It doesn’t work on my Vista machines- for some reason at this point it requires User Account Control (UAC) to be “on” (but of course won’t run on an account that doesn’t have admin privileges. I’m not going to rant too much about this yet since I’m hoping its just because its an early release. But this one is really confusing unless its something some wacko did to try to push UAC (EVIL!) on people. There is really nothing you can do with UAC on that won’t work with it off. Again, I’m hoping this is just one of those wacky bugs that got punted at the last minute for this early release…

It also doesn’t work on Server. So one of the more useful machines for me to sync to- the server box that I keep all my important data on, can’t play. This isn’t a normal “home” scenario, except that Microsoft is trying to push Home Server (I’m not running Home Server, but the install error pertty clearly said XP SP2 and Vista only). Again, I’m hoping this is just early release stuff.

Then one of my XP machines has Firefox as its default browser. The Mesh client bits keep poping URLs, but since Firefox can’t run the ActiveX associated with them, it fails. Here I feel some sympathy with Microsoft. The ideal solution would be if your UI requires Internet Explorer, launch IE directly instead of the machines configured “default browser”. However given the wacky government regulation of Microsoft they would likely get in trouble for doing this sensible thing, and so its yet another aspect that doesn’t really run correctly for me.

So, someday I hope to report on Mesh itself, but its not going to happen yet.

posted in Software, Technology, Vista | 2 Comments

6th February 2008

Antechinus JavaScript Editor v10

A good programmer’s editor is probably one of the most important tools you can have. I’ve been working with PHP and Javascript a bunch lately and the fact that I mostly use Visual Studio is really quite sad. It doesn’t know anything about PHP and so far I’ve not been that impressed by editing JavaScript in it either (its ok for debugging Javascript).

So I got mail today that the Antechnius JavaScript editor is out with version 10. The big deal in the new version is that they merged the PHP and JavaScript editors. I have played with both before but never stuck with them at all, mostly because it sucked that there were two different ones. The notion of one environment for both sounds great.

First of all, it doesn’t appear to have a “find in files” function. I use this all day long in VS. Find everywhere that calls foo(). Find this variable somewhere in the code-base. Especially when tackling a code-base that you aren’t familiar with yet this is crucial. The editor has a notion of a “project” but it seems limited to providing a file browser and uploading things via FTP.

It does have a handy thing that picks out all the functions in the current file, but again, it doesn’t know anything beyond the current file. So no help to find where foo() is declared.

It doesn’t really support mixing HTML and Javascript. So debugging Javascript in stand-alone JS files is fine, but if you put it in your HTML file you can only execute a little bit at a time by selecting it and saying “execute”. Its also not clear when you do that (or otherwise try out your Javascript) how it deals with bringing in includes and libraries and what-not.

It does let you run PHP stuff and do a syntax check, but its not really integrated. Its just running PHP.exe to do that, and displaying the results in a text window. You can’t even click on errors to go to the right line and they do nothing to help with the poor error messages in the PHP engine (two examples- you can be missing a close parenthesis but it tells you unexpected ‘{’, or forget to close a string and you get unexpected T_STRING, both referencing the wrong line. And yes, I know why the compiler gives the errors there, but its not user friendly and not really 2008-state-of-the-art).

I’ll play with it for a few more days but it feels like its still in the “not quite enough to be useful” category.

posted in Developers, Software, Technology | 0 Comments