9th June 2009

Browser File Upload

Uploading any files more than a few hundred K from a browser has been a problem for years. The UI available in the browser is very limited and relying on a single HTTP request that might take minutes or hours (and that you have to start over from scratch if it fails) often turns into a huge source of user frustration. There is also an extra flaw in that the TCP connection can fail before the whole file is transmitted but depending on the circumstances the server might not be able to tell if the whole file was actually received. There are a bunch of sites that use various ActiveX or Java controls but those have typically been a pain to install and/or flakey.

I just discovered that Silverlight can be used to create much more functional upload controls. Here is one for example in the Codeplex Code library. Granted, users need to have Silverlight already installed, but once they do it becomes much easier to have a good user interface, while having the actual process send chunks of the file that can be resumed if any piece fails, etc. Combine that with the Azure Blob chunked-PUT mechanism and you can build a very robust storage mechanism right in the browser. I’m looking forward to trying it out.

One last thought- it would be useful to define a standard protocol for uploading content in chunks (and yes, this is distinct from an HTTP PUT/POST with chunked encoding). Something along the lines of what the Azure Blob store does but defined as a standard that various controls and services can all interoperate.

posted in Developers, Networking, Silverlight, Technology | 0 Comments

9th June 2009

Regular Expressions Book

Coding Horror has a post highly recommending the new book “Regular Expressions Cookbook”. Now, I have mixed feelings about regular expressions and get concerned when I see them since they are often overused and when misused can result in code that is very hard to understand and debug. Having said that, when used in the right situation they can be a perfect solution to otherwise complicated text parsing & validation. But the art of creating them is often a lot of voodoo, so a book that has good reference materials and examples would be very helpful. I’ll report on this one next week after I have a chance to look through it a bit.

posted in Developers, Software, Technology | 0 Comments

5th June 2009

Azure Blob Storage as a Good HTTP Application

As an old HTTP guy I often get nervous about new services. They tend to violate all sorts of key HTTP architecture concepts and just take advantage of the flexibility to do whatever.

I’ve been really happy to see that the Azure Blob storage actually gets this stuff right. They have valid REST semantics with a good URL namespace, support GET and PUT with the right kind of range headers, etag and conditional operation support. They have a smart design for uploading a large blob in multiple pieces (which works around one of the bigger flaws in the older WebDAV support), and all. Anyway, its great to see a team do all their homework and get these details right- I suspect this will really payoff over the long lifespan of a service as it fits in cleanly with rest of the web services world. (note- I’m not saying other competing products aren’t also doing these things right, I haven’t researched those details lately).

posted in Azure, Developers, Networking, Standards, Technology | 0 Comments

7th January 2009

Calling a SOAP Service from PowerShell

Ever had that experience where you go do some work and then stop yourself and think “There has got to be an easier way to do this”?

I’ve been building some stuff that is mostly exposed as a web-service. I wanted to call it from PowerShell so I made a nifty set of Cmdlets that call it. But really each of them is just marshaling pretty much directly into the underlying SOAP service with no logic at all. I’ve got to assume someone already came up with a generic solution for this?

posted in Developers, Software, Technology | 0 Comments

6th January 2009

Database Design

So here is a classic database design quandary. Lets say you are developing a service with accounts and each account has a set of parameters associated with it- have they bought the service, how much storage do they have, how many user licenses, etc.

Now, the set of things you want to store is going change over time for sure. So option 1 is to design a completely flexible system for storing key-value pairs-
CREATE TABLE [AcctParams1] (
[AcctId] [int] NOT NULL PRIMARY KEY,
[Key] [nvarchar] (16) NOT NULL,
[Value] [int] NOT NULL
)
CREATE UNIQUE CLUSTERED INDEX [AcctParamsIdx] ON [AcctParams1] ([AcctId], [Key])

A couple of notes before I move on. First of all its very important to create the clustered index on AcctId. You want to make sure that all the rows for the given account are grouped together on disk. If not, when the database goes to load just one account it might have to do IOs all over the place just to get the scattered AcctParam values.

As I said above, this approach is completely flexible, ideally involving no database schema changes ever. But the storage is somewhat less efficient and you are going to have to do multiple queries to retrieve everything you need about an account (one for the account itself, another for params).

The other way to do it is to just build out a table of the explicit columns for these parameters in classic database way-

CREATE TABLE [AcctParams2] (
[AcctId] [int] NOT NULL IDENTITY,
[Purchase] [int] NOT NULL,
[UserLimit] [int] NOT NULL,
[StorageLimit] [int] NOT NULL
)
With this approach you need to change the database schema every time you want to add some new param. However I think sometimes people get too freaked out by database schema changes (often because of too painful “upgrades” in the past). For example if you just needed to do this-

ALTER TABLE [AcctParams2] ADD [TransferLimit] [int] NOT NULL DEFAULT 5

On modern databases this kind of thing tends to execute pretty quickly but most importantly if your code is written carefully you can run this on your SQL box while your service is still online without having to simultaneously update the code of the service. I did some examples and create the above table and put a million rows of random data in it (which took 229 seconds on my test box). The above ALTER TABLE took 7 seconds to execute.

Guess what? We can do much better. Try this one-
ALTER TABLE [AcctParams2] ADD [TransferLimit] [int] NULL

By making it a NULL column it took… 0 seconds to execute. And again, if your app is built the right way, existing code will continue to work unchanged. So you can use a database schema that really defines the parameter names and types (in effect, SQL is handling the key-value stuff for you), but still has minimal upgrade impact on the uptime of your service.

A third approach is to have something like-

CREATE TABLE [AcctParams3] (
[AcctId] [int] NOT NULL IDENTITY,
[Params] [text] NOT NULL,
)

And stuff XML in the text column. Again, lots of flexibility, but you need to deal with all the XML parsing and it tends to be difficult to get the database to help you with any queries or analysis of that data. I’ve also seen multiple efforts that went down this path and the XML parsers ended up inflexible enough that they actually introduced tons of inflexibility and upgrade hassle. If you need to touch every row to upgrade the XML on this approach you just made yourself a huge problem.

One last note that applies to either approach 2 or 3- you can combine these params in the base [Account] table. Whether you want to do this or not depends on the access patterns. If 90%+ of the time when you want to access one, you want to access the other (IE, you are always writing SELECT * from [Account] join [AcctParams] on [Account].[AcctId] = [AcctParams].[AcctId] ), you might as well combine them so that its always just one I/O. On the other hand if you often want just one or the other AND they start to get large (lots of values, especially to the point where it gets close to the page size), it can make sense to split them out.

As will all optimization, there are no hard and fast rules- just guidelines and good places to test alternatives.

posted in Developers, Storage, Technology | 0 Comments

3rd January 2009

Holiday Project- Windows Azure

Dare embarked on a holiday project too, building some stuff using Azure. I’ve been meaning to get around to that one too myself. In any case, some great observations there, I’ll have to write up my likes and dislikes about PowerShell myself shortly.

posted in Azure, Developers, Technology | 0 Comments

16th December 2008

ASP.NET Web Service Test Form

ASP.net has this great feature when you are creating web services that it will create a nice HTML form that you can use to test your web-service without constructing a SOAP client. The catch is that by default it only runs for requests from the local machine and I always forget how to turn it on for remote requests.

Here is the trick- add this to your web.config inside the system.web section-

<webServices>
<protocols>
<add name="HttpGet"/>
<add name="HttpPost"/>
</protocols>
</webServices>

posted in Developers, Software, Technology | 0 Comments

4th September 2008

Demo Version Problems

This morning I wanted to check out the latest stuff from Nero- I’m looking for some good video transcoding solutions. The problem with this space is that with all the proliferation of codecs and container formats its often pretty hard to tell whether a given product will work for a specific scenario or not. I’ve got the AVS Media tools and they work for some things but not for others, but was hoping that Nero Recode could do the trick.

After a little poking around I see that Nero has a 15-day trial version. Great, I’ll check it out and if it works I’m happy to buy it. I apply for a trial key, download the 150mb installer and suffer through two reboots (one to remove an older version of some Nero stuff another to install the new stuff).

But then I find out that the feature of encoding into MPEG-4 is also removed from the trial. When I checked back on the trial download page, this information was there, but only if you opened a little “Additional Information” tab on the page- basically it was totally hidden.

Creating mechanisms for demo software can be a pain- most of them are somewhat annoying. For this kind of thing you can do 15-day trails, limited encoding length (only 60 seconds?), watermark the output, etc. But the key is that you need to give people a real taste of the software and this particular implementation sucks- the limitations we hidden on their web site so I wasted my time downloading and installing this useless demo, plus I can’t even verify that the functionality I’m interested in works at all.

Oh well…

posted in Business, Developers, Software, Technology | 2 Comments

2nd September 2008

New ASP.net MVC stuff

Scott Guthrie has a post on the latest preview of the new ASP.net MVC stuff. This looks like a huge step forward in the standard model for ASP.net applications- a bunch of us have been ignoring much of the built in controls model and doing stuff similar to this for years, but the new version from the ASP.net team looks more elegant that what I’ve been doing all along and there is always a big bonus when you are using the standard built-in stuff.

I also think its worth pointing out that this stuff is some pretty good validation for the initial architecture of ASP.net. It feels like they did a really good job layering stuff so they had the basic runtime, the HTTP Handler level over that, the ASPX page model over that, and then a specific page processing / control action model. All along its been possible to take/leave various pieces, for example the urlrewriting.net stuff that plugged in and changed URL namespace stuff while the rest continued to work. Now they have evolved the architecture with a really new way to wire up pages, and its still consistent with the existing stuff and as far as I can tell doesn’t break/change any of the basic architecture. Its pretty rare to see a platform evolve so cleanly years and years after its initial version (for example I’d have to say that it seems like the various Java technologies have had a rougher time evolving cleanly).

posted in Developers, Software, Technology | 0 Comments

17th August 2008

Dare on the Essence of REST

Dare Obasanjo posted an excelent description of the essence of the REST architectural style. Very good write up and it includes some good history and theory.

posted in Developers, Networking, Software, Standards, Technology | 0 Comments