Tuesday, April 29, 2008

On Scripts

I spent pretty much the entire day today onsite at a company for whom I've been working on custom software in my off hours for the last few months. When I had reached the point where I was ready to import all their data (which is currently kept in a series of excel spreadsheets) into my application, they told me that they were unable to get all of it ready in time. They had most of it ready, the only column that was missing in their spreadsheets was the "county" column, which (of course) held the county for each record in the spreadsheet. Currently they had somebody manually looking up the zip code of each one of the 4000 records in the spreadsheet and filling in it's county appropriately.

Wow, you couldn't ask for a better opportunity to show off.

I asked them to give me an hour with an Internet connection, and here's what I came up with (code is in C#.NET):


static void Main(string[] args)
{
Excel.Application excel = new Excel.Application();
excel.Visible = true;
Workbook excelWorkbook =
excel.Workbooks.Open("store.xls",
0, false, 5, "", "", false,
Excel.XlPlatform.xlWindows, "",
true, false, 0, true, false, false);

Worksheet frontSheet =
(Worksheet)excelWorkbook.Sheets.get_Item("Sheet1");

for (int i = 2; i < 4124; i++)
{
try
{
Range zip = frontSheet.get_Range(("J" + i.ToString()),
Type.Missing);

string builder = "http://ws.fraudlabs.com/ZIPCodeWorldUS_WebService.asmx/ZIPCodeWorld_US?";

builder += "ZIPCODE=" + zip.Text.ToString().Trim();
builder += "&LICENSE=05-Y34L-X86F";

WebClient web_client = new WebClient();
byte[] x_response =
web_client.DownloadData(builder.ToString());

string response = Encoding.Default.GetString(x_response);

string strParam = "";
int indexOpen = response.IndexOf(strParam) + strParam.Length;
int indexClose = response.IndexOf("
");
string countyName = response.Substring(indexOpen, indexClose - indexOpen);

Range county = frontSheet.get_Range(("I" + i.ToString()),
Type.Missing);
county.Value2 = countyName;
Thread.Sleep(500);
}
catch (Exception e)
{
Console.WriteLine("" + i.ToString() + ": " + e.Message);
}
}
}


No big thing. It's about 60 lines of code, and for some reason it's one of the most satisfying things I've ever written. Notice the following things about the code: -

-It's not well abstracted
-I did not package it into a library for reuse
-There are many hard-coded values
-file paths are hard-coded relative paths
-it's not particularly efficient (I could have cached repeated results, for example, and applied them without making the webservice call).
-it doesn't follow good naming conventions.
-the exception handling is minimal
-it is not "flexible" by any definition of the word.

Each one of the above points is in direct violation of the all the software principals that I am a very strong advocate of. But the fact of the matter is that this little chunk of code turned a 40 hour project for one of the client's office staff into about 15 minutes of wait time. It took a need, and provided a solution, as quickly as possible.

I think it's possible to get lost in the ideals of "good" software. Believe me, I'm all for sound architecture and the Best-Practices. But honestly, the best measure of any piece of software is how much it helps the person who is using it, and sometimes what you really need is a script.

5 comments:

Damian Brady said...

I think the other thing to note about small scripts and applications like this is that they're clearly written for one small purpose and generally a single use.

Had there been a requirement to ship this as part of a product or use it continuously in an existing application, I'm sure you would have done it differently.

I totally agree with your assertion that in these cases, it's a mistake to get hung up on the "right" way to do things.

Stephen said...

5 minutes got me this "one liner" on my Linux machine...

zippy="30013"; curl -s "http://www.zipinfo.com/cgi-local/zipsrch.exe?cnty=cnty&zip=$zippy&Go=Go" | grep County | awk -F '>' '{print $29}' | awk -F '<' '{print $1}'

Just set "zippy" in a for loop feeding it the zip codes and format the output as db update statements (another 5 mins or less) and your done.

Ethan Vizitei said...

@stephen

That's pretty slick, nicely done!

Anonymous said...

you just demonstrated how and why programming is an art.

It's also called developing to the need. All you needed was a script, sweet script. Anything more than a script in that situation would have been unethical if you were charging by the hour and it took extra time.

Anonymous said...

Heh, I just stumbled here because a friend bookmarked a previous post. This snippet of code proves one of my life's quests. I am a programmer because it found me. I _had_ to write code. But I also went to school and most people who suffer from a college education feel that they need to adhere to this whole set of known good practices all of the time, most of which you have in your list. And yet you recognize exactly what that code accomplished. You did 40 hours of work in 1 hour. That is awesome! I write this kind of stuff all the time. Once you write a _lot_ of code, everyday, 12 hours a day, what difference does 60 lines make? none. So yeah, you abstract, and design your larger projects well, the code that future code is going to use and build upon should be designed well. But if you have to just get something done in the moment, just get it done. its as simple as that. If there's one thing I've learned from programming, its that there is only so much time. Its all about outsmarting it.