Monday, June 30, 2008

Background Jobs for Rails applications

This past week, I've been working pretty hard on my startup application, but not in the common areas that I'm used to. By that I mean that everything I've done this week has been involving processes that aren't kicked off by a web-request.

Most of my work involves grinding out new features for my web-app, but then on monday I came to several items in a row in my backlog notes that were going to be outside this realm. Although I have several specific examples of my own, you could probably think of hundreds of tasks like this: purging database tables, recieving emails into the database, sending users SMS or email notifications based on time-sensitive data, processing batches, generating nightly reports...I'm sure the list goes on and on. What's disturbing is something I saw far too frequently as I was crawling the web for a good solution was this: a random script full of copy-and-paste code that manually loads the rails environment ahead of time (or worse, PIECES of rails). Let me say right now: "Oh god, please don't ever do that". Yes, it works, but there's no REASON to go to that sort of trouble.

There are two ways I know of that you can load the rails environment easily: script/runner, and Rake.

If you're writing a script that you just want to run once, and it needs to hook into your rails app, just write it the way you would inside your application (go ahead and assume the presence of all the rails libraries) and then run it from the home directory of your application like this:

$> ruby script/runner my_script.rb


If you're building something that you want to run frequently, or even at regular intervals, just write the code the way you would in a rails model and package it into a rake task like this (this file goes in the "lib/tasks" directory of your rails app):

#server_tasks.rake

namespace :server do

desc 'Send SMS Notifications'
task :send_notifications => :environment do
Notification.find(:all).each do |n|
n.send
n.destroy
end
end

desc 'Build Batch File'
task :build_batch => :environment do
file = File.new("new_batch.txt","w+");
t = Translator.new()
data_string = t.translate(Notification.find(:all))
file.puts(data_string)
end

desc 'process emails'
task :check_mail => :environment do
MailFetcher.check_mail
end

end



Now you can run it any time with the following:


$> rake server:check_mail


or you can have a cron job run it regularly with this entry in your crontab file:


5 0 * * * cd /path/to/your/app && /path/to/your/rake server:check_mail RAILS_ENV='production'


MUCH better.

Friday, June 27, 2008

using Ruby for IMAP with Gmail

Yesterday, I spent the day working on a problem involving emails. Basically, it can be summed up as follows: the Rails application I'm working on has certain model objects that need to be able to have some data populated via emails (an email gets sent to a certain address, and the app processes it and populates data based on the content).

Now I'd read about ActionMailer before, I knew that it had a nifty way to receive email messages and process them as they arrive, so I went and looked at their documentation on how to tackle that problem.

The most advocated method of handling the incoming email is to configure the Postfix server on you box to forward each email into your mailer script as it comes in, but I'm not a fan of that path for two reasons. First, I don't have any desire to worry about configuring a mail server. That probably makes me less "hardcore" in the eyes of any *nix masters out there, but given our time constraints I am forced to accept that I'm just not strong enough in that area yet to trust a business's success to it. Additionally, and probably more importantly, using a real-time forwarding approach means kicking up a new rails process EVERY TIME an email comes in. What if I get hundreds of emails in a single hour? Thousands? All of a sudden, crashing my server becomes as easy as spamming it's mailbox. No, there must be a better way.

And there is. Rather than try and accept each email as it comes in, you could poll the email box with POP or IMAP every so often and thus use one session to process however many emails have arrived in the time since the last session. To me this was obviously superior, and after checking several blogs of other ruby community members I was ready to give it a shot.

So I started a gmail account for ease of testing, and wrote a script that looked something like this:


pop = Net::POP3.new("pop.gmail.com", port)
pop.enable_ssl
pop.start('YourAccount', 'YourPassword')
if pop.mails.empty?
puts 'No mail.'
else
i = 0
pop.each_mail do |m|
File.open("inbox/#{i}", 'w') do |f|
f.write m.pop
end
m.delete
i += 1
end
puts "#{pop.mails.size} mails popped."
end
pop.finish


And this probably would have worked fine if not for one thing. For my purposes, I need SSL to be enabled, and the method on line 2 of the code example above ("enable_ssl"), is an enhancement made in ruby 1.9. Unfortunately, that's a development release, not yet production ready, and 1.8.6 (the current stable version, which I'm using) doesn't have ssl support in it's POP library.

I briefly thought adding the functionality to the 1.8.6 source code myself, but decided that would have to be a last resort tactic. Instead I switched to the IMAP library to see if I'd have any better luck. Here would be a typical code sample you'd see for fetching mail with the ruby IMAP library:

imap = Net::IMAP.new('imap.gmail.com')
imap.authenticate('LOGIN', 'username', 'password')
imap.select('INBOX')
imap.search(['ALL']).each do |message_id|
msg = imap.fetch(message_id,'RFC822')[0].attr['RFC822']
MailReader.receive(msg)
imap.store(message_id, "+FLAGS", [:Deleted])
end
imap.expunge()


This looked promising, but every time I tried to run it I got a very strange error back from the server. Basically, the script would die and print out the words "Not Supported" and a filename coming back from the authentication request (line 2 above, imap.authenticate). I went and looked at the docs, and saw that the authenticate method supports 2 types of authentication: 'LOGIN' (like above), and 'CRAM-MD5', either one can be passed in as the first argument, and further more some servers won't support one or the other.

Sure I had found my fix, I replaced 'LOGIN' above with 'CRAM-MD5', and in an anticlimactic script run I found I got the exact same error.

If you're having this problem when trying to connect to gmail, I'm going to save you some time. The authenticate method in the IMAP library sends an "AUTHENTICATE" IMAP command to the server, which gmail does not support (you can prove this by sending a "CAPABILITY" command to the server, and seeing what options it has available. AUTHENTICATE ain't there).

Instead, using the "login" method (which sends no specific command), you can connect successfully. Here's my final script:


imap = Net::IMAP.new(@config['server'],@config['port'],true)
imap.login(@config['username'], @config['password'])
imap.select('INBOX')
imap.search(["NOT", "DELETED"]).each do |message_id|
MailFetcher.receive(imap.fetch(message_id, "RFC822")[0].attr["RFC822"])
imap.store(message_id, "+FLAGS", [:Deleted])
end
imap.logout()
imap.disconnect()


Enjoy!

Thursday, June 26, 2008

Time Management

"Wow, it's really been a while since I've had the time to sit down and write a blog post."

What a pleasant trap it would be for me to fall into, if I actually believed that sentence above. I could use the same format to justify just about anything: "I'm too busy to learn anything new about my craft", "I'm too busy to spend time with my wife in the evenings", "I'm too busy to keep up with technological advances", etc, etc.

But I don't believe it, not at all. Working on a startup is hard, no doubt, and it consumes a lot of time; but it certainly does not preclude any other activity besides work. If I allow that to happen, it's only because I myself choose to not make time for other things that are important to me, and mostly it's because I'm wasting time in other areas of my work. Want an example? Here's a common one for me:

I'm working on a new feature for my product, and I hit a snag. Unsure of the best way to solve my problem, I fire up my browser and google for similar examples. As long as I've got my browser open, I open a 2nd tab to check my email, and a 3rd tab to examine my RSS aggregator, just to see if anything has come up. If I have any new emails, I'll probably go ahead and read and respond to them, and then I'll read the 2 or three news articles or blog posts that have showed up in my RSS client, and THEN I'll take a look at the google results to my query. POW! I just lost half an hour of time that I didn't need to spend right then (yes, I know email can be important, but nobody needs to check their email 14 times a day; anything urgent enough should result in a phone call).

So after 12-14 hours of "work", I've probably only spent 7 hours actually producing. Of course, by that time, I've lost the majority of the day, and whatever's left I'm forced to use for other daily tasks (feeding pets, feeding me, paying bills, etc). Now, am I really "too busy" to do anything else, or am I being a poor steward of my time?

I think the answer to that question is pretty obvious; I'm wasting a lot of time.

Time is a resource both abundant, and precious. Until the day you die, you will always have more time; but by the same token you cannot HELP but spend it. You can't be conservative with your time during one month, and then splurge another; you will spend the 16 hours you have in a day on SOMETHING whether you intend to or not.

So my feeling is, if you're going to spend that precious resource so freely, shouldn't you at least be benefiting from it? Shouldn't you choose to spend it on activities that are meaningful, rather than on admitted "time wasters"?

In light of the above revelation, I have modified my approach to my day as follows:

-I spend at least 8 hours of the day working.

-During my "work" time (which I take in 90-120 minute intervals), I work hard. I don't check my email, or browse the blogosphere; I take care of business.

-I have several goals I want to accomplish throughout any given day, and I use my breaks between work periods for doing so:

-GOALS:
--1/2 hour of physical exercise
--1/2 hour of honing my craft (technical research or code-reading)
--1/2 hour of studying Russian (the language)
--1/2 hour of musical practice (piano, usually)

Altogether (work plus goals), that's 10 hours, less than I usually spend in a workday, but infinitely more rewarding. After only being on this plan for a week, I can highly recommend it to anyone who feels "too busy". The rewards of taking ownership of your time are best described by a shedding of the general malaise that is an all-too-common part of American office culture. So if I were you, I would stop reading this blog post and go do something you'll be proud of when the day is over.