Friday, October 29, 2010

the whenever gem and EngineYard AppCloud

If you have any sort of jobs you need run on a regular basis for your rails app, I can highly recommend the whenever gem. Make a schedule.rb file with all your cron jobs in a pleasent ruby DSL, it's lovely! Every time you deploy, you can setup a hook in whatever deploy process you're using (chef, capistrano) to run the "Whenever" executable script and update your crontab. Sweet!

But we've had a problem recently when we deployed our app to Engineyard AppCloud. We put in a deploy hook to run the whenever script, so far so good. The only problem was deployment wasn't the only time that we needed our crontab updated. You see, as with most cloud setups, we could rebuild our cluster at anytime to make configuration changes or what have you, and it should pick up where it left off. That would bring our code back, but our crontab would be squeaky clean (we found that out the hard way when we rebuilt our cluster and suddenly our billing jobs weren't running).

Fortunately, engineyard uses Chef for their cluster configuration, and they provide you a way to apply Custom Chef Recipes to your environments that will do things during your cluster rebuild.

Taking advantage of this, I built a small custom recipe for "whenever" that simply executes the whenever script in your current app directory (only if it's a utility instance or a solo instance, so your app servers aren't running your big jobs). Check it out at this gist!

Also, if you'd like to see it in it's native environment as part of your ey-cloud-recipes project, you can check out our public git repo Here (recipe is in /cookbooks/whenever/recipes/default.rb).

Cheers!

Saturday, October 23, 2010

tag-it 0.2.2 is released!

If you read this blog regularly, you know that last week I released a new gem called "tag-it" to interface with RFID receivers and the tags that come into range with them. Well, there's a hot new version.

Before, there were only 2 events that were thrown out by the tag tracker; one when a tag arrived (came into range) and another whenever a tag departed. The problem was, unless people were actually moving into and out of range of the receiver, there was nothing going on. That's not really an inherently bad thing, but from my web-app's view it WAS very similar to what would happen if the client were totally non-functional (powered down, error'd out, etc). In order to allow applications to be sure that nothing's happened, tag-it now also sends out a "pulse" event every 180 seconds (including an array of currently in range tags). Get the latest with

  gem install tag-it

Tuesday, October 19, 2010

Arrival!

One of my most attainable dreams has finally come true, as my most recent gem was mentioned today on Ruby 5. I've been listening to the show since it's creation, and to hear my own name called out (albeit pronounced incorrectly) was a true joy. :)

Monday, October 18, 2010

How One Person Unknowingly Destroyed our delayed_job queue

this post is one for the "aren't you glad you found this here on my blog due to some furious googling rather than trying to dig into it yourself" category.

We use delayed_job for our background queue, and it's great. You can do this really cool thing where you call "send_later" on any object, with the method you want eventually called, and it will queue it up for later execution. Very lightweight.

This is what we do for our reporting features. A user selects a date range and which report they want to run, and it gets offloaded to the background queue, sent to them later as an email link to an S3 file.

Today, suddenly things aren't working. Reports aren't coming through, in fact the whole queue has frozen, it's backed up to 85 jobs, and when we look at "top" on one of our EC2 instances, we can see the ruby processed dying as quickly as they spool up. Shit.

Frantically we search the logs. Nothing. Configuration? Still the same, and looks good. We try to manually run a few jobs, no issues. Finally I try spooling up a worker in process and telling it to work off the queue, and I see this:

ArgumentError: argument out of range
 from /usr/lib/ruby/1.8/yaml.rb:133:in `utc'
 from /usr/lib/ruby/1.8/yaml.rb:133:in `node_import'
 from /usr/lib/ruby/1.8/yaml.rb:133:in `load'
 from /usr/lib/ruby/1.8/yaml.rb:133:in `load'

From YAML? Why?!

Fortunately, we got some help from our friends at EngineYard (we've always had top notch support from them) who pointed out this little beauty on the ruby bug list. A bad date could cause YAML to fail hard, and this would be during the "deserialization" of our jobs (when the process loads them out of the database to decide which one to lock and run), so it would effectively hit any worker that came across that job, killing the process, effectively destroying our job queue by halting it in it's path. Upon investigation, we found that someone was indeed trying to run a report from 08/01/2010 to 10/01/12010. Dammit!

Deleting that job allowed our queue to get back up and start processing again (and since we use AppCloud, we were able to spool up another utility instance to work off the now VERY long backlog of jobs). Nevertheless, this now means we need to sanitize all of our report inputs, because it's not enough to be a VALID date (which technically that is, although far in the future), it also has to be within a reasonable distance from now (at least until we're using Ruby 1.9.2, in which this is fixed).

RFID in Ruby

Anyone who's been reading my blog recently know that I've been playing with some RFID pieces for a local project. The project itself isn't important, though I may go into that in another post. For now, what I want to talk about is the gem I've written to help you get a project up and running with active RFID technology.

It was actually kind of fun to slog through the process of pulling off the input stream from this devices serial port, character by character, and try to put together something useful; However, I wouldn't want to do it again. Enter tag-it, my newest micro-gem hosted on rubygems.org.

The problem was this: I bought this RFID receiver, which works great. It's got a built in serial-to-usb adapter, so as long as you have the right driver installed, you can just plug it into a USB port and it will report tag names and their Relative Signal Strength as they come into range (see my last post for details). Then they just continue repeating all the in range tags every 2.5 seconds until they go out of range again.

That's cool, but it needs to be filtered. If you have a webservice tracking which tags are near which receivers, it's not performance friendly to just keep pinging the server over and over and over again with the same information (this tag is still here, this tag is still here, etc). In my case, I really only care about a tag "arriving" (coming into range) and "departing" (moving out of range). Thus, I built tag-it as a gem that sets up a class monitoring your serial port, and dispatching events as it decides they occur.

Example:


require "tag_it"
port = SerialPort.new("/dev/tty.yourport",{:baud=>9600})
tracker = TagIt::TagTracker.new(port)
watcher = YourCustomWatcher.new
tracher.add_observer(watcher)
tracker.start!

tag-it uses the standard observer class to dispatch events through an "update" method defined on the watcher, and sends three parameters (tag_name, RSSI, and event).

So, your watcher should be built like this:

class YourCustomWatcher
  def update(tag,strength,event)
    if event == :tag_arrived
      # do something because a tag has come into range
    elsif event == :tag_departed
      # do something because a tag has gone out of range
    end
  end
end

Thus far, that's it. Couldn't be simpler (well, maybe it could, but the gem is opensourced on github, so if you COULD make it simpler, send me a patch!).

So far this thing only interacts with this receiver (the one linked to above), but as I procure other sources of hardware, I hope to make this a pretty general library able to interact with whatever other input formats are available out there from other manufacturers devices.

Happy tagging!

Sunday, October 17, 2010

Real World Integration

As part of a new project, I'm playing with some RFID technology, and I decided to document my setup process here so I could duplicate it later if I needed to do it again on another computer. The idea? use active RFID tags to track people as they come and go from a building. Here are the steps for getting the tag processing up and running on my macbook pro:

0) buy RFID receiver and tags from http://cliste.sailwhatcom.com/

1) Download Mac OS X driver for the serial to usb adapter built into the receiver
http://www.prolific.com.tw/eng/downloads.asp?ID=31. run the downloaded pkg file from the unzipped DMG. (will restart your computer)

2) make sure the receiver is sending data. Plug it into a USB port, and you can use (on Mac OS X) "screen /dev/tty.usbserial 9600" to watch the port's data come across your screen. (BTW, I was only able to receive telemetry from my RFID receiver from the USB port on the left side of my Macbook Pro's body. When I hooked into the USB port to the right of the keyboard, I got nothing. No idea why, just something to play with if you have trouble here).

3) Make sure you are using ruby 1.8.7, 1.9.x does not appear to work for this next gem

4) install the ruby-serialport gem with "gem install ruby-serialport"

5) write some code to read from the port, something like this:


require 'rubygems'
require 'bundler/setup'
require 'serialport'

port = SerialPort.new("/dev/tty.usbserial",:baud=>9600,:data_bits=>8,:stop_bits=>1)

count = 0
while count < 1000
  printf("%c",port.getc)
  count += 1
end

port.close

There's a simple program that just reads the tag names as they come in range, along with their relative signal strengths, and prints them to the console one character at a time. Output when I have 2 Active RFID 20 meter transmitters sitting on the table next to the receiver looks something like this:

1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri86 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP78 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP78 1nri85 1nwP78 1nri85 1nwP79 1nri84 1nwP78 1nri84 1nwP79 1nri84 1nwP79 1nri85 1nwP79 1nri86 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri86 1nwP79 1nri86 1nwP79 1nri85 1nwP80 1nri86 1nwP80 1nri87 1nwP80 1nri86 1nwP80 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP81 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri86 1nwP79 1nri86 1nwP79 1nri85 1nwP79 1nri85 1nwP79 1nri85 1nwP80 1nri87 1nwP79 1nri88 1nwP79 1nri85 1nwP80 1nri85 1nwP80 1nri85 1nwP79 1nri86 1nwP80 1nri86 1nwP80 1nri86 1nwP78 1nri86 1nwP79 1nri85 1nwP79 1nri85 1nwP81 1nri85 1nwP79 1nri85 1nwP79 1nri86 1nwP78 1nri85 1nwP78 1nri85 1nwP78 1nri86 1nwP79 1nri85 1nwP78 1nri85 1nwP78 1nri85 1nwP78 1nri85 1nwP78 1nri85 1nwP78 1nri85 1nwP78 1nri86 1nwP79 1nri85 

Note: BE SURE TO CLOSE YOUR PORT! On a couple occasions I've terminated the program without closing the port object, it seems to leave it open and any future attempts to connect with that port get bounced with a "resource busy" message. Only a restart has solved that problem for me.

Friday, October 15, 2010

you will_paginate in FBML!

I hate writing facebook applications. I don't know anyone who enjoys it.

Nevertheless, it's a necessity in todays contracting marketplace, and sometimes we just have to live with the fact that we're going to write facebook apps.

One of the big problems is that all these super useful gems and plugins we use to speed up our development when using ruby on rails (including elements of the rails stack itself) don't really work inside the facebook application paradigm (especially if you're using FBML, which I don't want to, but one of the requirements for this app is a profile tab, and right now there are no other options).

Today, I was saddened when my favorite pagination tool, will_paginate, was rendered useless by the fact that when dealing with FBML, any links have to work according to the facebook routing scheme rather than rails's defaults. It makes sense why, but by default will_paginate was generating URLs for my pagination links that pointed to my routes on the server, which are not the same as my routes within the facebook application (maybe I should have done some better configuration up front. Doh!).

Normally I would start googling like crazy for a solution, but I read a very persuasive blog post yesterday by John Nunemaker on just reading the code on gems you want to learn about rather than trying to google up their interface. So (determined to at least try following the advice of an industry icon such as John), I installed gemedit, and opened up the will_paginate source for the 3.0 prerelease, and found that I could actually very easily customize how will_paginate's links get generated.

In will_paginate, you can specify when you call your link-generating-view-helper just what class you want to render the links. The default is WillPaginate::ViewHelpers::LinkRenderer, which does a whole lot of nifty work with boxing up your pagination situation into a nice set of links (and is very well written for readability, in my opinion). In my case, I just wanted to change how the urls were built, so I created a new class, subclassed the LinkRenderer, overrode just one method (the one that builds a link, cleverly named "link"), and reimplemented it to do the routing the way I needed it. Then, in my view, I specified my custom renderer as the class to use for generating links, and presto, my favorite pagination gem is running again even under the arduous unpleasentries of working from within facebook. See the code below:


class FacebookLinkRenderer < 
  WillPaginate::ViewHelpers::LinkRenderer

  def link(text, target, attributes = {})
    if target.is_a? Fixnum
      attributes[:rel] = rel_value(target)
      target = url(target)
    end
    target = target.gsub "/what_is_wrong","/what_is_right"
    attributes[:href] = target
    tag(:a, text, attributes)
  end

end

And in the view:


#...haml haml haml....

=will_paginate(@items,:renderer=>"FacebookLinkRenderer")

#...haml haml haml....

Nice!

Wednesday, October 13, 2010

Classic ASP on a Mac

Although I am exclusively working on my own business most of the time, I do occasionally do some side contract work if it's to help out a friend or local organization. Usually this involves doing a small web-application or some sort of data processing, but this week I got a unique request.

My local fire department is interested in redoing their web presence entirely. I'm all game to help with that, but they also need some emergency help because their current application is flat out broken. This means actually diving into their current code, and stabilizing it before we even talk about moving to a more current platform (cough...Ruby on Rails...).

Having been part of these rescue missions before, I know that the problems are usually deeper than the description: "there's just a few things that need tweaking and fixing, no big deal really". In anticipation of this, I've already taken the code I copied off of their production server (yes, some organizations still do this), put it into a git repo, and checked it out to 2 separate computers.

The further complication? The entire app is in classic ASP. And all three of my computers are Macs.

But I'm not the last person who will ever have this problem, so we're going to follow along as I get this dinosaur up and running here on my OS X 10.6 Macbook Pro.

1) Download and Install the Mono Framework

Mono is basically an open source implementation of the .NET Framework. If you want to know more than that, checkout mono-project.com. For now, just know that you need this if you want to run an ASP application on your Mac, and you can download the latest version here. The dmg file has a package installer inside of it, and it should run just fine with the default configuration.

2) Download and Install the Mono CSDK

This is available from the same page, and is actually the link just to the RIGHT of the link for downloading the framework itself. I'd be lying if I said I knew what this was, but many people have said in the forums that it's necessary. :)

3) Download MonoDevelop and copy .app file to the Applications folder

MonoDevelop is a development environment for the .NET framework that you can run on a mac.

4) Forget everything about steps 1-3...

...when you realize that Mono isn't going to be much help running Classic ASP, which is not on the .NET framework.

5) find Apache::ASP

Sweet! Execute your classic ASP in Perl with http://www.apache-asp.org/
6) Make sure Apache is installed with mod_perl

If you're on a Mac, it's already there. Just add this to your httpd.conf:

LoadModule perl_module libexec/apache2/mod_perl.so

and add a perl.conf file to apache's "other" directory to allow the execution of perl scripts:


<ifmodule perl_module="">
AddHandler cgi-script .pl
<ifmodule dir_module="">
DirectoryIndex index.pl index.html
</ifmodule>
</ifmodule>

7) download and install the latest Apache::ASP

tar file can be found here. I used 2.6.1.

First, edit your httpd.conf file again:


<directory $document_root="" asp="" eg="">
  Options FollowSymLinks 
  AllowOverride All
</directory>

Then, in the root directory of the Apache::ASP download, run the following:
perl Makefile.PL
Which will tell you if you're missing any modules. I had to install MLDBM::Sync.
From there you can do your make process like any other:
make
make test
make install

8) test to make sure you can run an ASP script

With these steps in place, you should now at least be able to debug your inherited Classic ASP application on your Mac. Now, hurry up and rewrite it on a better platform!

Refactoring in Ruby with Blocks

The block can be a mysterious construct for the aspiring rubyist. Sure, they're probably comfortable using it to iterate over a collection, that's common. But when would I ever write my own? Why would this confusing jump in control flow be useful?

Fortunately for you, if you're one who would ask that question, I solved a refactoring problem just today by using a block to de-duplicate some code, and I think it makes a good example for showing what kind of situations blocks are good for helping solve. Enjoy!

The problem

I have a class that merges some active record models together when it turns out we have duplicate information in the database. At first glance, this isn't particularly complex. Merge the authoritative attributes over the secondary attributes, combine collections of child records, and they're merged. The problem is, some of the child record collections need to be reduced. For example, if I have two user models that represent the same person, and they both have collections of child records for the last 12 months indicating that they're eligible for something every month (doesn't matter what), then when merged I'll have one user, with 2 eligibility records for every month back to a year. That's not what I really want, I want to reduce those child collections based on criteria unique to each child collection so that we only have one record representing each discrete period of time. So, to do this for two seperate chile collections, I started out by writing (after the tests) some code like this:


 removed_scripts = []
 user.prescriptions.each do |script|
  if removed_scripts.index(script.id).nil?
   other_scripts = user.prescriptions.find(:all,
              :conditions=>["id != #{script.id} and 
                     start_date = ? and 
                     expiration_date = ?",
                 script.start_date,script.expiration_date])
   other_scripts.each do |os|
    removed_scripts << os.id
    os.destroy
   end
  end
 end
    
 removed_eligibility = []
 user.eligibility_records.each do |rec|
  if removed_eligibility.index(rec.id).nil?
   other_elig = user.eligibility_records.find(:all,
             :conditions=>["id != #{rec.id} and 
               year = ? and month = ?",
              ser.year,ser.month])
   other_elig.each do |oe|
    removed_eligibility << oe.id
    oe.destroy
   end
  end
 end

This was functional, but obviously I have two blocks of code that are more similar than different. How can I combine them into one function that can handle both cases?

The Solution

What makes this situation tough is that one of the things that is different between the two chunks of code above is within an iteration over a set of child records, and is different as it's applied to each record (that is, the conditions array for finding other records to destroy). You can't pass in static conditions, because the conditions are dependent on the particular child record being used at that time for data binding. What you need to do is pass a function that can take that child record, and give you your conditions. Hence, my refactor:

def reduce_recs(child_class,conditions)
 removed_recs = []
 child_class.find(:all,:conditions=>conditions).each do |cld|
  if removed_recs.index(cld.id).nil?
   other_recs = child_class.find(:all,:conditions=>yield(cld))
   other_recs.each do |oth_rec|
    removed_recs << oth_rec.id
    oth_rec.destroy
   end
  end
 end
end

By putting in the "yield" right where the main difference in implementation was, I can submit a different function for each child collection to do the necessary matching. Thus, my new reduce_recs function can be used like this:


reduce_recs(Prescription,"user_id = #{user.id}") do |child|
 ["id != #{child.id} and start_date = ? and 
   expiration_date = ?",
   child.start_date,child.expiration_date]
end

reduce_recs(EligibilityRecord,"user_id = #{user.id}") do |child|
 ["id != #{child.id} and year = ? and month = ?",
             child.year,child.month]
end

Now, are there some other things that could be done to make this code more expressive? Absolutely, and maybe I'll look at those in another post, but the lesson you can take away today is: when you have a duplicated code chunk that only really differs at some point within nested iteration, you might consider using a block to abstract the functionality into a single function that yields out right at that critical point, to allow the calling code to specify what needs to be done for that small difference.

Or, to refactor my last sentence: Blocks are good for making a lightweight template pattern.

Cheers,

~Ethan

Monday, October 11, 2010

Websolr: make your application do a REAL man's search

Rails 3 and all the various updates that come with it has been a really great experience for me so far. I'm fond of ActiveRelation, I like the routing updates, and I love the fact that Heroku already supports it for deployment (and has for some time).

One thing that's not great when using just the baked in tools, though, is search. Most applications use some kind of searching, but unfortunately many (including a few of my own) do something like this (yes, this is an excerpt from one of my own codebases, I'm sorry, it won't happen again):


class User < ActiveRecord::Base
  named_scope :matching_search_term,lambda{|search_term| 
    {:conditions=>["users.id = ? OR 
       users.last_name LIKE '#{search_term}%' OR 
       users.alt_id LIKE '#{search_term}%' OR 
       users.email LIKE '#{search_term}%'",search_term]}}
end

I shouldn't have to tell you why this sucks. But I will. Firstly, it's really only grabbing the front of the string on each field. If I'm searching for "Vizitei" then I'll find anybody with the last name "Vizitei", but not anyone who has somehow ended up with " Vizitei" in the database (which would never happen because you sanitize your inputs, right?). Additionally, those fields all have to be indexed in the database, or there's going to be a problem. Using OR means that the results of the first part of the query don't actually narrow down what has to be compared in the rest of the query. That is, even though I might have found a few users with the last name "stevens", when the query gets to the "email" part it still has to check the whole table for emails that match "stevens". Third, any long text fields (like "description" or "bio") are going to present a performance problem if you're looking for matching a word anywhere in the body of that text.

What's great, though, is that the problem of integrating the kind of search that would easily cover a situation like this has already been solved, and pretty simply if you're using Heroku for your deployment platform. Today I implemented this kind of fulltext search in about 10 minutes by following the instructions provided here for the "websolr" addon, adding the sunspot_rails gem to my app, and changing my user class to look like this:


class User < ActiveRecord::Base
  searchable do
    string  :first_name
    string  :last_name
    string  :alt_id
    string  :email
    text    :bio
  end
end

Then I just ran

rake sunspot:reindex

and we're done.

Why is this better? Well, for one I can now just call #search on my model class to do a loose search on all those fields that are indexed by my new solr server. For another, there are a ton of cool things you can do with solr that I don't want to go into now, but that you can find here on the solr welcome page. For the impatient: it's faster. Capiche?

I'm a little stunned that this was so easy, but in todays world of cloud deployments and pluggable architecture, tasks like this are stupid-simple. Hooray for finishing early and going home for a beer!