Offloading stuff to a background process
September 18th, 2007
I’d recently created a document management application for my company. I used a third party ocr app that I called from the command line. I deployed it and everything worked great….until the user said, let’s try this big document (180 pages). Amazingly, 15 minutes later, the document had been ocr’d and the text had been stuffed in the database, but the users browser had timed out after about 3 minutes and showed a screen that intimated the document import had failed.
I wrote a script that I threw in the root of my rails app directory. The script uses the the environment files and activeRecord objects from the main app. I think I may have been able to trim down some of my require statements had I used script/runner, but I developed this script while working on a solution involving the win32-service gem.
load 'config/environment.rb'
require 'app/models/document'
require 'app/models/notifications'
loop do
@document = Document.find(:first, :conditions=>'processed=0')
if !@document.nil?
begin
cmd = "#{CONFIG[:ocr]} -k 6 -6 -y 2 -o %FILENAME.pdf2 -p -t #{File.dirname(@document.full_filename)} #{File.expand_path(@document.full_filename)}"
cmd.gsub!(/\//,"\\")
output = `#{cmd} 2>&1`
#try to delete the original and rename the searchable one.
File.delete(File.expand_path(@document.full_filename))
File.rename(File.expand_path(@document.full_filename)+(2.to_s),File.expand_path(@document.full_filename))
f=File.open(File.dirname(@document.full_filename)+"\\"+@document.filename.gsub(/pdf$/,"txt"),"r")
text = f.read
f.close
@document.text_data = text
@document.processed = true
@document.save
Notifications.deliver_processed(@document.created_by, @document, output)
rescue
Notifications.deliver_error($!)
end
end
sleep 5
end
So I started digging around for ways of offloading the ocr portion into a background process. There are a lot of different ways to do this…..if you run on a unix/linux system. I, unfortunately, deploy to windows machines. After trying several different paths, I remembered my old friend SRVANY.exe. This utility is available in the Windows NT Resource kit (downloadable from Microsoft).
The Resource kit contains a lot of files, but these two, instsrv.exe and srvany are what interest us right now. The format to install a service is as follows
resource_kit\instsrv.exe MyServiceName resource_kit\srvany.exe
Your path to the resource kit may vary. Once you run that command, you should get some output like so:
The service was successfully added! Make sure that you go into the Control Panel and use the Services applet to change the Account Name and Password that this newly installed service will use for its Security Context.
Now, open up regedit and navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
Locate your new service in the left panel. Highlight the name and right click and choose New -> Key and name it Parameters

so that it looks like this:

Now, in the right panel, right click and and choose New -> String Value. Name it “Application”. Repeat the first step 2 more times, but name these “AppDirectory” and “AppParameters”. Double click the “Application” key and type in the path to your ruby interpreter.

For AppDirectory, type in the working directory for your project, e.g. c:\sites\myRailsApp
For AppParameters, type in the name of the script relative to the working directory. e.g. lib\import.rb. Since my script was in the root of the working directory, I was able to use just the script name.
Ok, next step….wait, there is no next step. You’re done! Now you can change all of the service specific settings in the Service snap-in, stuff like service login, start mode, etc.
Have fun with this!
Leave a Reply