rlucas.net: The Next Generation Rotating Header Image

May 9th, 2007:

Ruby’s ActiveRecord Makes Dropping to Raw SQL a Royal Pain (Probably on Purpose)

The opinionated programmers behind Rails have generally done a good job. (There are couple of FUBARs in their bag of tricks, such as the boneheaded choice to use pluralized table names (in some places) and use automagical pluralization code to try and mediate between the singular and plural.)

There’s another item I’d like to bring up, however, and that’s the fact that ActiveRecord intentionally cripples your ability to do raw SQL queries. This is, I’m sure, done to discourage raw SQL hacking in favor of using the ActiveRecord objects (which, for small numbers of objects, is admittedly a superior way to do many things, because of concerns for clarity, maintainability, etc.).

However, sometimes you need SQL, dammit. Especially when you’re doing a correlation between, say, different tags that describe business plans, and the people that link those business plans together, plus the number of times that such tags appear, there’s just no sense in pulling thousands of records into memory, instantiating Ruby objects, and improperly reimplementing basic CS sorting algorithms to link them up. You’ve got all that sitting right there in your RDBMS.

ActiveRecord lets you do something like this:

BusinessPlan.find_by_sql(    [      'SELECT s2.id FROM (COMPLICATED_SUBSELECT) AS s2',      var1, var2, var3    ] )  

Which will run the complicated SQL and replace the bind vars (question marks) in the raw SQL with var1, var2, var3, etc., and give you a bunch of BusinessPlan objects that it’s instantiated off those IDs. Easy enough.

But what if you need not merely to get the objects, but to get some other important info (say, COUNT(something)) out? You’re shit out of luck with ActiveRecord. The .connection.select_all method returns you an array of record hashes, but it requires fully-baked SQL (no bind vars).

  • You could manually construct the SQL and manually quote each bind variable into its place, but avoiding that kind of retarded scut work is exactly why you’re using Rails in the first place.
  • You could try and get the DBI handle that underlies ActiveRecord (does it?), but it’s very unclear as to how or if you can do that. If you call .connection.raw_connection you get a PGConn object (for PostgreSQL), not a DBI handle.
  • You could open up your own new DBI handle, which involves recapitulating the Rails initialization code for ripping the config values and rewriting connection-pooling code, which is bad for all sorts of reasons, not least of which is that you’re already f’ing connected to the DB!

WTF? If you read the code for the find_by_sql method, you’ll see:

def find_by_sql(sql)   connection.select_all(     sanitize_sql(sql),      "#{name} Load"   ).collect! { |record| instantiate(record) } end  

Given this, you might think: “aha, I’ll just use a similar method and pass to sanitize_sql an array with my SQL and bind vars, then pass that on to select_all. No can do. sanitize_sql is a protected method.

So, here’s my encapsulation-breaking, OO-unfriendly, scofflaw workaround to let you have access to what you should already get: a decent bit of code for binding SQL parameters:

(In helpers/application_helper.rb)

arb = ActiveRecord::Base def arb.sanitize_fucking_sql(*args)   sanitize_sql(*args) end  

Now, you can happily go about your business and, when necessary, call ActiveRecord.sanitize_fucking_sql(...) to get ‘er done. No special-purpose DB connections, no wrangling thousands of objects in memory.

Caveats:

  1. I suck at Ruby, I know. There’s a more elegant way to add a public method to a class, but that works and I understand it.
  2. Eventually, this will break. But it will probably be a long time and the, er, unique method signature I suggest should be easily globally replaced.

Making Subversion Set Reasonable Default Properties Like Keyword Substitution

(Programmers: skip down to the Meat section below.)

If you are so bored as to actually have read all the articles on this blog, you may have noticed that the “Id: lucas blah blah blah” string that shows up at the bottom of the articles. This is an interpolated keyword, put in by the revision control system, that shows in the text of the document when it was last committed to revision control.

Subversion (svn) is a revision control system — probably the leading such system for new deployments (I am not counting Microsoft-land, where old, proprietary systems still abound). It’s most familiar as the replacement for the venerable CVS, which used to do replacement more or less automatically. But a stock Subversion install won’t do keyword replacement unless you do a “property setting,” or propset, on a file-by-file basis (this is to prevent clobbering a file that happens to have the magical string in it).

Therefore, you have to remember to do something like this to your new files:

svn propset svn:keywords Id myfile.txt  

This tells svn to set the “svn:keywords” property to “Id,” meaning it will replace instances of $Word: $ with the ID string (when “Word” is “Id”).

However, this is a pain, and although you could conceivably script this action as a hook upon new additions to svn, there’s an easier way.

Meat

Find your local ~/.subversion/config file and edit it. Set the following:

[miscellany] enable-auto-props=yes  [auto-props] *.txt=svn:keywords=Id  

(The default config file has a bunch of examples commented out for you to base your settings on, but the above is the minimal set to get textfiles set with keyword substitution for the “Id” keyword.)