Archive for the ‘bugfix’ Category

Making Subversion Set Reasonable Default Properties Like Keyword Substitution

Wednesday, May 9th, 2007

(Programmers: skip down to the Meat section below.)

If you are so bored as to actually have read all the articles on this blog, you may have noticed that the “Id: lucas blah blah blah” string that shows up at the bottom of the articles. This is an interpolated keyword, put in by the revision control system, that shows in the text of the document when it was last committed to revision control.

Subversion (svn) is a revision control system — probably the leading such system for new deployments (I am not counting Microsoft-land, where old, proprietary systems still abound). It’s most familiar as the replacement for the venerable CVS, which used to do replacement more or less automatically. But a stock Subversion install won’t do keyword replacement unless you do a “property setting,” or propset, on a file-by-file basis (this is to prevent clobbering a file that happens to have the magical string in it).

Therefore, you have to remember to do something like this to your new files:

svn propset svn:keywords Id myfile.txt  

This tells svn to set the “svn:keywords” property to “Id,” meaning it will replace instances of $Word: $ with the ID string (when “Word” is “Id”).

However, this is a pain, and although you could conceivably script this action as a hook upon new additions to svn, there’s an easier way.

Meat

Find your local ~/.subversion/config file and edit it. Set the following:

[miscellany] enable-auto-props=yes  [auto-props] *.txt=svn:keywords=Id  

(The default config file has a bunch of examples commented out for you to base your settings on, but the above is the minimal set to get textfiles set with keyword substitution for the “Id” keyword.)

Ruby’s ActiveRecord Makes Dropping to Raw SQL a Royal Pain (Probably on Purpose)

Wednesday, May 9th, 2007

The opinionated programmers behind Rails have generally done a good job. (There are couple of FUBARs in their bag of tricks, such as the boneheaded choice to use pluralized table names (in some places) and use automagical pluralization code to try and mediate between the singular and plural.)

There’s another item I’d like to bring up, however, and that’s the fact that ActiveRecord intentionally cripples your ability to do raw SQL queries. This is, I’m sure, done to discourage raw SQL hacking in favor of using the ActiveRecord objects (which, for small numbers of objects, is admittedly a superior way to do many things, because of concerns for clarity, maintainability, etc.).

However, sometimes you need SQL, dammit. Especially when you’re doing a correlation between, say, different tags that describe business plans, and the people that link those business plans together, plus the number of times that such tags appear, there’s just no sense in pulling thousands of records into memory, instantiating Ruby objects, and improperly reimplementing basic CS sorting algorithms to link them up. You’ve got all that sitting right there in your RDBMS.

ActiveRecord lets you do something like this:

BusinessPlan.find_by_sql(    [      'SELECT s2.id FROM (COMPLICATED_SUBSELECT) AS s2',      var1, var2, var3    ] )  

Which will run the complicated SQL and replace the bind vars (question marks) in the raw SQL with var1, var2, var3, etc., and give you a bunch of BusinessPlan objects that it’s instantiated off those IDs. Easy enough.

But what if you need not merely to get the objects, but to get some other important info (say, COUNT(something)) out? You’re shit out of luck with ActiveRecord. The .connection.select_all method returns you an array of record hashes, but it requires fully-baked SQL (no bind vars).

  • You could manually construct the SQL and manually quote each bind variable into its place, but avoiding that kind of retarded scut work is exactly why you’re using Rails in the first place.
  • You could try and get the DBI handle that underlies ActiveRecord (does it?), but it’s very unclear as to how or if you can do that. If you call .connection.raw_connection you get a PGConn object (for PostgreSQL), not a DBI handle.
  • You could open up your own new DBI handle, which involves recapitulating the Rails initialization code for ripping the config values and rewriting connection-pooling code, which is bad for all sorts of reasons, not least of which is that you’re already f’ing connected to the DB!

WTF? If you read the code for the find_by_sql method, you’ll see:

def find_by_sql(sql)   connection.select_all(     sanitize_sql(sql),      "#{name} Load"   ).collect! { |record| instantiate(record) } end  

Given this, you might think: “aha, I’ll just use a similar method and pass to sanitize_sql an array with my SQL and bind vars, then pass that on to select_all. No can do. sanitize_sql is a protected method.

So, here’s my encapsulation-breaking, OO-unfriendly, scofflaw workaround to let you have access to what you should already get: a decent bit of code for binding SQL parameters:

(In helpers/application_helper.rb)

arb = ActiveRecord::Base def arb.sanitize_fucking_sql(*args)   sanitize_sql(*args) end  

Now, you can happily go about your business and, when necessary, call ActiveRecord.sanitize_fucking_sql(...) to get ‘er done. No special-purpose DB connections, no wrangling thousands of objects in memory.

Caveats:

  1. I suck at Ruby, I know. There’s a more elegant way to add a public method to a class, but that works and I understand it.
  2. Eventually, this will break. But it will probably be a long time and the, er, unique method signature I suggest should be easily globally replaced.

Hurrah for Kwiki

Tuesday, May 8th, 2007

… and for Jeremy Smith. Thanks to Jeremy’s hack on top of Ingy’s quickie wiki, we can now get proper behavior inside of table cells.

In a nutshell, Kwiki didn’t handle things like italics inside of a table. This should fix it. Previous posts that used the stock Kwiki should be fixed now.

Now, to remedy blockquotes …

Installing RMagick on OS X with Fink

Thursday, March 29th, 2007

Hold on: I’m not sure that the below works right. Don’t use it yet.

There are lots of instructions out there for installing RMagick, which is a graphics manipulation library used by many Ruby-istas for things like thumbnailing, resizing, etc. I wanted to use it for an internal database I’m building in Rails.

Some of the sites offering instructions:

  • The RMagick site itself. This one is tilted toward using Darwin Ports (the BSD-ish way to do third party package management on your mac; I prefer the Debian-ish “Fink”).
  • Hivelogic. This one involves manual downloads of tarballs and configure; make; make install type loving. I don’t like this way of going about it because you lose the package management features.

But nobody seemed to have a Fink-friendly way to do this.

If you naively try to install with gem install rmagick, you’ll get something like:

configure: error: Can't install RMagick. Can't find libMagick or one of the dependent libraries. Check the config.log file for more detailed information.

My solution:

1. Install the needed dependencies from binaries using Fink. 2. Use gem install to install RMagick (the Ruby bit) itself.

The dependencies include (as best I can tell):

freetype freetype-shlibs imagemagick imagemagick-dev imagemagick-shlibs ghostscript ghostscript-fonts gv libpng-shlibs libjpeg libjpeg-bin libjpeg-shlibs lcms lcms-bin lcms-shlibs libtiff libtiff-bin libtiff-shlibs

Therefore, you should probably be able to install simply by doing:

sudo apt-get install freetype freetype-shlibs imagemagick imagemagick-dev imagemagick-shlibs ghostscript ghostscript-fonts gv libpng-shlibs libjpeg libjpeg-bin libjpeg-shlibs lcms lcms-bin lcms-shlibs libtiff libtiff-bin libtiff-shlibs

sudo gem install rmagick

(I realize that this is probably overkill and that you don’t actually need all those packages above. If you figure out the minimal subset, why don’t you post a similar blog entry of your own?)

Good luck!

HOWTO: Subversion Export for Legal Discovery

Wednesday, February 7th, 2007

The more interesting things you do in life, the more likely it is that some jackass will sue you. If this happens, you will probably be faced with giving one or more sets of attorneys access to your electronic documents.

If, like all right-thinking citizens, you store your documents in a Subversion repository organized hierarchically by subject, you may find yourself needing to provide access to not only the current version, but all previous versions, of all documents on a particular subject.

With CVS, you could just have made a copy of the relevant part of the repository root, since the repository itself stores things in a directory structure.

But SVN has some nice features like renaming, symlinks, and binary diff, which necessitates a more sophisticated repository structure. You’ll notice that if you go to your repository root, there’s no analog to what you see in CVS.

If you don’t want to give the lawyers your whole SVN repository (and there’s no reason you should), then you need a way to drop out all of the revisions, ever, for a particular directory and all subdirectories.

Here’s what worked for me (using Subversion on Cygwin on Win XP).

$ mkdir /cygdrive/c/temp/discovery $ mkdir /cygdrive/c/temp/discovery/working_copy $ mkdir /cygdrive/c/temp/discovery/repository $ cd doc/all_stuff/ $ cp -par doc/all_stuff/target /cygdrive/c/temp/discovery/working_copy $ svn up -rHEAD target_subject $ svn log target_subject > /cygdrive/c/temp/discovery/repository/log  

Now, look through the log for all of the revisions mentioned. For me they were mercifully few; you may need to do a touch more scripting if you have e.g. hundreds of revisions (think “| grep ‘^r'”).

$  for i in 1 49 103 106 107 112 HEAD > do svn up -r$i target_subject > cp -par target_subject/ /cygdrive/c/temp/discovery/repository/r$i > done $   

This gives a nice, clean structure with your current working copy (including uncommitted changes) in one directory, and all revisions up to that one in another directory, along with the svn log of comments on all commits. This should satisfy all but the most KY-equipped of legal eagles.

Curses::UI Escape key bindings are slow; here’s why.

Thursday, September 14th, 2006

I am throwing together a quick Curses (console / terminal) based UI for a database here, prior to putting (or hiring someone to put!) a Web front-end on it. In keeping with my experience with elinks, I wanted the menubar activation key to be Escape. However, it was running slower than molasses in February — it seemed to take a FULL SECOND before the Esc key would register and focus / unfocus the menubar.

Well, poking around a bit gave me the answer. From man ncurses(3):

ESCDELAY   Specifies the total time, in milliseconds, for which ncurses will   await a character sequence, e.g., a function key.  The default   value, 1000 milliseconds, is enough for most uses.  However, it is   made a variable to accommodate unusual applications."  

Duh. It was taking exactly a full second.

A Rational Scheme for Medical Laboratory Results

Sunday, August 20th, 2006

Medical laboratory results these days are a hodgepodge of numbers on various scales and with various units. For example, the Merck Manual lists various laboratory test normal ranges and their units:

Hematocrit: Male 41-50%, Female 35-46% Hemoglobin: Male 13.8-17.2 g/dL, Female 12.0-15.6 g/dL ... Sodium: 135-146 mmol/L  

These “normal ranges” can be sort of misleading. If your value is numerically half of the lower-end of the hematocrit range, for example (say, 20%), you would be sick but still alive. However, if you have only half the normal range of sodium concentration (say, 70 mmol/L), you’d be dead.

This is crappy. It imposes a high cognitive load on doctors by requiring them to know a variety of “normal” ranges, it makes lab results opaque to patients and the uninitiated, and it has a “hidden memorization cost” of knowing the implications of going outside the normal range (such as the difference between having half the normal measurement for hematocrit vs. sodium, above).

I propose a replacement scheme for all scalar laboratory values (at least those in the main test batteries, like the chem-N and CBCs). In my scheme, all “unit” lab results are replaced (realistically, augmented) by “rational” values. Rational values are normalized at 100 for the center of the range. The “normal range” is represented by the range 90-110. The values associated with roughly 50% mortality are set at 50-150. The ranges 80-120, 70-130, and 60-140 will be pegged at some statistics-based measurement, either based upon standard deviation or upon increased chances of negative outcomes, whichever an appropriate standards body decides best (there are some labs for which it might not make sense to have it be standard deviation-based, others for which it would).

The correspondence of “unit” to “rational” measurements is not necessarily linear; the formulae to determine this will be decided per-test, reviewed annually by the standards body, and published as an appendix to standard references and on the Web.

The “core rational” lab values are those which are unadjusted for average adults. “Adjusted rational” lab values are adjusted for sex and body mass. “Peds adjusted rational” values are adjusted as above but with age ranges.

All lab reports will show these values on the summary page; “unit” measurements will be provided as well (they will doubtless remain indispensable for certain purposes). Color-coding would be straightforward: green for +/- 10, yellow for +– 20, orange for +– 30, and red for +/- 40.

This will become an ever more crucial part of diagnosis as we move toward greater automation (e.g. field lab-testing machines that paramedics could carry) and de-skilling of the medical profession (nurse practitioners, paramedics, self-administered care and monitoring, etc.). It also becomes a key part of the understanding required for personal medical choice as we move the economics of health care toward a (partial) patient-pays model.

If someone wants to give me a grant for a year of my time with a couple of assistants, I’ll go ahead and set this up. Drop me an email – rlucas@tercent.com.

Linux Software RAID and GRUB – Recovering From a Failure

Friday, August 18th, 2006

A couple of weeks ago, I had the bright idea to move an internal server here at Voyager from my office into a data room. I issued the customary sudo shutdown now and proceeded to move the box.

I was dismayed not to see it boot right back up afterwards. Ouch! I had specifically configured it with software RAID because the hard drives in the old spare box were a bit dodgy. Turns out that was a good idea, since one of the drives had failed (apparently the one that had the GRUB bootloader appropriately loaded on it).

I was faced with two 20 Gig HDDs, only one of which worked exactly right, and a computer which failed to boot off the remaining HDD. A quick trip to uBid and $70 later, I had two 60 Gig drives ready to use (20 Gigs are darn near impossible to find). I knew enough about partitions and whatnot to get this much done:

– Got a bootable rescue CD with a good set of utils (PLD Linux) downloaded and burned (it’s good to have one of these handy, rather than trying to burn them as-needed — see below under “Tricky stuff” for my unfortunate experiences with that).

– Trial-and-errored the two old HDDs to find which one was failing. Removed the bad one and replaced with New HDD #1.

– Used cfdisk, the curses-based fdisk, to read the exact size and type of the good RAID partition from Old HDD. Used that information to create an identical physical partition at the beginning of New HDD #1, including the same (Linux Software RAID) partition type.

– Used dd, the bit-for-bit copier, to copy the verbatim entire partition from the Old HDD’s main partition, /dev/hda1, to the New HDD #1’s identically situated partition, /dev/hdc1, both of which were unmounted at the time.

– Swapped out the Old HDD with New #2, and repeated the last couple steps to make a new partition on New #2 and copy New #1’s first partition to it.

– Used mdadm --assemble to put the two identical RAID partitions — /dev/hda1 and /dev/hdc1 — back together into a RAID array and let it re-sync them until mdadm reported them to be in good health.

– Used GRUB to re-install the MBR on both HDDs. This was a damn sight harder than it sounds (see below).

All in all, it was a far cry from sliding a replacement hot-swap SCSI into a nice hardware-based array — but at $70, a fraction of the cost, though hardly timely (my use of this server is as a document archive, web proxy, cron-job runner, and general workhorse for background processing and speculative projects for automated information-gathering tasks — none of which are mission-critical for us at Voyager).

Tricky stuff:

– Windows XP apparently doesn’t come with ANY ability to burn ISOs. WTF, Microsoft? Operating system ISOs are just about the only legal thing I have ever wanted to burn to a CD, and that’s the one thing you won’t support? (Well, duh, really.)

– The latest Knoppix (5.0?) just plain barfed. It may have been the speed at which the dodgy ISO burning software I downloaded burned it (errors?). In any case, burned about an hour of my life trying different “nousb” and similar switches to no avail.

PLD Linux‘s rescue disk was small and booted perfectly (though I took care to burn it at a low speed).

– BLACK MAGICK: When booting from the rescue disk, to get mdadm to appropriately deal with the raid, there weren’t any md devices in /dev on which I could mount the RAID. I needed a couple of times to create the node /dev/md0 by issuing the commands:

mknod /dev/md0 b 9 0  

Which, as I understand it, is “make node /dev/md0, type block, numerical type #9 (the magic number for RAID?), and the 0th such block.” Then, since mdadm refused to automatically find and mount the drives for /dev/md0, I had to find the UUID for the RAID volume thus:

mdadm --examine --scan  

And then copy the UUID (thanks, GNU Screen!) into the command:

mdadm /dev/md0 --assemble --uuid=<WHATEVER>  

– Getting GRUB installed on the hard drives was, in the end, easier than I thought but was rocky due to the complexity of the issues involved and me not understanding them fully.

If you search for “software raid grub” you’ll find a number of web pages that more or less get you there with what you need to know.

For me to get GRUB to work, I did the following.

– First, I had the /dev/md0 partition (the “RAID partition”) holding my / (root) partition, with NO separate /boot partition. That means I had to make each of /dev/hda1 and /dev/hdc1 (the “RAID constituents”) bootable. Much of what you read follows the old advice of having a separate / and /boot, which I did not have.

– Second, I had to boot from the rescue CD, get the RAID partition assembled, mount it, and chroot into the mount point of the RAID partition. Like:

mknod /dev/md0 b 9 0 mdadm /dev/md0 --assemble --uuid=<WHATEVER> mkdir /tmp/md0 mount /dev/md0 /tmp/md0 cd /tmp/md0 chroot .  

Note that since the partition names and numbers were the same on my New #1 and #2 drives as they were on the old ones (hd[ac]1), there’s no problem with the old legacy /etc/mdadm/mdadm.conf and it can tell the kernel how to assemble and mount the RAID partition (important for below).

– Then, once chroot‘ed into the RAID partition, I ran run-time GRUB (“run-time GRUB” being when you run grub as root on an already booted machine for purposes of installing stuff or whatnot; this is opposed to “boot-time GRUB” which looks pretty damn similar but is what is run off of the master boot record — MBR — of a bootable partition onto which GRUB has been installed) off of there, which used the existing Ubuntu menu.lst file. For some reason, that file ended up binary and corrupted. Therefore, I had to scrap it and come up with a new one. Here’s the meat of my new menu.lst:

# Boot automatically after 15 secs. timeout 15  # By default, boot the first entry. default 0  # Fallback to the second entry. fallback 1  # For booting Linux title  Linux root (hd0,0) kernel /vmlinuz root=/dev/md0 initrd /initrd.img  # For booting Linux title  Linux root (hd1,0) kernel /vmlinuz root=/dev/md0 initrd /initrd.img  

– Using that menu.lst, the commands I entered in the GRUB shell were as follows:

device (hd0) /dev/hda root (hd0,0) setup (hd0) device (hd0) /dev/hdc root (hd0,0) setup (hd0)  

The rationale behind this is that the first three lines install the current menu.lst (that is, whichever one it finds in /boot/grub/menu.lst) onto the MBR of /dev/hda, the first bootable HDD, and the second three lines install onto the MBR of /dev/hdc, the second HDD, but fake out the installation there of GRUB to act as though it’s on the first, bootable hdd (hd0).

Do you get it? After chrooting, I fired up run-time GRUB, which automatically looks in its current boot/grub for menu.lst. I told it to put MBRs on both /dev/hda and /dev/hdc to make boot-time GRUB behave as specified in the menu.lst. The menu.lst lines say “use hd0,0 (e.g. hda1) as the root directory, find the kernel (vmlinuz) and initrd there, and once the kernel get loaded, tell it to use /dev/md0 as the real root directory, which it can do because it reads /etc/mdadm/mdadm.conf or /etc/mdadm.conf to figure out its md devices.

What puzzled me at first was, "how does the kernel get loaded when the root is /dev/md0 and you obviously must run mdadm in order to assemble and mount /dev/md0?" The answer is that when you do the installation commands listed above, it tells the boot-time GRUB to act as though (hd0,0) (AKA /dev/hda1 or /dev/hdc1, depending on whether the BIOS points to hda or hdc for booting) is its root directory. So, boot-time GRUB, all the way up through the loading of the kernel, treats /dev/hda1 (or hdc1) as its root, and only at the stage where the kernel is loaded enough to check mdadm.conf and run mdadm does it then do a little "chroot" of its own. If I've got this completely wrong, please [rlucas@tercent.com email me] and tell me I'm a bonehead (and include a link to your, more informed, writeup).

There's an elegance to the whole Linux software RAID thing, but it took a darn long time to comprehend.

Broken Quoting of Spaces in Table Names in Ruby’s ActiveRecord

Friday, July 21st, 2006

Two-part posting.

1. The hot-shit developer boys at http://dev.rubyonrails.org apparently use Python (Trac) for their bug-tracking system, and for extra chuckles, it’s broken.

From http://dev.rubyonrails.org/newticket#preview

(removed a Python stack trace that came down to a NOT NULL constraint violation in the underlying database — the error message was messing up the blog formatting software.)

Not the most helpful when I’m trying to post a bug report!

2. So, I’m posting by bug report below. Essentially, having spaces in your table names throws a major monkey wrench in the “convention over configuration” mantra of Ruby on Rails.

I might fix this, if I can find the time (the lack thereof being why, in the first place, Rails seemed so appealing). If so, I will post a patch.

Legacy apps being built to SQL Server databases may find spaces in table names. These can be addressed superficially by a:

set_table_name '"Spacey Table"'  

or

set_table_name '[Spacey Table]'  

This approach makes parent classes behave properly when directly interpolating table_name into a string, avoiding such errors as:

Invalid object name 'Company'.: SELECT count(*) AS count_all FROM Spacey Table   

HOWEVER, pre-escaping the table names in this way breaks the SQLServer ConnectionAdapter’s ability to get info out of SQL Server, as in sqlserver_adapter.rb line 246 (line breaks added):

sql = "SELECT COLUMN_NAME as ColName, COLUMN_DEFAULT as DefaultValue,  DATA_TYPE as ColType, IS_NULLABLE As IsNullable,  COL_LENGTH('#{table_name}', COLUMN_NAME) as Length,  COLUMNPROPERTY(OBJECT_ID('#{table_name}'), COLUMN_NAME, 'IsIdentity')  as IsIdentity, NUMERIC_SCALE as Scale  FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '#{table_name}'"  

As you can see, here it will try to match ‘Spacey Table’ = ‘[Spacey Table]’ for the brackets case (or ‘”Spacey Table”‘ for double quotes).

Also, get_table_name(sql) will have trouble with this.

To make this work without breaking encapsulation will probably require using an escaped table_name and then selectively unescaping it for the SQL Server-specific uses.

All in all, MSFT’s behavior is fairly satanic on this; see below for a link describing syntax and escaping. Note that the only backwards-compatible solution is to SET QUOTED_IDENTIFIER ON and then use “This Table”.”This Column” notation, in order not to run up against problems with SQL Server < 6.5.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_con_03_89rn.asp

Fixing a package remove failure on Debian (or Ubuntu) when dpkg-divert barfs.

Sunday, June 18th, 2006

2006-06-18 Randall Lucas

If you get stuck with a half-uninstalled (technically “half-installed;” the debian dpkg system is an optimist, I guess) package that barfs on running its postrm (post-removal) script with something like:

"dpkg-divert: mismatch on divert-to ..."  

This can sometimes be fixed, or at least made to shut up, by finding the appropriate line in /var/lib/dpkg/info/PACKAGE.postrm and editing it to reflect the right filenames for the dpkg-divert remove line.

If you really need to get down and dirty (like, if the removal of a non-critical package is stuck halfway and that is stopping you from doing an installation of a really necessary package), you could just comment out that whole bit with the goal of getting the postrm script to return success (0).

Of course, if you’re running any important services, you shouldn’t be using “unstable” or Ubuntu; just run Debian stable so you can sleep at night.