rlucas.net: The Next Generation Rotating Header Image


[WARN] MD5 sums irredeemably broken

The MD5 hash function is dangerously unusable at this point.  I
was under the impression, casually following crypto over the last
couple years, that it was weak but likely “good enough” for
non-military, non-banking types of applications.  Dead wrong.

There are now known attacks — and doubtless toolchains for specific
exploits — that permit creating two completely different (but valid)
pieces of plaintext that generate the same MD5 sum.

See http://www.doxpara.com for an example of two mocked-up HTML pages,
one for “Lockheed” and one for “Boeing,” that share the same MD5 hash

See also Wikipedia's MD5 entry (which does not NEARLY sufficiently raise the alarum on this) at http://en.wikipedia.org/wiki/Md5

You might pooh-pooh my admittedly somewhat superficial take on this,
but ignore me at your peril: bad guys are doubtless developing toolkits
for creating two docs, one legit, one malicious, that share the same
MD5 sum.

Bottom line: time to use SHA1 (for a while until someone figures out
how to do the same thing).  Simple enough on debian; “sha1sum” is
in coreutils and is a seeming drop-in replacement for MD5 sums.

[SANITY CHECK] Apache 2 hangs with lots of STDERR output from CGI

You are not crazy. It is not an infinite recursion in your logic. Your code doesn't take that long to execute.

you output to STDERR (in Perl, this means Carp or warn or the venerable
print STDERR among others) from a CGI script under Apache 2.0, and you
end up dumping more than approximately 4k (note that if you are using
“warn” or “Carp” you may have extra stuff on there so that you only
output 3k or so but the extras bring it up to 4k), Apache 2 will hang
forever (as of today, 30 March 2004).

See this bug report: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22030

There are some patches proposed in the link above on the Apache project bugzilla, but they are not production releases.

case you were wondering,
http://blogs.law.harvard.edu/rlucas/2003/08/26#a13 shows some helpful
hints on how you can back down to version 1.3.

Question to all:
what are folks' recommendations for an Apache 1.3 packaged install? I
would tend to prefer statically linked with SSL and mod_perl, but the
only one I've seen folks using is from n0i.net which isn't entirely
satisfying because I don't speak Romanian.

Update: Using Apachetoolbox
(see Google), you can fairly simply compile apache 1.3 + mod_ssl + mod_perl
+ php and whatever 3rd party modules you like.  This makes for a fine alternative to RPM
versioning hell, or even to traipsing around your src tree typing make.  Be sure that if you do this, you specify
mod_perl 1.29 rather than 1.99, if you compile mod_perl in.

Solving a Real Problem

OK, I have determined what blogs are for. They give an easy way to publish aggregated technical fix information in a search-engine-friendly format. Aggregated: quite often, fixing a specific technical problem (even a common one!) requires looking around the web at a number of false leads on mailing list archives, tech docs, knowledge bases, etc. Putting the whole solution, once found, into a single blog entry (including links / attribution to the original solvers) makes sense. Seach-engine-friendly: mailing list archives are good, but only if they get web-published and Googled. Realistically, if it's not in Google, it doesn't exist — especially in the realm of technical problems that could have any number of origins (imagine compiling an XML to Excel Perl module on Mac OS X: is your problem with GCC, libxml, Excel, Perl, or Mac OS X? Which mailing lists do you search first?). Additionally, most computer problems have a characteristic error message which appears with some limited amount of variation. That message is easy to post on a blog. I originally believed that the solution to the aggregated technical fix information was a search-engine feeder backed by an RDBMS, but it is clear that any schema will be too inflexible for the variety of problems. Better to post error messages verbatim, try to be as explicit about keywords as practicable (if it segfaults, include the words “crash', “segmentation fault”, and “segfault” as an aid to searching), and let Google handle the hard stuff far better than a FTS through an RDBMS could hope. Why do this? Well, this is a case of the comedy of the commons: figuring out a solution like this on one's own or by searching through mailing lists piecemeal could consume hours or days of productive time. However, posting a solution once found is trivial, taking mere minutes. If even one other person posts a solution that I find which saves me 3-4 hours, it's worth all the time I'll ever spend in posting such things.

Unwire Portland (OR) Project: Public benefit through the "drinking fountain" model

Portland, Oregon is working toward a citywide, privately-operated
wireless network, under a public-private partnership model that
leverages city rights-of-way, among other assets, in return for certain
“public benefits.”  I strongly support this effort (the “Unwire
Portland” project).

The issue at hand is that the currently-proposed public benefit
structure is to create a “walled garden” of hand-picked sites that will
be freely available to the public.  A few moments' reflection
should alarm the reader: who will pick these sites, using what criteria
and what process for review, etc.?  Who will get sued when someone
inevitably disagrees with the choices?

My answer to these concerns is to do away with the “walled garden” and
in its place put a “drinking fountain” model, where each passerby may
take a small “trickle” of an unrestricted Internet connection for free.

I have put together a document supporting the adoption of the drinking fountain model here: http://rlucas.tercent.com/wifi.html

Your comments and suggestions are welcome.

[FIX] XFree86 stuck at 640 x 480 under Linux with Dell Dimension or Optiplex

With a fresh install of Red Hat 9 on a Dell Dimension 4600, the only video mode that would work with XFree86 was 640 x 480, which is ludicrously big on a decent-sized monitor.  Changing the config didn't do anything, even though the config was well within my monitor's limits.

The solution was to go into the BIOS setup and change the Integrated Devices (LegacySelect Options) / Onboard Video Buffer setting from 1 MB to 8 MB.  I'm not sure what the tradeoff with other aspects of the system is, but X nicely starts up at 1280 x 1024.  Apparently, this is the solution for other Dell models as well, including the Optiplex GX260; mine had Dell BIOS Revision A08.  Also, it seems to be the case that the problem is general to XFree86, although it manifested for me under Red Hat 9.

Thanks to Erick Tonnel at Dell, who kindly provided the solution here:



Apple Security Update 2003-03-24 Breaks Many Things?

For Mac OS X users who installed the Software Update with a security component on 24 March 2003, some things might be broken if you use Apache, Sendmail, or the Perl PostgreSQL module DBD::Pg.

1) Regarding Sendmail:

See http://www.macosxhints.com/article.php?story=20030306145838840

(relevant error message: Sendmail might complain in /var/log/mail.log of “Deferred: Connection refused by localhost “)

(summary: Apple makes sendmail look at /etc/mail/submit.cf instead of sendmail.cf)

2) Regarding Apache:

See http://ganter.dyndns.org/misc/apple_ssl.php and http://apple.slashdot.org/comments.pl?sid=58276&threshold=1&commentsort=0&tid=172&mode=thread&cid=5640470

(relevant error message: Apache segfaults out on some SSL requests with the crash message [try /var/log/system.log]

Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_INVALID_ADDRESS (0x0001)

… and specifically complains that the error is in ssl_var_lookup_ssl)

(summary: Apple supplies a faulty libssl.so for Apache; a working version is provided)

(fix: see dyndns link above or try http://cyber.law.harvard.edu/blogs/gems/rlucas/libssl.so NO WARRANTY courtesy only mirror)

3) Regarding DBD::Pg
See http://gborg.postgresql.org/pipermail/dbdpg-general/2003-March/000039.html

(relevant error message:

dyld: perl Undefined symbols:

… and more, whenever a script uses DBD::Pg.)

(summary: Perl scripts now crash out. Might be because PostgreSQL was compiled before the security update. Does anyone know otherwise?)

I am going to install the July Security Update to see if it fixes things at all.

UPDATE: The July security update does not fix it. However, recompiling PostgreSQL fixes most of the errors (see 25 July 2003 entry for a persistent error with utf-8 support).

[BUG] Mail::Mailer, Mail::Internet, and MIME::Entity fork / eval oddity

The Perl module Mail::Mailer, and those modules that rely upon it (at
least, Mail::Internet and MIME::Entity), have an undocumented fork that
can wreak havoc with your code if you call the send() method within an
eval {} block.  The solution is to either be very anal about
checking for PIDs or to use a different means for sending your
messages, like MIME::Lite.

Briefly, the problem is that the sending procedure forks, using the
open(“|-“) idiom to create a filehandle for writing to the child, which
immediately exec()'s a sendmail (or whatever) process.  The parent
returns the filehandle, to which is printed the message; the filehandle
is then closed for final sending (this is all hidden in the
Mail::Internet and MIME::Entity classes' send() method).  However,
if you are running in taint mode with an insecure path (for one
example), the exec() will fail in the child and will die.

If you were running this in an eval {} block, and didn't account for
the possibility of a fork within the eval{}, you could find that both
code paths — the success AND the failure code blocks — get
executed.  Since this is often done for db transactions or other
things that might be shared external resources, this could lead to some
nasty race conditions.

In defense of Mail::Mailer, it is *technically* the job of the coder to
check on forks, but this argument ad absurdum would have every line
that calls module code wrapped in an elaborate eval with checking of
the PIDs.  Clearly not OK.

I have explained this bug and opened it up to discussion on
perlmonks.org, at http://perlmonks.org/index.pl?node_id=459739 and have
reported the bug in Mail::Mailer under the MailTools distribution at

The workaround at present is to either 1. obsessively check the PIDs
before and after the eval, or 2. use MIME::Lite, which appears not to
fork.  NOT a valid workaround would be to ignore this becaues your
exec() hasn't died yet or to turn off taint mode.

FIX: Can't locate object method "select_val" via package "DBI::st" under Class::DBI 0.95

[warning: see updates below]

Between Class::DBI 0.94 and 0.95, something changed causing my classes that override db_Main() to set the db connection (instead of, e.g. using set_db) to stop working.  Specifically, when their sequence functions were invoked, I got an error of:


Can't locate object method “select_val” via package “DBI::st” (perhaps you forgot to load “DBI::st”?) at /usr/lib/perl5/site_perl/5.6.1/Class/DBI.pm line …


I was able to replicate this on Mac OS X 10.2 with Perl 5.6.0 and Red Hat 9 with 5.6.1 (don't ask about the latter version magic…).


If you get select_val errors with Class::DBI 0.95, here are two workarounds:



I am not sure why this is (comments are welcome) and have submitted a bug to the developers as CPAN #5522.

Update: Thanks to Tony Bowden, maintainer of Class::DBI, for his reply:


Full context and any attached attachments can be found at:
<URL: http://rt.cpan.org/NoAuth/Bug.html?id=5522 >

On Mon, Mar 01, 2004 at 06:42:38PM -0500, Guest via RT wrote:
> In Class::DBI 0.95, setting up the DB connection by overriding db_Main
> breaks Ima::DBI->select_val and the methods that rely on it (like sequence
> and count_all)

You need to call Ima::DBI->connect rather than DBI->connect in your
overriden db_Main.



Still not certain, though, why it is that it breaks in 0.95 and not 0.94.

Update: Thanks again to Tony, who writes:

The reason this broke in going from 0.94 to 0.95, btw, is that the
select_val stuff was only added to Ima::DBI recently, so Class::DBI .94
didn't use these, but instead rolled its own long winded versions.

0.95 uses the new methods in Ima::DBI and is a lot better off for it! πŸ™‚


Update 23 April 2004: Things are all wacky now.  You should be careful and should probably NOT do any of the above.  See the 06 April 2004 exchanges on the CDBI mailing list.  If you do what is described above you stand to get reconnections that will mess up things like auto_increments in MySQL.  At present the issue of how to deal with db_Main() seems unresolved.

Once Upon A Time

[Update: As is often the case when lots and lots of people (say, the whole Internet)
look at a problem, I came to this conclusion independently along with a
whole bunch of other folks.  I wrote this freshman effort at
blogging prior to becoming aware of the “Eternal September” concept;
however, this trope of pre/post-1993 Internet quality is much more
concisely described by the “Eternal September” entry in Wikipedia:
http://en.wikipedia.org/wiki/Eternal_September .  My take on it
doesn't put as much blame directly on AOL users as the folk wisdom of
Eternal September does; I try to look at structural differences in the
modes of communication and speculate as to their effects on the types
of interactions that went on.]

Once upon a time, the Internet was cool (circa pre-1993). At that time
there was a lot of info with a decent signal to noise ratio, and a lot
of knowledgable people, You could read the FAQs for a newsgroup on a
subject (anything from hang gliding to Germany) and get a fairly good
dose of knowledge on the topic, as well as a direct line to a bunch of
people who knew it well. Is there a way to get something as cool as
that back out of today's incarnation of the Internet (that is, the
largely Web-mediated experience)? I hold that maybe there is some hope
and that we can get the Internet back to being somewhat collaborative
and useful again.

If the Internet was so grand, what did people
do with it back then? There was the normal Internet stuff that still
goes on today and will probably go on forever: email and FTP, which
respectively served the most personal and most technical needs of its
users (sending letters and distributing software). There was real-time
chatting of various types, much as there is today. But the big
difference in the way people interacted then and now is the difference
between Usenet and the Web.

Usenet (a.k.a. netnews or
newsgroups) provided for the syndication of so-called “news” messages
grouped into subject-matter categories. In practice, these newsgroups
weren't really news per se. They were rather forums for discussion and
debate by people, often quite knowledgable people, about defined
subject areas (of all sorts, but most commonly political/religious
debate, hobbies, and computer/technical issues). People built up their
reputations by contributing constructively to these discussions but the
most presitigious thing you could do within the context of a newsgroup
was to help maintain its FAQ. The Frequently Asked Questions list was
kind of a “greatest hits” of the newsgroup's content. Most of the
active newsgroups had these FAQs, and they were routinely made
available in the context of the newsgroup itself as well as being
archived and distributed as ends in themselves. The maintainers of an
FAQ of course had to be able contributors who would structure and even
add novel material to the FAQ, but the document really represented a
collaborative effort of the group's active members, and was often
largely paraphrased or excerpted from newsgroup postings (with
attribution; another honor for the constructive group member).

was of course no such thing as a newsgroup that had only one member who
wrote the FAQ based upon his own discussion with himself and the
questions he had answered. The idea would be preposterous; newsgroups
were collaborative centers.)

(Note that the kind of knowledge
I'm discussing here is not the easy kind, like stock quotes, movie
times, sports scores, etc., which various companies have already
handled quite well [and which, I may add, were not nearly so easily
available during the Usenet era]. I call that the “easy” kind of
information because it's easy to imagine the SQL statement that
retrieves it, e.g. select showtime, location from movie_showings where
film_id = 38372 and city_name = 'boston'. I'm more interested in domain
knowledge of a particular field, such as “what are some good books I
should read to learn about hang gliding,” or “what does it mean if
program foo version 4.21 says 'error xyz-2?'”)

Sometime after
1993 a bunch of things started happening: commercial spam began to fill
up Usenet and folks' email boxes; waves of the uninitiated began
incurring the wrath of old-timers by their breaches of netiquette,
leading to a general lowering of the signal-to-noise ratio; and, of
course, people got turned on to this whole idea of the Web. Here was a
medium in which anyone could become a publisher! If you were expert on
a topic, or if you had a cool digital photo, or if you just happened to
know HTML, you could publish a Web site and become sort of famous! Of
course, this was a pain in the ass: posting on Usenet just meant typing
an email message, but having a web page required knowing and doing a
lot of tedious but not very interesting stuff, so you really had to
have some time to put into it.

However, the Web had pictures and
clicking with the mouse, while Usenet had boring words and typing —
and AOL users were starting to come onto the Internet. So the Web took

The dominant mode for interaction on the Internet — but
more importantly, for publishing of subject-matter knowledge — moved
away from Usenet to the Web. (Of course, Usenet is still around, and
the newsgroups generally put their FAQs on the Web, but a newcomer to
the Internet might never even hear of Usenet during his Web-mediated
experience.) Rather than posting an article to a group and waiting to
read other articles posted in response, you now published a “site” and
counted how many visitors came. (Plus, you could enjoy hours on the web
without ever using your keyboard, which meant of course that its users
were even physically disconnected from the means of actually inputting
any information.)

Everyone who was an aspirant to Web fame and
had an interest in model trains, say, would create his own model trains
Web site, provide his own set of (supposedly) interesting content, and,
often, maintain his own FAQ of questions asked of him by visitors to
the site. At first, these aspirants were individuals, but soon enough
affinity groups or associations and commerical interests got involved,
doing basically the same thing. Perhaps you see where I am going with
this, gentle reader. The way in which personal knowledge was packaged
up and distributed became centered on the individual, and the
relationship changed from one of collaboration between peers to one of
publisher and reader.

A well-known lament about web publishing
is that unlike print publishing, the cost is so low as to admit
amateurs, crazies, and just plain bad authors — anyone with sufficient
motivation to brave the arcana of FTP and HTML. On the other hand, I
have just complained that the model simultaneously changed from a
peer-to-peer to a client-server relationship. Could it be that both of
these charges are true? It seems this would be the worst of both
worlds: not only are people no longer as engaged in the constructive
addition to the commons, but those that control the production and
distribution of knowledge aren't even filtered out by the requirements
of capital investment. It's like creating a legislature by taking the
worst parts each from the House and Senate. Sadly, this describes much
of the past ten years of the Internet's history.

However, there
is some hope. Whereas previously, “anyone” could have a Web site but
precious few put in the many hours it required in practice, the promise
of Weblogs is to actually open Web publishing to “anyone.” This won't
filter out the crazies, but at least it won't artificially inflate
their importance by raising the bar just high enough to keep everyone
else out. Comment forums, content-management systems, Wikis,
trackbacks, and the like are helping to re-enable the sort of
collaboration that made the Usenet system work.

Bottom line: it rather feels like we're almost back to 1993.

Next time: future directions, pitfalls, and why blogging (alone) is not the answer.

FIX: Compiling SWI-Prolog on Mac OS X 10.2

SWI-Prolog is available as prepackaged binaries for Mac OS X 10.3+, but
not for 10.2.  If you try and install the 10.3 binary package, you
will get errors (at least, I did).  The answer is to compile from
source.  You are probably compile-savvy if you are looking to
install a Prolog interpreter, but if not, it's a fairly painless
./configure, make, make install process.

1. However, the docs warn that you'll want readline and a number of
other libraries installed.  There are some binary packages on the
SWI-Prolog site.  If you want to use those, and you don't have any
other versions of the libraries, so be it — but I would recommend
using Fink instead, so that you can install the most up to date

2. Especially if using Fink, be sure to alert the ./configure script to
the locations by including LDFLAGS=”-I /sw/include” and CFLAGS=”-L
/sw/lib” (or wherever).

For me, all it took was pointing the configure script to the /sw tree and it compiled with no further questions.