Feb 13 2006

I heart google.

I know there is a trend amongst some to decry google, but they consistenly produce little nuggets of software delight. I went back to their labs, trying to find the google homepage api to see if I could satisfy my epiphling; The epiphling awoke to me clicking on the first result from google, and thinking “why doesn’t it know that I feel lucky?” To which it answered,

〖well, you already have all your search history, so why not just use a greasemonkey script to auto-forward to the site I would have clicked on? Further, if we can’t determine with certainty which site we should go to, leave the page open but open in tabs the five sites we were most likely to click on. Subsequently watch to see how the opened tabs are treated (Are they closed without scrolling? Scrolled all the way? Have multiple pages opened from it?) and use those values to refine the tab-opening weighting.〗

Having found the API I also came across this spiffy “site flavored” search tool. Basically you give it your website, and it tries to auto-classify it. It could not classify mine (which, I must confess, made me preen a bit; though in all honesty it should shame me) and so I provided it with all sorts of details about the categories of my site – to their credit they let you pick many leaves from their categorical tree. What follows is the result, which I will integrate into this site as soon as I take the time to de-fugly the code:

Google



So then I thought I’d try to use google sets as a sort of oracle, and gave it five terms that seem related in my head; its expansion of the set was quite pleasant. Almost makes me want a google set API :)


Feb 1 2006

Geek Heaven

My lovely and talented wife had finally succumbed to my remorseless campaign for a wifi enabled Palm (despite the parlous state of our finances, the $2500 mac at my desk, and the Tungsten T), and so I won a Tungsten C. Unfortunately “won” has turned into a bit of a euphemism. After failing to get in touch with the seller three times over six days, I was beginning to hope I could back out of the auction and get a Sony Clié UX-50 instead – it is around the same price-point and has a nicer keyboard, bigger monitor, and a built-in camera, but the downside of being a discontinued product line and only accepting Sony’s memory-dick cards.

So, of course, within an hour of me emailing eBay’s customer service, the guy emails me and gives me enough information to think he’s legit and had a valid enough excuse (he was setting up his new intel iMac.) So I paypal him the next day. A day or so after that, I get an email from PayPal saying “Your package will be shipped by PayPal shipping.” Great! I thought, he’s got a pre-printed label and everything, it’s going to be here soon!

Turns out that the “will” was meant more indefinitely than my hope indicated, and as of today USPS says only that they’ve been notified by the shipper. So when a school comrade said he’d buy it off me if it ever showed up, my heart leaped for joy.

So I just bought a brand fucking new Palm T|X, which with my trial Prime subscription will get here Friday, and I am very, very excited.

I will be able to sync project-related program activities over wifi to my server running OpenGroupware, read scientific papers without lugging about printouts, continue writing for Winds of Titan (unlike most people, I seem to be fully acclimatized to Graffitti, and the T|X’s write-anywhere feature gets rid of the one thing that was slowing me down – the need to keep my hand from wandering from the input area.)

There are a bunch of other things I want to do with it; I’ll be posting here with what works for me. Thanks, Vika!


Jan 30 2006

Installing OpenGroupWare on Ubuntu/Debian

All of these instructions relate to OpenGroupware 1.0beta2. Although the instructions on the site suggest otherwise, it seems like all of OpenGroupware’s myriad dependencies are available if you have debian unstable in your /etc/apt/sources.list. So if you download OGo source, untar it, cd into the created directory, and do dpkg-buildpackage -b or debuild -b (for which you may need to install the relevant packages), you’ll get an error that tells you which packages are missing on your system. sudo apt-get install those, and run debuild -b again.

If you have failed to fully appease the gods, it will error again with a new list of packages you need to install. Repeat this process until your obeisance is complete, and the compile begins. In addition to those packages, you will also need the ngobjweb adaptor for apache (1 or 2: I use 2), available via sudo apt-get install libapache2-mod-ngobjweb

Before the next step, you should have an operating PostgreSQL database (I installed 8.1, it seems to work fine with the 7.x adapter provided in OGo) and a user for opengroupware that can create new databases. Since the later install scripts assume this user is called ogo, I did sudo -u postgres createuser -r -d -l ogo. Then you need to make sure that postgresql is listening on the local port- just uncomment the line in /etc/postgresql/…/postgresql.conf that mentions localhost and restart postgresql.

After you are done getting things up and running, you can remove the db creation permission from the OGo user, but the install will fail with a useless and mostly wrong error message if you’ve had the temerity to create the user and its database. If you get this error message, don’t panic – just create the database yourself and load in the initial database schema:

$ sudo createdb ogo
$ sudo -u postgres psql ogo
ogo=# \i /usr/lib/opengroupware.org-1.0/commands/OGo.model/Resources/pg-build-schema.psql
==snip==
ogo=# GRANT ALL PRIVILEGES ON DATABASE ogo TO ogo;
When you’re done compiling, you’ll have a bunch of .deb packages in the directory above the directory you ran the build command in.

At this point, before you continue, you must have postgresql running, listening on localhost, and with a password-protected db user that can create new databases.

Now you want to run sudo dpkg -i *.deb. This will install all the OGo packages. You may have some errors during this install; go ahead and install whatever packages it kvetches about not having.

When you are prompted for information about what versions of the various components you want to install you can just accept the default answer. I have no idea whether it matters for the user to be able to create roles, but answering yes here seems sane. If you want the network hotsync daemon (nhsd) to run you need to change the default answer (none) to 1.0 in order for it to start. (there is no port information because it runs on the standard network-hotsync ports).

Now you’re almost home free. Unfortunately, /etc/apache2/conf.d/mod_ngobjweb-ogo.conf contains an error: where it says LocationMatch "^/OpenGroupware/*" it should read LocationMatch ^/OpenGroupware*" (no slash after the e). You can use any string in substitution for OpenGroupware: I like gw, for instance.

You should be good to go: please feel free to comment on this post if you have any issues.

There are some more fiddly bits involved in getting OGo to talk to other clients and whatnot but I will get to that in another article.


Jan 14 2006

upgrades.

Well, being the early adopter monkey I am, I have gone ahead and updated to wordpress 2.0. Plusses: the new admin interface is quite nice- I’m enjoying the wysiwyg editor, hopefully it will still be enjoyable when I resort to more advanced htmlishness. Unfortunately the old theme I used no longer works, so I’m using this orchid theme for the moment, though I’m also looking into K2. I’ve hooked up to the new anti-spam plugin that comes in 2.0, but am still uncertain whether wp-haschash, my previous fave anti-spam plugin, works with 2.0. Other plugins I’ve installed are Sitemap Generator – I’ve stuck with the 2.7 version – and FlikrRSS – so as to selectively include flikr posts in my site.

In other news, my struggle to install horde is both complete and not yet begun. Things work for the moment, but after I finally punted on getting ldap to work and got mysql-based authentication to work I realized that one of the main drivers of using ldap was so that contacts would work in ldap-aware imap clients like Thunderbird.

Also it seems like my fantasy about being able to sync everything from horde to mac is still in beta, and the only way to get it installed is to go ahead and move to a HEAD cvs checkout … which is lovely, but I do need to use it. So I guess on the agenda is making a head-horde, so as to not interfere with the progress I’ve made so far.

Still, horde seems like the most actively developed tool of its kind – if anyone knows of OSS that’s better for webmail/project management/calendaring, I’d love to hear about it. I’ve seen some sexy AJAX toys around, but my attempts to install them have been for naught.


Jan 6 2006

A hard day’s night and (back!)

I’ve just completed getting ldap cooperating with horde … what this means is that all five of you using mindlace.net services will get One Username/Password to Rule them All: it’ll work for webmail, calendar, dav, subversion, jabber, and even yer blog if’n you like. etc.

Plus, there will be a spiffy note-taking app, a todo-list app, and a server-side filtering app for your imap email.

I also learned, in the way one might describe a cow learning her owner when the searing hot iron presses into her flank and the warm smell of seared cow flesh drifts into the cold morning air, that if you are using Apache 2 and php 5, setting post_max_size = 2048M in your php.ini file is really a fancy way of saying “please drop my POSTs on the floor”. In other words, no form submission was submitted with the form … but the request itself was. These being well behaved php programs, when you send them nonsensical input they give you somewhat reasonable error conditions; it just took me forever to realize that php was eating my POSTs when I had asked it to accept up to two gigabytes of POST data.


Jul 7 2005

mindlace.net jabber server

Gentlebeings,

I have added an ssl-encrypted jabber server. If you would like to connect to it, sign up for an account:

User Name
Password
Repeat Password

You will get additional instructions after filling out the form.


Apr 25 2005

Britannica: The Old Guard

Britannica is the oldest English-language encyclopedia, published since 1768. The first edition was printed in Scotland, and was an instrument of the Enlightenment that fomented there.

The former Editor in Chief of the Britannica does not have a high opinion of Wikipedia; he calls it “the Faith Based Encyclopedia”. Wikipedia offers a “random page” feature; Britannica does not. I used it to choose 5 articles, and attempted to find the same, or a similar article in Britannica. I’m not providing links to any of the Britannica articles because it is behind a paywall and I’m accessing it via reverse proxy.

The first random page was “Cahoots”, an album put out by a band called The Band. Wikipedia features the album, reviews, a picture of the album cover, and a link to an article about The Band itself. Britannica has an article for The Band that mentions “Cahoots”. It has no links to reviews, but there are two ‘interactive’ shockwave elements that look like they’re from the CD version. Both come out strong on this one; Wikipedia has more links to other sources, but Britannica has more media.

Next article was Hillsdale, New Jersey. Wikipedia features a map that shows the borough’s relationship to its county and New Jersey as a whole, GPS coordinates, demographics, and links to the official site along with other map and photo information. Britannica mentions it in its entry on New Jersey. Wikipedia took the prize this round.

Then came the Chislehurst Caves, which are abandoned mines in Chiselhurst, England. Wikipedia has two paragraphs about them; one about their history and another about their mythology, with two links, one to the ‘official’ page, and the other to an English guidebook’s opinion of the area, which itself runs to over a page. Wikipedia again.

The page on John Murray Gibbon was a brief two-paragraph discussion of the Canadian historian, and his advocacy of folk arts. There was no information on him in Britannica.

Wikipedia has a paragraph or so page on Dollis Hill, an area of London, England. It describes its location, how it got its name, and famous residents and visitors. Britannica mentions the Dollis Hill house when discussing the borough of Brent.

While both Britannica and Wikipedia show a strong Anglo-Saxon bent, Wikipedia is clearly the more useful reference source, if only for its extensive linking to other reference sources. That seems to obliterate the argument advanced by the Britannica editor – that it was a bad thing for the unwashed masses to be able to change anything – the level of trust you need to click on a link is really low, and it’s obvious (and rapidly remedied) when that trust is breached.


Apr 22 2005

I am a horrible person to have on a tech support line

Here’s the context. I have a SuperDrive in my PowerMac G5 2GHz that intermittently fails to read certain data CDs. It has essentially no problem (aside from shitty error recovery from read failures) reading DVDs. It has no problem reading CD-Rs. But it does have a problem where it gets locked in seeking for the start of the disk, and seeks over and over again for minutues before ejecting. It does this on bootup, before the OS is loaded. These CDs work flawlessly in Vika’s 12″ PowerBook G4. I do not have any original media for this machine anymore.

Now, I make bupkus. Of that bupkus, Apple Corporation receives more than what the Catholic Church would have asked of me were I not an apostate. I paid $200 to get the extra-special care package, and I spent more money than was strictly rational to get this tower.

I am invariably wildly irrational with people who walk through a script without understanding the context, ask me to do things I’ve already tried, and then don’t know enough about the boot sequence to recognize that a drive error before the root partition is found is not the fucking OS’s fault.

When I beg him to at least note the fact that data CDs placed in other drives don’t run on this drive, he refused. I suggested that while Apple did not see fit to actually require any product knowledge, they surely required him to be able to type.

We went back and forth like that for a while. Eventually he put me on hold having not typed anything. After 10minutes, someone else comes on the line, and I made an ass of myself carrying over my frustration from the last conversation.

Finally he agrees to a replacement. I ask him if I can pay extra to have a better drive than the one I started. He said there was no way to do that.

I guess it’s time to install Linux, and see if I can wean myself off of Apple hardware.


Conclusion: The replacement drive arrived. It works flawlessly. Thanks for making me waste three hours of my life proving that my drive was dysfunctional, Apple.


Jan 10 2005

Comment spam fix

I have been struggling with comment spam, but I think I’ve vanquished it with this latest plugin. It makes the user compute a md5 hash in javascript before submitting a page. Basically, a md5 hash is a unique fingerprint of some binary sequence, that can only be computed by running the md5 algorithm. What this script does is take the IP address of the user agent hitting your post page, a site-specific string, the user agent string, and the time down to the hour. Then it md5 encodes that string, which means analysis of the string itself can’t reveal how it is generated.

It inserts into each page a randomly-named javascript function that, upon submission, computes the md5 for the md5-encoded string and makes that the name of a hidden form variable, which in turn has a unique value.

What this means is this:

  1. A spammer must visit every comment page in order to get the information needed to comment on each page.
  2. A spammer must implement enough javascript & DOM to handle the function.
  3. A spammer must compute a md5 sum.

All of these things are expensive, computationally speaking. Calculating the md5 sum itself is not so bad, rendering a DOM tree & loading a javascript engine is fairly expensive, if you’re trying to spam millions of pages.

This may not be the ultimate fix, but I suspect it may put the “cost” of generating spam too high for most spammers. If you have a WordPress blog, I can’t recommend the WP-Hashcash plugin enough.


Jan 18 2004

Want a DOI for your webpage/site? hah!

In case you were thinking hey, maybe I should get me a DOI for some thing I just wrote, so as to make it easier for people to find in the future – think again.

The DOI system is loosely modelled after the domain system, but unlike it – where the process to become a registrar is clear, and anyone can buy from any registrar – the DOI system is farmed out to cartels, who appear to have near exclusive rights over their ‘reigons of interest’. None of these cartels actually allows you to walk up and get a DOI number, but all of them will allow you to beg for it via email, and they’ll get back to you.

To join the cartel costs $35k/yr, and then a fee per DOI number – its as if you had to pay a penny to your registrar every time you added a new url on your website. I would be shocked if these costs weren’t passed on to the people they license out this technology to.

This cartel exists “so that the integrity of the DOI system as a whole is maintained at the highest possible level (delivering reliable and consistent results to users).”

Essentially, what they’re selling is tiny url with the ability that once you register a url you can go back and change where it points.

Anyway, a bunch of people in scientific publishing have signed on to this farce, so we can’t imagine it’ll go away. From my angle, however, it looks like a triage opportunity.