Java, Pascal Classes; Fall 1998; Becoming
In the Fall of 1998 I started to take classes in Pascal and Java at Santa Monica College – the Pascal class was interesting for a while, but I dropped it. My own reading at the time was doing what I needed. At the time the only programming I was doing was some Perl and some JavaScript. The Java class was pretty mind-blowing. I grokked object oriented code, but the class rapidly got ahead of my own experiences of OO code. I remember being right there with the instructor for about 3 weeks, then I was in way over my head. I’m glad I took those classes, but I am only a dilettante programmer. My skills are multivariate; this is my strength. I need to get comfortable with that.
I have been going through all my old papers and objects. This morning I came across my notebook from SMC. I write it here because I’m throwing out the notebook.
I’ve got a zen calm about letting go of these things I have packratted.
I am ever in the process of becoming. Refining. Rehabilitating. Remaking. Reclaiming.
Onward.
You are currently browsing articles tagged Perl.
Tags: Java, Pascal Classes, Perl, Santa Monica College
Search Engine Upgrade to ht://Dig
Last night and this morning I installed htdig as the new search engine for this site internally. Back in August I mentioned that I would start using google because the solution I was trying had stopped working with any reliability. ht://Dig is open source and originated here in San Diego at SDSU.
Total time for installation and customization was about 5 hours total. This is valuable information in case I ever need to install an htdig search engine for a client. Lots of small details in doing this installation. I downloaded the installation as a tar.gz file, then decompressed that to a suitable location (cgi-bin). Then I had to do configure, make, make install. Installing unix software is always an adventure. This site runs FreeBSD (see: colophon, and I was delighted that it went pretty smoothly.
Then I was ready to start running it. This got tricky, but it was straightforward as I was able to tweak the conf/htdig.conf file to do what I like. rundig is the key to indexing a site. At first I had broken images, but it was working properly. The site initially indexes the htdig site itself. Just like any web robot, it goes out and looks at that site just as a browser would. This put my mind at ease, as I was not sure how it would deal with databased content, or the fact that the pages on my site are very include() driven. I was also concerned that because it is a local search engine, it would index files I don’t want indexed. The perl search engine I had originally installed had this problem. It would find older versions of files and garbage files that had become garbage for a reason.
As I got it working, and pointed it at artlung.com, I found a problem. The indexing process was taking far too long. Seems I had an infinite loop happening! In my accessibility slideshow from 1999 I had a problem. The [next] and [previous] links did not give any thought to whether they should actually show or not. The php for that I had written when I really knew very little php, and I ended up with the search engine indexing not just /words/accessibility/?i=0 to /words/accessibility/?i=10, but it was iteratively visiting the “next” and “previous” links like crazy. ?i=-1, ?i=-2, ?=-3, and on until I stopped it at ?i=-115. That would have been 115 versions of the “previous” page that was no different than the “first” poge. The PHP I had written in 1999 was smart enough to handle bad values for $i, but not smart enough to realize that there was no “previous” pages for those pages. The “next” links had the same problem. The htdig indexer was not smart enough to know that it was indexing hundreds of nearly identical pages. The solution was to fix the slideshow code so that it would not produce spurious links like that. After that fix, it was indexed properly and quickly. This is probably another reason that many search engines simply won’t touch pages with querystrings.
The next problem I had was that it was showing bad search results for certain pages. Example: I searched for the word “Zappa” – and I got far more results than I would have expected. Granted, I am a Frank Zappa Fan, but why would the bio page come up in a result for that? Turns out the indexer found the entry inside the bottom
Tags: bad search results, htdig search engine, htdig site, html, include-driven site, local search engine, Perl, perl search engine, PHP, san-diego, search page, search engine, search engines, search results, unix, unix software, unix system, web robot
Comparative Anatomy for Server Side Scripting – two small, trivial examples of the same thing, done different ways – in Perl, in Cold Fusion, in PHP, and ASP.
Tags: Perl
Initially this site had a search powered by a script from Matt’s Script Archive. Based on the timestamps on files I think I installed it in March of 1999. Last month I replaced it with another service. Problem is, a few weeks later, based on testing it now and again – it didn’t work more than it did work.
So what’s the answer?
Google! of course. Google comes around and indexes roughly monthly. What that means is that the blog will not always be instantly searchable. And day-to-day changes will only be searchable with a lag. But Google is good, and predictable.
If I can find a PHP solution that’s painless I may change again.
I learned a great deal on that first perlbased search though. Installing, configuring, testing.

Recent Comments