Tuesday, July 01, 2008

Update on Gambit-C Scheme benchmarks

Thanks to some helpful comments from a few people on my previous post about Gambit-C Scheme, I rebuilt and ran (with a larger minimum heap size specified) those few benchmarks I was bothered pulling off their site and the shootout site for SBCL (which is a very fast Common Lisp implementation I'm interested in using as a kind of reference).

The results in general are much better:

time ./fannkuch -:m20000 11 > output_fannkuch

real 0m9.931s
user 0m9.465s
sys 0m0.172s

time ./fasta -:m20000 25000000 > output_fasta

real 0m29.843s
user 0m25.220s
sys 0m2.700s

time ./binary-trees -:m20000 16 > output-binary-trees

real 0m2.770s
user 0m2.581s
sys 0m0.089s

time ./pidigits -:m20000 2500 > output
real 0m3.322s
user 0m3.093s
sys 0m0.110s


A range of improvement is apparent, with binary-trees benefitting a lot and pidigits not really getting much better.
I picked a 20 meg heap because a) it seems wasteful for individual programs to reserve (and probably waste) huge chunks of memory, if you're running a lot of programs b) when I tried higher values up to 50 megs on the fasta benchmark, it was faster, but only by another .2 seconds or so; about 1-2%... and c) Most other languages, and in this case SBCL, don't seem to request more than about 20 megs for their initial heap size either.

Of note is the fact that the prebuilt version produced slightly faster code than the version I compiled myself, specifying the optimisations mentioned in the INSTALL.txt document. Not sure why... These numbers are for the Gambit I compiled myself, and I didn't feel like dumping it and re-installing the prebuilt version just yet.

Anyway, these are nice results!

Sunday, June 29, 2008

How fast is Scheme? Well....

Thinking that it'd be a good idea to become familiar with Scheme as well as Common Lisp (but, er... probably after I've read Practical Common Lisp and amn't mentally overloaded, what with me being mental so busy and needing to focus my efforts in a more single-threaded manner), I checked out Gambit Scheme in the hopes that it'd produce faster code than the other free Scheme implementations.
I don't know much about Scheme (or Lisp at all really; I've only become interested in it during the last month or so) but it seems that the Scheme compilers produce quite sluggish code, at least looking through the grainy, distorted lens that is the Computer Language Benchmarks Game.

From some of the microbenchmarks published by the Gambit guy(s) on their homepage, it looks as if it fares really well against the Chicken and Mzscheme compilers, but I wanted to examine its performance on problems relative to SBCL, which is the Common Lisp compiler I've arbitrarily chosen.
Since it's not featured on the language shootout, I pulled the SBCL code for a few tests there as well as the Gambit code which was already prepared on their site (why hasn't Gambit appeared on the shootout site for the nosy hackers who want to find out if less mainstream (man I hate that word) languages are ridiculously slow? Many of them aren't! And they're more expressive and comfortable than C++/Java, use them!) and ran a few comparisons.
SBCL comes out very much on top; in general the Gambit programs take two or three times as long to do their job (although I haven't looked at memory usage). But as far as Scheme compilers go, Gambit seems to be improving things.

(Really dodgy) timings on my Macbook (input values = whatever's on the shootout site):

Gambit:
pidigits
real 0m3.611s
user 0m3.508s
sys 0m0.066s

binary-trees
real 0m7.519s
user 0m7.309s
sys 0m0.130s

fannkuch
real 0m12.571s
user 0m12.089s
sys 0m0.205s

fasta
real 0m46.334s
user 0m42.284s
sys 0m2.668s

Lisp SBCL:
pidigits
real 0m2.422s
user 0m2.018s
sys 0m0.308s

binary-trees
real 0m1.899s
user 0m1.723s
sys 0m0.127s

fannkuch
real 0m6.221s
user 0m5.982s
sys 0m0.098s

fasta
real 0m20.730s
user 0m15.267s
sys 0m4.375s

(ratio-pidigits: 1.5
ratio-binary-trees: 3.959
ratio-fannkuch: 2.021
ratio-fasta: 2.235)


One thing I've noticed from these coarse measurements is the "sys" component of the execution time is mostly higher for SBCL; maybe this means a lot of allocation and de-allocation of stuff is going on ...Or something.

Extra legs on a dog

This hi-larious entry on the wikiwikiweb offers a series of biological analogies for the origins of some popular/not-so-popular programming languages, on the premise "how other languages became ObjectOriented octopi".

I particularly like:

JavaLanguage
  • ..made by taking an 8-legged dog, housebreaking it, and tearing out its skeleton. Then giving it a machine shop.

    • a boneless 8-legged dog manages to be less useful than either an octopus or a dog - but it comes with a machine shop, so that's good.

That said, this is exactly the kind of thing that I would chuckle ceaselessly at, for which even fellow coders would shun me for being lame and non computer scientists would simply presume me to be insane. Fair play, really.

Tuesday, June 24, 2008

Lisp problem 10: RLE of a list

This one worked out much easier, but again, I went at it non-recursively. Bitching about my struggles with the previous problems made me realise that I had solved most of the previous problems recursively but non-tail-recursively.
I guess there's a danger there of writing really poorly-performing functions on the assumption that the compiler will optimise them. Actually, I don't even know if SBCL does tail-call optimisation... It somehow seems easier to express tail-recursive solutions in Haskell, so far anyway - as a complete newbie to both languages.

;; P10: Run-length encoding of a list.
(defun encode (list)
(let ((chunked (pack list)) result)
(dolist (chunk chunked)
(setf result (append result
(list (list (length chunk) (car chunk))))))
result))


Oh yeah, how do you get around doing that ugly (append result (list (list x y))) stuff? I assume that's super slow, unless Lisp lists have a pointer to the last element for quick access since (you'd think?) appending is common. But then, some of the stuff I read in Practical Common Lisp suggests a push/reverse pattern, which to me seems a bit awkward.

99 lisp problems. Some are hard!

This much publicised set of 99 Lisp problems (taken from an original set of Prolog problems) seems like a nice step for me to take in learning Lisp as a novice to the language.

I figured the first few would be really easy with a smooth learning curve as the questions get more difficult. Certainly, I didn't expect to spend at least two hours screwing up question 9; an ostensibly straightforward problem requiring you to convert a sequence into a list of sublists, grouped by consecutively equal elements (i.e. (1 4 4 8 1 9 1 1 1 2 2 3) -> ((1) (4 4) (8) (1) (9) (1 1 1) (2 2) (3))).

After trying to solve this one a few different ways, I rejected a recursive approach and simply looped through all the elements, doing dirty stuff:
(defun pack (list)
(let (a main-list)
(dolist (x list)
(if (and (not (null a)) (eq x (car (last a))))
(setf a (append a (list x)))
(if (null a)
(setf a (list x))
(progn (setf main-list (append main-list (list a)))
(setf a (list x))))))
(setf main-list (append main-list (list a)))
main-list))


I had a quick peek at the solution provided on the site and didn't bother going through it except to note that it was indeed recursive (and not tail-recursive). Partly because some of the identifier names are in Portuguese and partly because it looks similar to the mess I had when I tried to do it recursively; I'm just not comfortable with how lists work yet - I was getting stuff like ((1 (1 (2 3)))) and ((1 (1 (2))) . 3) and all sorts of other crap.
I'm sure I could have done this quicker (and much longer and uglier) in Java, but not figuring it out properly in Lisp is making me feel a bit stupid. Wah.

Friday, May 23, 2008

From I Ran to Eye Rack

In a thread on codeproject, someone humorously commented that "pedophile" must mean somebody with a foot-fetish: American pronunciation has even changed it from "peed-o-file" to "peddo-file", which is almost equivalent to "encyclo-peddy-ah" (since American spelling similarly converts that 'ae' glyph to an 'e').

Another funny dodgy American pronunciation is the popular "eye-raq / eye-ran" thing. It was entertaining to hear that our ex-Taoiseach Bertie Ahern was introduced as "Bertie Ay-hern" by Nanci Pelosi on a recent trip to the US.
As culturally-prevalent-and-nobody-corrects-themselves pronunciation errors go, it seems as bad as the Japanese 'l'->'r' transliteration (take the eravator to the rower reveru, prease), since at least Japanese have the excuse that the 'l' sound doesn't appear in their language and they thus have trouble pronouncing it.

Tuesday, May 06, 2008

gymnastic bounces, late

Received 3/5/2008 (happy birthday):

Your membership in the mailing list Announce-gymnast has been disabled due to excessive bounces The last bounce received from you was dated 29-Dec-2003. You will not get any more messages from this list until you re-enable your membership. You will receive 2 more reminders like this before your membership in the list is deleted.

Excessive bounces, but the last one my account apparently sent was four and a half years ago? I don't think I even subscribed myself to that list, though I was in the gymnastics club in first year of college. Weird.

Friday, April 11, 2008

Java's arithmetic operators

Here's a weird little one that confused me at first:

 @Test
public void testWeirdArithmeticOperators() {
double a = 1.0 / + + + 2.0;
assertTrue(0.5 == a);
String s = + + + + 0 + "abc" + "def";
assertEquals("0abcdef", s);
}


After playing about for a bit, switching some +es to -es and vice versa, I realised that the effect is that each + or - symbol could be rewritten as follows:
+ -> (+1)*
- -> (-1)*

So that the expression "+ + + 2.0" becomes "(+1) * (+1) * (+1) * 2.0" = 2.0, and the expression "- + + 2.0" becomes "-1 * +1 * +1 * 2.0" = -2.0. Thus, the sign of the result is negative for an odd number of - symbols and positive for an even number.

Seems a little pointless, but I suppose the logical way to see it is:
- + - x = - (+ - x) = - (+ (-x)). It makes sense then that the parser would accept it, but in it that form it looks wrong to (my) human eye.

Also, in writing this post I noticed that JUnit's assertEquals doesn't really like doubles:
assertEquals(0.5, 0) passes, as does assertEquals(Double.valueOf(0.5), Double.valueOf(0.0)) and other combinations. That's why I needed to use assertTrue(...) and == above. Huh.

Tuesday, April 01, 2008

the poor man's breakpoint

Since Eclipse is kind of very ridiculously slow betimes, I tend to avoid making it do too much work, for fear of it getting stuck in a rut and slowing down. Partly because my work machine is pretty much a doorstop, for the purposes of working with Eclipse (512mb RAM just doesn't cut it, if you want a browser and other apps open at the same time - I had to stop using the latest Opera snapshots while coding, as they use too much RAM at the moment, compared to Firefox 3.0b4).

So instead of running tests/etc in the debug perspective and adding breakpoints where I want to stop (if that block of code is executed), I've taken to doing things like this instead:
throw new RuntimeException("FUCK");
This means I get a stack trace showing how we got to the current state, which is often enough to realise what the problem is (at least, it was just now; I saw that a method I didn't expect to be called was being called by a parser for poorly-formed XML messages. Until that point, I didn't realise the messages were now poorly-formed as they weren't before I swapped XMLSerializer for LSSerializer earlier today).
Also, it's easy to search and remove your debugging throws afterward, since they contain the reasonably memorable, short string "FUCK". Of course, you wouldn't want to forget about it and leave one in a currently unused (as far as you know) method, lest an end-user find that their application is throwing an exception because FUCK. :)

Perhaps a more subtle keyword could be selected (how about "Fire in the hole" or "Get to the choppa!")...