[cups.development] why not hash?

Hidetomo Katsura Hidetomo.Katsura at efi.com
Tue Apr 24 11:33:09 PDT 2007


thanks for the actual numbers. i just implemented my own hash table lookup
code (instead of using CFDictionary/CFString) and got the actual numbers of
1000 ppdConflicts() calls + 1 ppdOpenFile() call.

k: keywords, c: constraints, bs: binary search (cups 1.3r6458), ht: hash
table

k: 67, c: 1076
  bs: 1.337919, ht: 1.021581 (1.31x)

k: 19, c: 0
  bs: 0.002012, ht: 0.002060 (0.94x)

k: 30, c: 26
  bs: 0.030796, ht: 0.024412 (1.26x)

the improvement is intended for the keyword (option->keyword) lookup, so we
don't get much benefit when the number of keywords is small.

what do you think?

katsura

-----Original Message-----
From: cups-dev-bounces at easysw.com on behalf of Michael Sweet
Sent: Mon 4/23/2007 9:00 PM
To: cups-dev at easysw.com
Subject: Re: [cups.development] why not hash?
 
Hidetomo Katsura wrote:
> thanks for the quick response, Michael.
> 
> i guess i wasn't clear what i meant by "hash". i meant to use the hash
table
> lookup which is O(1) assuming the collision is minimum, not the binary
search
> or linear search of hash values.
> 
> http://en.wikipedia.org/wiki/Hash_table
> 
> anyways, i'll post the changes when i have something.

Thanks.  FWIW, I coded up something simple for the array API that
you can play with, along with a benchmark program to test the
performance of ppdConflicts().  The code is available at:

     http://svn.easysw.com/public/cups/branches/hasharray

When I run the benchmark (which calls ppdConflicts 1000 times) against
a rather large Canon PPD file that has an enormous number of
UIConstraint lines (2056), I get the following results:

     CUPS 1.1.23: 2.346 seconds
     CUPS 1.2.10: 2.659 seconds
     CUPS hasharray: 2.168 seconds

So, at least for this particular PPD the old O(n) lookup code was
slightly faster than the current code, with the new hashed code
being just slightly faster than the 1.1.x code.

I'm guessing that another optimization such as a caching the lookups
in a shadow constraints array may yield the best results - that way
we only have the overhead of one set of lookups (1/1000th of the times
above) followed by quick loops through the cached constraints.

-- 
______________________________________________________________________
Michael Sweet, Easy Software Products           mike at easysw dot com
Internet Printing and Publishing Software        http://www.easysw.com
_______________________________________________
cups-dev mailing list
cups-dev at easysw.com
http://lists.easysw.com/mailman/listinfo/cups-dev





More information about the cups mailing list