Juha Heinanen wrote:
Alex Balashov writes:
What about some sort of hashing algorithm that
operates on regular
expression sums or is otherwise tuned by regex elements?
the problem is: given a string, find from set of prefixes the longest
one that matches the string. if prefixes are known strings, they can be
organized in a data structure (tree, hash table, etc), where matching is
fast to execute.
if prefixes are regular expressions, i don't know any other solution
than linearly match them one by one to string.
The solution, I imagine, would be to devise a tree or hash structure
wherein certain generalisable and atomic elements of regular expression
parsing are used to place the keys.
That said, I do not know what such an algorithm would be, cannot point
to any concrete implementations as examples, etc. and probably ought to
shut up. :)
Besides, I frequently make the erroneous assumption that such data
structures and optimisations exist where they don't, or at least are not
implemented. For instance, I was recently party to a thread on a list
for network service providers using Cisco gear in which it was revealed
to me, contrary to what I have characteristically and habitually
assumed, that ordinary access lists are actually evaluated in a linear
way; thus, it makes sense to place the most frequently-matched rules
near the top of the list if it is long.
I had no idea. I just took for granted that of course the matching is
done using some sort of compiled tree or hash structure that gives
matching performance somewhere between O(1) and O(N).
So, don't listen to me. :)
Cheers,
--
Alex Balashov
Evariste Systems
Web :
http://www.evaristesys.com/
Tel : (+1) (678) 954-0670
Direct : (+1) (678) 954-0671
Mobile : (+1) (706) 338-8599