2

I have a long list of arbitrary strings and I'd like to determine if my given string "ABADCAFE" starts with any of the strings in my list. Is there a library class somewhere that can do this for me reasonably efficiently ?

(I suppose it's much like the state machine built by regex, but I don't think composing a regex is the way to go here - my list is too long)

3
  • Not sure if this will actually work, so I won't post it as an answer: Perhaps build a Tree where the nodes are the letters in the string. A parent may only have one child. A parent's child is its next letter, and a leaf has a child of null. Iterate over the Tree to search for possible words. Commented Aug 26, 2010 at 20:05
  • foreach instruction would be too inefficient? Commented Aug 26, 2010 at 20:06
  • I suppose I should've added that the iteration is repeated; i.e. the long list is used multiple times Commented Aug 27, 2010 at 6:45

2 Answers 2

3

What you're looking for is probably a Patricia Tree or Radix Tree: http://en.wikipedia.org/wiki/Radix_tree

Apache Commons Collections and Google Collections Library appear to have the same implementation: http://code.google.com/p/patricia-trie/

Sign up to request clarification or add additional context in comments.

Comments

0

I don't think there is some magic algorithm here that would give you super efficiency; after all, the algorithm has to take a look at each string. Building a finite state machine or a radix tree, or using the foreach -- they're all linear in the number of strings and for this application just quite the same.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.