0

I have a very large set of string URL patterns like {http://www.imdb.com, http://www.amazon.com,...} in a list.

I am getting input URL's like this:

http://www.imdb.com/title/tt1409024/

For the purpose of my application this URL is actually formed from http://www.imdb.com, so the equality of these two should be true.

To implement this, I can extract the base URL from the input URL:

http://www.imdb.com/title/tt1409024/ => http://www.imdb.com

Now I need to compare this extracted URL with the master list of URLs and store the base URL in a database, if a match is found. So in essence, for each on of my input (base) URL's, I am looking for a match in the master list for the extracted URL, and if a match is found I am storing the input (base) URL in the database.

To implement the equality/matching logic, I have two possible solutions. Please weigh in as to which is better:

  1. Put the master list of URL's in an array list, and use the array list contains method
  2. Put the master list in a database, and use query to check the the input url against it

Can anyone tell me which one will be better in terms of performance?

2 Answers 2

4

Neither of your suggestions would be appropriate. For an ArrayList, you would have to search linearly through half the list (on average) for every URL you want to check.

For a database (presumably on disk?), you would incur a potentially expensive database lookup for every query.

1000 URL patterns isn't very many. Keep the list in memory and use an appropriate data structure - a HashSet would do a good job.

Sign up to request clarification or add additional context in comments.

Comments

1

If you put the url of the site into a HashSet you will get the same behaviour as your arraylist solution but it will be a constant time lookup instead of variable on the length of your list.

The database solution is probably overkill for your problem as the overhead will be more than the searching efficiency gains.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.