I'm trying to create a 'search engine' (sorta) for a simple library database system.
As a brief description on what I really want to do, I want to create a search engine program that's connected to an access database, that searches for a "Tag" associated to a book.
The Tag, or Keyword, is an option that the searchbox can use. So instead of searching for a name, you can search for a tag in the book itself. The tag can contain anything - from the theme of the book, if it's a book or light novel, and so on,
Ex : I type on the search : "Horror", a ton of books with "Horror" on their tags will be listed on a listbox.
That's fine and all, but as a search engine, it's not accurate enough. With my current code, as soon as I type a tag that matches specific books, that's it. It won't be specific enough, if I type in "Horror adventure" for example, as it will still list the books that have horror in it.
What my code does is it splits the search input based on the space that it has. On the Tags itself, it also splits them by the comma that it has. So on the database, you'll see the tags as "Horror, Adventure, Romance" for example. Then they're both iterated into two For-loops, to compare each split search string to the split tag string. If it matches, it adds it in. The code for it is:
Dim comparenew As String
Dim splittag As String
For Each comparenew In search
For Each splittag In compare
If splittag.Contains(comparenew) = True And comparenew <> "" And comparenew.Count <> 1 Then
If Not frmList.lstBooks.Items.Contains(dr("BookName").ToString()) Then
frmList.lstBooks.Items.Add(dr("BookName").ToString())
End If
End If
Next
Next
Next
Normally, this would result in multiple instances of the same book being added on the list, but I've already added the prevention of duplicates on the listbox, which is the "if not" statement.
But I want to utilize this duplication as a measure of accuracy. The more duplicates, the better search result.
Let's say a user inputs a search that resulted to a maximum of two (2) duplicated items. The 2 duplicate items will be added on the other listbox first, then add the rest of the duplicated items at the end.
If that's a bit confusing, that means that it's not always the 'most duplicated' item being added on the listbox, it will also add the lesser count duplicates as well.
Here's another example: Let's say that the user searched: "Love comedy adventure story, with time travel".
5 books with "Love", "Comedy", "Adventure", and "Time Travel" matches. They're automatically added to the list. (This means that these 5 books got duplicated 4 times each)
2 books with "Comedy", "Adventure", and "Time Travel" matches. They're added to the list (This means that these 2 books got duplicated 3 times each)
10 books with "Love", "Comedy", and "Adventure" matches. They're also added to the list (10 books with 3 duplicates each)
25 books with "Love", and "Comedy" matches. But they're not added to the list (They were only duplicated 2 times)
Hopefully everyone can understand the example, I think it was a bit clear. So as you can see, there's a 'level' wherein the results are added even though they're not the 'most duplicated' among the items. It's just one count below, but that's the problem on the search engine, as I don't know how to code it. I was thinking of counting the duplicates first, then put it on an Array for comparison. I'm not too sure how the code will work for this though.
Is there anyone that can help me with this? Let me know if you need the code, or make it clearer for you about this. I tried to look for something similar, but only found results about counting duplicates, or simply removing them.