0

So I'm doing this project, a little search engine. On the following function I try to build an Index and a Database from the content of a website. ( Index being a dictionary with index[word] = [url] , so for each word, a list of urls where it appears. And the db is a dictionary where db[url] = [(title , score)] where for each url found, its title and a score that is pregiven.

Now, I have the AttributeError 'str' object has no attribute 'append' in add_to_index function. (I attach the get_content function as well from which add_to_index is called. Here I attach the code, if someone could help! Thank you!

def get_content( url , soup , index , db):
 title = soup.title.text.strip()
 head = soup.head.text.strip()
 body = soup.body.text.strip()
 p = soup.find_all('p')
 h1 = soup.find_all('h1')
 h2 = soup.find_all('h2')
 h3 = soup.find_all('h3')

 #Introduce all of what we have got
 add_to_index( index , db, url , title , 1 )
 add_to_index( index , db, url , head , 2 )
 add_to_index( index , db, url , body , 3 )
 add_to_index( index , db, url , p , 4 )
 add_to_index( index , db, url , h1 , 4 )
 add_to_index( index , db, url , h2 , 4)
 add_to_index( index , db, url , h3 , 4)

def add_to_index( index , db , url ,  section , score ):
    for word in section:
        if word in index:
            index[word].append(url)
        else:
            index[word] = ( url )
            if word == title and not url in db:
                db[url] = ( title , score )
2
  • 1
    Well, it's true, str has no append. The question is, what are you surprised about? Did you not think that index[word[0]] was a string? Did you think strings could be mutated in-place with methods like append? Or something different? Or, to put it a different way, what did you want that line index[word[0]].append(url) to do? Commented Jun 9, 2018 at 0:33
  • I would like to find a solution for it, another method maybe, so that I can add some more urls to the entry word of index! Commented Jun 9, 2018 at 0:38

1 Answer 1

0

This line that creates the initial value of index[word] is setting it to a string:

        index[word] = ( url )

You should set it to a list:

        index[word] = [ url ]

Then you can append to it.

Sign up to request clarification or add additional context in comments.

3 Comments

Great, thanks! Do you know why word = characters instead of words? I thought .strip() would break down the text in to words.
strip() just removes any surrounding whitespace, why would it break it into words? You're thinking of .split()
Perfect! Thank you!!!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.