The code uses an OCR to read text from URLs in the list 'url_list'. I am trying to append the output in the form of a string 'txt' into an empty pandas column 'url_text'. However, the code does not append anything to the column 'url_text'? When
df = pd.read_csv(r'path') # main dataframe
df['url_text'] = "" # create empty column that will later contain the text of the url_image
url_list = (df.iloc[:, 5]).tolist() # convert column with urls to a list
print(url_list)
['https://pbs.twimg.com/media/ExwMPFDUYAEHKn0.jpg',
'https://pbs.twimg.com/media/ExuBd4-WQAMgTTR.jpg',
'https://pbs.twimg.com/media/ExuBd5BXMAU2-p_.jpg',
' ',
'https://pbs.twimg.com/media/Ext0Np0WYAEUBXy.jpg',
'https://pbs.twimg.com/media/ExsJrOtWUAMgVxk.jpg',
'https://pbs.twimg.com/media/ExrGetoWUAEhOt0.jpg',
' ',
' ']
for img_url in url_list: # loop over all urls in list url_list
try:
img = io.imread(img_url) # convert image/url to cv2/numpy.ndarray format
# Preprocessing of image
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
gry = cv2.resize(gry, (w*3, h*3))
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr) # read tweet image text
df['url_text'].append(txt)
print(txt)
except: # ignore any errors. Some of the rows does not contain a URL causing the loop to fail
pass
print(df)
txtto see if it contains anything?#df['url_text'].append(txt)the txt is printed in the console one by one. However, when addingdf['url_text'].append(txt)I cannot se the txt in the console. The txt object is a string.