I have a script which predicts product names from input files. The code is as follows:
output_dir = "C:\\Users\\Lenovo\\.spyder-py3\\NER_training"
DIR = 'C:\\Users\\Lenovo\\.spyder-py3\\Testing\\'
print("Loading from", output_dir)
nlp2 = spacy.load(output_dir)
with open('eng_productnames.csv', newline='') as myFile:
reader = csv.reader(myFile)
for rowz in reader:
try:
filenamez = rowz[1]
file = open(DIR+filenamez, "r", encoding ='utf-8')
filecontentszz = file.read()
for s in filecontentszz:
filecontentszz = re.sub(r'\s+', ' ', filecontentszz)
#filecontents = filecontents.encode().decode('unicode-escape')
filecontentszz = ''.join([line.lower() for line in filecontentszz])
doc2 = nlp2(filecontentszz)
for ent in doc2.ents:
print(filenamez, ent.label_, ent.text)
break
except Exception as e:`
which gives me output in the form of a stringas:
07-09-18 N021024s16PASBUNDLEACK - Acknowledgement P.txt PRODUCT ABC1
06-22-18 Letter from Supl.txt PRODUCT ABC2
06-22-18 Letter from Req to Change .txt PRODUCT ABC3
Now I want to export all these details to a csv with 2 columns, one column as FILENAME and one column with PRODUCT having all filenames and product names under the respective column names. All product names start with PRODUCT and then the name in the string. How can I solve this:
Output csv should look like:
Filename PRODUCT
07-09-18 Acknowledgement P.txt ABC1
06-22-18 Letter Req to Change.txt ABC2
for ent in doc2.ents: print(filenamez, ent.label_, ent.text)statement which returns a string like '10-26-18 Letter from Req - Written Resp.txt PRODUCT ABC3' '