I have been trying to write a tool to look at Word documents in a specified path for broken links.
I gave up on having it search a folder, thinking I just need to get it to do a document first. With my limited skills, reading, and trying some suggestions from copilot. (Breaking it down to individual tasks) I put this together:
import docx
import os
import url
doc = docx.Document('C:/Users/demo.docx')
allText = []
def find_hyperlinks(doc):
hyperlinks = []
rels = doc.part.rels
for rel in rels:
if "hyperlink" in rels[rel].target_ref:
hyperlinks.append(rels[rel].target_ref)
return hyperlinks
def find_broken_links_in_docx(doc):
broken_links = []
hyperlinks = find_hyperlinks(doc)
for url in hyperlinks:
try:
response = requests.head(url, allow_redirects=True)
if response.status_code >= 400:
broken_links.append(url)
except requests.RequestException:
broken_links.append(url)
return broken_links
def write_report(report, output_file):
with open(output_file, 'w') as f:
for file_path, links in report.items():
f.write(f"File: {file_path}\n")
for link in links:
f.write(f" Broken link: {link}\n")
f.write("\n")
if __name__ == "__main__":
output_file = "C:/Results/broken_links_report.txt"
report = find_broken_links_in_docx(doc)
write_report(report, output_file)
print(f"Report written to {output_file}")
Here is the error:
File "c:\Users\Scripts\playground\openinganddocx.py", line 41, in <module>
write_report(report, output_file)
File "c:\Users\Scripts\playground\openinganddocx.py", line 31, in write_report
for file_path, links in report.items():
AttributeError: 'list' object has no attribute 'items'
For reference:
Line 31
f.write(f"File: {file_path}\n")Line 41
print(f"Report written to {output_file}")
reportis alistwhich does not support.items(). I'm assuming the error message saysAttributeError: 'list' object has no attribute 'items'{str: list}. Tackle the problem with that in mind. The keys to your dict are filepaths, the values are the urls per file. Don't worry about writing the file just yet, parse the links out and see if you can fill out your data structure correctly. I'll add an answer, but I won't solve the whole problem