0

Assuming I have the following string with line breaks:

<table>
<tr>
<td valign="top"><a href="ABext.html">House Exterior:</a></td><td>Round</td>
</tr>
<tr>
<td>EF</td><td><a href="AB.html">House AB</a></td></tr>
<tr>
<td valign="top">Settlement Date:</td>
<td valign="top">2/3/2013</td>
</tr>
</table>

What is the best way to create a simple python dictionary with the following:

I want to extract the Settlement Date into a dict or some kind of regex match. What is the best way to do this?

NOTE: A sample in some utility is fine, but am looking for a better way than to have a variable that has contains text like this and having to go through a lot of .next.next.next.next.next until I finally get to settlement date, which is why I posted this question in the first place.

2

1 Answer 1

1

If the data is highly regular, then a regex isn't a bad choice. Here's a straight-forward approach:

regex = re.compile(r'>Settlement Date:</td>[^>]*>([^<]*)')
match = regex.search(data)
print match.group(1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.