am trying to write a regular expression to insert a space in specific location. Am reading html file and trying to insert a space between #WORD<tag so it would be #WORD <
where WORD is a variable, could be anything, as long as its a real word (string)
<p style="text-align: left;" data-redator="true"> #deeds</p><p style="text-align: left;" data-redator="true"></p><p style="text-align: left;" data-redator="true">this is it #$%$%$ dkfj dlkjf dklfj </p>
in the example above, i want to insert a space in #deeds</p> so it would be #deeds </p>
I tried to use string replace and re.sub but i don't know how to replace while maintaining a variable in between..
any advice?
update
I tried the provided in one of the answers and it works well, but problem is that it won't work on unicode characters. I tried to do the following adjustment, it picks up english words but not unicode characters like arabic
re.sub(ur'(#\w+)(<)', ur'\1 \2', c, flags=re.UNICODE)
below is an example of the html
<p style="text-align: left;" data-redator="true"> #$^$%^</p><p style="text-align: left;" data-redator="true"></p><p style="text-align: left;" data-redator="true"> #sdkjf #الكويت</p><p style="text-align: left;" data-redator="true"></p><p style="text-align: left;" data-redator="true"></p>
any suggestions? I used the re.UNICODE flag and tried to use ur before the regular expressions to parse the unicode.. but no luck
#deedsfrom the one inside your example without having the context and structure of the page?