0

I'm trying to get an implementation of github flavored markdown working in python, with no luck... I don't have much in the way of regex skills.

Here's the ruby code from github:

# in very clear cases, let newlines become <br /> tags
text.gsub!(/(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+/m) do |x|
  x.gsub(/^(.+)$/, "\\1  ")
end

And here's what I've come up with so far in python 2.5:

def newline_callback(matchobj):
    return re.sub(r'^(.+)$','\1 ',matchobj.group(0))     
text = re.sub(r'(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+', newline_callback, text)

There just doesn't seem to be any effect at all :-/

If anyone has a fully working implementation of github flavored markdown in python, other than this one (doesn't seem to work for newlines), I'd love to hear about it. I'm really most concerned about the newlines.

These are the tests for the regex, from github's ruby code:

>>> gfm_pre_filter('apple\\npear\\norange\\n\\nruby\\npython\\nerlang')
'apple  \\npear  \\norange\\n\\nruby  \\npython  \\nerlang'
>>> gfm_pre_filter('test \\n\\n\\n something')
'test \\n\\n\\n something'
>>> gfm_pre_filter('# foo\\n# bar')
'# foo\\n# bar'
>>> gfm_pre_filter('* foo\\n* bar')
'* foo\\n* bar'
1
  • Please post an example of something that should work. What's expected and what you're really getting. Commented Jan 28, 2011 at 10:48

2 Answers 2

1

That Ruby version has multiline modifier in the regex, so you need to do the same in python:

def newline_callback(matchobj):
    return re.sub(re.compile(r'^(.+)$', re.M),r'\1  ',matchobj.group(0))     

text = re.sub(re.compile(r'(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+', re.M), newline_callback, text)

So that code will (like the Ruby version) add two spaces after before newline, except if we have two newlines (paragraph).

Are those test string you gave correct? That file you linked has this, and it works with that fixed code:

"apple\npear\norange\n\nruby\npython\nerlang"
->
"apple  \npear  \norange\n\nruby  \npython  \nerlang"
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a ton... Could you point out to to me where the multiline modifier is in the Ruby regex? It's not /m, is it?
Yes it's that /m. /foo/m is a short of Regexp.new("foo", Regexp::MULTILINE), see "Regular Expression Options": ruby-doc.org/docs/ruby-doc-bundle/ProgrammingRuby/book/…
0
return re.sub(r'^(.+)$',r'\1 ',matchobj.group(0))
                       ^^^--------------------------- you forgot this. 

1 Comment

Yes, but that didn't really solve the problem... I've added tests that need to pass when the regex is applied.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.