0

Sample Text

    outline: 4 0
      corner: 1 347980000 -2540000 0
      corner: 2 347980000 -20320000 0
      corner: 3 482600000 -20320000 0
      corner: 4 482600000 -2540000 0

    outline: 4 1
      corner: 1 0 -2540000 0
      corner: 2 345440000 -2540000 0
      corner: 3 345440000 -20320000 0
      corner: 4 0 -20320000 0

    outline: 8 2
      corner: 1 0 0 0
      corner: 2 0 35560000 0
      corner: 3 53340000 35560000 0
      corner: 4 53340000 76200000 0
      corner: 5 449580000 76200000 0
      corner: 6 449580000 30226000 0
      corner: 7 482600000 30226000 0
      corner: 8 482600000 0 0

    outline: 4 3
      corner: 1 0 38100000 0
      corner: 2 50800000 38100000 0
      corner: 3 50800000 76200000 0
      corner: 4 0 76200000 0

    outline: 4 4
      corner: 1 482600000 76200000 0
      corner: 2 482854000 31750000 0
      corner: 3 450850000 31750000 0
      corner: 4 450850000 76200000 0

/^\s+corner:\s*(\d+)\s+(-?\d+)\s+(-?\d+)\s+(\d+)/m Captures all values for corners.

/^\s*outline:\s*(\d+)\s+(\d+)$.*?\s+corner:\s*(\d+)\s+(-?\d+)\s+(-?\d+)\s+(\d+)/m Captures all outlines, but only the first corner of each outline.

/^\s*outline:\s*(\d+)\s+(\d+)$.*?(^\s+corner:\s*(\d+)\s+(-?\d+)\s+(-?\d+)\s+(\d+)$).*?/m Does the same thing as the second, but looks like this:

4
0
corner: 1 347980000 -2540000 0
1
347980000
-2540000
0

I am trying to get it to capture all of the outlines and related corners. It's obviously not properly grouped - Any suggestions?

Thank you ;-)

2 Answers 2

1

Since the number of captures you want varies (probably without limit), you cannot do that in one regex. String#scan comes in handy in such case.

text.scan(/^\s*outline:\s*(\d+)\s+(\d+)\n(.*?)(?:\n\n|\z)/m)
.map{|a, b, corners| [a, b, corners.scan(/^\s+corner:\s*(\d+)\s+(-?\d+)\s+(-?\d+)\s+(\d+)/)]}

will give you:

[["4", "0",
  [["1", "347980000", "-2540000", "0"],
   ["2", "347980000", "-20320000", "0"],
   ["3", "482600000", "-20320000", "0"],
   ["4", "482600000", "-2540000", "0"]]],
 ["4", "1",
  [["1", "0", "-2540000", "0"],
   ["2", "345440000", "-2540000", "0"],
   ["3", "345440000", "-20320000", "0"],
   ["4", "0", "-20320000", "0"]]],
 ["8", "2",
  [["1", "0", "0", "0"],
   ["2", "0", "35560000", "0"],
   ["3", "53340000", "35560000", "0"],
   ["4", "53340000", "76200000", "0"],
   ["5", "449580000", "76200000", "0"],
   ["6", "449580000", "30226000", "0"],
   ["7", "482600000", "30226000", "0"],
   ["8", "482600000", "0", "0"]]],
 ["4", "3",
  [["1", "0", "38100000", "0"],
   ["2", "50800000", "38100000", "0"],
   ["3", "50800000", "76200000", "0"],
   ["4", "0", "76200000", "0"]]],
["4", "4",
  [["1", "482600000", "76200000", "0"],
   ["2", "482854000", "31750000", "0"],
   ["3", "450850000", "31750000", "0"],
   ["4", "450850000", "76200000", "0"]]]]

If you want numbers instead of strings,

text.scan(/^\s*outline:\s*(\d+)\s+(\d+)\n(.*?)(?:\n\n|\z)/m)
.map{|a, b, corners| [a.to_i, b.to_i, corners.scan(/^\s+corner:\s*(\d+)\s+(-?\d+)\s+(-?\d+)\s+(\d+)/).map{|a| a.map(&:to_i)}]}

will give you:

[[4, 0,
  [[1, 347980000, -2540000, 0],
   [2, 347980000, -20320000, 0],
   [3, 482600000, -20320000, 0],
   [4, 482600000, -2540000, 0]]],
 [4, 1,
  [[1, 0, -2540000, 0],
   [2, 345440000, -2540000, 0],
   [3, 345440000, -20320000, 0],
   [4, 0, -20320000, 0]]],
 [8, 2,
  [[1, 0, 0, 0],
   [2, 0, 35560000, 0],
   [3, 53340000, 35560000, 0],
   [4, 53340000, 76200000, 0],
   [5, 449580000, 76200000, 0],
   [6, 449580000, 30226000, 0],
   [7, 482600000, 30226000, 0],
   [8, 482600000, 0, 0]]],
 [4, 3,
  [[1, 0, 38100000, 0],
   [2, 50800000, 38100000, 0],
   [3, 50800000, 76200000, 0],
   [4, 0, 76200000, 0]]],
[4, 4,
  [[1, 482600000, 76200000, 0],
   [2, 482854000, 31750000, 0],
   [3, 450850000, 31750000, 0],
   [4, 450850000, 76200000, 0]]]]
Sign up to request clarification or add additional context in comments.

4 Comments

There you go again, with the goods - dude, you rock! I'm using string.scan(/regex/) for everything in this project. I knew I had to break them up somehow, but I just wasn't getting it - strings or numbers will both work, but numbers would probably be better. Your assumption is correct, while most will only have 1 outline, with 4 corners, it still needs to be flexible - just in case someone (like me), decides to panelize the project, with multiple, odd shaped outlines. I'll post results, when I test the API for it. Thanks again ;-)
I had to add class Symbol def to_proc proc { |*args| args[0].send(self, *args[1...args.size]) } end end to get output from the second version - as I'm using Ruby 1.8.6
A map(&:symbol) call results in wrong argument type Symbol (expected Proc) (TypeError)
Update: Using the numbers example, it worked like a charm in my app - no errors. Thanks again ;-)
0

I'm not sure I even want to know what the purpose of scanning that file with a regex is.

But you know, it would be easy to parse using virtually any technique other than regular expressions.

And in fact, with just a slight change in syntax1 it's a good YAML file:

- outline: 4 0
  - corner: 1 347980000 -2540000 0
  - corner: 2 347980000 -20320000 0
  - corner: 3 482600000 -20320000 0
  - corner: 4 482600000 -2540000 0

- outline: 4 1
. . .
. . .

And there you go, a perfectly organized data structure with one line of Ruby:

 > pp YAML::load_file 'corners.yaml'
[{"outline"=>
   [{"corner"=>"1 347980000 -2540000 0"},
    {"corner"=>"2 347980000 -20320000 0"},
    {"corner"=>"3 482600000 -20320000 0"},
    {"corner"=>"4 482600000 -2540000 0"}]},
 {"outline"=>
   [{"corner"=>"1 0 -2540000 0"},
    {"corner"=>"2 345440000 -2540000 0"},
    {"corner"=>"3 345440000 -20320000 0"},
    {"corner"=>"4 0 -20320000 0"}]},
 {"outline"=>
   [{"corner"=>"1 0 0 0"},
    {"corner"=>"2 0 35560000 0"},
    {"corner"=>"3 53340000 35560000 0"},
    {"corner"=>"4 53340000 76200000 0"},
    {"corner"=>"5 449580000 76200000 0"},
    {"corner"=>"6 449580000 30226000 0"},
    {"corner"=>"7 482600000 30226000 0"},
    {"corner"=>"8 482600000 0 0"}]},
 {"outline"=>
   [{"corner"=>"1 0 38100000 0"},
    {"corner"=>"2 50800000 38100000 0"},
    {"corner"=>"3 50800000 76200000 0"},
    {"corner"=>"4 0 76200000 0"}]},
 {"outline"=>
   [{"corner"=>"1 482600000 76200000 0"},
    {"corner"=>"2 482854000 31750000 0"},
    {"corner"=>"3 450850000 31750000 0"},
    {"corner"=>"4 450850000 76200000 0"}]}]

1. Now I did use a vim(1) regex to convert the file :%s/^ */&- /

2 Comments

YAML does look good, but the syntax of the file is not under my control. The 2 attributes following outline are <number of corners in board outline> <outline index>. Each outline is a separate entity, and I need to handle them as such, in order to incorporate them in another project. The sample text is just part of a larger file, this section just happens to have similar child elements, where all the others had individual child elements, that made using a regex the simpleest solution. I'm not familiar with YAML, and I need to keep this in Ruby 1.8.6, as I'm using it with the SketchUp API.
Ran out of room - Thank You ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.