1

I have a file like this:

$urls = [
      {name:'Venture Capitals',
       sites: [
           'http://blog.ycombinator.com/posts.atom',
           'http://themacro.com/feed.xml',
           'http://a16z.com/feed/',
           'http://firstround.com/review/feed.xml',
           'http://www.kpcb.com/blog.rss',
           'https://library.gv.com/feed',
           'http://theaccelblog.squarespace.com/blog?format=RSS',
           'https://medium.com/feed/accel-insights',
           'http://500.co/blog/posts/feed/',
           'http://feeds.feedburner.com/upfrontinsights?format=xml',
           'http://versionone.vc/feed/',
           'http://nextviewventures.com/blog/feed/',
       ]},

      {name:'Companies and Groups',
       sites: [
           {name:'Product Companies',
            sites: [
              'https://m.signalvnoise.com/feed',
              'http://feeds.feedburner.com/insideintercom',
              'http://www.kickstarter.com/blog.atom',
              'http://blog.invisionapp.com/feed/',
              'http://feeds.feedburner.com/bufferapp',
              'https://open.buffer.com/feed/',
              'https://blog.asana.com/feed/',
              'http://blog.drift.com/rss.xml',
              'https://www.groovehq.com/blog/feed',]},
           {name:'Consulting Groups, Studios',
            sites: [
              'http://svpg.com/articles/rss',
              'http://www.thoughtworks.com/rss/insights.xml',
              'http://zurb.com/blog/rss',]},
           {name:'Communities',
            sites: [
              'http://alistapart.com/main/feed',
              'https://www.mindtheproduct.com/feed/',]},
       ]},


  ]

I have organized the $url into different groups. Now I want to extract all the urls out (the link in the sites), how should I do?

The main problem is that, there are sites within sites, as the file showed above.

My problems are:

  1. Am I using a proper file structure to save these links? (array within array). If not, what would be good way to save and group them?

  2. How can I extract all the urls out into a flattened array? so I can later iterate through the list.

I can do this pretty manually, like the code shown below.

 sites = []
  $urls.each do |item|
    item[:sites].each do |sub_item|
      if sub_item.is_a?(Hash)
        sites.concat sub_item[:sites]
      else
        sites.append sub_item
      end
    end
  end

  File.open('lib/flatten_sites.yaml', 'w') { |fo| fo.puts sites.to_yaml }

But I just feel this is bad code.

An alternative in this specific case, is to collect all the sites attribute, but I feel this is also very constrained, and may not help in some other cases.

2 Answers 2

4

If you have Hash, you can use this recursive method

Input

urls = [
  {
    :name => 'Venture Capitals',
    :sites => [
      'http://blog.ycombinator.com/posts.atom',
      'http://themacro.com/feed.xml',
      'http://a16z.com/feed/',
      'http://firstround.com/review/feed.xml',
      'http://www.kpcb.com/blog.rss',
      'https://library.gv.com/feed',
      'http://theaccelblog.squarespace.com/blog?format=RSS',
      'https://medium.com/feed/accel-insights',
      'http://500.co/blog/posts/feed/',
      'http://feeds.feedburner.com/upfrontinsights?format=xml',
      'http://versionone.vc/feed/',
      'http://nextviewventures.com/blog/feed/',
    ]
  },
  {
    :name => 'Companies and Groups',
    :sites => [
      {
        :name => 'Product Companies',
        :sites => [
          'https://m.signalvnoise.com/feed',
          'http://feeds.feedburner.com/insideintercom',
          'http://www.kickstarter.com/blog.atom',
          'http://blog.invisionapp.com/feed/',
          'http://feeds.feedburner.com/bufferapp',
          'https://open.buffer.com/feed/',
          'https://blog.asana.com/feed/',
          'http://blog.drift.com/rss.xml',
          'https://www.groovehq.com/blog/feed',]
      },
      {
        :name => 'Consulting Groups, Studios',
        :sites => [
          'http://svpg.com/articles/rss',
          'http://www.thoughtworks.com/rss/insights.xml',
          'http://zurb.com/blog/rss',]
      },
      {
        :name => 'Communities',
        :sites => [
          'http://alistapart.com/main/feed',
          'https://www.mindtheproduct.com/feed/',]
      }
    ]
  }
]

Method

def get_all_sites(data)
  data[:sites].map { |r| Hash === r ? get_all_sites(r) : r }
end

urls.map { |r| get_all_sites(r) }.flatten

Output

[
  "http://blog.ycombinator.com/posts.atom",
  "http://themacro.com/feed.xml",
  "http://a16z.com/feed/", 
  "http://firstround.com/review/feed.xml", 
  "http://www.kpcb.com/blog.rss", 
  "https://library.gv.com/feed", 
  "http://theaccelblog.squarespace.com/blog?format=RSS",
  "https://medium.com/feed/accel-insights",
  "http://500.co/blog/posts/feed/",
  "http://feeds.feedburner.com/upfrontinsights?format=xml",
  "http://versionone.vc/feed/", 
  "http://nextviewventures.com/blog/feed/",
  "https://m.signalvnoise.com/feed", 
  "http://feeds.feedburner.com/insideintercom",
  "http://www.kickstarter.com/blog.atom",
  "http://blog.invisionapp.com/feed/", 
  "http://feeds.feedburner.com/bufferapp", 
  "https://open.buffer.com/feed/", 
  "https://blog.asana.com/feed/", 
  "http://blog.drift.com/rss.xml", 
  "https://www.groovehq.com/blog/feed",
  "http://svpg.com/articles/rss", 
  "http://www.thoughtworks.com/rss/insights.xml", 
  "http://zurb.com/blog/rss", 
  "http://alistapart.com/main/feed", 
  "https://www.mindtheproduct.com/feed/"
]

I hope this helps

Sign up to request clarification or add additional context in comments.

Comments

2

The solution similar to what Lukas Baliak proposed, but using more suitable Proc instead of redundant method (works for any amount of level’s nesting):

deep_map = ->(data) do 
  data[:sites].flat_map { |r| r.is_a?(String) ? r : deep_map.(r) }
end
urls.flat_map(&deep_map)

3 Comments

Rly nice lambda. And if i understand well, that deep_map.(r) is same like deep_map.call(r) ?
@LukasBaliak exactly.
It is impossible to code ruby without solid understanding of Procs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.