Ruby: collect a target key's value into array from nested hash

Question

I have a file like this:

$urls = [
      {name:'Venture Capitals',
       sites: [
           'http://blog.ycombinator.com/posts.atom',
           'http://themacro.com/feed.xml',
           'http://a16z.com/feed/',
           'http://firstround.com/review/feed.xml',
           'http://www.kpcb.com/blog.rss',
           'https://library.gv.com/feed',
           'http://theaccelblog.squarespace.com/blog?format=RSS',
           'https://medium.com/feed/accel-insights',
           'http://500.co/blog/posts/feed/',
           'http://feeds.feedburner.com/upfrontinsights?format=xml',
           'http://versionone.vc/feed/',
           'http://nextviewventures.com/blog/feed/',
       ]},

      {name:'Companies and Groups',
       sites: [
           {name:'Product Companies',
            sites: [
              'https://m.signalvnoise.com/feed',
              'http://feeds.feedburner.com/insideintercom',
              'http://www.kickstarter.com/blog.atom',
              'http://blog.invisionapp.com/feed/',
              'http://feeds.feedburner.com/bufferapp',
              'https://open.buffer.com/feed/',
              'https://blog.asana.com/feed/',
              'http://blog.drift.com/rss.xml',
              'https://www.groovehq.com/blog/feed',]},
           {name:'Consulting Groups, Studios',
            sites: [
              'http://svpg.com/articles/rss',
              'http://www.thoughtworks.com/rss/insights.xml',
              'http://zurb.com/blog/rss',]},
           {name:'Communities',
            sites: [
              'http://alistapart.com/main/feed',
              'https://www.mindtheproduct.com/feed/',]},
       ]},


  ]

I have organized the $url into different groups. Now I want to extract all the urls out (the link in the sites), how should I do?

The main problem is that, there are sites within sites, as the file showed above.

My problems are:

Am I using a proper file structure to save these links? (array within array). If not, what would be good way to save and group them?
How can I extract all the urls out into a flattened array? so I can later iterate through the list.

I can do this pretty manually, like the code shown below.

 sites = []
  $urls.each do |item|
    item[:sites].each do |sub_item|
      if sub_item.is_a?(Hash)
        sites.concat sub_item[:sites]
      else
        sites.append sub_item
      end
    end
  end

  File.open('lib/flatten_sites.yaml', 'w') { |fo| fo.puts sites.to_yaml }

But I just feel this is bad code.

An alternative in this specific case, is to collect all the sites attribute, but I feel this is also very constrained, and may not help in some other cases.

Lukas Baliak · Accepted Answer · 2016-06-17 13:44:17Z

If you have Hash, you can use this recursive method

Input

urls = [
  {
    :name => 'Venture Capitals',
    :sites => [
      'http://blog.ycombinator.com/posts.atom',
      'http://themacro.com/feed.xml',
      'http://a16z.com/feed/',
      'http://firstround.com/review/feed.xml',
      'http://www.kpcb.com/blog.rss',
      'https://library.gv.com/feed',
      'http://theaccelblog.squarespace.com/blog?format=RSS',
      'https://medium.com/feed/accel-insights',
      'http://500.co/blog/posts/feed/',
      'http://feeds.feedburner.com/upfrontinsights?format=xml',
      'http://versionone.vc/feed/',
      'http://nextviewventures.com/blog/feed/',
    ]
  },
  {
    :name => 'Companies and Groups',
    :sites => [
      {
        :name => 'Product Companies',
        :sites => [
          'https://m.signalvnoise.com/feed',
          'http://feeds.feedburner.com/insideintercom',
          'http://www.kickstarter.com/blog.atom',
          'http://blog.invisionapp.com/feed/',
          'http://feeds.feedburner.com/bufferapp',
          'https://open.buffer.com/feed/',
          'https://blog.asana.com/feed/',
          'http://blog.drift.com/rss.xml',
          'https://www.groovehq.com/blog/feed',]
      },
      {
        :name => 'Consulting Groups, Studios',
        :sites => [
          'http://svpg.com/articles/rss',
          'http://www.thoughtworks.com/rss/insights.xml',
          'http://zurb.com/blog/rss',]
      },
      {
        :name => 'Communities',
        :sites => [
          'http://alistapart.com/main/feed',
          'https://www.mindtheproduct.com/feed/',]
      }
    ]
  }
]

Method

def get_all_sites(data)
  data[:sites].map { |r| Hash === r ? get_all_sites(r) : r }
end

urls.map { |r| get_all_sites(r) }.flatten

Output

[
  "http://blog.ycombinator.com/posts.atom",
  "http://themacro.com/feed.xml",
  "http://a16z.com/feed/", 
  "http://firstround.com/review/feed.xml", 
  "http://www.kpcb.com/blog.rss", 
  "https://library.gv.com/feed", 
  "http://theaccelblog.squarespace.com/blog?format=RSS",
  "https://medium.com/feed/accel-insights",
  "http://500.co/blog/posts/feed/",
  "http://feeds.feedburner.com/upfrontinsights?format=xml",
  "http://versionone.vc/feed/", 
  "http://nextviewventures.com/blog/feed/",
  "https://m.signalvnoise.com/feed", 
  "http://feeds.feedburner.com/insideintercom",
  "http://www.kickstarter.com/blog.atom",
  "http://blog.invisionapp.com/feed/", 
  "http://feeds.feedburner.com/bufferapp", 
  "https://open.buffer.com/feed/", 
  "https://blog.asana.com/feed/", 
  "http://blog.drift.com/rss.xml", 
  "https://www.groovehq.com/blog/feed",
  "http://svpg.com/articles/rss", 
  "http://www.thoughtworks.com/rss/insights.xml", 
  "http://zurb.com/blog/rss", 
  "http://alistapart.com/main/feed", 
  "https://www.mindtheproduct.com/feed/"
]

I hope this helps

Aleksei Matiushkin · Accepted Answer · 2016-06-17 13:51:42Z

2

The solution similar to what Lukas Baliak proposed, but using more suitable Proc instead of redundant method (works for any amount of level’s nesting):

deep_map = ->(data) do 
  data[:sites].flat_map { |r| r.is_a?(String) ? r : deep_map.(r) }
end
urls.flat_map(&deep_map)

edited Jun 17, 2016 at 13:51

answered Jun 17, 2016 at 13:44

Aleksei Matiushkin

121k12 gold badges109 silver badges173 bronze badges

3 Comments

Lukas Baliak Over a year ago

Rly nice lambda. And if i understand well, that deep_map.(r) is same like deep_map.call(r) ?

Aleksei Matiushkin Over a year ago

@LukasBaliak exactly.

Aleksei Matiushkin Over a year ago

It is impossible to code ruby without solid understanding of Procs.

Collectives™ on Stack Overflow

Ruby: collect a target key's value into array from nested hash

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related