I have a file like this:
$urls = [
{name:'Venture Capitals',
sites: [
'http://blog.ycombinator.com/posts.atom',
'http://themacro.com/feed.xml',
'http://a16z.com/feed/',
'http://firstround.com/review/feed.xml',
'http://www.kpcb.com/blog.rss',
'https://library.gv.com/feed',
'http://theaccelblog.squarespace.com/blog?format=RSS',
'https://medium.com/feed/accel-insights',
'http://500.co/blog/posts/feed/',
'http://feeds.feedburner.com/upfrontinsights?format=xml',
'http://versionone.vc/feed/',
'http://nextviewventures.com/blog/feed/',
]},
{name:'Companies and Groups',
sites: [
{name:'Product Companies',
sites: [
'https://m.signalvnoise.com/feed',
'http://feeds.feedburner.com/insideintercom',
'http://www.kickstarter.com/blog.atom',
'http://blog.invisionapp.com/feed/',
'http://feeds.feedburner.com/bufferapp',
'https://open.buffer.com/feed/',
'https://blog.asana.com/feed/',
'http://blog.drift.com/rss.xml',
'https://www.groovehq.com/blog/feed',]},
{name:'Consulting Groups, Studios',
sites: [
'http://svpg.com/articles/rss',
'http://www.thoughtworks.com/rss/insights.xml',
'http://zurb.com/blog/rss',]},
{name:'Communities',
sites: [
'http://alistapart.com/main/feed',
'https://www.mindtheproduct.com/feed/',]},
]},
]
I have organized the $url into different groups. Now I want to extract all the urls out (the link in the sites), how should I do?
The main problem is that, there are sites within sites, as the file showed above.
My problems are:
Am I using a proper file structure to save these links? (array within array). If not, what would be good way to save and group them?
How can I extract all the urls out into a flattened array? so I can later iterate through the list.
I can do this pretty manually, like the code shown below.
sites = []
$urls.each do |item|
item[:sites].each do |sub_item|
if sub_item.is_a?(Hash)
sites.concat sub_item[:sites]
else
sites.append sub_item
end
end
end
File.open('lib/flatten_sites.yaml', 'w') { |fo| fo.puts sites.to_yaml }
But I just feel this is bad code.
An alternative in this specific case, is to collect all the sites attribute, but I feel this is also very constrained, and may not help in some other cases.