3

Lets say we have the following array of strings (this array is a lot bigger):

[
  'http://www.example.com?id=123456',
  'http://www.example.com?id=234567'
]

As you can see, everything up to the first digit is the same in both strings. Is there a way to easily find what both strings have in common and what is different? So that I get a string like 'http://www.example.com?id=' and and array like ['123456', '234567'].

10
  • 2
    What should happen for ['http://www.example.com?id=123456', 'http://www.example.com?id=123457']? Commented Aug 21, 2014 at 12:13
  • Regexp might not be the ideal tool for this. Building a trie is straightforward, not horribly slow, and solves your problem. Commented Aug 21, 2014 at 12:13
  • It should output (as said in the question) 'example.com?id=', since both strings have it in common. Commented Aug 21, 2014 at 12:14
  • @Taemyr Could you give an example of how you would do that? Commented Aug 21, 2014 at 12:14
  • @Severin Never programmed in Ruby. stackoverflow.com/questions/9042426/… discussed it, but the answer invokes commands that I am not familiar with so I can't vouch for it's function. Trie's are a common data structure so it's fairly googleable. - Also note that it's a bit of an overkill for your problem. Commented Aug 21, 2014 at 12:21

2 Answers 2

2

Here's a method to find the longest common prefix in an array.

def _lcp(str1, str2)
  end_index = [str1.length, str2.length].min - 1
  end_index.downto(0) do |i|
    return str1[0..i] if str1[0..i] == str2[0..i]
  end
  ''
end

def lcp(strings)
  strings.inject do |acc, str|
    _lcp(acc, str)
  end
end


lcp [
  'http://www.example.com?id=123456',
  'http://www.example.com?id=234567',
  'http://www.example.com?id=987654'
]
#=> "http://www.example.com?id="

lcp [
  'http://www.example.com?id=123456',
  'http://www.example.com?id=123457'
]
#=> "http://www.example.com?id=12345"
Sign up to request clarification or add additional context in comments.

1 Comment

Edit: changed max to min, because the longest prefix can only be as long as the shorter one of the strings.
0
# This is an approach using higher level ruby std-lib components instead of a regex.
# Why re-invent the wheel?
module UriHelper
    require 'uri'
    require 'cgi'

    # Take an array of urls and extract the id parameter.
    # @param urls {Array} an array of urls to parse
    # @returns {Array}
    def UriHelper.get_id_params( urls )
        urls.map do |u| 
            puts u
            uri = URI(u)
            params = CGI::parse(uri.query)  
            params["id"].first # returned
        end
    end
end

require "test/unit"
# This is unit test proving our helper works as intended
class TestUriHelper < Test::Unit::TestCase
  def test_get_id_params
    urls = [
        'http://www.example.com?id=123456',
        'http://www.example.com?id=234567'
    ]
    assert_equal("123456", UriHelper.get_id_params(urls).first )
    assert_equal("234567", UriHelper.get_id_params(urls).last )
  end
end

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.