3

I copied and pasted a smaller portion of a large string and matched it against the large string. However, it does not return the value back. In the NOT case, it returns true. Is there something to the match function I am missing, or could there be hidden characters?

times = File.readlines('timesplit')
stringcomp = "created_at : Tue Jul 02 03:30:50 +0000 2013  id : 351905778745094144  id_str : 351905778745094144"
times.each do |t|
 r = t.split('|') 
 timestamp = r[1]
 puts !stringcomp.match(timestamp)
 puts stringcomp.match(timestamp)
end

Below are the contents for timesplit.

Jul_01|created_at : Tue Jul 02 03:30:50 +0000 2013  id :
Jul_02|created_at : Tue Sep 03 05:08:44 +0000 2013  id :

2 Answers 2

3

The problem is subtle. String.match expects a regular expression for its parameter, and, if it doesn't see one it tries to turn the parameter into an expression:

Converts pattern to a Regexp (if it isn’t already one), then invokes its match method on str.

So:

created_at : Tue Jul 02 03:30:50 +0000 2013  id :

isn't a pattern going in, and it gets converted to one.

The problem is the +. In regular expressions, + means one-or-more of the preceding character or group or character set.

The correct way to specify a literal match between your stringcomp and your newly created pattern would be for the pattern to be:

created_at : Tue Jul 02 03:30:50 \+0000 2013  id :

Notice the \+. That means the + is now a literal value, not a length specifier.

For visual proof, check these two Rubular tests:

That all said, the simple fix is to not try to use match, and instead use a substring search:

times = [
  'Jul_01|created_at : Tue Jul 02 03:30:50 +0000 2013  id :',
  'Jul_02|created_at : Tue Sep 03 05:08:44 +0000 2013  id :'
]

stringcomp = "created_at : Tue Jul 02 03:30:50 +0000 2013  id : 351905778745094144  id_str : 351905778745094144"
times.each do |t|
  timestamp = t.split('|').last
  puts stringcomp[timestamp] || 'sub-string not found'
end

Which outputs:

created_at : Tue Jul 02 03:30:50 +0000 2013  id :
sub-string not found

If you want a boolean result, instead of the matching substring being returned you can use:

!!stringcomp[timestamp]

For example:

!!stringcomp['created_at : Tue Jul 02 03:30:50 +0000 2013  id :'] # => true

Alternately, you could use Regexp.escape on your string, prior to passing it in to match, but I think that's overkill when a substring match will accomplish what you want.

Sign up to request clarification or add additional context in comments.

2 Comments

I tried your method and altered the code with puts stringcomp[timestamp] || 'sub-string not found' However, it is printing sub-string not found each time
Then there is a difference between the substring and the text you're trying to find.
1

You could also...

stringcomp.include? timestamp

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.