Ruby string concatenation with integers (codepoints?)

Question

So, I'm bored, and I've found what appears to be a strange inconsistency I was hoping to find more information on. This deals with string concatenation in Ruby, particularly appending what the string documentation refers to as a "codepoint".

Here're some examples:

'' << 233 #=> "é"
'' << 256 #=> "Ā"

Now, the curious thing is that in IRB both of those examples work. However, if you create a ruby class in a file, load the file, and execute the code, it blows up. See following example:

class MyConcatenationTest
  def self.test
    '' << 233
    '' << 256
  end
end

And then in IRB:

load 'my_concatenation_test.rb'  #=> true
MyConcatenationTest.test         #=> RangeError: 256 out of char range

So, my question is this: Why does this work in IRB, but not when I load a script that runs the same line of code?

Some other things to notice, if you alter the class:

class MyConcatenationTest
  def self.test
    '' << 233
    #'' << 256
  end
end

... and then reload/run the method, it returns the \x escaped value for 233 instead of the "é" from before:

load 'my_concatenation_test.rb'
MyConcatenationTest.test          #=> "\xE9"

So... what's up with that? Both strings have the same encoding (UTF-8), and changing it to ASCII doesn't seem to make any difference.

EDIT: I should mention that I used 256 in the example above because that's the lowest number it blows up on. It's pretty obvious that it's freaking out because it can't properly deal with anything higher than "\xFF". To clarify my question, I'm curious to know why this limitation exists when the code exists in a loaded ruby file, but not in IRB.

No it is good for me.. I didn't get any error. In which Ruby version you are? — Arup Rakshit
– Arup Rakshit, Commented Sep 12, 2013 at 20:07
@ArupRakshit It fails in Ruby 1.9.3-p429, but works in 2.0.0 for me. — lurker
– lurker, Commented Sep 12, 2013 at 20:09
Maybe IRB runs a different Ruby version than your system. Add puts RUBY_VERSION to check that. — steenslag
– steenslag, Commented Sep 12, 2013 at 20:12
@mbratch Yes I tested on Ruby2.0.0-p0, but forgot to mention that.. sorry. :) — Arup Rakshit
– Arup Rakshit, Commented Sep 12, 2013 at 20:13

tessi · Accepted Answer · 2013-09-12 20:43:54Z

3

Which ruby version do you use? It's is probably because in ruby 1.9 (and earlier) UTF-8 is not the default encoding.

Modifying your file to the following advises ruby to use UTF-8 to parse your file.

# ~coding: utf-8
class MyConcatenationTest
  def self.test
    '' << 233
    '' << 256
  end
end

If you execute the file in ruby 2.0, it works as expected without the magic comment, because UTF-8 is the default encoding in ruby 2.0.

Why does it work in irb (even with ruby 1.9.3)?

irb uses the $LANG environment variable to determine which encoding it should use. My (and maybe your?) $LANG is set to en_US.UTF-8, which makes irb use UTF-8 encoding.

You may start your irb with irb -EISO-8859-1 (or some other encoding) to change that.

$ irb -EISO-8859-1 # start irb with ISO-8859-1 encoding
irb(main):001:0> "".encoding
=> #<Encoding:ISO-8859-1>

edited Sep 12, 2013 at 20:43

answered Sep 12, 2013 at 20:07

tessi

13.6k3 gold badges40 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

lurker Over a year ago

It fails on Ruby 1.9.3-p429

tessi Over a year ago

@mbratch with or without the magic comment? It works for me with the comment in 1.9.3-p392 (oh, I should update my dusty 1.9.3 :)

Arup Rakshit Over a year ago

I have seen #encoding 'utf-8' but not the magic one # ~coding: utf-8. Documentation please for myself..:P

SeanMeh Over a year ago

How very interesting. I didn't think about the encoding used for the code-file at all, and only checked that the string encoding was UTF-8. Thanks much for the fast answer.

tessi Over a year ago

@ArupRakshit you find the mri implementation of the magic comment here: github.com/ruby/ruby/blob/…

|

Collectives™ on Stack Overflow

Ruby string concatenation with integers (codepoints?)

1 Answer 1

Why does it work in irb (even with ruby 1.9.3)?

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Why does it work in irb (even with ruby 1.9.3)?

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related