20

I'm using Ruby 1.9 to open several files and copy them into an archive. Now there are some binary files, but some are not. Since Ruby 1.9 does not open binary files automatically as binaries, is there a way to open them automatically anyway? (So ".class" would be binary, ".txt" not)

3 Answers 3

39

Actually, the previous answer by Alex D is incomplete. While it's true that there is no "text" mode in Unix file systems, Ruby does make a difference between opening files in binary and non-binary mode:

s = File.open('/tmp/test.jpg', 'r') { |io| io.read }
s.encoding
=> #<Encoding:UTF-8>

is different from (note the "rb")

s = File.open('/tmp/test.jpg', 'rb') { |io| io.read }
s.encoding
=> #<Encoding:ASCII-8BIT>

The latter, as the docs say, set the external encoding to ASCII-8BIT which tells Ruby to not attempt to interpret the result at UTF-8. You can achieve the same thing by setting the encoding explicitly with s.force_encoding('ASCII-8BIT'). This is key if you want to read binary into a string and move them around (e.g. saving them to a database, etc.).

Sign up to request clarification or add additional context in comments.

Comments

19

Since Ruby 1.9.1 there is a separate method for binary reading (IO.binread) and since 1.9.3 there is one for writing (IO.binwrite) as well:

For reading:

content = IO.binread(file)

For writing:

IO.binwrite(file, content)

Since IO is the parent class of File, you could also do the following which is probably more expressive:

content = File.binread(file)
File.binwrite(file, content)

1 Comment

Yes, since the parent of the File class is the IO class.
1

On Unix-like platforms, there is no difference between opening files in "binary" and "text" modes. On Windows, "text" mode converts line breaks to DOS style, and "binary" mode does not.

Unless you need linebreak conversion on Windows platforms, just open all the files in "binary" mode. There is no harm in reading a text file in "binary" mode.

If you really want to distinguish, you will have to match File.extname(filename) against a list of known extensions like ".txt" and ".class".

2 Comments

Note that this answer is wrong. Ruby reads into a string, and as of 1.9 that string has an encoding associated with it. See more highly-modded answer for details and ignore this. If Alex can delete it that would be preferable.
If I just delete it, the existing answer won't make sense ("the answer by AlexD..."). It would be better if the information in this answer (mentioning the effect of the 'b' flag on linebreak conversion) is consolidated with the information in the other one first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.