3

I am considering writing a C-based Ruby gem to speed up text wrapping in Prawn. I've read a very small portion of the C source for MRI before, but don't know the API for building extensions well yet.

Within my C code, I'd like to get a direct pointer to the data within a Ruby String, and walk over it byte by byte. Further to that, I'd like to store pointers within the buffer in my own struct, and use them not only within the scope of a single call, but within subsequent calls into the extension code.

Is this possible? Could the GC move the Strings, rendering my pointers invalid? And also, how can I let Ruby know that I am holding pointers to the Strings in my own structs (so the GC doesn't try to reclaim them)? Can this code be written in a way which is compatible with both MRI 1.8 and 1.9?

And since I'm asking about using pointers safely in a C-based Ruby extension: can I use malloc and free the same as I would in a "regular" C-based project?

2
  • 1
    Have you seen this: media.pragprog.com/titles/ruby3/ext_ruby.pdf (PDF)? Commented Aug 21, 2012 at 17:21
  • 1
    You might also want look at the README.EXT file: github.com/ruby/ruby/blob/v1_9_3_194/README.EXT. The Pickaxe chapter (which is what the other link is) is based on it. It might also be useful to look at the README.EXTs for different Ruby version to look for any differences. Commented Aug 22, 2012 at 21:32

1 Answer 1

2

The link that matt offers is really good. It would have saved me days if I had found it before.

You can keep references to ruby Strings and pointers into them. I would suggest freezing the String. Then every attempt to change the string will fail. There is a function Data_Wrap_Struct() that lets you wrap your own data structure into a Ruby object. Beside the data structure and the class of the structure, the function takes two function arguments. One of them (mark) is used to show the garbage collector where your structure references other ruby objects.

What took me some time to understand, is that the garbage collector is really scanning the stack of all ruby threads to seek for references to ruby objects. So keeping VALUEs on the stack is also a safe method to keep objects referenced.

Can this code be written in a way which is compatible with both MRI 1.8 and 1.9?

The basic API for extensions didn't change very much (I think) from 1.8 to 1.9. But I've used only 1.9 so far.

can I use malloc and free the same as I would in a "regular" C-based project?

Sure, I cannot think of any reason why this should not possible, as long as you don't expect the garbage collector to keep care of the allocated memory.

I had a hard time, mixing C++ code, compiled with another version of gcc than the version the ruby interpreter was compiled with. If you experience strange startup behavior, I would check for compiler version differences.

Sign up to request clarification or add additional context in comments.

3 Comments

Awesome info, +1. When you say you "can't think of any reason" not to use malloc and free within a C-based Ruby extension, I wonder: did you not use dynamic memory allocation in your own extension? If you used C++, it's hard to imagine doing without new and delete.
@AlexD Of cause, I'm using new and delete. I'm experiencing some problems now, but so far they all boil down to 'usual' memory access errors.
@AlexD btw: the project where I used C++ for a ruby extension is open sourced: github.com/TorstenRobitzki/Sioux

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.