Adding `opt_respond_to` to the Ruby VM: part 1

@byroot has been posting a series on optimizations he and others have made to the json gem, and it’s been 🔥🔥🔥. Enjoyable and informative, I highly recommend reading what he’s posted so far.

In his second post, he mentions the possibility of improving performance by adding an additional method cache. This would involve compiling respond_to? calls in a special way:

It actually wouldn’t be too hard to add such a cache, we’d need to modify the Ruby compiler to compile respond_to? calls into a specialized opt_respond_to instruction that does have two caches instead of one. The first cache would be used to look up respond_to? on the object to make sure it wasn’t redefined, and the second one to look up the method we’re interested in. Or perhaps even 3 caches, as you also need to check if the object has a respond_to_missing? method defined in some cases.

That’s an idea I remember discussing in the past with some fellow committers, but I can’t quite remember if there was a reason we didn’t do it yet.

Inspired by his comment, I’m going to add a new bytecode instruction - opt_respond_to - to the Ruby VM, for fun. I don’t know how to add a new bytecode instruction (yet). I don’t know if one would get accepted by the Ruby team. I don’t know if adding it will actually provide a meaningful enhancement to performance. But let’s give it a try, shall we?

Understanding the requirements

I know I want to add a new Ruby Virtual Machine bytecode called opt_respond_to. What does code using respond_to? look like today, after being compiled? Here’s some code to evaluate:

if $stdout.respond_to?(:write)
  puts "Did you know you can write to $stdout?"
end

We can compile it using RubyVM::InstructionSequence. We compile the code, then we disassemble it to see the actual Ruby bytecode:

puts RubyVM::InstructionSequence.compile(DATA.read).disassemble

__END__
if $stdout.respond_to?(:write)
  puts "Did you know you can write to $stdout?"
end

📝 The __END__ format is just a convenient way of supplying some text to your program. Here we put all our Ruby code we want to compile after __END__ and it will be available to our program as an IO object called DATA. Thanks to Drew Bragg for the tip!

Compiling using InstructionSequence gives us:

== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(3,3)>
0000 getglobal                    :$stdout                  (   1)[Li]
0002 putobject                    :write
0004 opt_send_without_block       <calldata!mid:respond_to?, argc:1, ARGS_SIMPLE>
0006 branchunless                 14
0008 putself                                                (   2)[Li]
0009 putstring                    "Did you know you can write to $stdout?"
0011 opt_send_without_block       <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0013 leave
0014 putnil
0015 leave

I’m going to guess at the parts I think matter most - putobject, opt_send_without_block and branchunless.

putobject pushes the symbol :write onto the vm stack
opt_send_without_block is given <calldata!mid:respond_to?, argc:1, ARGS_SIMPLE>. I’m guessing calldata is a format for specifying metadata about what is being called. We’ve got the method name, respond_to?, how many args are being used, argc:1, and that the arguments are “simple”, ARGS_SIMPLE. mid stands for… “method id”?
branchunless wouldn’t be specifically related to creating a new instruction, but I just think it’s informative for how the respond_to? result is used. I believe it means “if the last result is false, jump to instruction 14”. 14 in this case is the putnil near the bottom of the bytecode. Each instruction seems to be prefixed with a hex value that identifies the location of the instruction. putobject is located at 0002, opt_send_without_block is located at 0004, branchunless is located at 0006 and putnil is located at 0014

I think the only thing I will be changing is taking calls to respond_to?, and making that a opt_respond_to bytecode instead of opt_send_without_block. We’ll see!

There are actually a few variations for respond_to? that I wasn’t aware of. The interface takes a symbol or a string as the first parameter, and then a boolean for whether to include private and protected methods:

$stdout.respond_to?("write")
$stdout.respond_to?("write", true)
$stdout.respond_to?(:write, true)

Let’s see the bytecode for these variations:

== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(3,33)>
0000 getglobal                    :$stdout                  (   1)[Li]
0002 putstring                    "write"
0004 opt_send_without_block       <calldata!mid:respond_to?, argc:1, ARGS_SIMPLE>
0006 pop
0007 getglobal                    :$stdout                  (   2)[Li]
0009 putstring                    "write"
0011 putobject                    true
0013 opt_send_without_block       <calldata!mid:respond_to?, argc:2, ARGS_SIMPLE>
0015 pop
0016 getglobal                    :$stdout                  (   3)[Li]
0018 putobject                    :write
0020 putobject                    true
0022 opt_send_without_block       <calldata!mid:respond_to?, argc:2, ARGS_SIMPLE>
0024 leave

It doesn’t make a huge difference. The string version is identical, except we see "write" instead of :write in the putobject call. The boolean versions add an additional putobject which pushes the boolean onto the stack. Then the opt_send_without_block call has argc:2 instead of argc:1. Good to understand, but functionally the same.

I’m going to keep these explorations shorter, and break them into parts, so i’m going to stop here. So far we’ve:

Identified that I want to create an opt_respond_to instruction for the Ruby VM
Compiled a simple respond_to? example, and examined what the current bytecode looks like
Identified what I think are the relevant instructions that need to be converted to opt_respond_to

Next up, i’m going to walk through the path CRuby takes from starting the program, to compiling our code, starting with main.c. Here we go!