Age | Commit message (Collapse) | Author |
---|
| If we have a frozen array `[..., a, ...]` and a compile-time fixnum index `i`, we can do the array load at compile-time. |
| Notes: Merged: https://.com/ruby/ruby/pull/13626 |
| Now that the shape_id gives us all the same information, it's no longer needed. Notes: Merged: https://.com/ruby/ruby/pull/13612 |
| This allow checking if an object has ivars with just a shape_id mask. Notes: Merged: https://.com/ruby/ruby/pull/13606 |
| Notes: Merged: https://.com/ruby/ruby/pull/13596 |
| This behave almost exactly as a T_OBJECT, the layout is entirely compatible. This aims to solve two problems. First, it solves the problem of namspaced classes having a single `shape_id`. Now each namespaced classext has an object that can hold the namespace specific shape. Second, it open the door to later make class instance variable writes atomics, hence be able to read class variables without locking the VM. In the future, in multi-ractor mode, we can do the write on a copy of the `fields_obj` and then atomically swap it. Considerations: - Right now the `RClass` shape_id is always synchronized, but with namespace we should likely mark classes that have multiple namespace with a specific shape flag. Notes: Merged: https://.com/ruby/ruby/pull/13411 |
| Previously, `asm.mov(m32, imm32)` panicked when `imm32 > 0x80000000`. It attempted to split imm32 into a register before doing the store, but then the register size didn't match the destination size. Instead of splitting, use the `MOV r/m32, imm32` form which works for all 32-bit values. Adjust asserts that assumed that all forms undergo sign extension, which is not true for this case. See: 54edc930f9f0a658da45cfcef46648d1b6f82467 Notes: Merged: https://.com/ruby/ruby/pull/13576 |
| Notes: Merged: https://.com/ruby/ruby/pull/13556 |
| Notes: Merged: https://.com/ruby/ruby/pull/13524 |
| Now all flags are only in the `shape_id_t`, and can all be checked without needing to dereference a pointer. Notes: Merged: https://.com/ruby/ruby/pull/13515 |
| Instead it's now a `shape_id` flag. This allows to check if an object is complex without having to chase the `rb_shape_t` pointer. Notes: Merged: https://.com/ruby/ruby/pull/13511 |
| Followup: https://.com/ruby/ruby/pull/13341 / [Feature #21353] Even thought `shape_id_t` has been make 32bits, we were still limited to use only the lower 16 bits because they had to fit alongside `attr_index_t` inside a `uintptr_t` in inline caches. By enlarging inline caches we can unlock the full 32bits on all platforms, allowing to use these extra bits for tagging. Notes: Merged: https://.com/ruby/ruby/pull/13500 |
| Whenever we run into an inline cache miss when we try to set an ivar, we may need to take the global lock, just to be able to lookup inside `shape->edges`. To solve that, when we're in multi-ractor mode, we can treat the `shape->edges` as immutable. When we need to add a new edge, we first copy the table, and then replace it with CAS. This increases memory allocations, however we expect that creating new transitions becomes increasingly rare over time. ```ruby class A def initialize(bool) @a = 1 if bool @b = 2 else @c = 3 end end def test @d = 4 end end def bench(iterations) i = iterations while i > 0 A.new(true).test A.new(false).test i -= 1 end end if ARGV.first == "ractor" ractors = 8.times.map do Ractor.new do bench(20_000_000 / 8) end end ractors.each(&:take) else bench(20_000_000) end ``` The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4, and only 1.7s with this branch. Co-Authored-By: Étienne Barrié <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/13441 |
| Previously we used a flag to set whether a module was uninitialized. When checked whether a class was initialized, we first had to check that it had a non-zero superclass, as well as that it wasn't BasicObject. With the advent of namespaces, RCLASS_SUPER is now an expensive operation, and though we could just check for the prime superclass, we might as well take this opportunity to use a flag so that we can perform the initialized check with as few instructions as possible. It's possible in the future that we could prevent uninitialized classes from being available to the user, but currently there are a few ways to do that. Notes: Merged: https://.com/ruby/ruby/pull/13443 |
| Notes: Merged: https://.com/ruby/ruby/pull/13450 |
| Further reduce exposure of `rb_shape_t`. Notes: Merged: https://.com/ruby/ruby/pull/13450 |
| We should avoid conversions from `rb_shape_t *` into `shape_id_t` outside of `shape.c` as the short term goal is to have `shape_id_t` contain tags. Notes: Merged: https://.com/ruby/ruby/pull/13448 |
| ``` # frozen_string_ltieral: true hash["literal"] = value ``` Notes: Merged: https://.com/ruby/ruby/pull/13342 |
| This commit allows building YJIT and ZJIT simultaneously, a "combo build". Previously, `./configure --enable-yjit --enable-zjit` failed. At runtime, though, only one of the two can be enabled at a time. Add a root Cargo workspace that contains both the yjit and zjit crate. The common Rust build integration mechanisms are factored out into defs/jit.mk. Combo YJIT+ZJIT s are supported; if either JIT uses `--enable-*=dev`, both of them are built in dev mode. The combo build requires Cargo, but building one JIT at a time with only rustc in release build remains supported. Notes: Merged: https://.com/ruby/ruby/pull/13262 |
| Notes: Merged-By: k0kubun <[email protected]> |
| |
| As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id` and `RSHAPE` is now a simple alias for `rb_shape_lookup`. I tried to turn all these into `static inline` but I'm having trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;` not being exposed as I'd expect. Notes: Merged: https://.com/ruby/ruby/pull/13283 |
| And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`. Notes: Merged: https://.com/ruby/ruby/pull/13283 |
| Also rename it, and change parameters to be consistent with other transition functions. Notes: Merged: https://.com/ruby/ruby/pull/13283 |
| Notes: Merged: https://.com/ruby/ruby/pull/13283 |
| Notes: Merged: https://.com/ruby/ruby/pull/13283 |
| And get rid of the `obj_to_id_tbl` It's no longer needed, the `object_id` is now stored inline in the object alongside instance variables. We still need the inverse table in case `_id2ref` is invoked, but we lazily build it by walking the heap if that happens. The `object_id` concern is also no longer a GC implementation concern, but a generic implementation. Co-Authored-By: Matt Valentine-House <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/13159 |
| Also refactor checks for `->type == SHAPE_OBJ_TOO_COMPLEX`. Notes: Merged: https://.com/ruby/ruby/pull/13159 |
| Ivars will longer be the only thing stored inline via shapes, so keeping the `iv_index` and `ivptr` names would be confusing. Instance variables won't be the only thing stored inline via shapes, so keeping the `ivptr` name would be confusing. `field` encompass anything that can be stored in a VALUE array. Similarly, `gen_ivtbl` becomes `gen_fields_tbl`. Notes: Merged: https://.com/ruby/ruby/pull/13159 |
| Notes: Merged-By: k0kubun <[email protected]> |
| Notes: Merged: https://.com/ruby/ruby/pull/13257 |
| Working towards having YJIT and ZJIT in the same build, we need to deduplicate some glue code that would otherwise cause name collision. Add jit.c for this and build it for YJIT and ZJIT builds. Update bindgen to look at jit.c; some shuffling of functions in the output, but the set of functions shouldn't have changed. Notes: Merged: https://.com/ruby/ruby/pull/13229 |
| * ZJIT: Disable ZJIT instructions when USE_ZJIT is 0 * Test the order of ZJIT instructions * Add more jobs that disable JITs * Show instruction names in the message Notes: Merged-By: k0kubun <[email protected]> |
| Notes: Merged-By: k0kubun <[email protected]> |
| Avoid generating an infinite loop in the case where: 1. Block `first` is adjacent to block `second`, and the branch from `first` to `second` is a fallthrough, and 2. Block `second` immediately exits to the interpreter, and 3. Block `second` is invalidated and YJIT is OOM While pondering how to fix this, I think I've stumbled on another related edge case: 1. Block `incoming_one` and `incoming_two` both branch to block `second`. Block `incoming_one` has a fallthrough 2. Block `second` immediately exits to the interpreter (so it starts with its exit) 3. When Block `second` is invalidated, the incoming fallthrough branch from `incoming_one` might be rewritten first, which overwrites the start of block `second` with a jump to a new branch stub. 4. YJIT runs of out memory 5. The incoming branch from `incoming_two` is then rewritten, but because we're OOM we can't generate a new stub, so we use `second`'s exit as the branch target. However `second`'s exit was already overwritten with a jump to the branch stub for `incoming_one`, so `incoming_two` will end up jumping to `incoming_one`'s branch stub. Fixes [Bug #21257] Notes: Merged: https://.com/ruby/ruby/pull/13186 Merged-By: XrXr |
| This commit inlines instructions for Class#new. To make this work, we added a new YARV instructions, `opt_new`. `opt_new` checks whether or not the `new` method is the default allocator method. If it is, it allocates the object, and pushes the instance on the stack. If not, the instruction jumps to the "slow path" method call instructions. Old instructions: ``` > ruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0004 leave ``` New instructions: ``` > ./miniruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 putnil 0003 swap 0004 opt_new <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11 0007 opt_send_without_block <calldata!mid:initialize, argc:0, FCALL|ARGS_SIMPLE> 0009 jump 14 0011 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0013 swap 0014 pop 0015 leave ``` This commit speeds up basic object allocation (`Foo.new`) by 60%, but classes that take keyword parameters see an even bigger benefit because no hash is allocated when instantiating the object (3x to 6x faster). Here is an example that uses `Hash.new(capacity: 0)`: ``` > hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 1.082 s ± 0.004 s [User: 1.074 s, System: 0.008 s] Range (min … max): 1.076 s … 1.088 s 10 runs Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 627.9 ms ± 3.5 ms [User: 622.7 ms, System: 4.8 ms] Range (min … max): 622.7 ms … 633.2 ms 10 runs Summary ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran 1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ``` This commit changes the backtrace for `initialize`: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb class Foo def initialize puts caller end end def hello Foo.new end hello aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] test.rb:8:in 'Class#new' test.rb:8:in 'Object#hello' test.rb:11:in '<main>' aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24] test.rb:8:in 'Object#hello' test.rb:11:in '<main>' ``` It also increases memory usage for calls to `new` by 122 bytes: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb require "objspace" class Foo def initialize puts caller end end def hello Foo.new end puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello))) aaron@tc ~/g/ruby (inline-new)> make runruby RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb 656 aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] 544 ``` Thanks to @ko1 for coming up with this idea! Co-Authored-By: John Hawthorn <[email protected]> |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| We filed https://.com/Shopify/zjit/pull/65 and https://.com/Shopify/zjit/pull/64 concurrently. Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Key here is calling rb_call_builtin_inits(), which sticking to public API for robustness is done by calling ruby_options(). Fixes: https://.com/Shopify/zjit/issues/61 Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| This will be used for local type inference and potentially SCCP. Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| (https://.com/Shopify/zjit/pull/16) * Add zjit_* instructions to profile the interpreter * Rename FixnumPlus to FixnumAdd * Update a comment about Invalidate * Rename Guard to GuardType * Rename Invalidate to Point * Drop unneeded debug!() * Plan on profiling the types * Use the output of GuardType as type refined outputs Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| Notes: Merged: https://.com/ruby/ruby/pull/13131 |
| When YJIT is forced to discard all the code, that's bad for performance, so there should be an easy way to know about it. Notes: Merged: https://.com/ruby/ruby/pull/12882 |
| Notes: Merged-By: maximecb <[email protected]> |