ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
9 days	Fix generic_ivar_set_shape_field for table rebuild	John Hawthorn
	[Bug #21438] Previously GC could trigger a table rebuild of the generic fields st_table in the middle of calling the st_update callback. This could cause entries to be reallocated or rearranged and the update to be for the wrong entry. This commit adds an assertion to make that case easier to detect, and replaces the st_update with a separate st_lookup and st_insert. Co-authored-by: Aaron Patterson <[email protected]> Co-authored-by: Jean Boussier <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/13589
14 days	Optimize callcache invalidation for refinements	alpaca-tc
	Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement. Notes: Merged: https://.com/ruby/ruby/pull/13077
2025-05-08	Rename `ivptr` -> `fields`, `next_iv_index` -> `next_field_index`	Jean Boussier
	Ivars will longer be the only thing stored inline via shapes, so keeping the `iv_index` and `ivptr` names would be confusing. Instance variables won't be the only thing stored inline via shapes, so keeping the `ivptr` name would be confusing. `field` encompass anything that can be stored in a VALUE array. Similarly, `gen_ivtbl` becomes `gen_fields_tbl`. Notes: Merged: https://.com/ruby/ruby/pull/13159
2025-04-29	st.c: Removed unused `set_add_direct_with_hash` function	Jean Boussier
	Notes: Merged: https://.com/ruby/ruby/pull/13208
2025-04-26	Use `set_table` to track const caches	Jean Boussier
	Now that we have a `set_table` implementation, we can use it to track const caches and save some memory. We could even save some more memory if `numtable` didn't store a copy of the `hash` and instead recomputed it every time, but this is a quick win. Notes: Merged: https://.com/ruby/ruby/pull/13184
2025-04-26	Implement Set as a core class	Jeremy Evans
	Set has been an autoloaded standard library since Ruby 3.2. The standard library Set is less efficient than it could be, as it uses Hash for storage, which stores unnecessary values for each key. Implementation details: * Core Set uses a modified version of `st_table`, named `set_table`. than `s/st_/set_/`, the main difference is that the stored records do not have values, making them 1/3 smaller. `st_table_entry` stores `hash`, `key`, and `record` (value), while `set_table_entry` only stores `hash` and `key`. This results in large sets using ~33% less memory compared to stdlib Set. For small sets, core Set uses 12% more memory (160 byte object slot and 64 malloc bytes, while stdlib set uses 40 for Set and 160 for Hash). More memory is used because the set_table is embedded and 72 bytes in the object slot are currently wasted. Hopefully we can make this more efficient and have it stored in an 80 byte object slot in the future. * All methods are implemented as cfuncs, except the pretty_print methods, which were moved to `lib/pp.rb` (which is where the pretty_print methods for other core classes are defined). As is typical for core classes, internal calls call C functions and not Ruby methods. For example, to check if something is a Set, `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the related object. * Almost all methods use the same algorithm that the pure-Ruby implementation used. The exception is when calling `Set#divide` with a block with 2-arity. The pure-Ruby method used tsort to implement this. I developed an algorithm that only allocates a single intermediate hash and does not need tsort. * The `flatten_merge` protected method is no longer necessary, so it is not implemented (it could be). * Similar to Hash/Array, subclasses of Set are no longer reflected in `inspect` output. * RDoc from stdlib Set was moved to core Set, with minor updates. This includes a comprehensive benchmark suite for all public Set methods. As you would expect, the native version is faster in the vast majority of cases, and multiple times faster in many cases. There are a few cases where it is significantly slower: * Set.new with no arguments (~1.6x) * Set#compare_by_identity for small sets (~1.3x) * Set#clone for small sets (~1.5x) * Set#dup for small sets (~1.7x) These are slower as Set does not currently use the AR table optimization that Hash does, so a new set_table is initialized for each call. I'm not sure it's worth the complexity to have an AR table-like optimization for small sets (for hashes it makes sense, as small hashes are used everywhere in Ruby). The rbs and repl_type_completor bundled gems will need updates to support core Set. The pull request marks them as allowed failures. This passes all set tests with no changes. The following specs needed modification: * Modifying frozen set error message (changed for the better) * `Set#divide` when passed a 2-arity block no longer yields the same object as both the first and second argument (this seems like an issue with the previous implementation). * Set-like objects that override `is_a?` such that `is_a?(Set)` return `true` are no longer treated as Set instances. * `Set.allocate.hash` is no longer the same as `nil.hash` * `Set#join` no longer calls `Set#to_a` (it calls the underlying C function). * `Set#flatten_merge` protected method is not implemented. Previously, `set.rb` added a `SortedSet` autoload, which loads `set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb` with a `SortedSet` autoload, but I recommend removing it and `set/sorted_set.rb`. This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`, reflecting that switch to a core class. This does not move the spec files, as I'm not sure how they should be handled. Internally, this uses the st_* types and functions as much as possible, and only adds set_* types and functions as needed. The underlying set_table implementation is stored in st.c, but there is no public C-API for it, nor is there one planned, in order to keep the ability to change the internals going forward. For internal uses of st_table with Qtrue values, those can probably be replaced with set_table. To do that, include internal/set_table.h. To handle symbol visibility (rb_ prefix), internal/set_table.h uses the same macro approach that include/ruby/st.h uses. The Set class (rb_cSet) and all methods are defined in set.c. There isn't currently a C-API for the Set class, though C-API functions can be added as needed going forward. Implements [Feature #21216] Co-authored-by: Jean Boussier <[email protected]> Co-authored-by: Oliver Nutter <[email protected]>
2025-03-05	Replace tombstone when converting AR to ST hash	John Hawthorn
	[Bug #21170] st_table reserves -1 as a special hash value to indicate that an entry has been deleted. So that that's a valid value to be returned from the hash function, do_hash replaces -1 with 0 so that it is not mistaken for the sentinel. Previously, when upgrading an AR table to an ST table, rb_st_add_direct_with_hash was used which did not perform the same conversion, this could lead to a hash in a broken state where one if its entries which was supposed to exist being marked as a tombstone. The hash could then become further corrupted when the ST table required resizing as the falsely tombstoned entry would be skipped but it would be counted in num entries, leading to an uninitialized entry at index 15. In most cases this will be really rare, unless using a very poorly implemented custom hash function. This also adds two debug assertions, one that st_add_direct_with_hash does not receive the reserved hash value, and a second in rebuild_table_with, which ensures that after we rebuild/compact a table it contains the expected number of elements. Co-authored-by: Alan Wu <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/12852
2025-02-13	Remove dead rb_st_nth_key	Peter Zhu
	Notes: Merged: https://.com/ruby/ruby/pull/12742
2024-02-09	Move clean-up after table rebuilding	Nobuyoshi Nakada
	Suppress a false positive alert by CodeQL.
2023-12-25	Move internal ST functions to internal/st.h	Peter Zhu
	st_replace and st_init_existing_table_with_size are functions used internally in Ruby and should not be publicly visible.
2023-12-15	check modifcation whil ar->st	Koichi Sasada
	* delete `ar_try_convert` but use `ar_force_convert_table` to make program simple. * `ar_force_convert_table` checks hash modification while calling `#hash` method with the following strategy: 1. copy keys (and vals) of ar_table 2. calc hashes from keys 3. check copied keys and hash's keys. if not matched, repeat from 1 fix [Bug #20050]
2023-11-11	[Bug #19969] Compact st_table after deleted if possible	Nobuyoshi Nakada

2023-07-01	Define `NO_SANITIZE` with reference to　ext/bigdecimal/missing.c	jinroq

2023-07-01	Supress `warning: ‘unsigned-integer-overflow’ attribute directive ↵	jinroq
	ignored [-Wattributes]`
2023-06-30	Don't check for null pointer in calls to free	Peter Zhu
	According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer. Notes: Merged: https://.com/ruby/ruby/pull/8004
2023-06-29	Fix memory when copying ST tables	Peter Zhu
	st_copy allocates a st_table, which is not needed for hashes since it is allocated by VWA and embedded, so this causes a memory . The following script demonstrates the issue: ```ruby 20.times do 100_000.times do {a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7, h: 8, i: 9} end puts `ps -o rss= -p #{$$}` end ``` Notes: Merged: https://.com/ruby/ruby/pull/8000
2023-06-24	De-duplicate parse_st.c code from st.c	Nobuyoshi Nakada
	Notes: Merged: https://.com/ruby/ruby/pull/7956
2023-06-17	Use ruby functions if `RUBY` is defined	Nobuyoshi Nakada
	Notes: Merged: https://.com/ruby/ruby/pull/7949
2023-06-17	Expand `#ifdef RUBY` region	Nobuyoshi Nakada
	Include the functions which are only used for `rb_hash_bulk_insert_into_st_table`. Notes: Merged: https://.com/ruby/ruby/pull/7949
2023-05-17	Implement Hash ST tables on VWA	Peter Zhu
	Notes: Merged: https://.com/ruby/ruby/pull/7742
2023-03-20	Use an st table for "too complex" objects	Aaron Patterson
	st tables will maintain insertion order so we can marshal dump / load objects with instance variables in the same order they were set on that particular instance [ruby-core:112926] [Bug #19535] Co-Authored-By: Jemma Issroff <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/7560
2023-02-10	st.c: spell `perturb' properly	Eric Wong
	Otherwise, a reader may wonder who `Peter B.' is and why a variable is named after them...
2022-10-19	Fix and improve coroutines for Darwin (macOS) ppc/ppc64. (#5975)	Sergey Fedorov
	Notes: Merged-By: ioquatix <[email protected]>
2022-10-06	[Bug #19038] Fix corruption of generic_iv_tbl when compacting	Peter Zhu
	When the generic_iv_tbl is resized up, rebuild_table performs allocations that can trigger GC. If autocompaction is enabled, then moved objects are removed from and inserted into the generic_iv_tbl. This may cause another call to rebuild_table to resize the generic_iv_tbl. When returning back to the original rebuild_table, some of the data may be stale, causing the generic_iv_tbl to be corrupted. This commit changes rebuild_table to only read data from the st_table after the allocations have completed. Co-Authored-By: Matt Valentine-House <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/6494
2022-07-21	Expand tabs [ci skip]	Takashi Kokubun
	[Misc #18891] Notes: Merged: https://.com/ruby/ruby/pull/6094
2022-02-28	st.c: Fix a typo in a comment	Yusuke Endoh

2022-02-10	st.c: Do not clear entries_bound when calling Hash#shift for empty hash	Yusuke Endoh
	tab->entries_bound is used to check if the bins are full in rebuild_table_if_necessary. Hash#shift against an empty hash assigned 0 to tab->entries_bound, but didn't clear the bins. Thus, the table is not rebuilt even when the bins are full. Attempting to add a new element into full-bin hash gets stuck. This change stops clearing tab->entries_bound in Hash#shift. [Bug #18578] Notes: Merged: https://.com/ruby/ruby/pull/5539
2021-06-17	Adjust styles [ci skip]	Nobuyoshi Nakada
	* --braces-after-func-def-line * --dont-cuddle-else * --procnames-start-lines * --space-after-for * --space-after-if * --space-after-while
2021-04-11	st.c: skip all deleted entries [Bug #17779]	tompng (tomoya ishida)
	Update the start entry skipping all already deleted entries. Fixes performance issue of `Hash#first` in a certain case.
2021-01-19	Replace "iff" with "if and only if"	Gannon McGibbon
	iff means if and only if, but readers without that knowledge might assume this to be a spelling mistake. To me, this seems like exclusionary language that is unnecessary. Simply using "if and only if" instead should suffice. Notes: Merged: https://.com/ruby/ruby/pull/4035
2020-11-30	[DOC] Fixed st_udpate comment [ci skip]	Nobuyoshi Nakada
	Clarified that the first and second arguments to the callback function are pointers to the KEY and the VALUE, but not those values themselves.
2020-10-17	sync RClass::ext::iv_index_tbl	Koichi Sasada
	iv_index_tbl manages instance variable indexes (ID -> index). This data structure should be synchronized with other ractors so introduce some VM locks. This also introduced atomic ivar cache used by set/getinlinecache instructions. To make updating ivar cache (IVC), we changed iv_index_tbl data structure to manage (ID -> entry) and an entry points serial and index. IVC points to this entry so that cache update becomes atomically. Notes: Merged: https://.com/ruby/ruby/pull/3662
2020-08-14	Enable arm64 optimizations that exist for power/x86 (#3393)	AGSaidi
	* Enable unaligned accesses on arm64 64-bit Arm platforms support unaligned accesses. Running the string benchmarks this change improves performance by an average of 1.04x, min .96x, max 1.21x, median 1.01x * arm64 enable gc optimizations Similar to x86 and powerpc optimizations. \| \|compare-ruby\|built-ruby\| \|:------\|-----------:\|---------:\| \|hash1 \| 0.225\| 0.237\| \| \| -\| 1.05x\| \|hash2 \| 0.110\| 0.110\| \| \| 1.00x\| -\| * vm_exec.c: improve performance for arm64 \| \|compare-ruby\|built-ruby\| \|:------------------------------\|-----------:\|---------:\| \|vm_array \| 26.501M\| 27.959M\| \| \| -\| 1.06x\| \|vm_attr_ivar \| 21.606M\| 31.429M\| \| \| -\| 1.45x\| \|vm_attr_ivar_set \| 21.178M\| 26.113M\| \| \| -\| 1.23x\| \|vm_backtrace \| 6.621\| 6.668\| \| \| -\| 1.01x\| \|vm_bigarray \| 26.205M\| 29.958M\| \| \| -\| 1.14x\| \|vm_bighash \| 504.155k\| 479.306k\| \| \| 1.05x\| -\| \|vm_block \| 16.692M\| 21.315M\| \| \| -\| 1.28x\| \|block_handler_type_iseq \| 5.083\| 7.004\| \| \| -\| 1.38x\| Notes: Merged-By: nurse <[email protected]>
2020-06-04	Removed no longer used constants [Bug #16934]	Nobuyoshi Nakada
	`RESERVED_HASH_VAL` and `RESERVED_HASH_SUBSTITUTION_VAL` have not been used directly in hash.c since 72825c35b0d8.
2020-03-16	Adjusted indents [ci skip]	Nobuyoshi Nakada

2020-03-11	Fix typos (#2958)	K.Takata
	* Fix a typo * Fix typos in st.[ch] Notes: Merged-By: k0kubun <[email protected]>
2020-02-27	st.c: remove variables that are no longer used	Yusuke Endoh
	to suppress a warning "variable 'check' set but not used"
2020-02-26	kill ST_DEBUG [Bug #16521]	卜部昌平
	This compile-time option has been broken for years (at least since commit 4663c224fa6c925ce54af32fd1c1cbac9508f5ec, according to git bisect). Let's delete codes that no longer work. Notes: Merged: https://.com/ruby/ruby/pull/2926
2020-02-07	more on NULL versus functions.	卜部昌平
	Function pointers are not void*. See also ce4ea956d24eab5089a143bba38126f2b11b55b6 8427fca49bd85205f5a8766292dd893f003c0e48
2019-12-26	decouple internal.h headers	卜部昌平
	Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies). Notes: Merged: https://.com/ruby/ruby/pull/2711
2019-12-20	Fixed misspellings	Nobuyoshi Nakada
	Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
2019-10-21	st: Do error check only on non-Ruby	K.Takata
	Notes: Merged: https://.com/ruby/ruby/pull/2304
2019-10-21	st: Add NULL checking	K.Takata
	These are found by Coverity. Notes: Merged: https://.com/ruby/ruby/pull/2304
2019-09-22	st.c: Use rb_st_* prefix instead of st_* (#2479)	Yusuke Endoh
	The original st.c was public domain hash table implementation, but Ruby's st.c is highly modified, and its data structure is not compatiblie with the original one. Therefore, when creating an extension library to wrap C code that uses the original st.c, the symbols conflict, which leads to segfault. This changes the prefix `st_` of st.c functions to `rb_st_` for reflecting that they are specific to Ruby's, and avoid symbol conflicts. Notes: Merged-By: mame <[email protected]>
2019-09-22	st.c (st_add_direct_with_hash): make it "static inline"	Yusuke Endoh
	It was originally static inline, but seemed to be accidentally published at 8f675cdd00e2c5b5a0f143f5e508dbbafdb20ccd.
2019-08-28	optimize get_power2 [Feature #15631]	pavel
	Merged: https://.com/ruby/ruby/pull/2292
2019-08-27	struct st_hash_type now free from ANYARGS	卜部昌平
	After 5e86b005c0f2ef30df2f9906c7e2f3abefe286a2, I now think ANYARGS is dangerous and should be extinct. This commit adds function s for struct st_hash_type. Honestly I don't understand why they were commented out at the first place.
2019-08-27	st_foreach now free from ANYARGS	卜部昌平
	After 5e86b005c0f2ef30df2f9906c7e2f3abefe286a2, I now think ANYARGS is dangerous and should be extinct. This commit deletes ANYARGS from st_foreach. I strongly believe that this commit should have had come with b0af0592fdd9e9d4e4b863fde006d67ccefeac21, which added extra parameter to st_foreach callbacks.
2019-04-20	Add `GC.compact` again.	tenderlove
	🙏 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67620 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-17	Reverting compaction for now	tenderlove
	For some reason symbols (or classes) are being overridden in trunk git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e