summaryrefslogtreecommitdiff
path: root/test/ruby/test_regexp.rb
AgeCommit message (Collapse)Author
2025-05-16Prevent double free for too big repetition quantifiers (#13332)Hiroya Fujinami
Prevent double free for too big repetition quantifiers The previous implementation calls `free(node)` twice (on parsing and compiling a regexp) when it has an error, so it leads to a double-free issue. This commit enforces `free(node)` once by introducing a temporal pointer to hold parsing nodes. Notes: Merged-By: makenowjust <[email protected]>
2025-04-18[Feature #20724] Bump Unicode version to 16.0.0Mari Imaizumi
Notes: Merged: https://.com/ruby/ruby/pull/13117
2025-03-28TestRegexp#test_match_cache_positive_look_ahead_complex: Extend the timeout ↵Yusuke Endoh
limit
2025-03-18[Feature #19908] Update Unicode headers to 15.1.0Mari Imaizumi
Notes: Merged: https://.com/ruby/ruby/pull/12798
2025-03-18Fix case folding in single byte encodingMari Imaizumi
Notes: Merged: https://.com/ruby/ruby/pull/12889
2025-03-11Fix memory in rb_reg_search_set_matchPeter Zhu
https://.com/ruby/ruby/pull/12801 changed regexp matches to reuse the backref, which causes memory to if the original registers of the match is not freed. For example, the following script s memory: 10.times do 1_000_000.times do "aaaaaaaaaaa".gsub(/a/, "") end puts `ps -o rss= -p #{$$}` end Before: 774256 1535152 2297360 3059280 3821296 4583552 5160304 5091456 5114256 4980192 After: 12480 11440 11696 11632 11632 11760 11824 11824 11824 11888 Notes: Merged: https://.com/ruby/ruby/pull/12905
2025-02-28Improve tests for small UTF regex with case fold.Maciek Rząsa
Co-authored-by: Nobuyoshi Nakada <[email protected]> Notes: Merged: https://.com/ruby/ruby/pull/12787
2025-02-28Use mbuf instead of bitset for character class for small UTF. Fixes #16145Maciej Rzasa
Notes: Merged: https://.com/ruby/ruby/pull/12787
2024-11-11Fix regex timeout double-free after stack_doubleJohn Hawthorn
As of 10574857ce167869524b97ee862b610928f6272f, it's possible to crash on a double free due to `stk_alloc` AKA `msa->stack_p` being freed twice, once at the end of match_at and a second time in `FREE_MATCH_ARG` in the parent caller. Fixes [Bug #20886] Notes: Merged: https://.com/ruby/ruby/pull/12030
2024-07-25Fix memory in Regexp capture group when timeoutPeter Zhu
[Bug #20650] The capture group allocates memory that is when it times out. For example: re = Regexp.new("^#{"(a*)" * 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do re =~ str rescue Regexp::TimeoutError end puts `ps -o rss= -p #{$$}` end Before: 34688 56416 78288 100368 120784 140704 161904 183568 204320 224800 After: 16288 16288 16880 16896 16912 16928 16944 17184 17184 17200 Notes: Merged: https://.com/ruby/ruby/pull/11238
2024-07-16Add MatchData#bytebegin and MatchData#byteendShugo Maeda
These methods return the byte-based offset of the beginning or end of the specified match. [Feature #20576]
2024-06-07TestRegexp#test_match_cache_positive_look_behind: Extend the timeout limitYusuke Endoh
2024-06-07TestRegexp#test_timeout_shorter_than_global: Extend the timeout limitYusuke Endoh
2024-06-07TestRegexp#test_s_timeout: accept timeout errors more tolerantlyYusuke Endoh
This test seems flaky on macOS Actions
2024-04-25Don't use assert_separately in Bug 20453 testDaniel Colson
https://.com/ruby/ruby/pull/10630#discussion_r1579565056 The PR was merged before I had a chance to address this feedback. `assert_separately` is not necessary for this test if I don't use a global timeout.
2024-04-25[Bug #20453] segfault in Regexp timeoutDaniel Colson
https://bugs.ruby-lang.org/issues/20228 started freeing `stk_base` to avoid a memory . But `stk_base` is sometimes stack allocated (using `xalloca`), so the free only works if the regex stack has grown enough to hit `stack_double` (which uses `xmalloc` and `xrealloc`). To reproduce the problem on master and 3.3.1: ```ruby Regexp.timeout = 0.001 /^(a*)x$/ =~ "a" * 1000000 + "x"' ``` Some details about this potential fix: `stk_base == stk_alloc` on [init](https://.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1153), so if `stk_base != stk_alloc` we can be sure we called [`stack_double`](https://.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1210) and it's safe to free. It's also safe to free if we've [saved](https://.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1187-L1189) the stack to `msa->stack_p`, since we do the `stk_base != stk_alloc` check before saving. This matches the check we do inside [`stack_double`](https://.com/ruby/ruby/blob/dde99215f2bc60c22a00fc941ff7f714f011e920/regexec.c#L1221)
2024-02-22Skip under_gc_compact_stress on s390x (#10073)Takashi Kokubun
2024-02-13Fix [Bug #20246]: Don't set next_head_exact if a capture is called (#9897)Hiroya Fujinami
2024-02-02Add memory test for Regexp timeoutPeter Zhu
[Bug #20228]
2024-01-29Fix RegExp warning causing flaky Ripper failureAlan Wu
Sometimes this file get picked up and break Ripper tests: TestRipper::Generic#test_parse_files:test/ruby assert_separately failed with error message pid 63392 exit 0 | test_regexp.rb:2025: warning: character class has duplicated range https://.com/ruby/ruby/actions/runs/7699956651/job/20982702553#step:12:103
2024-01-29Correctly handle consecutive lookarounds (#9738)Hiroya Fujinami
Fix [Bug #20207] Fix [Bug #20212] Handling consecutive lookarounds in init_cache_opcodes is buggy, so it causes invalid memory access reported in [Bug #20207] and [Bug #20212]. This fixes it by using recursive functions to detected lookarounds nesting correctly.
2024-01-11Prevent syntax warnings in test/ruby/test_regexp.rbYusuke Endoh
2024-01-10Fix test case for `test_match_cache_with_peek_optimization` (#9466)Hiroya Fujinami
2024-01-10Fix to work match cache with peek next optimization (#9459)Hiroya Fujinami
2024-01-01Don't create T_MATCH object if /regexp/.match(string) doesn't matchLuke Gruber
Fixes [Bug #20104]
2023-12-29Fix [Bug #20098]: set counter value for {n,m} repetition correctly (#9391)Hiroya Fujinami
2023-12-28Fix [Bug #20083]: correct a cache point size for atomic groups (#9367)Hiroya Fujinami
2023-12-24Fix Regexp#inspect for GC compactionPeter Zhu
rb_reg_desc was not safe for GC compaction because it took in the C string and length but not the backing String object so it get moved during compaction. This commit changes rb_reg_desc to use the string from the Regexp object. The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_inspect_under_gc_compact_stress [test/ruby/test_regexp.rb:474]: <"(?-mix:\\/)|"> expected but was <"/\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00/">.
2023-12-24Fix Regexp#match for GC compactionPeter Zhu
The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_match_under_gc_compact_stress: NoMethodError: undefined method `match' for nil test_regexp.rb:878:in `block in test_match_under_gc_compact_stress'
2023-12-23Fix Regexp#to_s for GC compactionPeter Zhu
The test fails when RGENGC_CHECK_MODE is turned on: TestRegexp#test_to_s_under_gc_compact_stress = 13.46 s 1) Failure: TestRegexp#test_to_s_under_gc_compact_stress [test/ruby/test_regexp.rb:81]: <"(?-mix:abcd\u3042)"> expected but was <"(?-mix:\u5C78\u3030\u5C78\u3030\u5C78\u3030\u5C78\u3030\u5C78\u3030)">.
2023-12-06Copy encoding flags when copying a regex [Bug #20039]Dustin Brown
* :bug: Fixes [Bug #20039](https://bugs.ruby-lang.org/issues/20039) When a Regexp is initialized with another Regexp, we simply copy the properties from the original. However, the flags on the original were not being copied correctly. This caused an issue when the original had multibyte characters and was being compared with an ASCII string. Without the forced encoding flag (`KCODE_FIXED`) transferred on to the new Regexp, the comparison would fail. See the included test for an example. Co-authored-by: Nobuyoshi Nakada <[email protected]>
2023-11-08Improve error and memory handlingAdam Hess
Apply Nobu's suggestions which improve style, memory handling and error correction. Co-authored-by: Nobuyoshi Nakada <[email protected]>
2023-10-30Optimize regexp matching for look-around and atomic groups (#7931)Hiroya Fujinami
2023-10-18Skip some timeout tests on s390xYusuke Endoh
They are too unstable on the machine. ``` 1) Failure: TestRegexp#test_timeout_shorter_than_global [/home/chkbuild/chkbuild/tmp/build/20231018T230003Z/ruby/test/ruby/test_regexp.rb:1788]: Expected |0.2 - 0.962938869| (0.7629388690000001) to be <= 0.15000000000000002. ``` https://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20231018T230003Z.fail.html.gz ``` 1) Failure: TestRegexp#test_timeout_longer_than_global [/home/chkbuild/chkbuild/tmp/build/20231017T140006Z/ruby/test/ruby/test_regexp.rb:1788]: Expected |0.5 - 1.040696078| (0.5406960780000001) to be <= 0.375. ``` https://rubyci.s3.amazonaws.com/s390x/ruby-master/log/20231017T140006Z.fail.html.gz
2023-10-01Move repeating `matches` and `unmatches` to keyword argumentsNobuyoshi Nakada
And default to the corresponding instance variables.
2023-10-01Add tests for Unicode age property 15.0Nobuyoshi Nakada
2023-05-22Allow the match cache optimization for atomic groups (#7804)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-04-23Use UTF-8 encoding for literal extended regexps with UTF-8 characters in ↵Jeremy Evans
comments Fixes [Bug #19455] Notes: Merged: https://.com/ruby/ruby/pull/7592
2023-04-19* remove trailing spaces. [ci skip]git
2023-04-19Refactor `Regexp#match` cache implementation (#7724)TSUYUSATO Kitsune
* Refactor Regexp#match cache implementation Improved variable and function names Fixed [Bug 19537] (Maybe fixed in https://.com/ruby/ruby/pull/7694) * Add a comment of the glossary for "match cache" * Skip to reset match cache when no cache point on null check Notes: Merged-By: makenowjust <[email protected]>
2023-04-19MatchData#named_captures: add optional symbolize_names keyword (#6952)Vladimir Dementyev
Notes: Merged-By: ioquatix <[email protected]>
2023-04-12[Bug #19587] Fix `reset_match_cache` argumentsNobuyoshi Nakada
Notes: Merged: https://.com/ruby/ruby/pull/7694
2023-03-18core_assertions.rb: Relax `assert_linear_performance`Nobuyoshi Nakada
* Use an `Enumerable` as factors, instead of three arguments. * Include `assert_operator` time in rehearsal time. * Round up max expected time. Notes: Merged: https://.com/ruby/ruby/pull/7554
2023-03-16Revert "core_assertions.rb: Refine `assert_linear_performance`"Takashi Kokubun
This reverts commit cae4342dd559e34c1ce6219593f77f0ad80286da. This is failing a lot of CIs and nobody is actively looking into fixing it. Let me revert this until we have a solution to it.
2023-03-16core_assertions.rb: Refine `assert_linear_performance`Nobuyoshi Nakada
* Use an `Enumerable` as factors, instead of three arguments.
2023-03-13[Bug #19476]: correct cache index computation for repetition (#7457)TSUYUSATO Kitsune
Notes: Merged-By: makenowjust <[email protected]>
2023-03-13* remove trailing spaces. [ci skip]git
2023-03-13[Bug #19467] correct cache points and counting failure on ↵TSUYUSATO Kitsune
`OP_ANYCHAR_STAR_PEEK_NEXT` (#7454) Notes: Merged-By: makenowjust <[email protected]>
2023-03-12Add test for linear performanceNobuyoshi Nakada
Notes: Merged: https://.com/ruby/ruby/pull/7506
2023-03-03[Bug #19471] `Regexp.compile` should handle keyword argumentsNobuyoshi Nakada
As well as `Regexp.new`, it should pass keyword arguments to the `Regexp#initialize` method. Notes: Merged: https://.com/ruby/ruby/pull/7431