Conversation

lukemanley

Perf improvement for concat(objects, axis=1) when objects have different indexes.

> asv continuous -f 1.1 upstream/main perf-concat-axis-1 -b join_merge.ConcatIndexDtype

       before           after         ratio
     [e58d1ba1]       [b8fbc611]
     <main>           <perf-concat-axis-1>
-         121±9μs          109±3μs     0.90  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'has_na', 0, True)
-      2.67±0.2ms      2.40±0.02ms     0.90  join_merge.ConcatIndexDtype.time_concat_series('datetime64[ns]', 'non_monotonic', 1, False)
-     4.84±0.07ms      2.78±0.02ms     0.57  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'non_monotonic', 1, True)
-     4.14±0.03ms      2.25±0.04ms     0.54  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'monotonic', 1, True)
-     4.04±0.03ms      2.18±0.03ms     0.54  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'monotonic', 1, False)
-     5.70±0.05ms       2.82±0.2ms     0.50  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'has_na', 1, True)
-     4.27±0.08ms      2.11±0.03ms     0.49  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'non_monotonic', 1, False)
-     4.76±0.07ms      2.27±0.06ms     0.48  join_merge.ConcatIndexDtype.time_concat_series('int64', 'non_monotonic', 1, True)
-      4.34±0.1ms      2.01±0.03ms     0.46  join_merge.ConcatIndexDtype.time_concat_series('int64', 'monotonic', 1, False)
-     5.21±0.09ms       2.32±0.2ms     0.45  join_merge.ConcatIndexDtype.time_concat_series('Int64', 'has_na', 1, False)
-     4.33±0.03ms      1.93±0.03ms     0.45  join_merge.ConcatIndexDtype.time_concat_series('int64', 'non_monotonic', 1, False)
-      4.59±0.2ms      1.98±0.04ms     0.43  join_merge.ConcatIndexDtype.time_concat_series('int64', 'monotonic', 1, True)
-      20.2±0.2ms       3.94±0.3ms     0.19  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'has_na', 1, True)
-      18.9±0.5ms      2.96±0.04ms     0.16  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'non_monotonic', 1, True)
-      19.0±0.1ms      2.94±0.06ms     0.15  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'has_na', 1, False)
-      18.2±0.1ms       2.67±0.2ms     0.15  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'monotonic', 1, False)
-      18.2±0.1ms      2.38±0.03ms     0.13  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'monotonic', 1, True)
-      18.1±0.1ms      2.28±0.04ms     0.13  join_merge.ConcatIndexDtype.time_concat_series('int64[pyarrow]', 'non_monotonic', 1, False)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@lukemanleylukemanley addedMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, Explodelabels Apr 8, 2023
@topper-123topper-123 added this to the 2.1 milestone Apr 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loks good, pending the tests pass.

@mroeschkemroeschke merged commit 2028d9a into pandas-dev:main Apr 10, 2023
@mroeschke

Thanks @lukemanley

@lukemanleylukemanley deleted the perf-concat-axis-1 branch April 18, 2023 11:03
Sign up for free to join this conversation on . Already have an account? Sign in to comment
PerformanceMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, Explode
None yet

Successfully merging this pull request may close these issues.