Conversation

lukemanley

Adds support for Series.str.join with ArrowDtype(pa.string()):

In [1]: from pandas import Series, ArrowDtype

In [2]: import pyarrow as pa

In [3]: ser = Series(["abc", "123", None], dtype=ArrowDtype(pa.string()))

In [4]: ser.str.join("-")
Out[4]: 
0    a-b-c
1    1-2-3
2     <NA>
dtype: string[pyarrow]

This is already supported by string[python] and string[pyarrow] as well as python strings in general:

In [1]: "-".join("abc")
Out[1]: 'a-b-c'

This is work towards being able to add ArrowDtype(pa.string()) to the string benchmarks.

@lukemanleylukemanley added StringsString extension data type and string dataArrowpyarrow functionalitylabels Jun 13, 2023
@lukemanleylukemanley added this to the 2.1 milestone Jun 13, 2023
@mroeschkemroeschke merged commit 1f3c9bc into pandas-dev:main Jun 13, 2023
@mroeschke

Nice find thanks @lukemanley

@lukemanleylukemanley deleted the str-join-pyarrow-string branch June 13, 2023 23:27
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* Series.str.join to support ArrowDtype(pa.string())

* gh ref
Sign up for free to join this conversation on . Already have an account? Sign in to comment
Arrowpyarrow functionalityStringsString extension data type and string data
None yet

Successfully merging this pull request may close these issues.