pandas.to_numeric fails to coerce Pyarrow Decimal series that contain NA values due to those NA values getting dropped, leading to an index mismatch:
importpandasaspdimportpyarrowaspadecimal_type=pd.ArrowDtype(pa.decimal128(3, scale=2))
series=pd.Series([1, None], dtype=decimal_type)
pd.to_numeric(series, errors="coerce")
---------------------------------------------------------------------------ValueErrorTraceback (mostrecentcalllast)
CellIn[13], line84decimal_type=pd.ArrowDtype(pa.decimal128(3, scale=2))
6series=pd.Series([1, None], dtype=decimal_type)
---->8pd.to_numeric(series, errors="coerce")
File/opt/homebrew/lib/python3.13/site-packages/pandas/core/tools/numeric.py:319, into_numeric(arg, errors, downcast, dtype_backend)
316values=ArrowExtensionArray(values.__arrow_array__())
318ifis_series:
-->319returnarg._constructor(values, index=arg.index, name=arg.name)
320elifis_index:
321# because we want to coerce to numeric if possible,322# do not use _shallow_copy323frompandasimportIndexFile/opt/homebrew/lib/python3.13/site-packages/pandas/core/series.py:575, inSeries.__init__(self, data, index, dtype, name, copy, fastpath)
573index=default_index(len(data))
574elifis_list_like(data):
-->575com.require_length_match(data, index)
577# create/copy the manager578ifisinstance(data, (SingleBlockManager, SingleArrayManager)):
File/opt/homebrew/lib/python3.13/site-packages/pandas/core/common.py:573, inrequire_length_match(data, index)
569""" 570 Check the length of data matches the length of the index. 571 """572iflen(data) !=len(index):
-->573raiseValueError(
574"Length of values "575f"({len(data)}) "576"does not match length of index "577f"({len(index)})"578 )
ValueError: Lengthofvalues (1) doesnotmatchlengthofindex (2)
I'd expect the series to get converted (to values of decimal.Decimal type, with dtype=object) without raising an exception, preserving the null elements.
Installed Versions
INSTALLED VERSIONS ------------------ commit : 0691c5c python : 3.13.2 python-bits : 64 OS : Darwin OS-release : 24.5.0 Version : Darwin Kernel Version 24.5.0: Tue Apr 22 19:53:27 PDT 2025; root:xnu-11417.121.6~2/RELEASE_ARM64_T6041 machine : arm64 processor : arm byteorder : little LC_ALL : en_CA.UTF-8 LANG : None LOCALE : en_CA.UTF-8
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
pandas.to_numeric
fails to coerce Pyarrow Decimal series that contain NA values due to those NA values getting dropped, leading to an index mismatch:This seems to be due to this conversion to a numpy type setting the dtype to
object
, which causes this condition to be false, which skips re-adding the NA values, leading to a finalvalues
array shorter than the original index.Expected Behavior
I'd expect the series to get converted (to values of
decimal.Decimal
type, with dtype=object) without raising an exception, preserving the null elements.Installed Versions
pandas : 2.2.3
numpy : 2.2.2
pytz : 2025.1
dateutil : 2.9.0.post0
pip : 25.0
Cython : None
sphinx : None
IPython : 8.32.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.4
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2025.2.0
html5lib : None
hypothesis : 6.125.2
gcsfs : None
jinja2 : 3.1.5
lxml.etree : None
matplotlib : 3.10.3
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 19.0.0
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.2
sqlalchemy : 2.0.38
tables : None
tabulate : None
xarray : 2025.1.2
xlrd : None
xlsxwriter : None
zstandard : 0.23.0
tzdata : 2025.1
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: