I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
import numpy as np import pandas as pd df = pd.DataFrame({"x": [1, 2]}) df.index = pd.MultiIndex.from_arrays( [pd.Categorical(np.array([1, 2], dtype=np.uint64))] ) df2.index = pd.MultiIndex.from_arrays( [pd.Categorical(np.array([1, 2], dtype=np.int64))] ) pd.testing.assert_frame_equal(df, df2, check_categorical=False, check_index_type=False)
When testing two indices and I ask to not check index type I don't think it should check the type of categories of the of type of an index level either.
Note that for a normal index things behave better:
df = pd.DataFrame({"x": [1, 2]}) df.index = pd.Index(pd.Categorical(np.array([1, 2], dtype=np.int64))) df2.index = pd.Index(pd.Categorical(np.array([1, 2], dtype=np.uint64))) pd.testing.assert_frame_equal(df, df2, check_categorical=False)
Passes as I'd expect.
df = pd.DataFrame({"x": [1, 2]}) df.index = pd.MultiIndex.from_arrays( [pd.Categorical(np.array([1, 2], dtype=np.uint64))] ) df2.index = pd.MultiIndex.from_arrays( [pd.Categorical(np.array([1, 2], dtype=np.int64))] ) pd.testing.assert_frame_equal(df, df2, check_categorical=False, check_index_type=False)
Expected behavior: passes. Nothing is printed.
Actual behavior:
AssertionError Traceback (most recent call last) Cell In[163], line 8 2 df.index = pd.MultiIndex.from_arrays( 3 [pd.Categorical(np.array([1, 2], dtype=np.uint64))] 4 ) 5 df2.index = pd.MultiIndex.from_arrays( 6 [pd.Categorical(np.array([1, 2], dtype=np.int64))] 7 ) ----> 8 pd.testing.assert_frame_equal(df, df2, check_categorical=False, check_index_type=False) [... skipping hidden 3 frame] File ~/.local/lib/python3.10/site-packages/pandas/_testing/asserters.py:479, in assert_categorical_equal(left, right, check_dtype, check_category_order, obj) 476 exact = True 478 if check_category_order: --> 479 assert_index_equal( 480 left.categories, right.categories, obj=f"{obj}.categories", exact=exact 481 ) 482 assert_numpy_array_equal( 483 left.codes, right.codes, check_dtype=check_dtype, obj=f"{obj}.codes" 484 ) 485 else: [... skipping hidden 1 frame] File ~/.local/lib/python3.10/site-packages/pandas/_testing/asserters.py:247, in assert_index_equal.<locals>._check_types(left, right, obj) 244 assert_index_equal(left.categories, right.categories, exact=exact) 245 return --> 247 assert_attr_equal("dtype", left, right, obj=obj) [... skipping hidden 1 frame] File ~/.local/lib/python3.10/site-packages/pandas/_testing/asserters.py:595, in raise_assert_detail(obj, message, left, right, diff, first_diff, index_values) 592 if first_diff is not None: 593 msg += f"\n{first_diff}" --> 595 raise AssertionError(msg) AssertionError: MultiIndex level [0] category.categories are different Attribute "dtype" are different [left]: uint64 [right]: int64
commit : 5c15588python : 3.10.9.final.0python-bits : 64OS : LinuxOS-release : 5.19.11-1rodete1-amd64Version : #1 SMP PREEMPT_DYNAMIC Debian 5.19.11-1rodete1 (2022-10-31)machine : x86_64processor :byteorder : littleLC_ALL : NoneLANG : en_US.UTF-8LOCALE : en_US.UTF-8
pandas : 2.1.0.dev0+284.g5c155883fdnumpy : 1.21.5pytz : 2022.6dateutil : 2.8.2setuptools : 65.6.3pip : 22.3.1Cython : Nonepytest : Nonehypothesis : Nonesphinx : Noneblosc : Nonefeather : Nonexlsxwriter : Nonelxml.etree : 4.9.1html5lib : 1.1pymysql : Nonepsycopg2 : Nonejinja2 : 3.1.2IPython : 8.10.0pandas_datareader: Nonebs4 : 4.11.1bottleneck : Nonebrotli : 1.0.9fastparquet : Nonefsspec : Nonegcsfs : Nonematplotlib : Nonenumba : Nonenumexpr : Noneodfpy : Noneopenpyxl : Nonepandas_gbq : Nonepyarrow : Nonepyreadstat : Nonepyxlsb : Nones3fs : Nonescipy : Nonesnappy : Nonesqlalchemy : Nonetables : Nonetabulate : Nonexarray : Nonexlrd : 1.2.0zstandard : Nonetzdata : 2022.7qtpy : Nonepyqt5 : None
The text was updated successfully, but these errors were encountered:
Thanks for the report @caneffI was able to reproduce the bug, but df2 is not defined on your example.
Sorry, something went wrong.
There was an error while loading. Please reload this page.
Successfully merging a pull request may close this issue.
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When testing two indices and I ask to not check index type I don't think it should check the type of categories of the of type of an index level either.
Note that for a normal index things behave better:
Passes as I'd expect.
Expected Behavior
Expected behavior: passes. Nothing is printed.
Actual behavior:
Installed Versions
INSTALLED VERSIONS
commit : 5c15588
python : 3.10.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.19.11-1rodete1-amd64
Version : #1 SMP PREEMPT_DYNAMIC Debian 5.19.11-1rodete1 (2022-10-31)
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.0.dev0+284.g5c155883fd
numpy : 1.21.5
pytz : 2022.6
dateutil : 2.8.2
setuptools : 65.6.3
pip : 22.3.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.10.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : 1.0.9
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
zstandard : None
tzdata : 2022.7
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: