Comments
This may have been unintentionally changed by me in https://.com/pandas-dev/pandas/pull/36911/files |
I looked into this, I think the new case is more consistent maybe?
returned
While a one dimensional group key returned what you showed above. The missing categories case would be tricky to handle with multidimensional keys. Maybe it would be better to remove unused categories from groups too? Or should the one-dimensional case be special here? |
@mroeschke no that was not the reason. I think this was caused by c4226d4 |
Thanks for confirming @phofl |
Addition: We are no longer running through there since #36842 |
@phofl thanks for looking at it!
Indeed for multiple keys, we seem to not include unobserved categories. But, here both So to fully make it consistent, then for example also |
Passing >>>df = pd.DataFrame({"key": pd.Categorical(["b"]*5, categories=["a", "b", "c", "d"]), "col": range(5)})
>>> gb = df.groupby("key", observed=True)
>>> list(gb.indices)
['b']
>>> gb = df.groupby("key", observed=False)
>>> list(gb.indices)
['a', b', 'c', 'd'] |
Have to correct myself, this was changed by #36842 @jorisvandenbossche When testing this on 1.1.0 and 1.1.5 I get
both times. Edit: Changed the example a bit.
|
I think the pointer of @mroeschke to #36911 might be more correct, since that was a PR for 1.1.4, while #36842 only for 1.2.0. And unlike what I said earlier (I thought it was only working on 1.0, and not in 1.1.x), this actually only changed from 1.1.3 to 1.1.4. |
can confirm, first bad commit: [345efdd] BUG: RollingGroupby not respecting sort=False (#36911) |
Hm just looked at the pr numbers, not when they were merged. Nevertheless, we have to change both commits to get the original result, because the code path from #36911 is currently not used on master. |
Does anybody know if this was an intentional change? (I don't directly find something about it in the whatsnew)
vs
This already changed in pandas 1.1, so not a recent change.
The consequence of this is that iterating over
gb
vs iterating overgb.indices
is not consistent anymore.cc @mroeschke @rhshadrach
The text was updated successfully, but these errors were encountered: