Conversation

gvwilson

This PR replaces #5173 by rebasing @bmaranville's changes on top of main. Please see #5173 for detailed notes about the motivation and changes. The important changes are in:

  • codegen/init.py

  • codegen/datatypes.py

  • codegen/validators.py

  • doc/python/marker-style.md

  • plotly/_subplots.py

  • plotly/basedatatypes.py

  • plotly/figure_factory/_annotated_heatmap.py

  • plotly/io/_templates.py

  • plotly/validator_cache.py

  • I have read through the contributing notes and understand the structure of the package. In particular, if my PR modifies code of plotly.graph_objects, my modifications concern the codegen files and not generated files.

  • I have added tests (if submitting a new feature or correcting a bug) or
    modified existing tests.

  • For a new feature, I have added documentation examples in an existing or
    new tutorial notebook (please see the doc checklist as well).

  • I have added a CHANGELOG entry if fixing/changing/adding anything substantial.

  • For a new feature or a change in behaviour, I have updated the relevant docstrings in the code to describe the feature or behaviour (please see the doc checklist as well).

@gvwilsongvwilson requested a review from emilykl June 4, 2025 13:41
@gvwilsongvwilson self-assigned this Jun 4, 2025
@gvwilsongvwilson mentioned this pull request Jun 4, 2025
5 tasks
| File      | Before (bytes) | After (bytes) |
| --------- | -------------: | ------------: |
| `.whl`    |       16265568 |       9447645 |
| `.tar.gz` |        7666222 |       6615305 |
@gvwilsongvwilson force-pushed the refactor-validators branch from f0fddc2 to 0184bd4 Compare June 4, 2025 13:59
@gvwilsongvwilson requested a review from alexcjohnson June 4, 2025 19:31
@gvwilson

@alexcjohnson if you have time to look this over we'd be grateful for your feedback

@emilykl

@bmaranville The file plotly/validators/_validators.json needs to be flagged in the pyproject.toml so that it's included in the built package.

Looks like the correct way is to add validators/_validators.json to the list under [tool.setuptools.package-data].

@emilykl

Thanks for the PR @bmaranville ! Other than my comments above, looks good to me. The tests are passing and validation seems to be working as usual in the examples I've tried. (@alexcjohnson I'm curious whether you can see any possible pitfalls or edge cases here -- happy to do more testing of specific cases if you have concerns.)

@alexcjohnson

@bmaranville nice work, this is a clever solution! Two small thoughts about it:

  • We should be able to reduce the whitespace significantly, to make the file smaller. Could even collapse it to a single line, I'm not sure there's much value keeping it human-readable, but maybe @emilykl has opinions about this. If we really want to optimize file size we could also change its structure a bit - like if every entry is a dict {params, superclass} this could be converted to a length-2 list. But the whitespace is the biggest piece of this.
  • json.load is pretty fast, ~41ms on my computer. But it's even faster if we just make this a Python file. In my quick test I just added:
true=True
false=False
null=None
v = 

to the beginning of the file to convert the JSON to Python, and then from validators._validators import v took only ~25ms. (If we do this for real we should be able to tweak the json.dump to output Python in the first place so we don't need to alias true, false, and null)

@emilykl the only thing I'm not 100% confident of is whether this has any type checking implications. I don't think so, I think the only type checking that matters is on graph_objs, not validators, but we do currently have if TYPE_CHECKING blocks in all the __init__.py files in validators, ensuring that type checkers will load them all. If you haven't done so already it's probably worth playing around on this branch a bit to convince ourselves that we don't lose anything important with this.

@gvwilson

@alexcjohnson 👍 on reducing whitespace, but I'd like to merge this one first, then combine it with #5218, and then tidy up the whitespace in the result - #5218 switches code formatting and checking from black to ruff, so if we're going to figure out how to configure a tool to stay silent about things like long lines, I'd like to do that once. I've created #5225 and will assign it to myself.

@gvwilsongvwilson force-pushed the refactor-validators branch from 48daf32 to a39e0e6 Compare June 10, 2025 14:08
-   fix black exclude parameter in pyproject.toml to properly exclude ONLY directories
-   reintroduce py38 in codegen/__init__.py
-   remove commented-out lines
-   add plotly/validators/_validators.json to pyproject.toml
-   regenerate code
@gvwilsongvwilson force-pushed the refactor-validators branch from a39e0e6 to 513337c Compare June 10, 2025 15:02
@gvwilsongvwilson merged commit 9dc08a1 into main Jun 10, 2025
10 checks passed
@emilykl
  • We should be able to reduce the whitespace significantly, to make the file smaller. Could even collapse it to a single line, I'm not sure there's much value keeping it human-readable, but maybe @emilykl has opinions about this.

@alexcjohnson I'm in favor of this. I don't think there's any particular reason to keep it human-readable, although it's also not a HUGE difference with compression. Looks like about 194 KB compressed with whitespace vs. 163 KB compressed without whitespace, from a quick test I did locally.

  • json.load is pretty fast, ~41ms on my computer. But it's even faster if we just make this a Python file. ... convert the JSON to Python, and then from validators._validators import v took only ~25ms. (If we do this for real we should be able to tweak the json.dump to output Python in the first place so we don't need to alias true, false, and null)

@alexcjohnson @gvwilson I'm in favor of this -- that's a big performance improvement. Something for a future PR? (as a bonus, then we wouldn't have to worry about including this file in the pyproject.toml, since Python files are included automatically.)

@emilykl the only thing I'm not 100% confident of is whether this has any type checking implications. I don't think so, I think the only type checking that matters is on graph_objs, not validators, but we do currently have if TYPE_CHECKING blocks in all the __init__.py files in validators, ensuring that type checkers will load them all. If you haven't done so already it's probably worth playing around on this branch a bit to convince ourselves that we don't lose anything important with this.

I'm not a heavy user of type checking so it's possible I'll miss something here. Syntax highlighting is working as expected with these changes. I turned on stricter Pylance type checking and haven't noticed any issues yet but I'll keep an eye out. Hopefully we can put these changes out in an RC to catch any issues before full release.

@gvwilsongvwilson mentioned this pull request Jun 10, 2025
Sign up for free to join this conversation on . Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.