Conversation

shink

All examples can only be run on CUDA or MPS using --cuda or --mps.
This change adds an argument --device to make our examples run on a specific device, e.g., --device cpu. This will benefit all device manufacturers.

cc: @Yikun @FFFrog @hipudding @jgong5 @EikanWang

@netlifyNetlify

Deploy Preview for pytorch-examples-preview canceled.

NameLink
🔨 Latest commit2846161
🔍 Latest deploy loghttps://app.netlify.com/sites/pytorch-examples-preview/deploys/66f11872584dd60008bf36a1

@shinkshink marked this pull request as draft September 23, 2024 03:35
@shink

Could someone please review this change? If it makes sense to you, I will proceed.

@shinkshink marked this pull request as ready for review September 23, 2024 06:40
@shinkshink requested a review from msaroufim as a code owner September 23, 2024 07:03
@msaroufim

This change would be easier to merge if we started testing M1 in CI

Do you have any experience in doing this kind of stuff?

Basically you can copy this https://.com/pytorch/examples/blob/main/./workflows/main_python.yml change this line to support M1 https://.com/pytorch/examples/blob/main/./workflows/main_python.yml#L16 macos-latest and then make sure you have a test file where you always pass in the M1 device

@shink

@msaroufim Thanks for your review! The following code snippet is the key of this change. args.device is cpu by default so this change is compatible. As you can see, --device has lower priority than --cuda and --mps.

I tested this change on my out-of-tree backend device and the result said OK.

if args.cuda:
    device = torch.device("cuda")
elif args.mps:
    device = torch.device("mps")
else:
-   device = torch.device("cpu")
+   device = torch.device(args.device)

This change would be easier to merge if we started testing M1 in CI

ah yes! So should I test this change on MPS and add a workflow in this PR?

@shinkshink mentioned this pull request Dec 4, 2024
9 tasks
@framoncg

Hi, just to help with this PR, some changes have been made to some examples regarding accelerator devices with the help of the new accelerate API see #1334 for reference. Rebasing the branch and updating the PR might help a bit.

Sign up for free to join this conversation on . Already have an account? Sign in to comment
None yet

Successfully merging this pull request may close these issues.