Conversation

rohansjoshi

Summary:
Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:

Model Namemax_seq_lenptqword_perplexity
Llama 3.2-1B Instruct12816a4w5821003.055178451
Llama 3.2-1B Instruct12816a4w_block5396240.078572427
Llama 3.2-1B Instruct1288a8w533154.970440251

Differential Revision: D76837572

@rohansjoshirohansjoshi requested a review from cccclai as a code owner June 20, 2025 15:29
@pytorch-botPyTorch Bot

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11822

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 6cd35a3 with merge base a12a005 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-botfacebook--bot added the CLA SignedThis label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.label Jun 20, 2025
@facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D76837572

@github-actionsGitHub Actions

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorct, for example
@pytorct label "release notes: none"

For more information, see
https://.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

rohansjoshi added a commit to rohansjoshi/executorch that referenced this pull request Jun 20, 2025
Summary:

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Differential Revision: D76837572
@facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D76837572

rohansjoshi added a commit to rohansjoshi/executorch that referenced this pull request Jun 20, 2025
Summary:
Pull Request resolved: pytorch#11822

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Differential Revision: D76837572

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the ppl summary table

rohansjoshi added a commit to rohansjoshi/executorch that referenced this pull request Jun 20, 2025
Summary:

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Reviewed By: cccclai

Differential Revision: D76837572
@facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D76837572

rohansjoshi added a commit to rohansjoshi/executorch that referenced this pull request Jun 20, 2025
Summary:
Pull Request resolved: pytorch#11822

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Reviewed By: cccclai

Differential Revision: D76837572
rohansjoshi added a commit to rohansjoshi/executorch that referenced this pull request Jun 21, 2025
Summary:

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Reviewed By: cccclai

Differential Revision: D76837572
Summary:
Pull Request resolved: pytorch#11822

Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:
| Model Name | max_seq_len | ptq | word_perplexity
|----------|----------|----------|-----------|
| Llama 3.2-1B Instruct  | 128   | 16a4w |  5821003.055178451 |
| Llama 3.2-1B Instruct  | 128   | 16a4w_block |  5396240.078572427 |
| Llama 3.2-1B Instruct  | 128   | 8a8w |  533154.970440251 |

Reviewed By: cccclai

Differential Revision: D76837572
@facebook-github-bot

This pull request was exported from Phabricator. Differential Revision: D76837572

@facebook-github-botfacebook--bot merged commit 608a745 into pytorch:main Jun 21, 2025
102 of 104 checks passed
Sign up for free to join this conversation on . Already have an account? Sign in to comment
CLA SignedThis label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.fb-exported
None yet

Successfully merging this pull request may close these issues.