summaryrefslogtreecommitdiff
path: root/mlir/test/python/dialects/quant.py
AgeCommit message (Collapse)Author
2025-03-23Sub-channel quantized type implementation (#120172)Sandeep Dasgupta
This is an implementation for [RFC: Supporting Sub-Channel Quantization in MLIR](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694). In order to make the review process easier, the PR has been divided into the following commit labels: 1. **Add implementation for sub-channel type:** Includes the class design for `UniformQuantizedSubChannelType`, printer/parser and bytecode read/write support. The existing types (per-tensor and per-axis) are unaltered. 2. **Add implementation for sub-channel type:** Lowering of `quant.qcast` and `quant.dcast` operations to Linalg operations. 3. **Adding C/Python Apis:** We first define he C-APIs and build the Python-APIs on top of those. 4. **Add pass to normalize generic ....:** This pass normalizes sub-channel quantized types to per-tensor per-axis types, if possible. A design note: - **Explicitly storing the `quantized_dimensions`, even when they can be derived for ranked tensor.** While it's possible to infer quantized dimensions from the static shape of the scales (or zero-points) tensor for ranked data tensors ([ref](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694/3) for background), there are cases where this can lead to ambiguity and issues with round-tripping. ``` Consider the example: tensor<2x4x!quant.uniform<i8:f32:{0:2, 0:2}, {{s00:z00, s01:z01}}>> ``` The shape of the scales tensor is [1, 2], which might suggest that only axis 1 is quantized. While this inference is technically correct, as the block size for axis 0 is a degenerate case (equal to the dimension size), it can cause problems with round-tripping. Therefore, even for ranked tensors, we are explicitly storing the quantized dimensions. Suggestions welcome! PS: I understand that the upcoming holidays may impact your schedule, so please take your time with the review. There's no rush.
2024-11-19[mlir][Bindings] Fix missing return value of functions and incorrect type ↵annuasd
hint in pyi. (#116731) The zero points of UniformQuantizedPerAxisType should be List[int]. And there are two methods missing return value. Co-authored-by: 牛奕博 <niuyibo@niuyibodeMacBook-Pro.local>
2023-05-26[NFC][Py Reformat] Reformat python files in mlir subdirTobias Hieta
This is an ongoing series of commits that are reformatting our Python code. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Differential Revision: https://reviews.llvm.org/D150782
2022-01-05[mlir] Introduce Python bindings for the quantization dialectAlex Zinenko
So far, only the custom dialect types are exposed. The build and packaging is same as for Linalg and SparseTensor, and in need of refactoring that is beyond the scope of this patch. Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D116605