JointQ¶
JointQ Quantizer¶
JointQ
dataclass
¶
JointQ(name: str = None, num_layers: int = None, calc_quant_error: bool = False, include_layer_names: list[str] = None, exclude_layer_names: list[str] = (lambda: ['lm_head'])(), include_layer_keywords: list[str] = None, exclude_layer_keywords: list[str] = (lambda: ['per_layer_model_projection'])(), target_layer_types: tuple = (lambda: (Linear,))(), hessian_dtype: dtype = torch.float64, module_to_name: dict = dict(), results: dict = dict(), flag_calibration: bool = True, flag_hessian: bool = False, flag_xtx: bool = True, bits: int = 4, symmetric: bool = False, group_size: Optional[int] = 128, log_level: int = 0, device: Optional[device] = None, regularization_lambda: Optional[float] = 0.1, regularization_mode: str = 'diagonal', regularization_gamma: float = 0.5, lambda_mode: str = 'fixed_lambda', lambda_list: Optional[List[float]] = None, incremental_eps_y: float = 0.03, incremental_eps_w: float = 0.1, incremental_initial_skip_ew_threshold: Optional[float] = 0.3, actorder: bool = False, ils_enabled: bool = False, ils_num_iterations: int = 10, ils_num_clones: int = 8, ils_num_channels: Optional[int] = None, enable_clip_optimize: bool = True, enable_clip_optimize_ep: bool = False, enable_gptq: bool = True, gptq: Optional[GPTQ] = None)
Bases: Quantizer
JointQ quantizer class.
JointQ is a post-training quantization method that combines multiple initialization strategies (Clip-Optimize, Clip-Optimize-EP, GPTQ) with local search optimization to find high-quality quantized weights.
Attributes:
| Name | Type | Description |
|---|---|---|
bits |
int
|
Number of bits for quantization. Default is 4. |
symmetric |
bool
|
Whether to use symmetric quantization. Default is False. |
group_size |
int or None
|
Group size for quantization. Default is 128. If None, per-channel quantization is used (group_size = in_features). |
log_level |
int
|
Log level (0: none, 1: minimal, 2: detailed). Default is 0. |
device |
device or None
|
Device for quantization. If None, uses the device of the module being quantized. |
regularization_lambda |
float or None
|
Tikhonov regularization strength.
Default is 0.2. Replaces X^T X with X^T X + nlambdaR where
R depends on |
regularization_mode |
str
|
Shape of the regularization matrix R.
|
regularization_gamma |
float
|
Exponent for the diagonal weights
in |
lambda_mode |
str
|
Regularization mode. Default is |
lambda_list |
list of float or None
|
Ascending list of lambda values
to try in |
incremental_eps_y |
float
|
Maximum tolerated relative output-error
increase when accepting a candidate in |
incremental_eps_w |
float
|
Minimum required relative weight-error
decrease to accept a candidate whose output error worsened in
|
incremental_initial_skip_ew_threshold |
float or None
|
If the first
incremental candidate uses |
actorder |
bool
|
Whether to reorder columns by activation magnitude (Hessian diagonal) before quantization. Default is False. When enabled, columns with larger activations are grouped together, improving group quantization efficiency and GPTQ initial solution quality. |
ils_enabled |
bool
|
Whether to enable Iterated Local Search. Default is False. |
ils_num_iterations |
int
|
Number of ILS iterations. Default is 10. |
ils_num_clones |
int
|
Number of clones per row in ILS. Default is 8. |
ils_num_channels |
int or None
|
Number of rows targeted per ILS iteration. When None, automatically set to min(dim_p, 1024). Default is None. |
enable_clip_optimize |
bool
|
Whether to use Clip-Optimize initialization. Default is True. |
enable_clip_optimize_ep |
bool
|
Whether to use Clip-Optimize with Error Propagation initialization. Default is False. |
enable_gptq |
bool
|
Whether to use GPTQ initialization. Default is True. |
gptq |
GPTQ or None
|
GPTQ instance for initial solution generation. If None, a default GPTQ is created from bits/group_size/symmetric. Pass a custom GPTQ instance to control parameters like blocksize, percdamp, mse, q_grid, q_norm. The GPTQ instance must have wbits/groupsize/sym matching JointQ's bits/group_size/symmetric, and actorder must be False. |
Example
Basic usage::
from onecomp.quantizer.jointq import JointQ
quantizer = JointQ(
bits=4,
symmetric=False,
group_size=128,
)
With all initialization strategies enabled::
quantizer = JointQ(
bits=4,
symmetric=False,
group_size=128,
enable_clip_optimize=True,
enable_clip_optimize_ep=True,
enable_gptq=True,
)
With custom GPTQ parameters::
from onecomp.quantizer.gptq import GPTQ
quantizer = JointQ(
bits=4,
symmetric=False,
group_size=128,
gptq=GPTQ(
wbits=4, groupsize=128, sym=False, mse=True
),
)
With incremental lambda mode::
quantizer = JointQ(
bits=4,
symmetric=False,
group_size=128,
lambda_mode="incremental_lambda",
)
validate_params ¶
Validate JointQ and GPTQ parameters.
Called once during setup(). Validates:
JointQ parameters
bits: int >= 1 group_size: int >= 1 or None log_level: int in {0, 1, 2} ils_num_iterations: int >= 1 (when ils_enabled) ils_num_clones: int >= 1 (when ils_enabled) ils_num_channels: int >= 1 or None (when ils_enabled)
GPTQ consistency
gptq.wbits == bits gptq.groupsize == group_size (or -1 when group_size is None) gptq.sym == symmetric gptq.actorder == False
Also delegates to self.gptq.validate_params() for GPTQ's own
parameter validation (blocksize, percdamp, etc.).
quantize_layer ¶
Quantize a single layer.
Processing flow
- Extract weight matrix from module
- Prepare matrix_XX (= X^T X) from input or use precomputed
- Apply activation ordering (actorder) if enabled
- Generate GPTQ initial solution (if enable_gptq=True), using the pre-regularization hessian
- Convert GPTQ result to JointQ Solution format
- Prepare ILS parameters
- Apply Tikhonov regularization to matrix_XX
- Run JointQ quantization with initial solutions
- Return quantization result
When lambda_mode="incremental_lambda", steps 7-8 are replaced by
an iterative loop that tries each value in lambda_list and keeps
the solution as long as it improves weight error without substantially
degrading output error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module
|
Module
|
The layer module to quantize. |
required |
input
|
tuple or Tensor
|
Input activations. Used to compute matrix_XX when matrix_XX is not provided. |
None
|
hessian
|
Tensor
|
Not used in JointQ (ignored). |
None
|
matrix_XX
|
Tensor
|
Precomputed X^T X (FP64). If provided, this is used instead of input. |
None
|
dim_n
|
int
|
Number of samples. Required when matrix_XX is provided. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
JointQResult |
Quantization result containing scale, zero_point, assignment, and perm (column permutation when actorder is used). |
execute_post_processing ¶
Log accepted_lambda statistics after all layers are quantized.