JointQ¶

JointQ Quantizer¶

JointQ `dataclass` ¶

JointQ(name: str = None, num_layers: int = None, calc_quant_error: bool = False, include_layer_names: list[str] = None, exclude_layer_names: list[str] = (lambda: ['lm_head'])(), include_layer_keywords: list[str] = None, exclude_layer_keywords: list[str] = (lambda: ['per_layer_model_projection'])(), target_layer_types: tuple = (lambda: (Linear,))(), hessian_dtype: dtype = torch.float64, module_to_name: dict = dict(), results: dict = dict(), flag_calibration: bool = True, flag_hessian: bool = False, flag_xtx: bool = True, flag_qep_supported: bool = False, bits: int = 4, symmetric: bool = False, group_size: Optional[int] = 128, log_level: int = 0, device: Optional[device] = None, regularization_lambda: Optional[float] = 0.1, regularization_mode: str = 'diagonal', regularization_gamma: float = 0.5, lambda_mode: str = 'fixed_lambda', lambda_list: Optional[List[float]] = None, incremental_eps_y: float = 0.03, incremental_eps_w: float = 0.1, incremental_initial_skip_ew_threshold: Optional[float] = 0.3, actorder: bool = False, ils_enabled: bool = False, ils_num_iterations: int = 10, ils_num_clones: int = 8, ils_num_channels: Optional[int] = None, enable_clip_optimize: bool = True, enable_clip_optimize_ep: bool = False, enable_gptq: bool = True, gptq: Optional[GPTQ] = None)

Bases: Quantizer

JointQ quantizer class.

JointQ is a post-training quantization method that combines multiple initialization strategies (Clip-Optimize, Clip-Optimize-EP, GPTQ) with local search optimization to find high-quality quantized weights.

Attributes:

Name	Type	Description
`bits`	`int`	Number of bits for quantization. Default is 4.
`symmetric`	`bool`	Whether to use symmetric quantization. Default is False.
`group_size`	`int or None`	Group size for quantization. Default is 128. If None, per-channel quantization is used (group_size = in_features).
`log_level`	`int`	Log level (0: none, 1: minimal, 2: detailed). Default is 0.
`device`	`device or None`	Device for quantization. If None, uses the device of the module being quantized.
`regularization_lambda`	`float or None`	Tikhonov regularization strength. Default is 0.2. Replaces X^T X with X^T X + nlambdaR where R depends on `regularization_mode`. lambda is relative to the normalized Hessian (1/n)X^T X, so its meaning is consistent across different calibration sample sizes. Recommended range: 0.1 to 1.0. Set to None or 0.0 to disable. Used only in `fixed_lambda` mode.
`regularization_mode`	`str`	Shape of the regularization matrix R. `"identity"` (default): R = I (standard Tikhonov). `"diagonal"`: R = diag(a) where a_i = (diag(X^T X)_i / mean(diag(X^T X))) ^ gamma. This makes regularization importance-aware: columns with larger activations receive stronger regularization. Only supported with `lambda_mode="fixed_lambda"`.
`regularization_gamma`	`float`	Exponent for the diagonal weights in `"diagonal"` mode. Default is 0.5. Smaller values reduce the spread between weak and strong columns.
`lambda_mode`	`str`	Regularization mode. Default is `"fixed_lambda"`. `"fixed_lambda"`: Use a single fixed `regularization_lambda` for all layers (existing behavior). `"incremental_lambda"`: For each layer, try increasing lambda values from `lambda_list` and accept the solution as long as it improves weight error without substantially degrading output error.
`lambda_list`	`list of float or None`	Ascending list of lambda values to try in `incremental_lambda` mode. Ignored in `fixed_lambda` mode. Default is `[0.001, 0.01, 0.05, 0.1, 0.15, 0.2, 0.3, 0.5]`.
`incremental_eps_y`	`float`	Maximum tolerated relative output-error increase when accepting a candidate in `incremental_lambda` mode. Default is 0.03 (3%).
`incremental_eps_w`	`float`	Minimum required relative weight-error decrease to accept a candidate whose output error worsened in `incremental_lambda` mode. Default is 0.10 (10%).
`incremental_initial_skip_ew_threshold`	`float or None`	If the first incremental candidate uses `lambda=0.0` and its relative weight error exceeds this threshold, skip that candidate and try the next lambda instead of accepting it as the initial solution. This guard is only relevant when `lambda_list` starts with `0.0`. Default is 0.3 (30%). Set to `None` to disable this guard.
`actorder`	`bool`	Whether to reorder columns by activation magnitude (Hessian diagonal) before quantization. Default is False. When enabled, columns with larger activations are grouped together, improving group quantization efficiency and GPTQ initial solution quality.
`ils_enabled`	`bool`	Whether to enable Iterated Local Search. Default is False.
`ils_num_iterations`	`int`	Number of ILS iterations. Default is 10.
`ils_num_clones`	`int`	Number of clones per row in ILS. Default is 8.
`ils_num_channels`	`int or None`	Number of rows targeted per ILS iteration. When None, automatically set to min(dim_p, 1024). Default is None.
`enable_clip_optimize`	`bool`	Whether to use Clip-Optimize initialization. Default is True.
`enable_clip_optimize_ep`	`bool`	Whether to use Clip-Optimize with Error Propagation initialization. Default is False.
`enable_gptq`	`bool`	Whether to use GPTQ initialization. Default is True.
`gptq`	`GPTQ or None`	GPTQ instance for initial solution generation. If None, a default GPTQ is created from bits/group_size/symmetric. Pass a custom GPTQ instance to control parameters like blocksize, percdamp, mse, q_grid, q_norm. The GPTQ instance must have wbits/groupsize/sym matching JointQ's bits/group_size/symmetric, and actorder must be False.

Example

Basic usage::

from onecomp.quantizer.jointq import JointQ

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
)

With all initialization strategies enabled::

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    enable_clip_optimize=True,
    enable_clip_optimize_ep=True,
    enable_gptq=True,
)

With custom GPTQ parameters::

from onecomp.quantizer.gptq import GPTQ

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    gptq=GPTQ(
        wbits=4, groupsize=128, sym=False, mse=True
    ),
)

With incremental lambda mode::

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    lambda_mode="incremental_lambda",
)

validate_params ¶

validate_params()

Validate JointQ and GPTQ parameters.

Called once during setup(). Validates:

JointQ parameters

bits: int >= 1 group_size: int >= 1 or None log_level: int in {0, 1, 2} ils_num_iterations: int >= 1 (when ils_enabled) ils_num_clones: int >= 1 (when ils_enabled) ils_num_channels: int >= 1 or None (when ils_enabled)

GPTQ consistency

gptq.wbits == bits gptq.groupsize == group_size (or -1 when group_size is None) gptq.sym == symmetric gptq.actorder == False

Also delegates to self.gptq.validate_params() for GPTQ's own parameter validation (blocksize, percdamp, etc.).

quantize_layer ¶

quantize_layer(module, input=None, hessian=None, matrix_XX=None, dim_n=None)

Quantize a single layer.

Processing flow

Extract weight matrix from module
Prepare matrix_XX (= X^T X) from input or use precomputed
Apply activation ordering (actorder) if enabled
Generate GPTQ initial solution (if enable_gptq=True), using the pre-regularization hessian
Convert GPTQ result to JointQ Solution format
Prepare ILS parameters
Apply Tikhonov regularization to matrix_XX
Run JointQ quantization with initial solutions
Return quantization result

When lambda_mode="incremental_lambda", steps 7-8 are replaced by an iterative loop that tries each value in lambda_list and keeps the solution as long as it improves weight error without substantially degrading output error.

Parameters:

Name	Type	Description	Default
`module`	`Module`	The layer module to quantize.	required
`input`	`tuple or Tensor`	Input activations. Used to compute matrix_XX when matrix_XX is not provided.	`None`
`hessian`	`Tensor`	Not used in JointQ (ignored).	`None`
`matrix_XX`	`Tensor`	Precomputed X^T X (FP64). If provided, this is used instead of input.	`None`
`dim_n`	`int`	Number of samples. Required when matrix_XX is provided.	`None`

Returns:

Name	Type	Description
`JointQResult`		Quantization result containing scale, zero_point, assignment, and perm (column permutation when actorder is used).

execute_post_processing ¶

execute_post_processing()

Log accepted_lambda statistics after all layers are quantized.