JointQ¶

JointQ Quantizer¶

JointQ `dataclass` ¶

JointQ(name: str = None, num_layers: int = None, calc_quant_error: bool = False, include_layer_names: list[str] = None, exclude_layer_names: list[str] = (lambda: ['lm_head'])(), include_layer_keywords: list[str] = None, exclude_layer_keywords: list[str] = None, target_layer_types: tuple = (lambda: (Linear,))(), hessian_dtype: dtype = torch.float64, module_to_name: dict = dict(), results: dict = dict(), flag_calibration: bool = True, flag_hessian: bool = False, flag_xtx: bool = True, bits: int = 4, symmetric: bool = False, group_size: int = 128, batch_size: Optional[int] = None, log_level: int = 0, device: Optional[device] = None, regularization_lambda: Optional[float] = 0.2, actorder: bool = False, ils_enabled: bool = False, ils_num_iterations: int = 10, ils_num_clones: int = 8, ils_num_channels: Optional[int] = None)

Bases: Quantizer

JointQ quantizer class

JointQ is a quantization method that uses the jointq package.

Attributes:

Name	Type	Description
`bits`	`int`	Number of bits for quantization. Default is 4.
`symmetric`	`bool`	Whether to use symmetric quantization. Default is False.
`group_size`	`int or None`	Group size for quantization. Default is 128. If None, per-channel quantization is used.
`batch_size`	`int`	Batch size for quantization. Default is None (solve all at once).
`log_level`	`int`	Log level (0: none, 1: minimal, 2: detailed). Default is 0.
`device`	`device`	Device for quantization.
`regularization_lambda`	`float`	Tikhonov regularization strength. Default is 0.2. Replaces X^T X with X^T X + nλI, where n = dim_n. λ is relative to the normalized Hessian (1/n)X^T X, so its meaning is consistent across different calibration sample sizes. Recommended range: 0.1 to 1.0.
`actorder`	`bool`	Whether to reorder columns by activation magnitude (Hessian diagonal) before quantization. Default is False. When enabled, columns with larger activations are grouped together, improving group quantization efficiency and GPTQ initial solution quality.
`ils_enabled`	`bool`	Whether to enable Iterated Local Search. Default is False.
`ils_num_iterations`	`int`	Number of ILS iterations. Default is 10.
`ils_num_clones`	`int`	Number of ILS clones. Default is 8.
`ils_num_channels`	`int`	Number of ILS channels. Default is None.

Example

Basic usage::

from onecomp.quantizer.jointq import JointQ

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    device=torch.device(0),
)

With batch_size::

from onecomp.quantizer.jointq import JointQ

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    batch_size=4096,
    device=torch.device(0),
)

Without Iterated Local Search (ILS)::

from onecomp.quantizer.jointq import JointQ

quantizer = JointQ(
    bits=4,
    symmetric=False,
    group_size=128,
    device=torch.device(0),
    ils_enabled=False,
)

validate_params ¶

validate_params()

Validate JointQ parameters once in setup().

Validated ranges

bits: int >= 1 group_size: int >= 1 batch_size: int >= 1 or None log_level: int in {0, 1, 2} ils_num_iterations: int >= 1 (when ils_enabled=True) ils_num_clones: int >= 1 (when ils_enabled=True) ils_num_channels: int >= 1 or None (when ils_enabled=True)

quantize_layer ¶

quantize_layer(module, input=None, hessian=None, matrix_XX=None, dim_n=None)

Quantize the layer

If matrix_XX and dim_n are provided, uses the precomputed X^T X. Otherwise, computes matrix_X from input (legacy behavior).

Parameters:

Name	Type	Description	Default
`module`	`Module`	The layer module	required
`input`	`tuple or Tensor`	The input to the layer (input activations)	`None`
`hessian`	`Tensor`	The Hessian matrix (not used in JointQ)	`None`
`matrix_XX`	`Tensor`	Precomputed X^T X (FP64). If provided, this is used instead of input.	`None`
`dim_n`	`int`	Number of samples. Required when matrix_XX is provided.	`None`

Returns:

Name	Type	Description
`JointQResult`		JointQ quantization result object

JointQ¶

JointQ Quantizer¶

JointQ dataclass ¶

validate_params ¶

quantize_layer ¶

JointQ `dataclass` ¶