DBF¶
DBF Quantizer¶
DBF
dataclass
¶
DBF(name: str = None, num_layers: int = None, calc_quant_error: bool = False, include_layer_names: list[str] = None, exclude_layer_names: list[str] = (lambda: ['lm_head'])(), include_layer_keywords: list[str] = None, exclude_layer_keywords: list[str] = None, target_layer_types: tuple = (lambda: (Linear,))(), hessian_dtype: dtype = torch.float32, module_to_name: dict = dict(), results: dict = dict(), flag_calibration: bool = True, flag_hessian: bool = True, flag_xtx: bool = False, target_bits: float = 1.5, iters: int = 600, reg: float = 0.03, use_balancing: bool = True, balance_iters: int = 40, balance_alpha: float = 1.0, balance_mode: str = 'l1', use_adaptive_rho: bool = True, mlp_target_bits: Optional[float] = None, module_target_bits: Optional[dict[str, float]] = None)
Bases: Quantizer
DBF quantizer.
Runs DBF (Double Binary Factorization) quantization per layer.
Attributes:
| Name | Type | Description |
|---|---|---|
flag_calibration |
bool
|
Calibration mode flag. |
flag_hessian |
bool
|
Hessian computation flag. |
target_bits |
float
|
Target bit-width (e.g., 1.5). |
iters |
int
|
Optimization iterations. |
reg |
float
|
Regularization coefficient. |
use_balancing |
bool
|
Whether to apply weight balancing. |
balance_iters |
int
|
Balancing iterations. |
balance_alpha |
float
|
Balancing alpha. |
balance_mode |
str
|
Balancing mode (e.g., "l1"). |
use_adaptive_rho |
bool
|
Whether to adapt ADMM rho. |
Methods:
| Name | Description |
|---|---|
quantize_layer |
Quantizes a given layer using DBF. |
resolve_bits
staticmethod
¶
resolve_bits(layer_name: Optional[str], default_bits: float, mlp_bits: Optional[float] = None, module_bits: Optional[dict[str, float]] = None) -> float
Resolve bit-width from overrides (DBF semantics: module > mlp > default).
Used by the quantizer and by config loader. If layer_name is None, returns default_bits. Does not validate range; caller may validate.
validate_params ¶
Validate DBF parameters once in setup().
Validated ranges
target_bits: float > 0 iters: int >= 1 reg: float >= 0 balance_iters: int >= 1 (when use_balancing=True) balance_alpha: float > 0 (when use_balancing=True) balance_mode: str in {"l1", "l2"} (when use_balancing=True)
quantize_layer ¶
Quantize the layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module
|
Module
|
The layer module. |
required |
input
|
tuple or Tensor
|
The input to the layer (activations). |
None
|
hessian
|
Tensor
|
The Hessian matrix. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
DBFResult |
DBFResult
|
DBF quantization result object containing quantized weights and parameters. |
get_quant_config ¶
Return quantization_config dict for save_quantized_model.
Structure: all keys at top-level (quant_method, bits, iters, reg, etc.).
create_inference_layer ¶
Build DoubleBinaryLinear from DBFResult.
DBFResult¶
DBFResult
dataclass
¶
DBFResult(dequantized_weight: Tensor = None, quantization_time: float = None, output_squared_error: float = None, mean_output_squared_error: float = None, weight_squared_error: float = None, mean_weight_squared_error: float = None, relative_output_squared_error: float = None, relative_weight_squared_error: float = None, target_bits: float = None, iters: int = None, reg: float = None, use_balancing: bool = None, balance_iters: int = None, balance_alpha: float = None, balance_mode: str = None, use_adaptive_rho: bool = None, is_dbf_quantized: Optional[bool] = None, dbf_Da: Optional[Tensor] = None, dbf_A: Optional[Tensor] = None, dbf_mid: Optional[Tensor] = None, dbf_B: Optional[Tensor] = None, dbf_Db: Optional[Tensor] = None)
Bases: QuantizationResult
DBF quantization result.
Attributes:
| Name | Type | Description |
|---|---|---|
target_bits |
float
|
Target bit-width (e.g., 1.5). |
iters |
int
|
Optimization iterations. |
reg |
float
|
Regularization coefficient. |
use_balancing |
bool
|
Whether to apply weight balancing. |
balance_iters |
int
|
Balancing iterations. |
balance_alpha |
float
|
Balancing alpha. |
balance_mode |
str
|
Balancing mode. |
use_adaptive_rho |
bool
|
Whether to adapt ADMM rho. |
is_dbf_quantized |
Optional[bool]
|
Whether DBF quantization was applied. |
dbf_Da |
Optional[Tensor]
|
Scaling vector paired with A. |
dbf_A |
Optional[Tensor]
|
Binary A matrix. |
dbf_mid |
Optional[Tensor]
|
Middle scaling vector. |
dbf_B |
Optional[Tensor]
|
Binary B matrix. |
dbf_Db |
Optional[Tensor]
|
Scaling vector paired with B. |
compute_dequantized_weight ¶
Compute dequantized weight from quantized data and quantization parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
str or device
|
Device to compute on. |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Dequantized weight tensor (FP16, CPU). |