Skip to content

Installation

This page describes how to install Fujitsu One Compression (OneComp).

Requirements

  • Python 3.12 or later (< 3.14)
  • PyTorch (CPU, CUDA, or MPS on macOS)

For Users (pip)

Step 1: Install PyTorch

Install the appropriate version of PyTorch for your system.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130

On macOS, install PyTorch from PyPI (default wheels include MPS support). You do not need the CUDA index URLs above.

pip install torch torchvision torchaudio

Verify MPS:

import torch
print(torch.backends.mps.is_available())

Then install OneComp (step 2 below). GPTQ quantization and Hugging Face generate() inference on MPS are supported; vLLM serving requires Linux with an NVIDIA GPU. An editable install from a git clone is not required for MPS use — see For Developers (pip) only if you are contributing to OneComp.

For usage (device="mps", VRAM budget, limitations), see the macOS / MPS guide.

Check your CUDA version (Linux / Windows with NVIDIA GPU):

nvcc --version
# or
nvidia-smi

Verify PyTorch GPU support (CUDA):

import torch
print(torch.cuda.is_available())

Step 2: Install OneComp

pip install onecomp

To enable visualization features (matplotlib), install with the visualize extra:

pip install onecomp[visualize]

To enable multi-GPU training features (DeepSpeed), install with the distributed extra:

pip install "onecomp[distributed]"

uv is a fast Python package and project manager written in Rust. It provides deterministic, reproducible environments via its lockfile.

# Install uv (macOS or Linux)
curl -LsSf https://astral.sh/uv/install.sh | sh

git clone https://github.com/FujitsuResearch/OneCompression.git
cd OneCompression

The uv sync command creates a virtual environment and installs all dependencies.

Linux (CUDA quantization / vLLM)

uv sync --extra cu128 --extra dev --extra visualize

The --extra cu128 option installs the CUDA-enabled version of PyTorch (along with torchvision from the same CUDA index). Replace cu128 with the appropriate variant for your environment: cpu, cu118, cu121, cu124, cu126, cu128, or cu130. PyTorch will be automatically downloaded by uv, so you do not need to install it beforehand.

macOS (development / MPS inference)

uv sync --extra mps --extra dev --extra visualize

On macOS, use --extra mps only. CUDA extras (cu118cu130), --extra cpu (Linux-only), and --extra vllm are not supported on macOS. After uv sync, you can run GPTQ quantization and Hugging Face generate() inference on MPS; vLLM serving still requires Linux with an NVIDIA GPU. See the macOS / MPS guide for device placement and usage details.

Adding --extra dev installs development tools (black, pytest, pylint). Adding --extra visualize installs matplotlib for visualization features. Adding --extra distributed installs DeepSpeed for multi-GPU training.

To use vLLM for serving quantized models on Linux, add --extra vllm together with --extra cu130:

uv sync --extra cu130 --extra dev --extra visualize --extra vllm

vLLM requires the cu130 extra

Recent vLLM releases depend on torch>=2.10, whose wheels are only published for the cu130 index. The --extra vllm declaration in pyproject.toml therefore conflicts with cpu, mps, cu118, cu121, cu124, cu126, and cu128; combining any of these with --extra vllm is rejected by uv at lock time.

vLLM 0.22+ is not supported

vLLM 0.22.0 removed the legacy Exllama GPTQ kernel that OneComp's GPTQ serving relies on for low bit-widths (2-/3-bit, and Marlin-ineligible 4-/8-bit), so pyproject.toml pins vllm>=0.10,<0.22. See vLLM Inference for details.

Warning

Do not install vLLM with uv pip install vllm after uv sync. Packages installed via uv pip are not tracked by the lockfile and will be removed or overwritten by subsequent uv sync or uv run commands. Always use --extra vllm instead.

Running Commands

uv run onecomp --version
uv run onecomp TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
uv run pytest tests/ -v
uv run python example/example_gptq.py
source .venv/bin/activate
onecomp --version
onecomp TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
pytest tests/ -v
python example/example_gptq.py

For Developers (pip)

Note

The editable install below is for developing OneComp from a local clone. macOS users who only want MPS inference or quantization should use the For Users (pip) flow (pip install torch then pip install onecomp from PyPI); pip install -e is not needed for MPS.

git clone https://github.com/FujitsuResearch/OneCompression.git
cd OneCompression

# First, install PyTorch for your environment
pip install torch --index-url https://download.pytorch.org/whl/cu128
# Then install onecomp with development dependencies
pip install -e ".[dev]"

Replace cu128 with the appropriate variant for your environment: cpu, cu118, cu121, cu124, cu126, cu128, or cu130. On macOS, install PyTorch from PyPI instead (see macOS (MPS) above).

Building Documentation Locally

--extra docs alone is enough. PyTorch extras (mps, cu*, cpu) are not required to build or serve the documentation.

uv sync --extra docs
uv run mkdocs serve

Then open http://127.0.0.1:8000 in your browser.