Open-source AI compression
smaller. faster. open.
Prune, quantize, and distill neural networks. Ship models that are smaller and faster, on any hardware.
from fasterai.prune.all import *
from fasterbench import benchmark
pruner = Pruner(model, 50, 'global', large_final)
pruner.prune_model()
benchmark(model, dummy).summary() faster
smaller
less CO₂
Compared to the original uncompressed model. Best-case results from combined pruning, quantization, and distillation.
Choose your path
Two ways to optimize.
Use our open-source tools yourself, or let us handle it for you.
DIY
Open-source, Apache 2.0 licensed
Pruning, quantization, distillation, benchmarking
Full documentation and tutorials
Community support via Discord
Browse Libraries
Use our tools
Done for you
We audit your model and recommend a compression strategy
Apply our proprietary optimization pipeline
Deliver a production-ready compressed model
Typical results: 3–10× speedup, minimal accuracy loss
Book a Call