FRAPPE: Full Input, Residual Output Autoencoding with Projection Pursuit Encoder
Abstract
A novel autoencoding framework called FRAPPE uses a projection pursuit encoder to predict residuals from full input, enabling efficient variable-rate image compression with fast CPU-based encoding.
Media compression standards have reached a plateau in terms of the rate-distortion-complexity trade-off, limiting the ability to offload expensive AI perception to the cloud in applications like robotics, wearables, and remote sensing. DNN-based codecs improve compression efficiency, but at a cost: they cannot easily adapt to large changes in available bitrate, and real-time encoding requires expensive, power-hungry GPUs that prohibit use on low-cost or resource-constrained platforms. To address these limitations, we propose a novel autoencoding framework (FRAPPE) that uses the Full input to predict the Residual output via a Projection Pursuit Encoder. FRAPPE's encoding objective naturally sorts latent channels by importance, allowing zero-overhead variable-rate coding. Unlike RNN-based learned codecs, whose encoder consumes the previous reconstruction's residual, or RVQ-style codecs, whose codebooks must be applied sequentially, FRAPPE's analysis path is an embarrassingly parallel DAG of independent input projections. Using FRAPPE, we build a variable-rate RGB image codec (FRAPPE-Image), and evaluate its rate-distortion-complexity trade-off against standard image codecs. At high compression ratios (approx. 0.1 bpp) FRAPPE-Image provides higher perceptual quality than AVIF with 47 times faster encoding, making it capable of real-time 1080p, 30fps CPU-only encoding. Our code and pre-trained models are available: https://github.com/UT-SysML/FRAPPE .
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Spatial Competition for Low-Complexity Learned Image Compression (2026)
- Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression (2026)
- What Matters in Practical Learned Image Compression (2026)
- Dual-Latent Collaborative Decoding for Fidelity-Perception Balanced Image Compression (2026)
- CATRF: Codec-Adaptive TriPlane Radiance Fields for Volumetric Content Delivery (2026)
- DiV-INR: Extreme Low-Bitrate Diffusion Video Compression with INR Conditioning (2026)
- ViTok-v2: Scaling Native Resolution Auto-Encoders to 5 Billion Parameters (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.28992 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper