This repository contains code and data for "Learning to Compress Prompts with Gist Tokens." Training results are reproducible only with DeepSpeed version 0.8.3. For some (currently unknown) reason, ...
Abstract: Domain-specific hardware is becoming a promising topic in the backdrop of improvement slow down for general-purpose processors due to the foreseeable end of Moore's Law. Machine learning, ...