Abstract: An improved variant of the precise-integration time-domain (PITD) method is proposed to eliminate the inverse matrix calculation and optimize the storage burden with the help of sparse ...
Vector Post-Training Quantization (VPTQ) is a novel Post-Training Quantization method that leverages Vector Quantization to high accuracy on LLMs at an extremely low bit-width (<2-bit). VPTQ can ...
Abstract: Long Short-Term Memory (LSTM) and its variants have been widely adopted in many sequential learning tasks, such as speech recognition and machine translation. The low-latency and ...