vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0.
Add your gear to cvedb and we'll alert you only when a vendor you run ships something exploited.
Check my exposure →This product uses data from the NVD API but is not endorsed or certified by the NVD. Informational only; not professional security advice.