vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0.
Add your gear to cvedb and we'll alert you only when vllm ships something exploited.
Check my exposure →This product uses data from the NVD API but is not endorsed or certified by the NVD. Informational only; not professional security advice.