llama.cpp is an inference of several LLM models in C/C++. Prior to version b5721, there is a signed vs. unsigned integer overflow in llama.cpp's tokenizer implementation (llama_vocab::tokenize) (src/llama-vocab.cpp:3036) resulting in unintended behavior in tokens copying size comparison. Allowing heap-overflowing llama.cpp inferencing engine with carefully manipulated text input during tokenization process. This issue has been patched in version b5721.
Add your gear to cvedb and we'll alert you only when ggml ships something exploited.
Check my exposure →This product uses data from the NVD API but is not endorsed or certified by the NVD. Informational only; not professional security advice.