Open-source C/C++ software for running and serving large language models locally, with support for GGUF models and efficient inference on CPU and other accelerators.

Recent stories
1 linked story
Open-source C/C++ software for running and serving large language models locally, with support for GGUF models and efficient inference on CPU and other accelerators.
