PUMA

Puma aims to be a lightweight, high-performance inference engine for heterogeneous devices. Currently under active development.

How to Run

Run make build to build the puma binary.

Run ./puma help to see all available commands.

For example, you can run ./puma version to see the binary version.

Use llama.cpp as the default backend for quick prototyping, will implement our own backend in the future.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md