This project is licensed under Apache 2.0 with The Commons Clause.
The Helicone Helm chart deploys a complete Helicone stack including web interface, API, OpenAI proxy, and supporting services.
-
Use values.example.yaml as your starting point
- Copy
values.example.yaml
tovalues.yaml
to create your configuration - The example file is configured with a standard setup that routes all services through a single domain
- Customize the domain and other settings to match your environment
- Copy
-
Ingress Configuration
- The main ingress configuration is in the
extraObjects
section at the bottom of the values file - This creates a single ingress that routes to different services based on path:
/
- Web interface/jawn(/|$)(.*)
- Jawn service/oai(/|$)(.*)
- OpenAI proxy/api2(/|$)(.*)
- API service/supabase(/|$)(.*)
- Supabase/Kong
- You should only need to change the
host
value to your domain
- The main ingress configuration is in the
-
Accessing the Web Interface
- Once deployed, the web interface will be accessible at your configured domain
- No port-forwarding is needed when ingress is properly configured
-
Understanding the Routing Strategy
- All Helicone services are accessed through a single domain with different path prefixes
- Example URLs for a domain
helicone.example.com
:- Web UI:
https://helicone.example.com/
- OpenAI Proxy:
https://helicone.example.com/oai/v1/chat/completions
- API:
https://helicone.example.com/api2/v1/...
- Supabase:
https://helicone.example.com/supabase/
- Jawn:
https://helicone.example.com/jawn/
- Web UI:
- This routing is configured in the
extraObjects
section of the values file - Individual service ingress configurations are disabled by default as they're not needed
-
Supabase Studio Configuration
- Supabase Studio can be accessed through the main domain at
/supabase
- If you prefer a separate domain for Supabase Studio, you can enable its dedicated ingress:
supabase: studio: ingress: enabled: true hostname: "studio-your-domain.com" annotations: kubernetes.io/ingress.class: nginx cert-manager.io/cluster-issuer: letsencrypt-prod tls: true
- This configuration has been tested and works well with cert-manager and TLS
- Supabase Studio can be accessed through the main domain at
-
S3 Configuration
-
Create a bucket in your cloud
-
For GCP you will have to go into the interoperability section and create an access key
-
Create the required secret:
# For GCP kubectl -n default create secret generic helicone-s3 \ --from-literal=access_key='' \ --from-literal=bucket_name='helicone-bucket' \ --from-literal=endpoint='https://storage.googleapis.com' \ --from-literal=secret_key=''
# For MinIO (example) kubectl -n default create secret generic helicone-s3 \ --from-literal=access_key='minio' \ --from-literal=bucket_name='request-response-storage' \ --from-literal=endpoint='http://localhost:9000' \ --from-literal=secret_key='minioadmin'
-
Configure CORS for your bucket using the provided
bucketCorsConfig.json
file
-
By default, the Helm chart uses your cluster's default StorageClass for both ClickHouse and PostgreSQL (managed by Supabase). You can override this behavior by specifying storage classes in your values file:
# For ClickHouse storage
helicone:
clickhouse:
persistence:
storageClass: "your-clickhouse-storage-class"
# For PostgreSQL (Supabase) storage
supabase:
postgresql:
primary:
persistence:
storageClass: "your-postgres-storage-class"
storage:
persistence:
storageClass: "your-storage-storage-class"
This allows you to use specific storage classes optimized for database workloads or to meet specific requirements for your environment.
Google Cloud's Artifact Registry is used to store the helm chart. The following steps are to be followed to release a new version of the chart. Google's Documentation
gcloud auth application-default login
gcloud container clusters get-credentials helicone --location us-west1-b
-
Create a new GKE cluster with the following command
gcloud container clusters create helicone \ --enable-stackdriver-kubernetes \ --subnetwork default \ --num-nodes 1 \ --machine-type e2-standard-8 \ --zone us-west1-b
-
Install the chart with the following command
helm install helicone ./
-
Connect via K9s and verify the pods are running.
k9s -n default
-
Port forward to the following services:
- web
- oai
- api
-
Send a request to oai and api services and verify they are showing in the web.
-
If everything is working as expected, delete the cluster with the following command
Important: As this is expensive, please remember to delete the cluster after testing.
gcloud container clusters delete helicone
-
Increase number of nodes in the cluster
gcloud container clusters resize helicone --node-pool default-pool --num-nodes [NUM_NODES]
-
Upgrade the helm chart
helm upgrade helicone ./ -f values.yaml
-
Decrease the number of nodes in the cluster
gcloud container clusters resize helicone --node-pool default-pool --num-nodes 0
-
Update the
Chart.yaml
file with the new version number. -
Package the chart with
helm package .
-
Authenticate
gcloud auth print-access-token | helm registry login -u oauth2accesstoken \ --password-stdin https://us-central1-docker.pkg.dev
-
Push the chart to the repository with
helm push helicone-[VERSION].tgz oci://us-central1-docker.pkg.dev/helicone-416918/helicone-helm
-
Notify the consumers of the new version.
-
Auth with gcloud docker
gcloud auth configure-docker us-central1-docker.pkg.dev
-
Configure helm auth
gcloud auth application-default print-access-token | helm registry login -u oauth2accesstoken \ --password-stdin https://us-central1-docker.pkg.dev
-
Or to impersonate a service account
gcloud auth application-default print-access-token \ --impersonate-service-account=SERVICE_ACCOUNT | helm registry login -u oauth2accesstoken \ --password-stdin https://us-central1-docker.pkg.dev
-
Pull the chart locally
helm pull oci://us-central1-docker.pkg.dev/helicone-416918/helicone-helm/helicone \ --version [VERSION] \ --untar
-
To install directly from OCI registry
helm install helicone oci://us-central1-docker.pkg.dev/helicone-416918/helicone-helm/helicone \ --version [VERSION]
-
Add cors for the s3 bucket
gcloud storage buckets update gs://<BUCKET_NAME> --cors-file=bucketCorsConfig.json
The following steps will help you deploy Helicone on Google Kubernetes Engine (GKE):
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set installCRDs=true
Create a file named prod_issuer.yaml
with the following content:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
namespace: cert-manager
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx
Then apply it:
kubectl apply -f prod_issuer.yaml
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install nginx ingress-nginx/ingress-nginx \
--namespace nginx \
--set rbac.create=true \
--set controller.publishService.enabled=true
helm upgrade helicone ./ -f values.yaml --install
Important Note: Ensure your domain's A record is pointing to the load balancer IP address that is assigned to your ingress.
Once your Helicone instance is deployed and accessible, you can use it to proxy and log LLM API calls. Here are example requests:
curl -k -H "Accept: application/json" \
-H "Accept-Encoding: identity" \
-H "Authorization: Bearer YOUR_OPENAI_API_KEY" \
-H "Helicone-Auth: Bearer YOUR_HELICONE_API_KEY" \
-H "Content-Type: application/json" \
https://your-domain.com/oai/v1/chat/completions \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello, tell me a short joke"}]
}'
Helicone also provides a gateway for more advanced routing and experimentation:
curl -k -H "Accept: application/json" \
-H "Accept-Encoding: identity" \
-H "Authorization: Bearer YOUR_OPENAI_API_KEY" \
-H "Helicone-Auth: Bearer YOUR_HELICONE_API_KEY" \
-H "Content-Type: application/json" \
https://your-domain.com/jawn/v1/gateway/oai/v1/chat/completions \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello, tell me a short joke about programming"}]
}'
Key points for making API requests:
- Use the
/jawn/v1/gateway
path prefix for requests through the gateway - Add
Accept-Encoding: identity
header to prevent binary/compressed responses