🌇 Sunset Kubernetes deployments
This page covers our PostHog Kubernetes deployment, which we sunset and no longer support. We will continue to provide security updates for Kubernetes deployments until at least May 31, 2024.
For existing customers
We highly recommend migrating to PostHog Cloud (US or EU). Take a look at this guide for more information on the migration process.Looking to continue self-hosting?
We still maintain our Open-source Docker Compose deployment. Instructions for deploying can be found here.
If you are looking for routine procedures and operations to manage PostHog installations like begin, stop, supervise, and debug a PostHog infrastructure, please take a look at the runbook section.
Troubleshooting
Helm failed for not enough resources
While running helm upgrade --install you might run into an error like timed out waiting for the condition
One of the potential causes is that Kubernetes doesn't have enough resources to schedule all the services PostHog needs to run. To know if resources are a problem we can check pod status and errors while the helm command is still running:
- check the output for - kubectl get pods -n posthogand if you see any pending pods for a long time then that could be the problem
- check if the pending pod has scheduling errors using - kubectl describe pod <podname> -n posthog. For example, at the end of the events section we could see that we didn't have enough memory to schedule the pod.
How to fix this: add more nodes to your Kubernetes cluster.
Connection is not secure
First, check that DNS is set up properly:
Note that when using a browser there are various layers of caching and other logic that could make the resolution work (temporarily) even if its not correctly set up.
Kafka crash looping (disk full)
You might see an error similar to this one in the Kafka pod:
This tells us that the data disk is full. To resize the disk, please follow the runbook.
Why did we run into this problem and how to avoid it in the future?
There isn't a way for us to say "if there's less than X% of disk space left, then nuke the oldest data". Instead we have two conditions that restrict, when stuff can be deleted:
- size (logRetentionBytes: _22_000_000_000) for the minimum size of data on disk before allowed deletion.
- time (logRetentionHours: 24) for the minimum age before allowed deletion.
We need to configure these well, but a disk monitoring utility can help catch this problem before we end up in a crash loop.
See more in these stack overflow questions (1, 2, 3).
Upgrade failed due to cert-manager conflicts
If a deploy fails with the following error:
The issue might be with cert-manager custom resource definitions already existing and being unupgradable.
Try running helm upgrade without --atomic to fix this issue.
Namespace deletion stuck at terminating
While deleting the namespace, if your Helm release uses clickhouse.enabled: true you might end up in the operation being indefinitely stuck.
This is a known behavior of the clickhouse-operator finalizer. Workaround:
- patch CHI removing the finalizer: - kubectl patch chi posthog -n posthog -p '{"metadata":{"finalizers":null}}' --type=merge
- delete CHI: - kubectl delete chi posthog -n posthog
FAQ
How can I increase storage size?
To increase the storage size of the ClickHouse, Kafka or PostgreSQL service, take a look at our runbook section.
Are the errors I'm seeing important?
Here are some examples of log spam that currently exists in our app and is safe to ignore:
The following messages in the ClickHouse pod happen when ClickHouse reshuffles how it consumes from the topics. So, anytime ClickHouse or Kafka restarts we'll get a bit of noise and the following log entries are safe to ignore:
The following error is produced by some low-priority celery tasks and we haven't seen any actual impact so can safely be ignored. It shows up in Sentry as well.
How do I see logs for a pod?
- Find the name of the pod you want to get logs on: Terminalkubectl get pods -n posthog- This command will list all running pods. If you want app/plugin server logs, for example, look for a pod that has a name starting with - posthog-plugins. This will be something like- posthog-plugins-54f324b649-66afm
- Get the logs for that pod using the name from the previous step: Terminalkubectl logs posthog-plugins-54f324b649-66afm -n posthog
How do I connect to the web server's shell?
PostHog is built on Django, which comes with some useful utilities. One of them is a Python shell. You can connect to it like so:
In a moment you should see the shell load and finally a message like this appear:
That means you can now type Python code and run it with PostHog context (such as models) already loaded! For example, to see the number of users in your instance run:
How do I connect to Postgres?
- Find out your Postgres password from the web pod: Terminal# First we need to determine the name of the web pod – see "How do I see logs for a pod?" for more on thisPOSTHOG_WEB_POD_NAME=$(kubectl get pods -n posthog | grep -- '-web-' | awk '{print $1}')# Then we can get the password from the pod's environment variableskubectl exec -n posthog -it $POSTHOG_WEB_POD_NAME -- sh -c 'echo The Postgres password is: $POSTHOG_DB_PASSWORD'
- Connect to your Postgres pod's shell: Terminal# We need to determine the name of the Postgres pod (usually it's 'posthog-posthog-postgresql-0')POSTHOG_POSTGRES_POD_NAME=$(kubectl get pods -n posthog | grep -- '-postgresql-' | awk '{print $1}')# We'll connect straight to the Postgres pod's psql interfacekubectl exec -n posthog -it $POSTHOG_POSTGRES_POD_NAME -- /bin/bash
- Connect to the - posthogdatabase:- You're connecting to your production database, proceed with caution! Terminalpsql -d posthog -U postgres- Postgres will ask you for the password. Use the value you found out in step 1. Now you can run SQL queries! Just remember that an SQL query needs to be terminated with a semicolon - ;to run.
How do I connect to ClickHouse?
Tip: Find out your pod names with
kubectl get pods -n posthog
- Find out your ClickHouse user and password from the web pod: Terminalkubectl exec -n posthog -it <your-posthog-web-pod> \-- sh -c 'echo user:$CLICKHOUSE_USER password:$CLICKHOUSE_PASSWORD'
- Connect to the - chi-posthog-posthog-0-0-0pod:Terminalkubectl exec -n posthog -it chi-posthog-posthog-0-0-0 -- /bin/bash
- Connect to ClickHouse using - clickhouse-client:- Note: You're connecting to your production database, proceed with caution! Terminalclickhouse-client -d posthog --user <user_from_step_1> --password <password_from_step_1>
How do I restart all pods for a service?
Important: Not all services can be safely restarted this way. It is safe to do this for the app/plugin server. If you have any doubts, ask someone from the PostHog team.
- Terminate all running pods for the service: Terminal# substitute posthog-plugins for the desired servicekubectl scale deployment posthog-plugins --replicas=0 -n posthog
- Start new pods for the service: Terminal# substitute posthog-plugins for the desired servicekubectl scale deployment posthog-plugins --replicas=1 -n posthog