16 June, 2025Blog Post

Hibernating apps

Pod hibernation. Ever heard of it? It's a great feature if you need it, but a bit tricky to set up.

Hibernating apps

There are numerous examples for why you'd want to hibernate your apps. For example, Laravel Cloud uses pod hibernation, so you can configure your apps to hibernate after a set period of time. In return, Laravel Cloud doesn't charge you for the time the pods you host are hibernated.

All the reasons usually involve some sort of resource usage reduction.

For example, Codecannon allows you to use a UI/AI builder to define an application specification. We use your specification to deterministically generate a full-stack app codebase without AI. We want the users to see the apps in demo deployment environments before they decide if they want to purchase them. This can lead to a lot of apps deployed on our Kubernetes cluster, sucking up resources and costing money even when the demo apps aren't being used.

Enter pod hibernation.

If users aren't using the apps, why keep them running?

How it works

This is a high level overview of our cluster architecture as it relates to hibernating user apps.

Hibernation Diagram 1

Hibernation Diagram 1

User land

This is usually the browser or any other source that makes requests to the user app.

ingress-nginx

We use nginx as our ingress server, which routes incoming http requests to the appropriate pods.

We use the ingress-nginx helm chart with the custom error pages backend enabled, so we can serve custom error pages (we'll explain this later).

user-app-*

The user applications that we want to hibernate when they aren't being used.

prometheus

Prometheus is an open-source systems monitoring and alerting toolkit.

It's used here to collect request metrics from the nginx ingress pod.

keda

KEDA is a tool that helps Kubernetes scale applications based on real-world events.

We use it here to scale pods up and down.

We use ingress-nginx request metrics, collected and provided by Prometheus, to respond to user requests to specific pods.

When a request is made, we wake them up. When no requests have been made for a while, we hibernate them.

Request path

Let's run through a scenario to imagine how the entire system operates a bit better

Hibernation Diagram 2

Hibernation Diagram 2

Scenario 1:

  • Pod is running
  • User makes request
  • User gets app UI

Scenario 2:

  • Pod is not running
  • User makes request
  • Request enters nginx-ingress, from here, 2 concurrent paths are triggered
    1. The request is picked up by Prometheus
      • Prometheus stores the request metric in it's database
      • KEDA pings Prometheus and learns that a request to a hibernated pod has been maded
      • KEDA wakes up the hibernated pod
    2. Nginx tries to contact the pod
      • But the pod is not running yet, so it returns a 404 or 503 error
      • A custom Nginx error page is returned to the user
      • The error page presents a loader, notifying the user that the pod is starting
      • The error page pings the URL the user tried to access every X seconds
      • When the pinging http request returns 200, a page reload is triggered
      • When the page is reloaded, the request follows "Scenario 1" above

A bit more complicated but the concept is still pretty simple.

Example Error Page

Example Error Page

How does one set this up

Fair warning! This is not intended as a complete guide, and none of the configs below are intended as production recommendations, or guaranteed to work on their own.

It's intended to get you a bird's eye overview of how such a system might be set up, and hopefully gives you some useful ideas or insight on how to set it up yourself.

Before building this yourself, we strongly recommend you read the docs and other examples to figure out what the correct configuration for you is - because there's a lot of configuration to explore.

We won't go into too much detail here, but we'll add some configs to help you on your way.

ingress-nginx

To add the ingress-nginx repo we run:

bash
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

After that, we define some custom values for the helm chart

yaml
controller:
  config:
    custom-http-errors: "404,503"
  metrics:
    enabled: true
    service:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "10245"
    serviceMonitor:
      enabled: true
      scrapeInterval: 1s

defaultBackend:
  enabled: true
  image:
    registry: registry.k8s.io
    image: ingress-nginx/custom-error-pages
    tag: v1.1.3@sha256:5aeaf5d01470bcc7d73b8846458b00dbc62d54277cd110cec8f28e663c11f93e
  extraVolumes:
  - name: custom-error-pages
    configMap:
      name: custom-error-pages
      items:
      - key: "404"
        path: "404.html"
      - key: "503"
        path: "503.html"
  extraVolumeMounts:
  - name: custom-error-pages
    mountPath: /www

The custom error pages guide can be found here

The metrics service needs to be enabled, so Prometheus picks up the request metrics.

The custom error pages are defined in a ConfigMap that you need to apply to your cluster manually, ours looks something like this:

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-error-pages
  namespace: ingress-nginx
data:
  404: |
    <!doctype html>
    <html>
      <head>
        <script src="https://unpkg.com/vue@3/dist/vue.global.js"></script>
        <title>404 Not Found</title>
        <style>
          /* Our custom styles */
        </style>
      </head>
      <body>
        <div id="app">
          <!-- The default error page in case users are accessing some non-user-app url -->
          <span v-if="!isUserApp"
            ><center>
              <h1>404 Not Found</h1>
            </center>
            <hr />
            <center>nginx</center></span
          <!-- Loader page in case user app is accessed -->
          <div v-else class="message-container">
            <!-- Loader goes here -->
            <p>Starting app. Please wait a moment...</p>
          </div>
        </div>
      </body>
      <script>
        const { createApp, ref, onMounted } = Vue;
        createApp({
          setup() {
            const isUserApp = ref(window.location.href.startsWith("https://user-app-prefix"));
            async function pingApp() {
              const response = await fetch(window.location.href);
              if (response.ok) {
                window.location.reload();
              } else {
                setTimeout(pingApp, 1000);
              }
            }
            onMounted(() => {
              if (isUserApp.value) {
                pingApp();
              }
            });
            return {
              isUserApp,
            };
          },
        }).mount("#app");
      </script>
    </html>

We recommend storing the html in a separate file to get a better HTML/JS editing experience, then just pasting the file into the ConfigMap.

We ommitted some details here, and we skipped the 503 error page which is mostly the same but with some different text. For 503 custom error pages or other error pages you'll need to specify a config.


Install the chart and ConfigMap with the following commands:

bash
helm install -n ingress-nginx --values ./ingress-nginx/values.yaml ingress-nginx ingress-nginx/ingress-nginx
kubectl apply -n ingress-nginx ./ingress-nginx/custom-error-pages.yml

Prometheus

For Prometheus we use the kube-prometheus-stack which installs Prometheus and Grafana along side some other goodies. On the backend we also use Alloy, installed using the k8s-monitoring chart but that's a bit to much to get into here.

If you're interested in more detail you can ping us in our Discord server.

yaml
prometheus:
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false # disable serviceMonitorSelector overwrite
    serviceMonitorSelector: {}                   # match all ServiceMonitors
    serviceMonitorNamespaceSelector:
      any: true                                  # watch all namespaces
    podMonitorSelectorNilUsesHelmValues: false # disable podMonitorSelector overwrite
    scrapeInterval: 1s
    retention: 90d
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: do-block-storage
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 50Gi

grafana:
  persistence:
    enabled: true
    type: pvc
    storageClassName: do-block-storage
    accessModes:
    - ReadWriteOnce
    size: 4Gi
    finalizers:
    - kubernetes.io/pvc-protection

With this we install it using these commands:

bash
kubectl create namespace monitoring

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install -n monitoring --values ./kube-prometheus-stack/values.yaml prometheus prometheus-community/kube-prometheus-stack --set prometheusOperator.installCRDs=true

With this, Prometheus is listening to the ingress-nginx requests and collecting the metrics.

KEDA

For KEDA, we pretty much just use the default KEDA helm chart without custom values.

bash
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

helm install keda kedacore/keda --namespace keda --create-namespace

User apps

For user apps, we have a custom deployment config, and what it looks like is pretty much irrelevant, because all you need to add to enable scaling is a ScaledObject config (this is a KEDA CRD). Ours looks like this:

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: http-app-scaledobject
  namespace: <NAMESPACE>
spec:
  scaleTargetRef:
    name: app
  minReplicaCount: 0
  maxReplicaCount: 1
  pollingInterval: 1
  cooldownPeriod: 3600
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus-kube-prometheus-prometheus.monitoring.svc.cluster.local:9090
        metricName: http_requests_total
        threshold: '1'
        query: sum(increase(nginx_ingress_controller_requests{ exported_namespace=~"<NAMESPACE>" }[1m]))

It's pretty simple but the main parts to look at are:

  • minReplicaCount: This needs to be 0, so KEDA scales the pods to 0 in case of no traffic
  • maxReplicaCount: We have this set to 1 as we never scale the demo deployments to more than 1 pod, if you need to scale more you should look at the official docs
  • cooldownPeriod: This defines how long the pod should stay running after the condition (request count in our example) is no longer met. We run our apps for 1 hour after the last request was made. After this period, the app will scale down
  • triggers.treshold: This configures the scaling to trigger if at least 1 request was made
  • triggers.query: This query checks how many requests were made to the user app in the last minute.

With this, Keda will ping prometheus every second to check if an app was accessed in the last minute. If it was, it scales the pods to 1 for 1 hour. After the app hasn't received any requests for 1 hour, it scales it back down to 0.


That's it. With the infrastructure installed and configured, all you need to do to scale a service is define one of these ScaledObject configs and you're off to the races.

All in all...

The concepts used here are actually pretty simple. If you're new to this however, there's a lot of concepts, a lot of documentation to read, and a lot of configuration to figure out.

Like we mentioned above, this is not a recommended production config nor is it guaranteed to work, so please, read the docs. In the long run, it's worth it, because there are a lot of functionalities and concepts here that are applicable to general cluster management, monitoring, alerting and scaling.

If you liked this post please let us know, and if you'd like us to go more in depth on some of these concepts. We'd be happy to oblige.

If you find yourself stuck on something, we invite you to reach out to us on our Discord server, and we'd be happy to chat and help you figure out your deployment. We're still figuring some of this stuff out ourselves, but we're confident we'll be able to help in some way.

That's it from us, hope you build some cool stuff using this, and if you're starting a new project, and want to cut down on your boilerplating time dramatically, please check out Codecannon.


Andrej Fidel avatar

Andrej Fidel

Co-Founder @ Codecannon

Ready to build your next app?

Try Codecannon today to supercharge your development process!

Get Started