Contents

Improve your Statefulsets' reliability on GCP with the GKE Stateful HA Operator

Photo by Marc Pell on Unsplash

Most workloads we deploy on Kubernetes are deployments. They dynamically manage Pods & Replicasets.

However, it may be useful to manually handle the identity of the Pods and their scalability. For instance, if we want to install a distributed database such as MongoDB on top of Kubernetes, it would be mandatory to manually set the names to set up the cluster and its discovery.

For that purpose, we may use StatefulSets.

When comparing the two, it is important to note that Deployments are designed to manage and host stateless applications, while StatefulSets are specifically tailored for stateful applications.

Now, imagine you have set up a single-replica application that uses a persistent disk for storage. How to recover from failures and especially node crashes?

Google has introduced a new Operator in their Kubernetes stack : The Stateful HA Operator.

From my perspective, it prevents setting up a cluster, using a single-replica configuration and let Kubernetes manage the failover in two ways: using the StatefulSet restart using liveness probes, and using this operator. To some extent, it helped me simplify the setup - Yes, I can mix Kubernetes and simplification in the same sentence.

Unfortunately this features comes with some restrictions:

I will then introduce how to put it in place.

Here’s a representation of the main infrastructure components required:

Stateful Ha Operator

The Stateful set is exposed through a Regional Load Balancer to ensure connectivity after switching to another node during the failover process.

First, enable the addon:

bash

gcloud container clusters update gke-cluster \
      --region MY_REGION --project MY_GCP_PROJECT \
      --update-addons=StatefulHA=ENABLED

For more details, you can refer to the documentation.

You can now set up this Kubernetes object

yaml

kind: HighAvailabilityApplication
apiVersion: ha.gke.io/v1
metadata:
  name: APP_NAME
  namespace: APP_NAMESPACE
spec:
  resourceSelection:
    resourceKind: StatefulSet
  policy:
    storageSettings:
      requireRegionalStorage: true
    failoverSettings:
      forceDeleteStrategy: AfterNodeUnreachable
      afterNodeUnreachable:
        afterNodeUnreachableSeconds: 20

And after applying it, we can get the following events:

bash

1s          Warning   NodeNotReady                     pod/stateful-service-ha-stateful-operator-0                                                       Node is not ready
0s          Normal    TaintManagerEviction             pod/stateful-service-ha-stateful-operator-0                                                       Marking for deletion Pod namespace/stateful-service-ha-stateful-operator-0
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Failover for pod stateful-service-ha-stateful-operator-0 successful
1s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    Scheduled                         pod/stateful-service-ha-stateful-operator-0                                                       Successfully assigned namespace/stateful-service-ha-stateful-operator-0 to gke-gke-cluster-dev-node-pool20250114-5e2dc459-pafo
0s          Normal    SuccessfulCreate                  statefulset/stateful-service-ha-stateful-operator                                                 create Pod stateful-service-ha-stateful-operator-0 in StatefulSet stateful-service-ha-stateful-operator successful
0s          Normal    PodFailoverAfterNodeUnreachable   highavailabilityapplication/stateful-service-ha-stateful-operator                                 Triggering failover for pod stateful-service-ha-stateful-operator-0
0s          Normal    SuccessfulAttachVolume            pod/stateful-service-ha-stateful-operator-0                                                       AttachVolume.Attach succeeded for volume "pvc-8839935b-6637-4f70-b5b8-17e2bbf31b04"
0s          Normal    Pulling                           pod/stateful-service-ha-stateful-operator-0                                                       Pulling image "DOCKER_IMAGE"
0s          Normal    Pulled                            pod/stateful-service-ha-stateful-operator-0                                                       Successfully pulled image "DOCKER_IMAGE" in 694ms (694ms including waiting). Image size: XXXX bytes.
0s          Normal    Created                           pod/stateful-service-ha-stateful-operator-0                                                       Created container stateful-service-ha-stateful-operator
0s          Normal    Started                           pod/stateful-service-ha-stateful-operator-0                                                       Started container stateful-service-ha-stateful-operator
0s          Normal    SyncLoadBalancerSuccessful        service/stateful-service-ha-stateful-operator                                                     Successfully ensured IPv4 External LoadBalancer resources

We now have the HighAvailabilityApplication object:

bash

$ kubectl get highavailabilityapplications
NAME           PROTECTED
stateful-service-ha-stateful-operator        True

The description would be:

bash

 kubectl describe highavailabilityapplications/stateful-service-ha-stateful-operator
Name:         stateful-service-ha-stateful-operator
Namespace:    namespace
Annotations:  meta.helm.sh/release-name: release
              meta.helm.sh/release-namespace: namespace
API Version:  ha.gke.io/v1
Kind:         HighAvailabilityApplication
Metadata:
  Creation Timestamp:  XXXXX
  Generation:          1
  Resource Version:    70249501
  UID:                 UUID
Spec:
  Policy:
    Failover Settings:
      After Node Unreachable:
        After Node Unreachable Seconds:  20
      Force Delete Strategy:             AfterNodeUnreachable
    Storage Settings:
      Require Regional Storage:  true
  Resource Selection:
    Resource Kind:  StatefulSet
Status:
  Conditions:
    Last Transition Time:  2025-03-12T21:57:57Z
    Message:               Application is protected
    Observed Generation:   1
    Reason:                ApplicationProtected
    Status:                True
    Type:                  Protected
Events:                    <none>

The Google HA Operator is a great option to simplify your architecture, avoiding the need to create a full cluster (e.g., a database cluster) on top of Google Kubernetes Engine. Unfortunately, as always, these technologies come with constraints: the availability of the storage and the unavailability of the service during the failover.

I’ve summarized the key characteristics, along with their pros and cons, in the table below:

Architecture characteristicGKE HA Stateful OperatorComment
Partitioning typeThe same as for StatefulSetYou need to enable the add-on first
Deployability⭐⭐⭐⭐⭐Very easy to deploy, as you don’t need to configure a cluster manually
Elasticity⭐⭐⭐Matches the elasticity of a StatefulSet
Evolutionary⭐⭐⭐You must stick to the operator’s requirements
Fault Tolerance⭐⭐⭐⭐Requires a retry mechanism in the client code to handle potential node failover
Modularity⭐⭐⭐
Overall cost⭐⭐⭐⭐⭐No extra cost. Can be deployed on either an Autopilot or self-managed GKE cluster
Reliability⭐⭐⭐⭐If using CSI persistent storage, it can only be configured across a maximum of two zones
Scalability⭐⭐⭐Limited by the scalability of the application deployed via StatefulSet
Simplicity⭐⭐⭐⭐⭐Really easy to setup. You can move to another solution easily.
Testability⭐⭐⭐⭐⭐
Cloud-AgnosticismA proprietary Google extension

You probably understand that, depending on your constraints, this tool could be a good fit—or not.

To conclude, in my view, it’s worth starting with this setup if your architecture is Kubernetes-centric and you have some tolerance for failover. This approach allows you to take advantage of Kubernetes’ built-in failover mechanisms