Deploying Vault-HA with integrated storage in kubernetes using AWS Dynamic Secrets engine with auto-rotation

ryan
AWS Tip
Published in
9 min readMar 13, 2023

--

This guide shows the step by step implementation of multi node HA Vault with integrated storage inside kubernetes clusters using dynamic credentials through the vault AWS secrets engine. Integrated storage enables all workers in a vault cluster to have a replicated copy of vault’s data thus providing high-availability. I have added explanation for each step, however if anything is still fuzzy/incorrect please make sure to let me know in the comments.
BTW — This tutorial uses helm-charts for deploying vault resources. Also knowingly/unknowingly i have abbreviated kubernetes as k8s in some places.

Solution Overview

How this setup works

Steps

  1. Add the vault repository to the helm repo list
helm repo add hashicorp https://helm.releases.hashicorp.com

2. Install Vault: For this demo purpose we will use the latest chart version but you can always target a specific version
The below command will install a high availability vault with two replicas (check the command parameters)

helm install vault hashicorp/vault --set "csi.enabled=true" --set "server.ha.replicas=2" --set='server.ha.enabled=true' --set='server.ha.raft.enabled=true'

# Once done check if installed correctly,

helm ls
#status should be "Deployed"

Check the pods and their states (only the agent-injector should be running, rest should be READY but not RUNNING)

3. Initialize the vault and store the key files. The key generation uses Shamir’s secret sharing

kubectl exec vault-0 -- vault operator init -key-shares=1 -key-threshold=1 -format=json > cluster-keys.json

4. Unseal the first vault instance using the unseal keys

# record in a variable
VAULT_UNSEAL_KEY=$(jq -r ".unseal_keys_b64[]" cluster-keys.json)

# unseal the first instance
kubectl exec vault-0 -- vault operator unseal $VAULT_UNSEAL_KEY

# check the vault status now, the "SEALED" value should be false as below
kubectl exec vault-0 -- vault status

5. Raft Join the other vault instance : Since we are using “‘server.ha.raft.enabled=true’” we need to join the vault-1 pod to the vault-0 pod to form a HA cluster (we are only using 2 replicas so just need to join vault-0 with vault-1)

kubectl exec vault-1 -- vault operator raft join http://vault-0.vault-internal:8200

#The output should be as below

Key Value
--- -----
Joined true

6. Unseal the second vault instance and verify cluster setup

kubectl exec vault-1 -- vault operator unseal $VAULT_UNSEAL_KEY

#To check if the cluster join is complete, run the below command and confirm you have
#a cluster configured (a leader and a follower)

kubectl exec vault-0 -- env VAULT_TOKEN=$(cat cluster-keys.json | jq -r '.root_token') vault operator raft list-peers

#OUTPUT
Node Address State Voter
---- ------- ----- -----
XXXXXXXXXXXXXXXXXX vault-0.vault-internal:8201 leader true
YYYYYYYYYYYYYYYYYY vault-1.vault-internal:8201 follower false

7. For the next steps, it is better to execute them from inside the “leader”(vault-0, usually in this case, confirm in the above step) container using the vault cli. You will need to login via vault-cli as below

cat cluster-keys.json | jq -r '.root_token'
#copy the output value (in a file or clipboard)

kubectl exec -it vault-0 -- /bin/sh

# execute the below from inside the containeer
vault login

#it will ask you to enter the token
/ $ vault login
Token (will be hidden):

#paste the value copied in step 1 of this code block, you will be
# presented with the following output

Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.

8. Since we are playing with vault’s AWS dynamic secrets engine , we should enable the aws secret engine at a specific path. For convinience we will be using the default “/aws” path

# these should be run from inside the container 
# (as logged-in in the previous step)
vault secrets enable aws

#once executed check the enabled engines and their paths
vault secrets list

#you should now see AWS in the list

Path Type Accessor Description
---- ---- -------- -----------
aws/ aws aws_9037ea5b n/a
cubbyhole/ cubbyhole cubbyhole_b357bfe5 per-token private
...

9. Now its time to configure the Vault AWS engine. The first step would be to add credentials so that vault can connect to the AWS account containing the resource (in this case i have used s3 as my target resource)
You would need the access_key and the secret_key for an IAM user for the AWS s3 resource account with sufficient privileges to create users and roles (since Vault will be creating temporary credentials in AWS IAM to access the AWS resource, s3 in our case). Please note this user should be an IAM user and not an SSO/federated user.

# these should be run from inside the container 
# (you can use any other path than root also)
vault write aws/config/root access_key=XXXXXXXXXXXXXXXXXXXX secret_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
region=eu-central-1

#output
Success! Data written to: aws/config/root

With this Vault now has the credentials to access AWS where our resource is.

10. Now Vault needs to know what IAM role its temporary/dynamic user shall assume to access the reource (s3). As per the principle-of-least-priviledge this temporary user should only have the access that it needs (which in this case is to read/write into s3)

# these should be run from inside the container
# (you can use a better role name)
vault write aws/roles/myrole \
credential_type=iam_user \
policy_document=-<<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions",
"s3:ListBucket",
"s3:DeleteObject",
"s3:ListMultipartUploadParts"
],
"Resource": "arn:aws:s3:::*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:ListStorageLensConfigurations",
"s3:ListAccessPointsForObjectLambda",
"s3:ListAllMyBuckets",
"s3:ListAccessPoints",
"s3:ListJobs",
"s3:ListMultiRegionAccessPoints"
],
"Resource": "*"
}
]
}
EOF

# output
Success! Data written to: aws/roles/myrole

# to check run the below and make sure it is not complaining with any error
vault read aws/roles/myrole

Disclaimer: I have used an example policy for this demonstration, you can use a more restrictive policy as well

11. Now it time to enable and configure Kubernetes authentication for vault. This step is neccessary so that Vault can “do what it wants to do” inside the k8s cluster.

# these should be run from inside the container
vault auth enable kubernetes

#output
Success! Enabled kubernetes auth method at: kubernetes/

# after enabling, configure the kubernetes authentication,
# We rae referring the JWT token and the ca certificate used by Vault to authenticate to
# the k8s API server. The
vault write auth/kubernetes/config \
token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
kubernetes_host=https://${KUBERNETES_PORT_443_TCP_ADDR}:443 \
kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

#output
Success! Data written to: auth/kubernetes/config

12. Create a Vault policy so that it can update/inject dynamic values into the consuming applications.

# these should be run from inside the container, make sure to use the 
# same role name in path "aws/creds/myrole" as configured in Step 10
vault policy write internal-app - <<EOF
path "aws/creds/myrole" {
capabilities = ["read","list"]
}
EOF

#output
Success! Uploaded policy: internal-app

# you can verify with the read
vault policy read internal-app

13. Create a k8s service account that will be used by Vault.

# exit from the Vault container and execute the below in the main terminal
kubectl create sa internal-app

# if you're using k8s server v1.24 or later, you may not see a
# serviceaccounttoken secret created with the sa but that should not be a problem

14. Bind the vault role to use this service account

kubectl exec -it vault-0 -- /bin/sh

# the below commands should be run from inside the container, make sure to get the
# 'bound_service_account_namespaces' correct, it should be the namespace where your sa is created
# its better to keep this NS same as the one where you deployed your Vault helm chart
# policies - should be the one created in step no 12
vault write auth/kubernetes/role/internal-app bound_service_account_names=internal-app bound_service_account_nam
espaces=myns policies=internal-app ttl=24h

#output
Success! Data written to: auth/kubernetes/role/internal-app

15. (Optional) Reduce the TTL duration of dynamic secrets. This is something that i have experienced while working with Vault dynamic secrets and its not a very pleasant experience. By default the dynamic secrets has a TTL of 768d which is way too high. If you want to see you secrey auto-rotate (and do not wish to wait 768 days to see it) it would be a good idea to reduce/cap this value accourdingly.

# the below commands should be run from inside the container
# This commands caps the max TTL for the secret to 30 minutes,
# check the path is mentioned only for the AWS secrets engine
vault secrets tune -max-lease-ttl=30m aws/

#output
Success! Tuned the secrets engine at: aws/

16. Install the Secret-Store CSI driver for Kubernetes. This driver is responsible to sync external secrets stores such as Vault with kubernetes. However to make it work with Vault, you will need a Vault specific provider (covered in later steps)

# exit from the Vault container and execute the below in the main terminal
# The syncSecret enables k8s secret creation from Vault aws engine
# enableSecretRotation - This will enable the secrets to update based on Vault secret value
# rotationPollInterval - based on this driver will poll if Vault has changed the dynamic secrets
helm install csi secrets-store-csi-driver/secrets-store-csi-driver --set syncSecret.enabled=true --set enableSecretRotation=true --set rotationPollInterval=600s --set rbac.pspEnabled=false

#check the status is deployed

17. Now your k8s cluster needs to know about the external vault it needs to sync with. This information is provided by the k8s CRD — secretproviderclass. This CRD will have the details of which secret you want to sync with (AWS_ACCESS_KEY and SECRET_KEY in this case). The vaultAddress is the clusterIP address for your Vault kubernetes service (usually the default port is 8200). This field points to the csi driver where it can reach vault.

Use this ClusterIp in vaultAdress parameter in secretproviderclass
#execute the below in the main terminal
# as you can see we have created 2 k8s secrets "accesskey" and "secretkey"
# access_key and secret_key are name of the keys respectively
cat << 'EOF' > spc.yaml
> apiVersion: secrets-store.csi.x-k8s.io/v1
> kind: SecretProviderClass
> metadata:
> name: vault-database
> spec:
> provider: vault
> secretObjects:
> - data:
> - key: access_key
> objectName: aws-access-key
> secretName: accesskey
> type: Opaque
> - data:
> - key: secret_key
> objectName: aws-secret-key
> secretName: secretkey
> type: Opaque
> parameters:
> vaultAddress: "http://XX.X.XX.136:8200"
> roleName: "internal-app"
> objects: |
> - objectName: "aws-access-key"
> secretPath: "aws/creds/myrole"
> secretKey: "access_key"
> - objectName: "aws-secret-key"
> secretPath: "aws/creds/myrole"
> secretKey: "secret_key"
> EOF

#apply the created file
kubectl apply -f spc.yaml

#output
secretproviderclass.secrets-store.csi.x-k8s.io/vault-database created

18. Now lets create a dummy pod where we will use the above created secretproviderclass and mount the secrets in a path inside the container. This will test if our setup is finally working.

#execute the below in the main terminal
# make sure to use the secretProviderClass created as above and also
# lookout for the mountPath where thgis secret will be mounted
cat << 'EOF' > app.yaml
> kind: Pod
> apiVersion: v1
> metadata:
> name: webapp
> spec:
> serviceAccountName: internal-app
> containers:
> - image: jweissig/app:0.0.1
> name: webapp
> volumeMounts:
> - name: secrets-store-inline
> mountPath: "/mnt/secrets-store"
> readOnly: true
> volumes:
> - name: secrets-store-inline
> csi:
> driver: secrets-store.csi.k8s.io
> readOnly: true
> volumeAttributes:
> secretProviderClass: "vault-database"
> EOF

#apply the created file
kubectl apply -f app.yaml

#check if all the pods are in running state, you will see there are pods
# related to the csi driver and vault provider as well
kubectl get po

#output
NAME READY STATUS RESTARTS AGE
csi-secrets-store-csi-driver-xxxxx 3/3 Running 0 6h20m
csi-secrets-store-csi-driver-yyyyy 3/3 Running 0 6h20m
csi-secrets-store-csi-driver-zzzzz 3/3 Running 0 6h20m
vault-0 1/1 Running 0 9h
vault-1 1/1 Running 0 9h
vault-agent-injector-XXXXX-xxxxxxxx 1/1 Running 0 9h
vault-csi-provider-xxxxx 1/1 Running 0 9h
vault-csi-provider-yyyyy 1/1 Running 0 9h
vault-csi-provider-zzzzz 1/1 Running 0 9h
webapp 1/1 Running 0 37m


# Once everything is in running state, check the pod volume mountPath if
# the secret values are getting loaded

kubectl exec webapp -- cat /mnt/secrets-store/aws-access-key

# output should show the access key
AXXXXXXXXXXXXXXXXXXX

# the below command should show the same output as above, where you are querying the
# synced k8s secret
kubectl get secrets/accesskey --template={{.data.access_key}} | base64 -d

# you can run similar checks for the "secretkey" k8s secret

# based on the Lease duration set/capped for the secret you can wait 'X'
# interval and run the above queries again and VOILA! you get new values
# i.e. secrets are getting auto rotated.

--

--

Currently working as a devops engineer. Have extensive experience in software development, integration, cloud and devops. And I'm here to help!