Properly install EFS CSI driver on EKS cluster

Benoit MOUQUET
7 min readMar 13, 2023

Amazon EFS provides a simple, serverless, set-and-forget elastic file system. With Amazon EFS, you can create a file system, mount the file system on an Amazon EC2 instance, and then read and write data to and from your file system. You can mount an Amazon EFS file system in your virtual private cloud (VPC), through the Network File System versions 4.0 and 4.1 (NFSv4) protocol.

By installing Amazon EFS CSI Driver on an EKS cluster, you will able to consume an EFS storage for provisioning persistent volumes (PV).

EFS vs EBS

The standard way of provisioning PV on EKS cluster is using Elastic Block Store (EBS) to provide storage volume to any stateful application (like databases). These volumes will be formatted with a classic storage file system like ext4, XFS or NTFS and are well designed for IO standard / intensive services.

Starting to version 1.23 of Kubernetes, EKS requires EBS CSI Driver in order to provision, attach and mount EBS volumes for Kubernetes Pods.

But EBS volumes have two major limitations:

  • They are bound to an AWS availability zone and cannot be attached to an EC2 in another AZ
  • They can be attached to only one node at a time

So what I can do if an application required a shared volume between replicas? Generally, the preferred solution is the usage of Object Storage like AWS S3 or equivalent. But this may require important changes in the application and may not be possible (for example in lift an shift strategy).

Fortunately, AWS has a solution for us name EFS. EFS is a network file system based on NFS v4.1 protocol that allows cross AZs and multi attachment of a data volume. EFS is fully integrated with K8S by using the EFS CSI Driver.

Installation of EFS CSI Driver

AWS provides a documentation to easily install this driver on the cluster. But this documentation omits a critical part … the security. In order to use an EFS in your infrastructure, you need first to create mount targets to expose the network storage to your EC2 in your VPC. Generally, one mount target (corresponding to an ENI) by availability zone will be created.

EFS with one mount target by AZ

If no particular restriction is applied to the EFS, the file system can be mounted on any EC2 connected to the VPC (without any IAM role or specific credentials). Of course, some limitations can be configured by using a security group, but this not enough, especially on a mutualised EKS cluster.

You can verify it by using the following bash command:

sudo yum install amazon-efs-utils -y
sudo mkdir /mnt/efs
sudo mount -t efs <fs-id> /mnt/efs

Protect EFS

To avoid anonymous mount of the EFS storage, you can add a file system policy on EFS storage like this one:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RemoveAllDefaultPermission",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "",
"Resource": "arn:aws:elasticfilesystem:eu-west-3:<account-id>:file-system/<efs-id>"
},
{
"Sid": "ForceTLSUsage",
"Effect": "Deny",
"Principal": {
"AWS": "*"
},
"Action": "*",
"Resource": "arn:aws:elasticfilesystem:eu-west-3:<account-id>:file-system/<efs-id>",
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
}
]
}

The policy does two things:

  • Remove all default permissions
  • Force usage of TLS for NFS mount (may be optional because authenticated mount always requires TLS encryption)

Kubernetes Volume provisioning

Static Provisioning

After installing EFS CSI Driver on the cluster, you have two possibilities to use EFS storage with your pods.

Static provisioning is the first solution: you manually create a storage class (SC), a persistent volume (PV) and the corresponding persistent volume claim (PVC). Normally the PV will be automatically bound to the PVC (or you can enforce it by setting volumeName attribute). This volume can be directly attached to the root directory of the EFS storage, or to a specific subpath by using an access point. This possibility is suitable for some cases, especially if you require a standard file permissions, but it’s not allow Kubernetes users’ to be autonomous to create their EFS volumes.

Dynamic Provisioning

In order to dynamically create EFS volume in Kubernetes (without requiring AWS permissions to create a new EFS or access points), we will take advantage to access point feature of EFS.

Access point, required by dynamic provisioning, applies an operating system user, group, and file system path to any storage request made using the access point. A path in the EFS, corresponding of the PV name in our case, will be serving as the root folder of the volume (pod will not access to other files on the network storage).

With usage of dynamic provisioning, users of the cluster are autonomous to create EFS volumes by themselves. Access point requires usage of TLS encryption (standard EFS traffic is not encrypted by default), so no further configuration is needed. The only drawback is that all files in the volume will have the same ID and GID assigned (the ones specified on the access point). Regarding of usage of EFS volumes, this should not be a big deal. Also, dynamic provisioning is not compatible with Fargate Nodes.

Deploying EFS CSI Controller

First of all we need to create necessaries IAM roles (Terraform code):

data "aws_iam_policy_document" "efs_csi" {
statement {
effect = "Allow"
actions = [
"elasticfilesystem:DescribeAccessPoints",
"elasticfilesystem:DescribeFileSystems",
"elasticfilesystem:DescribeMountTargets",
"ec2:DescribeAvailabilityZones"
]
resources = ["*"]
}

statement {
effect = "Allow"
actions = [
"elasticfilesystem:CreateAccessPoint"
]
resources = ["<efs-arn>"]
condition {
test = "StringLike"
variable = "aws:RequestTag/efs.csi.aws.com/cluster"
values = ["<cluster-name>"]
}
}

statement {
effect = "Allow"
actions = [
"elasticfilesystem:DeleteAccessPoint"
]
resources = ["*"]
condition {
test = "StringLike"
variable = "aws:ResourceTag/efs.csi.aws.com/cluster"
values = ["<cluster-name>"]
}
}

statement {
effect = "Allow"
actions = [
"elasticfilesystem:ClientRootAccess",
"elasticfilesystem:ClientWrite",
"elasticfilesystem:ClientMount"
]
resources = ["<efs-id>"]
}
}

resource "aws_iam_policy" "efs_csi" {
name = "<cluster-name>-efs-csi-driver-policy"
description = "Policy for efs-csi-driver service account"
policy = data.aws_iam_policy_document.efs_csi.json
}

module "efs_csi_iam_assumable_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "5.11.2"
create_role = true
role_name = "<cluster-name>-efs-csi-driver-sa-role"
provider_url = local.eks_oidc_provider_schemeless_url
role_policy_arns = [
aws_iam_policy.efs_csi.arn
]
oidc_fully_qualified_subjects = [
"system:serviceaccount:kube-system:efs-csi-controller-sa",
"system:serviceaccount:kube-system:efs-csi-node-sa"
]
}

This policy is designed to manage only the EFS used by the controller on the cluster (and not others EFS that might exist)

Now we can deploy the controller on the cluster. We will use the official Helm Chart (Github repo):

resource "helm_release" "efs_controller" {
name = "aws-efs-csi-driver"
chart = "aws-efs-csi-driver"
repository = "https://kubernetes-sigs.github.io/aws-efs-csi-driver/"
version = "2.3.6"
namespace = "kube-system"

values = [
<<EOF
clusterName: <cluster-name>
controller:
create: true
deleteAccessPointRootDir: true
serviceAccount:
name: efs-csi-controller-sa
annotations:
eks.amazonaws.com/role-arn: ${module.efs_csi_iam_assumable_role.iam_role_arn}
tags:
efs.csi.aws.com/cluster: <cluster-name>
node:
serviceAccount:
name: efs-csi-node-sa
annotations:
eks.amazonaws.com/role-arn: ${module.efs_csi_iam_assumable_role.iam_role_arn}
EOF
]
}

Note the option: deleteAccessPointRootDir set to true. This option will permit to automatically delete data on EFS when a PVC will be removed. This prevents manual operation in order to reclaim EFS space.

To be able to use EFS volume, we will create the corresponding StorageClass with the following parameters:

---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap # Only value possible, use dynamic provisionning with access points
fileSystemId: <efs-id> # ID of EFS storage
directoryPerms: "700" # Permissions on access point root directory

Now we can easily create a PVC that uses this StorageClass:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pdf-reports
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi

At this moment, the PVC can be used in multiple pods at the same time, and it can be attached on multiple AWS AZ. We specify a limit of 5Gi on the volume because this field is required by the K8s PVC specifications but it will not be taken in consideration.

We can test the configuration with the following deployment. The pod will be replicated 3 times with an anti affinity enforced on hostname to spread pods on separate node and force multiple attachment.

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: back-office
spec:
selector:
matchLabels:
app: back-office
replicas: 3
template:
metadata:
labels:
app: back-office
spec:
containers:
- name: back-office
image: registry.mycompany.com/back-office:v1.2.3
resources:
limits:
memory: "4Gi"
cpu: "1"
ports:
- containerPort: 8080
volumeMounts:
- name: pdf-reports
mountPath: /pdf-reports
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- back-office
topologyKey: "kubernetes.io/hostname"
volumes:
- name: pdf-reports
persistentVolumeClaim:
claimName: pdf-reports
Pods on different nodes with same PVC
Pdf-reports mount inside each pods

Why the network mount point is localhost IP’s? To allow data encryption in transit, a TLS tunnel (using stunnel) is created between the node and AWS EFS. This TLS tunnel will also used to manage IAM authentication. On a standard EC2 instance, this is managed by the following tool: EFS-Utils

Conclusion

Now we are able to mount and write in a same volume on multiple pods spread to different nodes. Using access point, TLS encryption and IAM authentication, we ensure to reduce the attack surface of this network storage. Remember that this kind of storage does not offer incredible performances, so use it with parsimony and prefer an object storage approach (like S3 bucket) if it is possible.

--

--