Securely Managing AWS Infrastructure without Passwords with GitLab
Intro
I’m looking at how to manage infrastructure without the use of passwords or static keys. These authentication mechanisms have been the bane of platform engineers everywhere, so removing them wherever possible is a goal worth shooting for. But why?
Leaked keys can expose your infrastructure and data to bad actors
Rotating passwords is part of multiple compliance requirements, including PCI-DSS
Multiple platforms require that keys have a lifespan, necessitating rotation
If you automate the rotation, you need to be aware of platform limitations
I still remember the 2 AM war-room calls with developers and leadership when our app went down because a database password was automatically rotated. The new password had an exclamation point (!) which broke shell parsing in the app, causing it to fail.
This leads us to where we are today: How do we manage infrastructure without dealing of any of these headaches? By using the GitLab OIDC authentication mechanism.
If you’re not aware, the way this feature works is that GitLab will act as an OIDC provider, and each CI job can receive it’s own JWT. This JWT has a lifespan as long as the current job, and after that, it’s invalid. No passwords, no rotations, just clean and easy ID tokens that get automatically injected into your pipeline allowing for seamless Terraform, CDK or AWS CLI commands as your infrastructure requires. Note that this works with other cloud providers - I’m just using AWS as an example.
The examples below assume you have the following:
A GitLab group or project that you have at least maintainer access to
The ability to merge/write to the default branch
An AWS account already created
The ability to provision the necessary resources in that AWS account
Let’s run through the steps:
Provision the AWS IAM OIDC provider
Provision an administrative IAM role for Terraform
Configure GitLab to send an ID Token to the job
Configure the job to use the token
Unfortunately the initial setup must be done via some other authentication mechanism. In this example, I created a temporary IAM user, and an AK/SK for that user for the initial Terraform run. Note that I’ll be showing snippets throughout, if you want the full code scroll to the bottom. The key portion is this:
resource "aws_iam_openid_connect_provider" "gitlab" { url = "https://gitlab.com" client_id_list = ["https://gitlab.com"] thumbprint_list = ["2b8f1b57330dbba2d07a6c51f70ee90ddab9ad8e"] tags = { Name = "GitLab" } }
This is the resource that allows AWS IAM to accept the JWT for authentication. The biggest parts here are the URL, which will match the iss
field of your JWT and the client_id_list
field, which will match any provided value to the JWT’s aud
field. Note that the thumbprint specified is NOT the thumbprint of the gitlab.com certificate, but rather, the root CA’s certificate (USERTrust RSA Certification Authority for gitlab.com).
After this is created, you’ll want to create a role to assume so you can use the new provider. The key note here is the assume role policy (AKA trust policy) and what conditions are on it. We don’t want anyone with a gitlab.com account to be able to access your infrastructure, so let’s limit that to just your group and/or projects. Here’s an example:
data "aws_iam_policy_document" "gitlab_assume_policy" { statement { actions = ["sts:AssumeRoleWithWebIdentity"] effect = "Allow" principals { type = "Federated" identifiers = [aws_iam_openid_connect_provider.gitlab.arn] } condition { test = "StringLike" variable = "gitlab.com:sub" values = ["project_path:mschultz.consulting/*:ref_type:*:ref:*"] } } }
The two most common conditions I utilize are StringLike
and StringEquals
on the sub
field. Note that the variable starts with gitlab.com
which is your aud
field (AKA client_id_list
), if you customized that. This sub
field comes with a lot of information in it, and can be limited however you like. In this example, I’m allowing any project in my top level group, mschultz.consulting
to access this role. Additionally, any ref type (tag or branch) and any ref name (such as branch name, or tag name) is also allowed. In an ideal world, you would have multiple roles - a R/O role for use in Merge Requests to allow for terraform plan
commands that’s allowed to run on any branch, but this admin role might be locked down to only allow access from main
.
Next up, let’s connect this to GitLab. GitLab provides the id_tokens
stanza that’s used in order to create the token and provide it to the job. That stanza looks like this:
.job: id_tokens: STS_TOKEN: #This will be the variable passed into the job aud: https://gitlab.com
Now this is the annoying part - the AWS SDK doesn’t allow the web identity token to be an environment variable, it must be a file. You can call aws sts-assume-role-with-web-identity directly and pass it in as an argument, but if using e.g., the Terraform AWS provider or AWS CLI without calling assume-role, it must be a file. There’s a few ways to manage this:
Echo the token into a file (
echo $STS_TOKEN > $IDFILE
)Specify the token manually via arguments and variables
But both of those are a bit clunky. There is a neat trick I’ll go into later. But for now, this gives us one of two job options that look like this:
.terraform: id_tokens: TF_VAR_identity_token: aud: https://gitlab.com variables: TF_VAR_assumed_role_arn: arn:aws:iam::$AWS_ACCOUNT_ID:role/Terraform_Admin .other_job_using_aws_cli_or_sdk: id_tokens: STS_TOKEN: aud: https://gitlab.com variables: AWS_ROLE_ARN: arn:aws:iam::$AWS_ACCOUNT_ID:role/OtherRole AWS_WEB_IDENTITY_TOKEN_FILE: "/tmp/id_file" AWS_ROLE_SESSION_NAME: gitlab before_script: - echo $STS_TOKEN > $AWS_WEB_IDENTITY_TOKEN_FILE
Use whatever of these makes the most sense for what you’re doing. Most of my work using the AWS CLI, such as S3 Sync, CloudFront invalidations etc will use the second job, whereas anything in Terraform will use the first. See the full code at the bottm on how to set up those variables to pass through to the provider.
Now that we’ve got the boring out of the way, let’s have some fun:
The Hack
By using GitLab CI/CD variables through the UI, you can have GitLab create the file for you! Then, you can pass all of that information to the SDK/CLI/providers through environment variables, without needing to do before_script
commands or passing variables around. To do this
Go to your project or group’s CI/CD variables via Settings->CI/CD->Variables
Add a new variable called
AWS_WEB_IDENTITY_TOKEN_FILE
Type: File
Visibility: Visible
Not protected
Expanded
Value:
$STS_TOKEN
Now things start to get a bit interesting. Using the list of supported env vars, we have three that need to be set total: AWS_WEB_IDENTITY_TOKEN_FILE,AWS_ROLE_ARN
, and AWS_ROLE_SESSION_NAME
. With all of these set, it allows for some very interesting possibilities.
.other_job_using_aws_cli_or_sdk:
id_tokens:
STS_TOKEN:
aud: https://gitlab.com
variables:
AWS_ROLE_ARN: arn:aws:iam::$AWS_ACCOUNT_ID:role/OtherRole
AWS_ROLE_SESSION_NAME: gitlab
Note the lack of a need to specify the web identity token, since that’s now automatically injected by GitLab, and no use of the before_script. That part is key - by removing the need for the before_script section, other jobs can use it without needing to resort to yaml anchors or other messy workarounds.
You might be asking: Why am I calling this a hack? Because how variables get interpolated as a part of the CI/CD process isn’t well documented, so this may be an unexpected consequence and patched out later, if GitLab decides that id_tokens shouldn’t be a part of variable expansion for security or other reasons. As it stands, I don’t immediately foresee this breaking, because it’s incredibly useful and still has all of the other limitations of id_tokens without exposing any additional attack surface.
So with all of that, we’re done! Include that template job, and you’re off to running AWS commands through GitLab CI without having to manage any keys, passwords, or other authentication mechanisms. Feel free to play around with it! You can limit your assume role document to only allow certain emails to auth, or other limitations as needed by your organization.
But before I let you go…
One last thing
You see that AWS_ROLE_SESSION_NAME
? You can use variable interpolation on that too! For example, if you were to set that to something like $CI_PROJECT_ID-$PIPELINE_ID
you’d be able to trace any updates via AWS CloudTrail to the project and pipeline ID. That pipeline ID has the user who submitted the pipeline, as well as the git history attached to it. This allows full traceability back to individual contributors for exact changes at exact times - an auditor’s dream, and an easy way for your platform engineer to do a full RCA if anything goes wrong.
Curious about this? Have any questions or comments? Want to see how I can help you accelerate your development teams and reduce complexity by eliminating annoyances such as long-lived keys?
The Magic
Initial Setup
Run the below Terraform utilizing a standard AK/SK setup in order to provision the GitLab OIDC provider and admin-level Terraform user for future use. Remember to de-provision the AK/SK after running this!
terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 5.7" } } } provider "aws" { } resource "aws_iam_openid_connect_provider" "gitlab" { url = "https://gitlab.com" client_id_list = ["https://gitlab.com"] thumbprint_list = ["2b8f1b57330dbba2d07a6c51f70ee90ddab9ad8e"] tags = { Name = "GitLab" } } resource "aws_iam_role" "terraform_admin" { name = "Terraform_Admin" assume_role_policy = data.aws_iam_policy_document.gitlab_assume_policy.json managed_policy_arns = ["arn:aws:iam::aws:policy/AdministratorAccess"] } data "aws_iam_policy_document" "gitlab_assume_policy" { statement { actions = ["sts:AssumeRoleWithWebIdentity"] effect = "Allow" principals { type = "Federated" identifiers = [aws_iam_openid_connect_provider.gitlab.arn] } condition { test = "StringLike" variable = "gitlab.com:sub" values = ["project_path:mschultz.consulting/*:ref_type:*:ref:*"] #Note that this will allow access from any project in your top level group. #Filter your ref_type (allowed: branch, tag) and ref accordingly. } } }
The Rest of the Way
Below are snippets that you’ll be able to put into your GitLab CI file and Terraform in order to use this the rest of the way.
Terraform Boilerplate:
variable "identity_token" { type = string description = "The GitLab ODIC identity token" } variable "assumed_role_arn" { type = string description = "The AWS STS role to assume" } terraform { required_providers { aws = { source = "hashicorp/aws" version = "~> 5.7" } } } provider "aws" { #Assuming we're not doing the variable hack above. #If you are, omit this stanza. You can also remove the #variables above. Clean! assume_role_with_web_identity { web_identity_token = var.identity_token role_arn = var.assumed_role_arn } }
GitLab CI Boilerplate
.terraform: id_tokens: TF_VAR_identity_token: aud: https://gitlab.com variables: TF_VAR_assumed_role_arn: arn:aws:iam::$AWS_ACCOUNT_ID:role/Terraform_Admin .other_job_using_aws_cli_or_sdk: id_tokens: STS_TOKEN: aud: https://gitlab.com variables: AWS_ROLE_ARN: arn:aws:iam::$AWS_ACCOUNT_ID:role/OtherRole AWS_WEB_IDENTITY_TOKEN_FILE: "/tmp/id_file" before_script: - echo $STS_TOKEN > $AWS_WEB_IDENTITY_TOKEN_FILE .job_using_gitlab_variable_hack: id_tokens: STS_TOKEN: aud: https://gitlab.com variables: AWS_ROLE_ARN: arn:aws:iam::$AWS_ACCOUNT_ID:role/OtherRole AWS_ROLE_SESSION_NAME: $CI_PIPELINE_ID
Sources
https://docs.gitlab.com/ee/ci/secrets/id_token_authentication.html
https://docs.gitlab.com/ee/ci/cloud_services/aws/
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc_verify-thumbprint.html
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_openid_connect_provider