Vault
Best practices for AppRole authentication
At the core of Vault's usage is authentication and authorization. Understanding the methods that Vault surfaces these to the client is the key to understanding how to configure and manage Vault.
Vault provides authentication to a client by the use of auth methods.
Vault provides authorization to a client by the use of policies.
Vault provides several internal and external authentication methods. External methods are called trusted third-party authenticators such as AWS, LDAP, GitHub, and so on. A trusted third-party authenticator is not available in some situations, so Vault has an alternate approach which is AppRole. If another platform method of authentication is available via a trusted third-party authenticator, the best practice is to use that instead of AppRole.
This guide relies heavily on two fundamental principles for Vault: limiting both the blast-radius of an identity and the duration of authentication.
Blast-radius of an identity
Vault is an identity-based secrets management solution, where access to a secret is based on the known and verified identity of a client. It is crucial that authenticating identities to Vault are identifiable and only have access to the secrets they are the users of. Secrets should never be proxied between Vault and the secret end-user and a client should never have access to secrets they are not the end-user of.
Duration of authentication
When Vault verifies an entity's identity, Vault then provides that entity with a token. The client uses this token for all subsequent interactions with Vault to prove authentication, so this token should be both handled securely and have a limited lifetime. A token should only live for as long as access to the secrets it authorizes access to are needed.
Glossary of terms
- Authentication - The process of confirming identity. Often abbreviated to AuthN
- Authorization - The process of verifying what an entity has access to and at what level. Often abbreviated to AuthZ
- RoleID - The semi-secret identifier for the role that will authenticate to Vault. Think of this as the username portion of an authentication pair.
- SecretID - The secret identifier for the role that will authenticate to Vault. Think of this as the password portion of an authentication pair.
- AppRole role - The role configured in Vault that contains the authorization and usage parameters for the authentication.
What is AppRole auth method?
The AppRole authentication method is for machine authentication to Vault. Because AppRole is designed to be flexible, it has many ways to be configured. The burden of security is on the configurator rather than a trusted third party, as is the case in other Vault auth methods.
AppRole is not a trusted third-party authenticator, but a trusted broker method. The difference is that in AppRole authentication, the onus of trust rests in a securely-managed broker system that brokers authentication between clients and Vault.
The central tenet of this security is that during the brokering of the authentication to Vault, the RoleID and SecretID are only ever together on the end-user system that needs to consume the secret.
In an AppRole authentication, there are three players:
- Vault - The Vault service
- The broker - This is the trusted and secured system that brokers the authentication.
- The secret consumer - This is the final consumer of the secret from Vault.
Platform credential delivery method
To prevent any one system, other than the target client, from obtaining the complete set of credentials (RoleID and SecretID), the recommended implementation is to deliver those values separately through two different channels. This enables you to provide narrowly-scoped tokens to each trusted orchestrator to access either RoleID or SecretID, but never both.
RoleID delivery best practices
RoleID is an identifier that selects the AppRole against which the other credentials are evaluated. Think of it as a username for an application; therefore, RoleID is not a secret value. It's a static UUID that identifies a specific role configuration. Generally, you create a role per application to ensure that each application will have a unique RoleID.
Because it is not a secret, you can embed the RoleID value into a machine image or container as a text file or environment variable.
For example:
- Build an image with Packer with RoleID stored as an environment variable.
- Use Terraform to provision a machine embedded with RoleID.
There are a number of different patterns through which this value can be delivered.
The application running on the machine or container will read the RoleID from the file or environment variable to authenticate with Vault.
Policy requirement
An appropriate policy is required to read RoleID from Vault. For example, to get the RoleID for a role named, "jenkins", the policy should look as below.
# Grant 'read' permission on the 'auth/approle/role/<role_name>/role-id' path
path "auth/approle/role/jenkins/role-id" {
capabilities = [ "read" ]
}
SecretID delivery best practices
SecretID is a credential that is required by default for any login and is intended to always be secret. While RoleID is similar to a username, SecretID is equivalent to a password for its corresponding RoleID.
There are two additional considerations when distributing the SecretID, since it is a secret and should be secured so that only the intended recipient is able to read it.
- Binding CIDRs
- AppRole response wrapping
Binding CIDRs
When defining an AppRole, you can use the secretid_bound_cidrs
parameter to specify blocks of IP addresses which can perform the login operation for this role. You can further limit the IP range per token using token_bound_cidrs
.
Example:
$ vault write auth/approle/role/jenkins \
secret_id_bound_cidrs="0.0.0.0/0","127.0.0.1/32" \
secret_id_ttl=60m \
secret_id_num_uses=5 \
enable_local_secret_ids=false \
token_bound_cidrs="0.0.0.0/0","127.0.0.1/32" \
token_num_uses=10 \
token_ttl=1h \
token_max_ttl=3h \
token_type=default \
period="" \
policies="default","test"
CIDR consideration
While there is no hard limit to how many CIDR blocks you can set using the
token_bound_cidrs
parameter, there are limiting factors. One is the amount of
time it takes for the Vault to compare an IP with the list provided. Another is
the maximum request size of the HTTP when you create the list.
AppRole response wrapping
To guarantee confidentiality, integrity, and non-repudiation of SecretID, you can use the -wrap-ttl
flag when generating the SecretID. Instead of providing the SecretID in plaintext, it puts it into a new token’s Cubbyhole with a token use count of 1. When the application attempts to read the SecretID, we can guarantee that only this application can read it.
Example: The following CLI command retrieves the SecretID for a role named, "jenkins". The generated SecretID is wrapped in a token which is valid for 60 seconds to unwrap.
$ vault write -wrap-ttl=60s -force auth/approle/role/jenkins/secret-id
Key Value
--- -----
wrapping_token: s.yzbznr9NlZNzsgEtz3SI56pX
wrapping_accessor: Smi4CO0Sdhn8FJvL8XvOT30y
wrapping_token_ttl: 1m
wrapping_token_creation_time: 2021-06-07 20:02:01.019838 -0700 PDT
wrapping_token_creation_path: auth/approle/role/jenkins/secret-id
Finally, you can monitor your audit logs for attempted read access of your SecretID. If Vault throws a use-limit error when an application tries to read the SecretID, you know that someone else has read the SecretID and alert on that. The audit logs will indicate where the SecretID read attempt originated.
Policy requirement
An appropriate policy is required to read SecretID from Vault. For example, to get the SecretID for a role named, "jenkins", the policy should look as below.
# Grant 'update' permission on the 'auth/approle/role/<role_name>/secret-id' path
path "auth/approle/role/jenkins/secret-id" {
capabilities = [ "update" ]
}
Token lifetime considerations
Tokens must be maintained client side and upon expiration can be renewed. For short lived workflows, traditionally tokens would be created with a lifetime that would match the average deploy time and left to expire, securing new tokens with each deployment.
A long token time-to-live (TTL) can cause out of memory when trying to purge millions of AppRole leases. To avoid this, we recommend that you reduce TTLs for AppRole tokens and implement token renewal where possible. You can increase the memory on the Vault server; however, it won't be a long-term solution.
In general, with any auth method, it's preferable for applications to keep using the same Vault token to fetch secrets repeatedly instead of a new authentication each time. Authentication is an expensive operation and results in a token that Vault must keep track of. If high authentication throughput, 1000s of authentications per second, are expected we recommend using batch tokens which are issued from memory and do not consume storage.
Vault Agent
Consider running Vault Agent on the client host, and let the agent manage the token's lifecycle. Vault Agent reduces the number of tokens used by the client applications. In addition, it eliminates the need to implement the Vault APIs to authenticate with Vault and renew the token TTL if necessary.
To learn more about Vault Agent, read the following tutorials:
Jenkins CI/CD
When you are using Jenkins as a CI tool, Jenkins itself will need an identity; however, you should never have Jenkins log into Vault and pass a client token to the application via workflow. Jenkins needs to give the application its own identity so that the application gets its own secret. The best practice is to use the Vault Agent as much as possible with Jenkins so that Vault token is not managed by Jenkins. You can deliver a SecretID every morning or before every run for x number of uses. Let Vault Agent authenticate with Vault and get the token for Jenkins. Then, Jenkins uses that token for x number of operations against Vault.
A key benefit of AppRole for applications is that it enables you to more easily migrate the application between platforms.
When you use an AppRole for the application, the best practice is to obscure the RoleID from Jenkins but allow Jenkins to deliver a wrapped SecretID to the application.
Usage workflow
Jenkins needs to run a job requiring some data classified as secret and stored in Vault. It has a master and a worker node where the worker node runs jobs on spawned container runners that are short-lived.
The process would look like:
- Jenkins worker authenticates to Vault
- Vault returns a token
- Worker uses token to retrieve a wrapped SecretID for the role of the job it will spawn
- Wrapped SecretID returned by Vault
- Worker spawns job runner and passes wrapped SecretID as a variable to the job
- Runner container requests unwrap of SecretID
- Vault returns SecretID
- Runner uses RoleID and SecretID to authenticate to Vault
- Vault returns a token with policies that allow read of the required secrets
- Runner uses the token to get secrets from Vault
Here are more details on the more complicated steps of that process.
Secrets wrapping
If you are unfamiliar with secrets wrapping, refer to the response wraping documentation.
CI worker authenticates to Vault
The CI worker will need to authenticate to Vault to retrieve wrapped SecretIDs for the AppRoles of the jobs it will spawn.
If the worker can use a platform method of authentication, then the worker should use that. Otherwise, the only option is to pre-authenticate the worker to Vault in some other way.
Vault returns a token
The worker's Vault token should be of limited scope and should only retrieve wrapped SecretIDs. Because of this the worker could be pre-seeded with a long-lived Vault token or use a hard-coded RoleID and SecretID as this would present only a minor risk.
The policy the worker should have would be:
path "auth/approle/role/+/secret*" {
capabilities = [ "create", "read", "update" ]
min_wrapping_ttl = "100s"
max_wrapping_ttl = "300s"
}
Worker uses token to retrieve a wrapped SecretID
The CI worker now needs to be able to retrieve a wrapped SecretID. This command would be something like:
$ vault write -wrap-ttl=120s -f auth/approle/role/my-role/secret-id
Notice that the worker only needs to know the role for the job it is spawning. In the example above, that is my-role
but not the RoleID.
Worker spawns job runner and passes wrapped SecretID
This could be achieved by passing the wrapped token as an environment variable. Below is an example of how to do this in Jenkins:
environment {
WRAPPED_SID = """$s{sh(
returnStdout: true,
Script: ‘curl --header "X-Vault-Token: $VAULT_TOKEN"
--header "X-Vault-Namespace: ${PROJ_NAME}_namespace"
--header "X-Vault-Wrap-Ttl: 300s"
$VAULT_ADDR/v1/auth/approle/role/$JOB_NAME/secret-id’
| jq -r '.wrap_info.token'
)}"""
}
Runner uses RoleID and SecretID to authenticate to Vault
The runner would authenticate to Vault and it would only receive the policy to read the exact secrets it needed. It could not get anything else. An example policy would be:
path "kv/my-role_secrets/*" {
capabilities = [ "read" ]
}
Implementation specifics
As additional security measures, create the required role for the App bearing in mind the following:
secret_id_bound_cidrs
(array: []) - Comma-separated string or list of CIDR blocks; if set, specifies blocks of IP addresses which can perform the login operation.secret_id_num_uses
(integer: 0) - Number of times any particular SecretID can be used to fetch a token from this AppRole, after which the SecretID will expire. A value of zero will allow unlimited uses.
Recommendation
For best security, set secret_id_num_uses
to 1
use. Also, consider changing secret_id_bound_cidrs
to restrict the source IP range of the connecting devices.
Anti-patterns
Consider avoiding these anti-patterns when using Vault's AppRole auth method.
CI worker retrieves secrets
The CI worker could just authenticate to Vault and retrieve the secrets for the job and pass these to the runner, but this would break the first of the two best practices listed above.
The CI worker may likely have to run many different types of jobs, many of which require secrets. If you use this method, the worker would have to have the authorization (policy) to retrieve many secrets, none of which is the consumer. Additionally, if a single secret were to become compromised, then there would be no way to tie an identity to it and initiate break-glass procedures on that identity. So all secrets would have to be considered compromised.
CI worker passes RoleID and SecretID to the runner
The worker could be authorized to Vault to retrieve the RoleId and SecretID and pass both to the runner to use. While this prevents the worker from having Vault's authorization to retrieve all secrets, it has that capability as it has both RoleID and SecretID. This is against best practice.
CI worker passes a Vault token to the runner
The worker could be authorized to Vault to generate child tokens that have the authorization to retrieve secrets for the pipeline.
Again, this avoids authorization to Vault to retrieve secrets for the worker, but the worker will have access to the child tokens that would have authorization and so it is against best practices.
Security considerations
In any trusted broker situation, the broker (in this case, the Jenkins worker) must be secured and treated as a critical system. This means that users should have minimal access to it and the access should be closely monitored and audited.
Also, as the Vault audit logs provide time-stamped events, monitor the whole process with alerts on two events:
- When a wrapped SecretID is requested for an AppRole, and no Jenkins job is running
- When the Jenkins slave attempts to unwrap the token and Vault refuses as the token has already been used
In both cases, this shows that the trusted-broker workflow has likely been compromised and the event should investigated.