Using Letsencrypt with OCI

Sven Illert -

Security is one of the biggest topics and concerns in the IT industry nowadays and since more and more services a hosted in cloud environments, the need for secure configurations increases steadily. One small part is to secure any connections in some way or another so that nobody can see your precious data just by capturing data streams. For generic TCP connections the most common way to do this is using Transport Layer Security, or short: TLS and in former times called Secure Socket Layer, short: SSL. And with the release of Let’s Encrypt it became easier to obtain TLS certificates that are widely accepted.

The most common way to obtain such a certificate is to use the certbot from the Electronic Frontier Foundation. This brings the possibility to configure various webservers like apache, nginx, etc. or cloud services like cloudflare, google or others. But many other cloud services are not supported directly and sadly one of them is the Oracle Cloud Infrastructure OCI. What is also necessary when you want to use the internal webserver configuration support is that the webserver must be accessible by Let’s Encrypt via public internet which might not be always desirable. Fortunately the certbot allows one to provide own hooks for the stages of the certificate deployment and I’d like to show one way to handle that for OCI.

Certbot Stages

At first, let’s have a look at the three stages during the aquisition of an certificate and it’s placement at the desired location.

Authentication
During this stage the certbot requests a token from the Let’s Encrypt servers that is placed in a special location. If the servers find it placed there accessing it via the domain you want a certificate for, the authentication is successful.
Cleanup
After successful authentication the token should be removed again, so that there are no leftovers.
Deployment
Since the system is now authenticated and Let’s Encrypt can be sure that the domain in question is owned by you, the certbot requests a certificate and places it in your desired location.

The authentication phase can be handled using two techniques. The first and obvious way is to place the token into a directory that is serverd by your webserver under a special path. The second one comes in handy you need to ask this question: what if the webserver is not accessible via internet?

Loadbalancer

Let’s assume you have a very common cloud setup. You have a loadbalancer that is listening in the public internet and has one or more backend servers acting as webservers. But the loadbalancer is only accessible from one or more specific IP addresses, filtered out by firewall ruling. The image below shows this setup with the restricted amount of clients on the right side that might be your select customers allowed to access the resources from the webserver.

Common loadbalancer setup in OCI

Common loadbalancer setup in OCI

OK, now let’s start to implement a solution that works with certbot without direct accessibility of the load balancer for the let’s encrypt servers.

Prerequesites

There are some prerequesistes to be met, before I will go into detail of the implementation. One that’s out of scope here is that the Domain in question or just the required token-subdomain are managed by the OCI DNS. The others are explained now in detail.

Control Server

You need a server where certbot will run and which is allowed to manage the OCI DNS and certificate resources. For simple setups the role can by fulfilled by the webserver from the drawing above. To install the required software you can just install the following packages on Oracle Linux.

% dnf install oracle-epel-release-el9
% dnf config-manager --set-enabled ol9_developer_EPEL
% dnf install oci-included-release-el9
% dnf config-manager --set-enabled ol9_oci_included
% dnf install certbot
% dnf install python39-oci-cli
% dnf install jq

OCI Policies and resources

The control server should have HTTP(S) access to the Let’s Encrypt servers and the OCI API. I’ll leave that kind of network configuration up to you. But more important is that you should enable the control server to manage some kind of resources in your OCI tenant. Also, for the whole setup to work, you should delegate either the whole domain in question or the above mentioned token-subdomain to the Oracle DNS. The general process is described in the documentation. If you want to just delegate the subdomain that is required for this setup and leave the rest in your preferred DNS management, then just delegate _acme-challenge.yourdomain.tld to the Oracle DNS as you need to manage a TXT record for that domain.

To make the server be able to act on DNS and certificate resources using the OCI API, it needs to be part of a dynamic group because grants for permissions are always bound to types of groups. With this dynamic group, permissions to work with specific resources can be granted in according policies. To implement this the following code can be used if you use IaC. Of course you can also implement it using clickops.

resource "oci_identity_dynamic_group" "dg-cert-mgmt" {
  name           = "dg-ert-mgmt"
  description    = "Dynamic group for instances managing certificates"
  compartment_id = var.tenancy-id
  matching_rule  = "All {instance.id = '${oci_core_instance.ek-web01.id}'}"
}

resource "oci_identity_policy" "pol-dns-record-mgmt" {
  name           = "pol-dns-record-mgmt"
  description    = "Policy to manage DNS records"
  compartment_id = var.tenancy-id
  statements = [
    "Allow dynamic-group ${oci_identity_dynamic_group.dg-cert-mgmt.name} to manage dns-records in tenancy",
    "Allow dynamic-group ${oci_identity_dynamic_group.dg-cert-mgmt.name} to use dns-zones in tenancy"
  ]
}

resource "oci_identity_policy" "pol-cert-mgmt" {
  name           = "pol-cert-mgmt"
  description    = "Policy to manage certificates"
  compartment_id = oci_identity_compartment.cp-ek-web.id
  statements = [
    "Allow dynamic-group ${oci_identity_dynamic_group.dg-cert-mgmt.name} to manage leaf-certificate-family in compartment ${oci_identity_compartment.cp-ek-web.name}"
  ]
}

To check out if the above prerequisites are met you can try to access the already existing zone for your public DNS domain using the following oci command (maybe wait one or two minutes to let the changes propagate to all involved backend systems).

% oci --auth instance_principal dns zone get --zone-name-or-id your.domain.tld

Implementation

Once the requirements are met you can use certbot to manage your certificates. Since by default it does not support the OCI you need to use some custom scripts handle the DNS and certificate management part.

Configuration File

There will be created two scripts and for any further changes and additions there will be a little configuration file. I placed all resources under the path /opt/le-oci and named the configuration file le-oci.cfg. The content of the file should look like this.

AUTH_CONFIG="instance_principal"
COMPARTMENT_ID="ocid1.compartment.oc1..xxxxyyyyyyzzzzz"

The AUTH_CONFIG variable defines the method of authentication to the OCI API. The easiest way to do this is implementing the prerequisites and using the instance principal authentication. The next variable COMPARTMENT_ID defines the OCID in which compartment any resources like DNS records or certificates will be placed.

Authentication Script

The next script will be used to place the authentication token in the according DNS record and to clean that up after success. I named this script /opt/le-oci/le-auth.sh

#!/bin/sh

cfgfile="/opt/le-oci/le-oci.cfg"

if [ ! -f "${cfgfile}" ]; then
        exit 1
fi

. ${cfgfile}

auth() {
	if [ -z $CERTBOT_VALIDATION ]; then
		echo "No CERTBOT_VALIDATION specified."
		exit 1
	fi

	oci --auth ${AUTH_CONFIG} dns record domain patch --zone-name-or-id ${zoneid} --domain ${patchdom} --items "[${item}]" >> /var/log/le-auth.log 2>&1
	sleep 60
}

cleanup() {
	oci --auth ${AUTH_CONFIG} dns record rrset delete --force --zone-name-or-id ${zoneid} --domain ${patchdom} --rtype "TXT" >> /var/log/le-auth.log 2>&1
}

command=$(basename $0)

if [ "$command" != "auth" -a "$command" != "cleanup" ]; then
	echo "Invalid script invocation. Either 'auth' or 'cleanup' is allowed."
	exit 1
fi

if [ -z $CERTBOT_DOMAIN ]; then
	echo "No CERTBOT_DOMAIN specified."
	exit 1
fi


zoneid=""
zonedom=${CERTBOT_DOMAIN}
while ! echo $zoneid | grep ocid > /dev/null; do
	if [ -z $(echo $zonedom | grep "\.") ]; then
		echo "No zone found for domain ${CERTBOT_DOMAIN}"
		exit 1
	fi
	zoneid=$(oci --auth ${AUTH_CONFIG} dns zone list --compartment-id ${COMPARTMENT_ID} --name ${zonedom} --all | jq -Mr '.data[].id' 2>/dev/null)
	zonedom=$(echo $zonedom | sed 's/^[a-zA-Z0-9\-]\+\.//')
done

patchdom="_acme-challenge.${CERTBOT_DOMAIN}"
item=$(printf '{"domain":"%s","rdata":"%s","rtype":"TXT","ttl":30}' ${patchdom} ${CERTBOT_VALIDATION})

${command}

The script will automatically find the zone in your DNS configuration where to place the acme challenge record. It traverses any parent subdomains starting with the most specific at first.

This script needs to be started using a special name which is used as the command name to be executed. To achieve this you should create two symlinks as follows.

% ln -s /opt/le-oci/le-auth.sh /opt/le-oci/auth
% ln -s /opt/le-oci/le-auth.sh /opt/le-oci/cleanup

Deployment Script

The script above only covers two stages of the certbot certification retrieval process. After the authentication was done successfully and certbot retrieved the certificate for your domain(s) it should be deployed to your webserver or load balancer. In this case we don’t want to configure the former, but provide a certificate for the latter.

In OCI you can use a certificate in a load balancer in two different ways. The first is to import the certificate in the loadbalancer itself as a load balancer certificate. This is probably the way most users would do it at first since it is directly accessible from the configuration page of the load balancer itself. But the certificate management isn’t that easy and does not provide features like versioning. So the second way is the one you probably should go and that is the central certificate management where you can also create your own certification authority or an intermediate authority. This supports versioning and can be used not only for load balancer purposes. I saved the deployment script into the file /opt/le-oci/le-deploy.sh.

#!/bin/sh

cfgfile="/opt/le-oci/le-oci.cfg"

if [ ! -f "${cfgfile}" ]; then
	exit 1
fi

source ${cfgfile}

if [ ! -d "${RENEWED_LINEAGE}" ]; then
	exit 1
fi

certname=$(basename ${RENEWED_LINEAGE})
certocid=$(oci --auth $AUTH_CONFIG certs-mgmt certificate list --all --name ${certname} -c $COMPARTMENT_ID | jq -Mr '.data[] | .[0].id')

if [ $certocid != "null" ]; then
	oci --auth ${AUTH_CONFIG} certs-mgmt certificate update-certificate-by-importing-config-details --certificate-id ${certocid} --cert-chain-pem "$(cat ${RENEWED_LINEAGE}/chain.pem)" --certificate-pem "$(cat ${RENEWED_LINEAGE}/cert.pem)" --private-key-pem "$(cat ${RENEWED_LINEAGE}/privkey.pem)" >> /var/log/le-deploy.log 2>&1
else
	oci --auth ${AUTH_CONFIG} certs-mgmt certificate create-by-importing-config --name ${certname} --compartment-id ${COMPARTMENT_ID} --cert-chain-pem "$(cat ${RENEWED_LINEAGE}/chain.pem)" --certificate-pem "$(cat ${RENEWED_LINEAGE}/cert.pem)" --private-key-pem "$(cat ${RENEWED_LINEAGE}/privkey.pem)" >> /var/log/le-deploy.log 2>&1
fi

Of course this script checks if the certificate has already been added and just updates the existing one or creates it if it doesn’t exist yet. To keep things consistent I’d create a symlink for the deployment too.

% ln -s /opt/le-oci/le-deploy.sh /opt/le-oci/deploy

Usage

This is probably the most interesting part. After all that theoretical implementation above you may ask yourself: how do I use all that stuff? That’s a good question and the answer is the following execution of the certbot command that retrieves a certificate for a domain and deploys it to the load balancer.

% certbot certonly --manual --domain your.domain.tld --preferred-challenges dns --manual-auth-hook /opt/le-oci/auth --manual-cleanup-hook /opt/le-oci/cleanup --deploy-hook /opt/le-oci/deploy

The command specifies to just retrieve a certificate for the domain your.domain.tld and that the authentication should be done via DNS challenge and not via HTTP (default). For this and the deployment of the certificates hooks are specified using the scripts created above. For multiple domains I’d suggest to use multiple --domain parameters like with multiple of any of the --*-hook parameters if you want to execute any other scripts.

The certbot also has a default timer which runs periodically to renew any certificates every 60 days. By default the configuration of your certonly run will be saved to a configuration file in /etc/letsencrypt/renewal so that it will use the initially specified hooks upon renewal. Just have a look at the configuration which might look like follows.

# renew_before_expiry = 30 days
version = 2.11.0
archive_dir = /etc/letsencrypt/archive/your.domain.tld
cert = /etc/letsencrypt/live/your.domain.tld/cert.pem
privkey = /etc/letsencrypt/live/your.domain.tld/privkey.pem
chain = /etc/letsencrypt/live/your.domain.tld/chain.pem
fullchain = /etc/letsencrypt/live/your.domain.tld/fullchain.pem

# Options used in the renewal process
[renewalparams]
account = 1234
pref_challs = dns-01,
renew_hook = /opt/le-oci/deploy
authenticator = manual
manual_auth_hook = /opt/le-oci/auth
manual_cleanup_hook = /opt/le-oci/cleanup
server = https://acme-v02.api.letsencrypt.org/directory
key_type = ecdsa

So that’s it. If you now check your certificates (Identity & Security -> Certificates -> Certificates) in your OCI account and select the according compartment you can see the certificates to be used in your loadbalancer configuration.