Exadata@Azure with OpenTofu
Sven Illert -
In one of my recent projects I am building up an application environment in Azure using OpenTofu. That in itself is quite a challenge, since I mostly worked with the Oracle Cloud Infrastructure until this year. Another challenge is handling the Exadata@Azure part of the project since two different cloud worlds are involved.
At first I had some problems with deploying an Exadata cluster in an Azure virtual network with an IPv4/IPv6 dual stack configuration. As per documentation that isn’t recommended at first, but since only one virtual network should be used I didn’t bother and tried to create it anyways using the dual stack vnet. Although no IPv6 prefix was configured for the delegated subnet which should be used by OCI, there was a bug when creating a network security group which didn’t ignore the IPv6 prefix of the vnet. That made the deployment process fail, because the virtual network in the OCI is an IPv4 only network and so it’s not possible to add IPv6 CIDRs to a NSG. Since the project has quite some relevance even for Oracle they fixed that in record speed of about 2 weeks.
The first try of creating that cluster was done by click ops via the Azure web portal. Next step was to transfer that cluster configuration into code to make things easily reproducible. So I went to the azurerm documentation for Oracle cloud vm clusters and followed the specs. I came up with the following configuration where I specified only the required resource arguments and those I intended to change from the default.
resource "azurerm_oracle_cloud_vm_cluster" "exacl-fz3" {
name = "exacl-fz3"
display_name = "exacl-fz3"
location = azurerm_resource_group.rg-exadata-run.location
resource_group_name = azurerm_resource_group.rg-exadata-run.name
cloud_exadata_infrastructure_id = azurerm_oracle_exadata_infrastructure.exa-platform-zone3.id
db_servers = [data.azurerm_oracle_db_servers.exacl-zone3.db_servers[0].ocid, data.azurerm_oracle_db_servers.exacl-zone3.db_servers[1].ocid]
hostname = "sd-fz3"
virtual_network_id = azurerm_virtual_network.vnet-exadata-run.id
subnet_id = azurerm_subnet.subnet-oracle-exadata.id
cpu_core_count = 4
license_model = "BringYourOwnLicense"
gi_version = "23.0.0.0"
ssh_public_keys = ["ssh-rsa xxx"]
data_storage_percentage = "80"
time_zone = "Europe/Berlin"
db_node_storage_size_in_gbs = 1000
sparse_diskgroup_enabled = false
data_storage_size_in_tbs = 192
local_backup_enabled = false
}
When applying the above configuration the whole creation process starts in Azure and tries to handover the workload to the OCI. But after a minute or so the process fails with the following error.
azurerm_oracle_cloud_vm_cluster.exacl-fz3: Creating...
╷
│ Error: creating Cloud V M Cluster (Subscription: "xxx-yyy-4711"
│ Resource Group Name: "RG-ExaDATA-RUN"
│ Cloud Vm Cluster Name: "exacl-fz3"): performing CreateOrUpdate: unexpected status 400 (400 Bad Request) with error: 400: DbServerList is not empty, DataStorageOption params can't be null.
│
│ with azurerm_oracle_cloud_vm_cluster.exacl-fz3,
│ on exadata.tf line 25, in resource "azurerm_oracle_cloud_vm_cluster" "exacl-fz3":
│ 25: resource "azurerm_oracle_cloud_vm_cluster" "exacl-fz3" {
│
│ creating Cloud V M Cluster (Subscription: "xxx-yyy-4711"
│ Resource Group Name: "rg-exadata-run"
│ Cloud Vm Cluster Name: "exacl-fz3"): performing CreateOrUpdate: unexpected status 400 (400 Bad Request) with error: 400:
│ DbServerList is not empty, DataStorageOption params can't be null.
---
The error message here is probably from the OCI side, because the job to deploy the cluster is created successfully in Azure but fails with the above error. From the output we can see the message “DataStorageOption params can’t be null” which indicates some missing configuration about data storage. But as can be seen from the above code snippet all arguments for the azurerm_oracle_cloud_vm_cluster
containing the term storage are given and definitely not null.
So I made contact with knowing persons of Oracle to help me out here. And after an analysis that by documentation I made no mistake it was time to create an SR to see what’s wrong here. To make a not so long story even shorter: the fix for this is quite easy by specifying just one more obviously not so optional argument.
The error message unfortunately is a little bit misleading and the problem shouldn’t occur in the first place anyways. The problem is, that I didn’t specify the memory_size_in_gbs
(see below) parameter which configures the amount of RAM a cluster VM will have. And of course this is something that needs to be configured and is specified via click ops with a default value at the lower end. For some reason the azurerm provider doesn’t require this argument while not defining a default value and on the other hand the error message could be more specific. So if you run into that problem, this might help you get that Oracle cluster on Exadata flying using code :-) and special thanks go out to Martin + Martin.
resource "azurerm_oracle_cloud_vm_cluster" "exacl-fz3" {
…
time_zone = "Europe/Berlin"
db_node_storage_size_in_gbs = 1000
memory_size_in_gbs = 600
sparse_diskgroup_enabled = false
data_storage_size_in_tbs = 192
…
}