Skip to main content
Pedro_FTNT
Staff
Staff
April 30, 2026

Technical Tip: Troubleshooting an OCI SDN connector that is down with Invalid Certs and Not-Authenticated on a FortiGate HA member

  • April 30, 2026
  • 0 replies
  • 53 views

Description

This article describes how to troubleshoot an OCI SDN connector that appears down on a FortiGate VM deployed in Oracle Cloud Infrastructure (OCI), where OCI instance metadata is available, but the authentication flow used by the 'ocid' daemon fails during X.509 token generation.

Scope

FortiGate VM in OCI.

Solution

A common symptom pattern includes the following:

  • The OCI SDN connector is enabled and configured with metadata IAM.

  • OCI endpoints resolve successfully.

  • Instance metadata is available and shows valid instance identity information.

  • The token request to https://auth.<region>.oraclecloud.com/v1/x509 fails with:


  1. InvalidParameter.


  1. Invalid Certs.

  • Subsequent requests to the OCI Identity API fail with:


401 NotAuthenticated


  • The connector reports that no active compartment is available.


This workflow is intended to isolate the exact failure domain and distinguish:

  • Metadata reachability.

  • Endpoint resolution.

  • Token generation.

  • Compartment validation.

  • And HA role-related behavior.


This workflow applies to environments where:

  • The OCI SDN connector is operational on one HA member.

  • The same connector appears down on another member.

  • The affected member shows authentication failures during OCI token generation.


Solution:

  • Confirm the HA role before interpreting OCI connector behavior.

Check HA status first:


get system ha status
diagnose sys ha status
diagnose sys ha checksum show


Expected result:

  • The primary node should be identified clearly.

  • The secondary node may report passive/standby behavior for connector updates.

  • HA checksum should be in sync if the configuration is consistent across members.


Example observations:

  • A healthy member may operate as primary and update the connector normally.

  • An affected member may report secondary mode and not actively update the connector, but local authentication tests should still be evaluated separately.


Relevant behavior observed in the case:

  • Healthy member:


HA state: primary


  • Affected member:


HA state: secondary
ocid running in secondary mode, won't update


Important note:

'secondary' status alone does not explain 'Invalid Certs' or 'NotAuthenticated'. The HA role explains update behavior, but not a local token generation failure.


  • Validate the OCI SDN connector configuration.


Check the full connector definition.


show full-configuration system sdn-connector


Expected result:

Connector should show:


set type oci
set use-metadata-iam enable
set ha-status enable
Valid: tenant-id
Valid: compartment-list
Expected: oci-region-type
Expected: update-interval


Example connector definition:


config system sdn-connector
edit "OCI"
set status enable
set type oci
set use-metadata-iam enable
set ha-status enable
set tenant-id "ocid1.tenancy.oc1...."
config compartment-list
edit "ocid1.compartment.oc1...."
next
end
set oci-region-type commercial
set update-interval 60
next
end


If the healthy and affected nodes show the same connector configuration and HA checksum is in sync, the problem is likely not caused by connector configuration drift.


  • Check the connector status.


Run:


diagnose sys sdn status


Healthy output example:


SDN Connector                       Type        Status
-------------------------------------------------------------
OCI-PRD-ASH                         oci         Up


Failing output example:


SDN Connector                       Type        Status
-------------------------------------------------------------
OCI-PRD-ASH                         oci         Down


In the analyzed case, the healthy member reported 'UP' and the affected member reported 'Down'.


  • Review available 'ocid' test functions.


List the supported test options:


diagnose test application ocid -1


Example output:


1. list sdn connectors
2. filter list test
3. list available compartment
4. HA test
5. print nic metadata
6. instance metadata
7. force token refresh
8. list compartments in HA
99. restart


These commands help isolate the stage where the failure occurs. The same command set was available on the healthy member used as the baseline.


  • Check whether the connector has an active compartment.


Run:


diagnose test application ocid 3


Healthy output example:


Available Compartments for OCI:
HUB_Network (ocid1.compartment.oc1....)


Failing output example:


OCI has no active compartment


In the analyzed case, the failing member repeatedly reported 'OCI has no active compartment' . This condition followed the authentication failure and should usually be interpreted as a downstream effect, not the first failure point.


  • Validate endpoint resolution and X.509 token generation.


Run:


diagnose test application ocid 5


This command is useful because it may show both:

  • OCI endpoint resolution.

  • And the token/authentication behavior.


Healthy pattern.

A healthy member may show successful endpoint resolution and continue with compartment validation or inventory collection.

Example:


core api endpoint iaas.sa-xxxxx-1.oraclecloud.com is resolved at 140.x.x.x
identity api endpoint iaas.sa-xxxxx-1.oraclecloud.com is resolved at 140.x.x.x


Failing pattern:

A failing member may show the following sequence:


ocid api url: https://auth.sa-xxxxx-1.oraclecloud.com/v1/x509, ret: 400
http response err: 400
{
"code" : "InvalidParameter",
"message" : "Invalid Certs"
}
OCID failed to get metadata token
core api endpoint iaas.sa-xxxxx-1.oraclecloud.com is resolved at 140.x.x.x
identity api endpoint identity.sa-xxxxx-1.oraclecloud.com is resolved at 140.x.x.x
rsa key file open error: /etc/cert/local/root_.key
ocid api url: https://identity.sa-xxxxx-1.oraclecloud.com/20160918/compartments/ocid1.compartment.oc1...., ret: 401
http response err: 401
{
"code" : "NotAuthenticated",
"message" : "The required information to complete authentication was not provided or was incorrect."
}
rsa key file open error: /etc/cert/local/root_.key
ocid api url: https://identity.sa-xxxxx-1.oraclecloud.com/20160918/availabilityDomains?compartmentId=ocid1.compartment.oc1...., ret: 400
http response err: 400
<h1>Bad Message 400</h1><pre>reason: Illegal character CNTL=0x1</pre>
ocid failed to list availability domain


Interpretation:

  • Endpoint resolution is successful.

  • The failure happens first at /v1/x509.

  • 401 NotAuthenticated follows the failed token request.

  • Availability domain lookup fails after authentication has already failed.


This exact sequence was observed on the affected member.


  • Confirm that OCI instance metadata is available.


Run:


diagnose test application ocid 6


Expected result:

  • Instance name.

  • Instance OCID.

  • Compartment OCID.

  • OCI region.

  • Availability domain.

  • Realm domain.


Example output:


Instance Name: xxxxxfortinet02
Instance Id: ocid1.instance.oc1.sa-xxxxx-1....
Compartment Id: ocid1.compartment.oc1....
OCI Region: sa-xxxxx-1
Availability Domain: gVeo:SA-xxxxx-1-AD-1
Realm Domain: oraclecloud.com


If this command succeeds while 'ocid 5' fails with 'Invalid Certs', the failure domain is narrowed to the token/authentication stage rather than metadata access. This exact condition was observed on the affected node.


  • Validate the token refresh directly.


Run:


diagnose test application ocid 7


Healthy output example:


Instance Principal Token has been refreshed


Failing output example:


metadata url: http://169.x.x.x/opc/v2/identity/cert.pem
metadata url: http://169.x.x.x/opc/v2/identity/key.pem
metadata url: http://169.x.x.x/opc/v2/identity/intermediate.pem
ocid api url: https://auth.sa-xxxxx-1.oraclecloud.com/v1/x509, ret: 400
http response err: 400
{
"code" : "InvalidParameter",
"message" : "Invalid Certs"
}
OCID failed to get metadata token
Failed to refresh token


In the analyzed case, the healthy member refreshed the token successfully, while the affected member failed with the same 'Invalid Cert' pattern during token refresh.


  • Enable 'ocid' debug for low-level tracing.


Use the following sequence:


diagnose debug reset
diagnose debug console timestamp enable
diagnose debug application ocid -1
diagnose debug enable


Then repeat relevant tests:


diagnose test application ocid 4
diagnose test application ocid 5
diagnose test application ocid 7


Disable debug when finish


diagnose debug disable


Typical failing pattern:

  • Metadata URLs are accessed.

  • /v1/x509 returns Invalid Certs.

  • Token creation fails.

  • Compartment validation fails.

  • Not Authenticated is returned.


Example debug fragment:


2026-03-19 14:45:25 ocid stats: secondary
2026-03-19 14:45:25 metadata url: http://169.254.169.254/opc/v2/identity/cert.pem
2026-03-19 14:45:25 metadata url: http://169.254.169.254/opc/v2/identity/key.pem
2026-03-19 14:45:25 metadata url: http://169.254.169.254/opc/v2/identity/intermediate.pem
2026-03-19 14:45:25 ocid api url: https://auth.sa-santiago-1.oraclecloud.com/v1/x509, ret: 400
2026-03-19 14:45:25 http response err: 400
{
"code" : "InvalidParameter",
"message" : "Invalid Certs"
}
2026-03-19 14:45:25 OCID failed to get metadata token


This pattern indicates that the failure is in OCI token generation/authentication, not in initial metadata access.


  • Check time synchronization.

Run:


execute time
diagnose sys ntp status


Healthy example:


synchronized: yes


Abnormal example:


synchronized: no
reachable(0xff)
no data


Time synchronization was healthy on the baseline member and unhealthy on the affected member in the analyzed case. This difference should be corrected, even though the strongest failure signature remained 'Invalid Certs' during X.509 token generation.


  • About RSA key file open error: /etc/cert/local/root_.key.


This message may appear during the failing flow:


rsa key file open error: /etc/cert/local/root_.key


This message should not be treated alone as the root cause.


Reason:

  • It appears during the failing token flow.

  • However, a missing or unreadable path by itself does not fully distinguish the failing member from the healthy authentication result.

  • The more reliable discriminator is the full sequence:

    • healthy member: active connector, active compartment, successful token refresh.

    • affected member: /v1/x509 returns Invalid Certs, followed by NotAuthenticated, followed by no active compartment.


This interpretation is consistent with the comparative analysis performed on the two cluster members.


  • Troubleshooting summary.


If all of the following are true:


diagnose test application ocid 6 -> succeeds.


  • OCI endpoints resolve correctly.


diagnose test application ocid 5 -> fails at /v1/x509 with Invalid Certs.401 NotAuthenticated -> follows.
diagnose test application ocid 3 -> reports no active compartment.
diagnose test application ocid 7 -> fails to refresh the token.


The most likely failure domain is:

OCI token generation / OCI metadata IAM authentication.


Rather than:

  • Connector syntax.

  • Missing metadata.

  • Tenant or compartment mismatch.

  • Or basic endpoint reachability.


That was the exact fault pattern observed on the affected member, while the healthy member kept the connector up and refreshed the token successfully.


  • Recommended next actions.


  • Capture the same 'ocid' debug flow on a healthy HA member for direct comparison.

  • Restart the ocid daemon and repeat token-related tests:


diagnose test application ocid 99
diagnose test application ocid 5
diagnose test application ocid 7


  • Correct NTP if the affected node is not synchronized.

  • Correlate the issue with OCI using the specific instance OCID of the affected node.

  • If HA failover testing is required, perform it only in a controlled maintenance window.