How to configure external OpenAI-compatible API Connectors

OpenAI-compatible API connectors are the recommended feature in the following scenario:

You would like to leverage the Aleph Alpha (AA) PhariaAI stack, and direct your requests against the AA inference API.
You do not possess enough GPUs on your own, so you cannot deploy your own inference backend (i.e., workers) for all desired models.
You do not have the possibility to use shared inference and connect to another AA inference deployed in a different environment.

In this case, you can connect your on-premise inference API scheduler to external inference APIs that serve as an inference backend. This feature allows you to use all the features of the API scheduler, such as queueing, authentification etc.

Provide credentials

Create a secret containing the API keys and configuration, for example from a local file called secret-external-api-connectors.toml:

kubectl create secret generic inference-api-external-api-connectors \
    --from-file=external-api-connectors.toml=./secret-external-api-connectors.toml

warning

Replacing an existing model or checkpoint with an external API connector requires manual steps and a short downtime of the model. See "Replacing an existing model or checkpoint" for details.

secret-external-api-connectors.toml has the following structure:

[openai]
base_url = "https://api.openai.com/v1"
api_key = "[redacted]"
[openai.o3]
internal = "o3-mini-do-not-use"
external = "o3-mini"
[openai.gpt4]
internal = "gpt-4o-do-not-use"
external = "gpt-4o-2024-11-20"

[provider2]
base_url = "https://example.com/second/v7"
api_key = "[redacted]"
[provider2.deepseek]
internal = "deepseek-example"
external = "deepseek-r1"

Each provider requires a base_url, an api_key and a list of models. The external model name refers to the name of the model in the external API, while the internal name will be available at the AA inference API. The internal model name will be also used as name for the checkpoint.

Activation in helm config

To activate the feature, reference the secret in the inference-api section of your configuration:

inference-api:
  externalApiConnectors:
    enabled: true
    secretName: inference-api-external-api-connectors

After modifications of the secret, the inference-api needs to be restarted for the settings to take effect.

Replacing an existing model or checkpoint

Test if your user has the required privileges.

This manual migration requires a user with administrative permissions. You can DELETE https://inference-api.example.com/model-packages/some-non-existent-checkpoint to see if that's the case. If you get back an HTTP 404 Not Found, then you're good to go.

Example curl request:
```
 curl -H "Authorization: $TOKEN" -X DELETE \
   https://inference-api.example.com/model-packages/some-non-existent-checkpoint
```
Prepare the configuration for the external API connector as outlined above.
Shut down all workers serving the model or checkpoint you want to replace with the external API connector. Your model will not be available until step 4 is completed.

DELETE https://inference-api.example.com/model-packages/{checkpoint_name}

Example curl request:

curl -H "Authorization: $TOKEN" -X DELETE \
  https://inference-api.example.com/model-packages/{checkpoint_name}

Deploy your new configuration for the inference-api service. This will restart the pod and the external API connector should immediately create the checkpoint/model again.

Troubleshooting

In case that a model does not become available, inspect the logs of the inference-api pod for potential configuration issues.

warning

Please note that not all parameters offered by OpenAI API endpoints are supported.

Provide credentials​

Activation in helm config​

Replacing an existing model or checkpoint​

Troubleshooting​

Provide credentials

Activation in helm config

Replacing an existing model or checkpoint

Troubleshooting