How to configure external OpenAI-compatible API Connectors
OpenAI-compatible API connectors are the recommended feature in the following scenario:
- You would like to leverage the Aleph Alpha (AA) PhariaAI stack, and direct your requests against the AA inference API.
- You do not possess enough GPUs on your own, so you cannot deploy your own inference backend (i.e., workers) for all desired models.
- You do not have the possibility to use shared inference and connect to another AA inference deployed in a different environment.
In this case, you can connect your on-premise inference API scheduler to external inference APIs that serve as an inference backend. This feature allows you to use all the features of the API scheduler, such as queueing, authentification etc.
Provide credentials
Create a secret containing the API keys and configuration, for example from a local file called secret-external-api-connectors.toml:
kubectl create secret generic inference-api-external-api-connectors \
--from-file=external-api-connectors.toml=./secret-external-api-connectors.toml
Replacing an existing model or checkpoint with an external API connector requires manual steps and a short downtime of the model. See "Replacing an existing model or checkpoint" for details.
secret-external-api-connectors.toml has the following structure:
[openai]
base_url = "https://api.openai.com/v1"
api_key = "[redacted]"
[openai.o3]
internal = "o3-mini-do-not-use"
external = "o3-mini"
[openai.gpt4]
internal = "gpt-4o-do-not-use"
external = "gpt-4o-2024-11-20"
[provider2]
base_url = "https://example.com/second/v7"
api_key = "[redacted]"
[provider2.deepseek]
internal = "deepseek-example"
external = "deepseek-r1"
Each provider requires a base_url, an api_key and a list of models. The external model name refers to the name of the model in the external API, while the internal name will be available at the AA inference API. The internal model name will be also used as name for the checkpoint.
Activation in helm config
To activate the feature, reference the secret in the inference-api section of your configuration:
inference-api:
externalApiConnectors:
enabled: true
secretName: inference-api-external-api-connectors
After modifications of the secret, the inference-api needs to be restarted for the settings to take effect.
Replacing an existing model or checkpoint
-
Test if your user has the required privileges.
This manual migration requires a user with administrative permissions. You can
DELETE https://inference-api.example.com/model-packages/some-non-existent-checkpointto see if that's the case. If you get back anHTTP 404 Not Found, then you're good to go.Example curl request:
curl -H "Authorization: $TOKEN" -X DELETE \
https://inference-api.example.com/model-packages/some-non-existent-checkpoint -
Prepare the configuration for the external API connector as outlined above.
-
Shut down all workers serving the model or checkpoint you want to replace with the external API connector. Your model will not be available until step 4 is completed.
-
DELETE https://inference-api.example.com/model-packages/{checkpoint_name}Example curl request:
curl -H "Authorization: $TOKEN" -X DELETE \
https://inference-api.example.com/model-packages/{checkpoint_name} -
Deploy your new configuration for the
inference-apiservice. This will restart the pod and the external API connector should immediately create the checkpoint/model again.
Troubleshooting
In case that a model does not become available, inspect the logs of the inference-api pod for potential configuration issues.
Please note that not all parameters offered by OpenAI API endpoints are supported.