LLMs: Replacing OpenAI’s ChatGPT Endpoints with AptAI API’s Private Endpoints

AptAI APIs provide endpoints that can replace OpenAI’s ChatGPT endpoints. This means that the OpenAI’s API clients (python and Node.js libraries) can be used with AptAI APIs.

The migration is easy and straightforward. Simply replace the Base URL and API Key of OpenAI with your AptAI APIs Running Server URL and API Key.

The Running Server URL is a randomly generated URL that is created once you create your private server. The API Key can be found in the AptAI Admin page by clicking on ‘Show Credentials’ for your running server as described here.

Python
from openai import OpenAI

client = OpenAI(
    api_key = "<YOUR-PRIVATE-API-ENDPOINT-API-KEY>", 
    base_url = "<YOUR-RUNNING-SERVER-URL>"
    )

After setting the Base URL and the API Key, the OpenAI client will work as before except for the functionalities that are currently not supported by AptAI APIs. A list of supported functionalities can be found at the bottom of this page. For a comprehensive documentation of OpenAI’s API library refer to OpenAI’s API references.

Fine-Tuning LLMs with AptAI

As mentioned above, supported functionalities of AptAI LLM API work with OpenAI clients. Here is an example for fine-tuning the Mistral model.

Step 1: Upload your training file

First you need to prepare a JSON file with the same format as described by OpenAI’s API references and upload them to your private server.

Python
response = client.files.create(
  file=open("<PATH-TO-JSON-FILE>", "rb"),
  purpose="fine-tune"
)

file_id = response.id

Step 2: Create fine-tuning job

Once the file id is retrieved it can be used to create a fine-tuning job. Feel free to specify the hyperparameters as you desire. Also note that there are additional optional hyperparameters that can be provided. Please refer to your private server documentations (<YOUR-PRIVATE-SERVER-URL>/docs page for more details.

Python
response = client.fine_tuning.jobs.create(
    model = "mistralai/Mistral-7B-Instruct-v0.2",
    training_file = file_id,
    hyperparameters = {
            "batch_size": 2,
            "n_epochs": 100
        },
    suffix="<YOUR-FINE-TUNING-MODEL-SUFFIX>"
    )
job_id = response.id

Step 3: Check the status of fine-tuning job

You can check the status of the fine-tuning job using the following code. Once complete response.fine_tuned_model would return the model name.

Python
response = client.fine_tuning.jobs.retrieve(job_id)
fine_tuned_model_name = response.fine_tuned_model
print(fine_tuned_model_name)

Step 4: Use your fine-tuned model

Once the fine-tuning job is complete you can apply the fine-tuned model to the base model and use it for chat completion or completion jobs. Here is an example with streaming.

Python
response = client.chat.completions.create(
  model = fine_tuned_model_name,
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "<YOUR-QUESTION>"},
    ],
  stream = True,)

collected_messages = []

for chunk in response:
    chunk_message = chunk.choices[0].delta.content
    collected_messages.append(chunk_message)
    print(f"Message chunk received: {chunk_message}")  # print the delay and text

print("".join(collected_messages))
List of OpenAI API Client Supported Functionalities

– chat (streaming supported)
– completions (streaming supported)
– files
– fine-tuning
– models
– assistants (coming soon)
– embeddings (coming soon)
– moderations (coming soon)