Getting Started

The Vector Inference Platform is available to all Vector Institute community members. For an up-to-date list of available models and their specifications, visit inference.vectorinstitute.ai.

Prerequisites

Install the OpenAI Python client:

pip install openai

Usage

The platform exposes an OpenAI-compatible API at https://proxy.vectorinstitute.ai/v1. You can use it as a drop-in replacement for any OpenAI client by changing the base_url and model parameters.

from openai import OpenAI

client = OpenAI(
    base_url="https://proxy.vectorinstitute.ai/v1",
    api_key="<your-api-key>"
)

stream = client.chat.completions.create(
    model="<model-id>",  # see inference.vectorinstitute.ai for available models
    messages=[{"role": "user", "content": "Explain attention mechanisms in transformers."}],
    stream=True,
)

for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

You can also use curl:

curl https://proxy.vectorinstitute.ai/v1/chat/completions \
  -H "Authorization: Bearer <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model-id>",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Getting an API Key

API keys are managed by the AI Engineering team. To request access, reach out via the Slack channel #vector-inference-platform.

Listing Available Models

You can retrieve the current list of enabled models programmatically via the API:

curl https://proxy.vectorinstitute.ai/v1/models \
  -H "Authorization: Bearer <your-api-key>"

Or simply visit inference.vectorinstitute.ai for a visual overview.