Skip to main content

Getting Started

HowThe Vector Inference Platform is available to sendall queriesusers of Vector's Bon Echo cluster. At the vectortime inferenceof platform.this document, this includes the AI Engineering team, the Industry and a select group of researchers. (We will be opening up the environment to all Vector researchers later in 2026.)

There are currently 3 models available to use:

Usage Instructions

The Inference Platform is only available on the Bon Echo cluster. Start by logging in to the cluster:

ssh v.vectorinstitute.ai

Create a new Python script called qwen3-omni-30b-test.py:

from openai import OpenAI

client = OpenAI(
    base_url="http://aieng01:4000/v1/",
    timeout=300.0,
    api_key="sk-M5WOZhbca-Avj-CiI7pAEg"
)

response = client.chat.completions.create(
    model="openai/Qwen3-Omni-30B-A3B-Instruct",
    messages=[{"role": "user", "content": "Who is better, Messi or Ronaldo?"}]
)

print(response)