Getting Started

The Vector Inference Platform is available to all users of Vector's Bon Echo cluster. At the time of this document, this includes the AI Engineering team, the Industry and a select group of researchers. (We will be opening up the environment to all Vector researchers later in 2026.)

There are currently 3 models available to use:

Qwen2.5-0.5B
Qwen2.5-1.5B
Qwen3-Omni-30B-A3B-Instruct

Usage Instructions

The Inference Platform is only available on the Bon Echo cluster. Start by logging in to the cluster:

ssh v.vectorinstitute.ai

Create a new Python script called qwen3-omni-30b-test.py:

from openai import OpenAI

client = OpenAI(
    base_url="http://aieng01:4000/v1/",
    timeout=300.0,
    api_key="sk-M5WOZhbca-Avj-CiI7pAEg"
)

response = client.chat.completions.create(
    model="openai/Qwen3-Omni-30B-A3B-Instruct",
    messages=[{"role": "user", "content": "Who is better, Messi or Ronaldo?"}]
)

print(response)