Getting Started

~~How~~The Vector Inference Platform is available to ~~send~~all ~~queries~~users of Vector's Bon Echo cluster. At the ~~vector~~time ~~inference~~of ~~platform.~~this document, this includes the AI Engineering team, the Industry and a select group of researchers. (We will be opening up the environment to all Vector researchers later in 2026.)

There are currently 3 models available to use:

Qwen2.5-0.5B

Qwen2.5-1.5B

Qwen3-Omni-30B-A3B-Instruct

Usage Instructions

The Inference Platform is only available on the Bon Echo cluster. Start by logging in to the cluster:

ssh v.vectorinstitute.ai

Create a new Python script called qwen3-omni-30b-test.py:

from openai import OpenAI

client = OpenAI(
    base_url="http://aieng01:4000/v1/",
    timeout=300.0,
    api_key="sk-M5WOZhbca-Avj-CiI7pAEg"
)

response = client.chat.completions.create(
    model="openai/Qwen3-Omni-30B-A3B-Instruct",
    messages=[{"role": "user", "content": "Who is better, Messi or Ronaldo?"}]
)

print(response)