Overview

The Vector Inference Platform is a ~~new~~ service provided by the AI Engineering ~~team,~~team ~~rolling~~at ~~out~~Vector inInstitute. ~~early~~It ~~2026.~~hosts ~~This new platform will be used to host large~~large, state-of-the-art open-source language ~~models,~~models ~~which~~that anyone in the Vector community can use freely and easily.

Unlike previous efforts to provide inference services on Vector's compute environment, this ~~new~~ platform ~~will be~~is a production-grade, always-available service. Users ~~will~~do not ~~be expected~~need to ~~bring~~spin up their own models via Slurm ~~jobs,~~jobs or worry about time ~~limits;~~limits ~~these~~— models ~~will~~ remain persistently online.

The source code and ~~advanced~~ technical documentation for this project are available on the ~~Github~~GitHub ~~page:~~ ~~https://github.com/VectorInstitute/inference-platform~~repository.

For

Asthe current list of ~~February 2026, 3 different~~available models ~~are~~and ~~already~~their ~~available:~~specifications, visit inference.vectorinstitute.ai.

~~openai/Qwen3-Omni-30B-A3B-Instruct~~

~~openai/gpt-oss-120b~~

~~openai/Kimi-Linear-48B-A3B-Instruct~~

The AI Engineering team will ~~make~~continue ~~more~~to add new models ~~available~~ as ~~this~~the service ~~matures~~matures. Feedback and ~~we get access to better hardware. We encourage any feedback or new~~ model requests are welcome via ~~our~~the ~~slack~~Slack channel at #vector-inference-platform.