Overview
The Vector Inference Platform is a new service provided by the AI Engineering team,team rollingat outVector inInstitute. earlyIt 2026.hosts This new platform will be used to host largelarge, state-of-the-art open-source language models,models whichthat anyone in the Vector community can use freely and easily.
Unlike previous efforts to provide inference services on Vector's compute environment, this new platform will beis a production-grade, always-available service. Users willdo not be expectedneed to bringspin up their own models via Slurm jobs,jobs or worry about time limits;limits these— models will remain persistently online.
The source code and advanced technical documentation for this project are available on the GithubGitHub page: https://github.com/VectorInstitute/inference-platformrepository.
For
Asthe current list of February 2026, 3 differentavailable models areand alreadytheir available:specifications, visit inference.vectorinstitute.ai.
The AI Engineering team will makecontinue moreto add new models available as thisthe service maturesmatures. Feedback and we get access to better hardware. We encourage any feedback or new model requests are welcome via ourthe slackSlack channel at #vector-inference-platform.