Build machine learning APIs

Build machine learning APIs

Cortex makes deploying, scaling, and managing machine learning systems in production simple. We believe that developers in any organization should be able to add natural language processing, computer vision, and other machine learning capabilities to their applications without having to worry about infrastructure.

Key features


  • Run Cortex locally or as a production cluster on your AWS account.

  • Deploy TensorFlow, PyTorch, Keras, ONNX, XGBoost, scikit-learn, and other models as realtime APIs or batch APIs.

  • Define preprocessing and postprocessing steps in Python.


  • Update APIs with no downtime.

  • Stream logs from your APIs to your CLI.

  • Monitor API performance and track predictions.

  • Run A/B tests.


  • Automatically scale APIs to handle production traffic.

  • Reduce your cloud infrastructure spend with spot instances.

  • Maximize resource utilization by deploying multiple models per API.

How it works

Implement your predictor in, configure your deployment in cortex.yaml, and run cortex deploy.

Here's how to deploy GPT-2 as a scalable text generation API:


Get started


bash -c "$(curl -sS"

See our installation guide, then deploy one of our examples or bring your own models to build realtime APIs and batch APIs.

Learn more

Check out our docs and join our community.