Cortex makes deploying, scaling, and managing machine learning systems in production simple. We believe that developers in any organization should be able to add natural language processing, computer vision, and other machine learning capabilities to their applications without having to worry about infrastructure.
Run Cortex locally or as a production cluster on your AWS account.
Deploy TensorFlow, PyTorch, Keras, ONNX, XGBoost, scikit-learn, and other models as realtime APIs or batch APIs.
Define preprocessing and postprocessing steps in Python.
Update APIs with no downtime.
Stream logs from your APIs to your CLI.
Monitor API performance and track predictions.
Run A/B tests.
Automatically scale APIs to handle production traffic.
Reduce your cloud infrastructure spend with spot instances.
Maximize resource utilization by deploying multiple models per API.
Implement your predictor in
predictor.py, configure your deployment in
cortex.yaml, and run
Here's how to deploy GPT-2 as a scalable text generation API:
bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get-cli.sh)"