Create APIs that can perform arbitrary tasks like training or fine-tuning a model.
# train_iris.py​import osimport boto3import picklefrom sklearn.datasets import load_irisfrom sklearn.model_selection import train_test_splitfrom sklearn.linear_model import LogisticRegression​​class Task:def __call__(self, config):# get the iris flower datasetiris = load_iris()data, labels = iris.data, iris.targettraining_data, test_data, training_labels, test_labels = train_test_split(data, labels)print("loaded dataset")​# train the modelmodel = LogisticRegression(solver="lbfgs", multi_class="multinomial", max_iter=1000)model.fit(training_data, training_labels)accuracy = model.score(test_data, test_labels)print("model trained; accuracy: {:.2f}".format(accuracy))​# upload the modeldest_dir = config["dest_s3_dir"]bucket, key = dest_dir.replace("s3://", "").split("/", 1)pickle.dump(model, open("model.pkl", "wb"))s3 = boto3.client("s3")s3.upload_file("model.pkl", bucket, os.path.join(key, "model.pkl"))print(f"model uploaded to {dest_dir}/model.pkl")
# requirements.txt​boto3scikit-learn==0.23.2
# cortex.yaml​- name: train-iriskind: TaskAPIdefinition:path: train_iris.py
$ cortex deploy
$ cortex get train-iris​# > endpoint: http://***.elb.us-west-2.amazonaws.com/train-iris
You can submit a job by making a POST request to the Task API's endpoint.
Using curl
:
$ export TASK_API_ENDPOINT=<TASK_API_ENDPOINT> # e.g. export TASK_API_ENDPOINT=https://***.elb.us-west-2.amazonaws.com/train-iris$ export DEST_S3_DIR=<YOUR_S3_DIRECTORY> # e.g. export DEST_S3_DIR=s3://my-bucket/dir​$ curl $TASK_API_ENDPOINT \-X POST -H "Content-Type: application/json" \-d "{\"config\": {\"dest_s3_dir\": \"$DEST_S3_DIR\"}}"# > {"job_id":"69b183ed6bdf3e9b","api_name":"train-iris",...}
Or, using Python requests
:
import corteximport requests​cx = cortex.client("aws") # "aws" is the name of the Cortex environment used in this exampletask_endpoint = cx.get_api("train-iris")["endpoint"]​dest_s3_dir = # S3 directory where the model will be uploaded, e.g. "s3://my-bucket/dir"job_spec = {"config": {"dest_s3_dir": dest_s3_dir}}response = requests.post(task_endpoint, json=job_spec)print(response.text)# > {"job_id":"69b183ed6bdf3e9b","api_name":"train-iris",...}
$ cortex get train-iris 69b183ed6bdf3e9b
Once the job is complete, you should be able to find the trained model in the directory you've specified.
$ cortex delete train-iris