Data Management

Saving Moms with ML: How CareSource makes use of MLOps to Enhance Healthcare in Excessive-Threat Obstetrics

Spread the love

This weblog publish is in collaboration with Russ Scoville (Vice President of Enterprise Information Companies), Arpit Gupta (Director of Predictive Analytics and Information Science), and Alvaro Aleman (Senior Information Scientist) at CareSource.

In the US, roughly 7 out of each 1000 moms undergo from each being pregnant and supply problems annually1. Of these moms with being pregnant problems, 700 die however 60% of these deaths are preventable with the fitting medical consideration, in accordance with the CDC. Even among the many 3.7 million profitable births, 19% have both low birthweight or are delivered preterm. These high-risk pregnancies and deliveries, medically referred to as obstetrics, impose not solely a threat to human life but additionally a substantial emotional and financial burden on households. A high-risk being pregnant may be almost 10 occasions costlier than a standard delivery consequence, averaging $57,000 for a high-risk being pregnant vs $8,000 for a typical being pregnant2. CareSource, one of many largest Medicaid suppliers in the US, goals to not solely triage these high-risk pregnancies, but additionally companion with medical suppliers to allow them to present lifesaving obstetrics care for his or her sufferers earlier than it’s too late. Nonetheless, there are knowledge bottlenecks that should be solved.

CareSource wrestled with the problem of not with the ability to use the whole lot of their historic knowledge for coaching their machine studying (ML) fashions. With the ability to systematically monitor ML experiments and set off mannequin refreshes was additionally a ache level. All these constraints led to delay in sending time-sensitive obstetrics threat predictions to medical companions. On this weblog publish, we are going to briefly focus on how CareSource developed an ML mannequin to determine high-risk obstetrics after which give attention to how we constructed a standardized and automatic manufacturing framework to speed up ML mannequin deployment.

Atmosphere and Folks Context

CareSource has a workforce of knowledge scientists and DevOps engineers. Information scientists are accountable for creating ML pipelines whereas DevOps engineers configure the mandatory infrastructure to help the ML pipelines in manufacturing.

When it comes to atmosphere setup, CareSource makes use of a single Azure Databricks workspace for dev, staging, and manufacturing. The workforce leverages totally different Git branches, backed by an Azure Repo, to distinguish between environments:

  • dev or characteristic branches: growth
  • essential department: staging
  • launch department: manufacturing

ML Growth

What stands out about high-risk obstetrics (HROB) knowledge is that it not solely incorporates well being profiles but additionally different circumstantial components, akin to financial stability, that will have an effect on the pregnant affected person’s well-being. There are over 500 options altogether and plenty of of those clinical-related options are helpful for associated ML fashions, akin to re-admission threat mannequin. Therefore, we used Databricks Characteristic Retailer to retailer all cleaned and engineered options to permit reuse and collaboration throughout initiatives and groups.

For simpler experimentation with totally different characteristic mixtures, we expressed all characteristic choice and imputation strategies within the type of YAML information with out altering the precise mannequin coaching code. We first mapped options into totally different teams in feature_mappings.yml. Then, we outlined which characteristic teams to maintain or drop in feature_selection_config.yml as proven beneath. The benefit of this strategy is that we didn’t must edit mannequin coaching code immediately.

# feature_selection_config.yml

  - primary_key_group
  - group_a
  - group_b
  - feature_c
  - feature_d

To permit coaching at scale on a full set of historic knowledge, we utilized the distributed PySpark framework for knowledge processing. We additionally used Hyperopt, an open-sourced instrument that gives Bayesian hyperparameter search, leveraging outcomes from previous mannequin configuration runs, to tune a PySpark mannequin. With MLflow, all of those hyperparameter trials have been robotically captured. This included their hyperparameters, metrics, any arbitrary information (e.g. pictures or characteristic significance information). Utilizing MLflow eliminated the guide wrestle of protecting monitor of varied experimentation runs. In keeping with a 2022 perinatal well being report launched by the Heart for American Progress, we discovered from our preliminary experimentation that being pregnant threat is certainly a multi-faceted downside, influenced not solely by well being historical past but additionally by different socioeconomic determinants.

ML Productionization

Usually, how we productionize fashions has a variety of variability throughout initiatives and groups even inside the identical group. The identical was true for CareSource as effectively. CareSource struggled with various productionization requirements throughout initiatives, slowing down mannequin deployment. Moreover, elevated variability means extra engineering overhead and extra onboarding problems. Therefore, the chief objective that we needed to attain at CareSource was to allow a standardized and automatic framework to productionize fashions.

On the coronary heart of our workflow is leveraging a templating instrument, Stacks – a Databricks product below personal preview – to generate standardized and automatic CI/CD workflows for deploying and testing ML fashions.

Introducing Stacks*

Stacks leverages the deploy-code sample, by means of which we promote coaching code, somewhat than mannequin artifacts to staging or manufacturing. (You possibly can learn extra about deploy-code vs deploy-model on this Large Ebook of MLOps.) It offers a cookiecutter template to arrange infrastructure-as-code (IaC) and CI/CD pipelines for ML fashions in manufacturing. Utilizing cookiecutter prompts, we configured the template with Azure Databricks atmosphere values akin to Databricks workspace URL and Azure storage account title. Stacks, by default, assumes totally different Databricks workspaces for staging and manufacturing. Due to this fact, we personalized how Azure service principals are created, in order that we might have two SPs, i.e. staging-sp and prod-sp, in the identical workspace. Now that we’ve the CI/CD pipelines in place, we proceeded with adapting our ML code in accordance with the cookiecutter template. The diagram beneath reveals the general structure of the ML growth and automatic productionization workflow that we applied.

*Word: Stacks is a Databricks product below personal preview and is regularly evolving to make future mannequin deployments even simpler. Keep tuned for the upcoming launch!

Manufacturing Structure and Workflow

Note: MLflow Model Registry is also used in staging, but not shown in this picture for simplicity.
Word: MLflow Mannequin Registry can be utilized in staging, however not proven on this image for simplicity.

Within the dev atmosphere:

  • Information scientists are free to create any characteristic branches for mannequin growth and exploration
  • They commit code in opposition to Git usually to avoid wasting any work-in-progress
  • As soon as knowledge scientists determine a candidate mannequin to maneuver ahead with manufacturing:
    – They additional modularize and parameterize ML code if want be
    – They implement unit and integration checks
    – They outline paths to retailer MLflow experiments, MLflow fashions, coaching and inference job frequencies
  • Lastly, they submit a pull request (PR) in opposition to the staging atmosphere, i.e. essential department

Within the staging atmosphere:

  • The PR triggers a collection of unit checks and integration checks below the Steady Integration (CI) step outlined in Azure DevOps
    – Confirm that the characteristic engineering and mannequin coaching pipelines run efficiently and produce outcomes inside expectation
  • Register the candidate mannequin in MLflow Mannequin Registry and transition its stage to staging
  • As soon as all checks go, merge the PR into the essential department

Within the prod atmosphere:

  • Information scientists minimize a model of the essential department to the launch department to push the mannequin to manufacturing
  • A Steady Supply (CD) step in Azure DevOps is triggered
    – Much like the staging atmosphere, confirm that the characteristic engineering and mannequin coaching pipelines run efficiently
  • As soon as all checks go, register the candidate mannequin within the MLflow Mannequin Registry and transition to Manufacturing, if that is the primary mannequin model
    – For future mannequin model upgrades, the challenger mannequin (model 2) has to exceed a efficiency threshold when in comparison with the present mannequin in manufacturing (model 1), earlier than it transitions to Manufacturing
  • Load the mannequin in MLflow Mannequin Registry and generate batch predictions
    – Persist these predictions in Delta tables and conduct any post-processing steps

The standardized workflow describe above can now be utilized to all different ML initiatives at CareSource. One other essential component that simplifies mannequin administration is automation. We don’t wish to set off checks manually when we’ve numerous fashions to handle. The embedded part inside Stacks that permits automation is Terraform. We expressed all configurations as code, together with compute assets to spin up characteristic engineering, mannequin coaching, and inference jobs. The added bonus from Terraform is that we will now construct and model these infra modifications as code. Establishing IaC through Terraform and CI/CD is non-trivial from scratch, however fortunately Stacks offers each bootstrapping automation and reference CI/CD code out of the field. As an example, utilizing the Terraform useful resource beneath,, we scheduled a prod batch inference jobs to run at 11am UTC every day, whereas pulling code from the launch department.
useful resource "databricks_job" "batch_inference_job" {
  title = "${native.env_prefix}-batch-inference-job"

  new_cluster {
    num_workers   = 2
    spark_version = "11.3.x-cpu-ml-scala2.12"
    node_type_id  = "Standard_D3_v2"
    single_user_name   = knowledge.databricks_current_user.service_principal.user_name
    data_security_mode = "SINGLE_USER"

  notebook_task {
    notebook_path = "notebooks/04_batch_inference"
    base_parameters = {
      env = native.env

  git_source {
    url      = var.git_repo_url
    supplier = "azureDevOpsServices"
    department   = "launch"

  schedule {
    quartz_cron_expression = "0 0 11 * * ?" # every day at 11am
    timezone_id            = "UTC"

On this mission, we additionally leveraged each project-wide and environment-specific configuration information. This enabled simple toggling between totally different configurations because the atmosphere modified from dev to staging, for instance. Typically, parameterized information assist hold our ML pipeline clear and bug-free from parameter iterations:

# configurations/configs.yml 

  data_dir_path: &dir table_path 
  feature_selection_config: feature_selection_config.yml
    split_config: [0.8, 0.2]
      hyperparam_1: 10
      hyperparam_2: 0.01

End result

To recap, we used Databricks Characteristic Retailer, MLflow, and Hyperopt to develop, tune, and monitor the ML mannequin to foretell obstetrics threat. Then, we leveraged Stacks to assist instantiate a production-ready template for deployment and ship prediction outcomes at a well timed schedule to medical companions. An end-to-end ML framework, full with manufacturing finest practices, may be difficult and time-consuming to implement. Nonetheless, we established the ML growth and productionization structure detailed above inside roughly 6 weeks.

So how did Stacks assist us speed up the productionization course of at CareSource?


Stacks offers a standardized and but absolutely customizable ML mission construction, infra-as-code, and CI/CD template. It’s agnostic to how mannequin growth code is written so we had utterly flexibility over how we wrote our ML code and which packages we used. The info scientists at CareSource can personal this course of utterly and deploy fashions to manufacturing in a self-service trend by following the guardrails Stacks offers. (As talked about earlier, Stacks will get even simpler to leverage because it undergoes enhancements throughout this personal preview part!)

The CareSource workforce can now simply prolong this template to help different ML use circumstances. An vital studying from this work was that early collaboration between each the info science and DevOps (ML) engineering groups is instrumental to making sure clean productionization.

Migrating this high-risk obstetrics mannequin to Databricks is just the start for CareSource. The accelerated transition between ML growth and productionization not solely permits knowledge practitioners to completely unleash the ability of knowledge and ML, however at Caresource, it means having an opportunity to immediately influence sufferers’ well being and lives earlier than it’s too late.

CareSource was chosen as One of many Greatest Locations to Work 2020 and gained the Medical Innovator Award. If you want to hitch CareSource to enhance their members’ well-being, try their profession openings right here.


  1. Blue Cross Blue Defend Group – the Well being of America. (2020, June 17). Tendencies in Being pregnant and Childbirth Problems within the U.S. Retrieved March 23, 2023, from
  2. M. Lopez. (2020, August 6). Managing Prices in Excessive-Threat Obstetrics. AJMC.

Leave a Reply

Your email address will not be published. Required fields are marked *