CDK Shorts #2 – Parallel Deployments

11 Aug 2021

The ability to deploy stacks in parallel is beyond the CDK and CloudFormation scope. It is up to the caller to orchestrate and specify the order of the stack when this granularity is desired.

In this post we show how a basic 3 stack application’s deployment time can be reduced by deploying stacks in parallel where possible. The stacks in question are:

stacks/Infrastructure this contains the all resources used by the Service stacks, like VPCs, DBs ect. In this example it only contains a DynamoDB Table.
stacks/ServiceA this is one of the Service stacks, it only contains a Lambda that receives the DynamoDB Table name as an environment variable from the Infrastructure stack. It is thus dependent on the Infrastructure stack and needs that to be deployed first.
stacks/ServiceB exactly the same as stacks/ServiceA.

In theory, we could have looped over an array and created as many stacks as we want (ServiceX) but the example is keeping it concrete and simple with only two Service stacks.

The project that is referenced in this post can be found here: https://github.com/rehanvdm/aws-cdk-parallel-deploy

The Problem

You have to wait a long time when you have a large CDK project that needs to deploy many stacks. The default behaviour of the CDK is to deploy the stacks in synchronous order of dependency when you specify the *, indicating to deploy all. The CDK does a great job to keep track of which stacks are dependent on each other but can not know which stacks can be deployed in parallel.

The deploy * command will deploy the stacks in order of: (theoretical time indicated next to each stack)

stacks/Infrastructure - 1 minute
stacks/ServiceA - 1 minute
stacks/ServiceB - 1 minute

Solution

Do not specify the * when doing the deploy command. Explicitly deploy stacks in the correct order, use the --exclusively true argument on the deploy command.
Synthesis the cloud assembly output.
Pass the cloud assembly output as input to all the deploy commands.

1. Deploy Order

So our deployment order needs to change:

stacks/Infrastructure - 1 minute
Parallel deployment ofstacks/ServiceA and stacks/ServiceB - 1 minute

This is entirely up to your build/deployment script. In this project we use a GULP file as a build script to make the process platform-agnostic. This is a basic implementation of the fourth method as explained in one of my other blog posts 4 Methods to configure multiple environments in the AWS CDK

It is important to specify the --exclusively true property when deploying the ServiceX stacks so that they don’t both try to deploy the Infrastructure stack at the same time.

2. Synth Cloud Assembly outputs

The CDK synth command produces a Cloud Assembly output when you specify the --output <cloud_asm_path> property. AWS mentions Cloud Assembly but does not highlight the benefits allowing parallel deployments. It allows you to do one synth command and then specify --app <cloud_asm_path> for every subsequent deployment.

This is required when we run the deploy command in parallel. We are specifying which stacks to deploy (--exclusively true), but the CDK will rebuild the cloud assemblies for the whole project everytime. This creates a race conditions and wastes a lot of compute resources. It is thus better to only do this step once and then pass it down as an artifact to the rest of the deployments which only deploy thier exclusive stacks.

3. Use Cloud Assembly outputs

The --app variable which reads the cdk.json file by default needs to be overridden by --app <cloud_asm_path>. This instructs the CDK to use the pre generated cloud assembly output instead of using the app command/property in the cdk.json file to regenerate the cloud assembly every time.

Putting it all together:

The high level commands:

> tsc
> cdk synth --output ./cloud_assembly_output
> cdk deploy "parallel-deploy-infra" --app ./cloud_assembly_output
IN PARALLEL
    > cdk deploy "parallel-service-a" --app ./cloud_assembly_output 
    > cdk deploy "parallel-service-b" --app ./cloud_assembly_output

This is what it looks like in my GULP deploy script:

gulp.task("deploy", async callback =>
{
  try
  {
    let config = await getConfig();
    printConfig(config);

    /* Convert TSC to JS dor CDK */
    await CommandExec("npm", ["run build"], paths.workingDir);

    /* Create Cloud Assembly */
    await CommandExec("cdk",[`synth "${stackNames.infra}" --profile ${config.AWSProfileName} ` +
      ` --output ${paths.cloudAssemblyOutPath}`], paths.workingDir);

    /* Deploy Infra stack */
    await CommandExec("cdk",[`deploy "${stackNames.infra}" --require-approval=never ` +
      ` --profile ${config.AWSProfileName} --progress=events --app ${paths.cloudAssemblyOutPath}`], paths.workingDir);

    /* Deploy Service Stacks in parallel */
    let serviceStacks = [stackNames.serviceA, stackNames.serviceB];
    let arrPromises = [];
    for (let stackName of serviceStacks)
    {
      arrPromises.push(
        CommandExec("cdk",[`deploy "${stackName}" --require-approval=never ` +
        ` --profile ${config.AWSProfileName} --progress=events --app ${paths.cloudAssemblyOutPath} --exclusively true`],
        paths.workingDir, true, process.env, `[${stackName}] `)
      );
    }
    await Promise.all(arrPromises);

    callback();
  }
  catch (e)
  {
    callback(e);
  }
});

Section of the GULP deploy config file, complete file can be found here

TL;DR

CDK stacks can be deployed in parallel by generating a cloud assembly output and then specifying the order explicitly.

The project that is referenced in this post can be found here: https://github.com/rehanvdm/aws-cdk-parallel-deploy