What should you know about AWS CodePipeline?

8 min readMay 9, 2020

AWS CodePipeline is one of the most widely used continuous delivery service at present. It provides an automated release pipeline for CICD processes by ensuring the fast, and reliable software releases with easy maintenance of the infrastructure with the help of lot of fully managed services in AWS. As we all know, Continuous Delivery process (CD) (referred as Continuous Deployment in some places) is the key process in CICD which meets end-users/clients at the other end. Therefore, choosing a better mechanism to fulfill the requirement obviously adds more values to your software product. I will explain key facts that everyone should know about AWS CodePipeline and the overall execution process in this story.

Let’s have a look at the following diagram first. It shows a sample CICD process including AWS CodePipeline as the CD process. First, try to understand something on your own.

** Note: I am providing a simple figure with only Dev and Prod stages in the pipeline (source stage is a general stage). However, the number of stages can be varied. **

Figure 1: Sample CICD process using AWS CodePipeline in CD

As you can see, developers only do the code changes. The CodePipeline takes the responsibility from the source changes to add new version of the product to the production level. Let’s focus on the subject back!

You may see that I have highlighted three main items in the above diagram. Which are stages, actions, and transitions. Apart from these three, there is one more major entity called artifacts for AWS CodePipeline which we cannot see above. In order to play with AWS CodePipeline, everyone should understand these four concepts first. Of course, there are lot more advanced concepts behind!

Let’s identify what they are and how CodePipeline defines them though we have heard the terms already. Before moving forward, I hope that you know what is a pipeline :). As per the topic here, a pipeline can be considered as a set of activities that describes how software releases should go in a release process.

Stages

A stage is a logical unit in a pipeline that can be used to isolate an environment. Isolation is important to handle concurrent changes in the environment. Let’s consider the Dev-stage in figure 1 above. What happens when there is one execution in that stage and meanwhile Source-stage provides another execution to run? Thus, the isolation provides the capability to handle such situations. AWS CodePipeline itself has a strong in-built mechanism to handle them in a proper way. I will explain them later in this story.

2. Actions

As its name implies, an action is a set of operations defined to run in specified point in the pipeline. Actions are created inside stages. AWS CodePipeline defines a valid set of action types to be used anywhere in the pipeline. Those action types are source, build, test, deploy, approve, and invoke. For each of these actions, there are set of action providers as well. You can see all valid actions and action providers here. As an example from figure 1, I have used source as the action type and GitHub as the action provider.

3. Transitions

This is the bridge between two consecutive stages where a pipeline execution transitions from a stage to the next stage. When we create a pipeline, all the transitions are enabled by default. However, we can disable the inbound transition for any stage as we wish. Wait! Why do we need to disable a transition? The requirements may depend on the situation. However, disabling transitions can control the flow of pipeline executions as well. There will be more on it later in this story.

4. Artifacts

Artifact is referred here as a collection of data that is used within AWS CodePipeline. There are two types of artifacts such as input artifacts and output artifacts. Artifacts are used by actions in the pipeline. Each action should have at least one artifact. Basically, artifacts carry the information across the actions in the pipeline preserving the pipeline execution.

AWS CodePipeline workflow

Let’s use the same figure 1 with some modifications to show the execution flow as shown in figure 2 below. Think when developers introduce new features/ fixes or simply push a code change into the source code repository. Then, the flow can be explained as below.

Figure 2: AWS CodePipeline execution flow

The source code change triggers AWS CodePipeline. The source stage starts the pipeline execution by taking the source changes and build the output artifact in source action.
Pipeline execution transitions to the Dev (next) stage. First action in the Dev stage; build action takes the output artifact of the source action in source stage as the input artifact for it.
Pipeline execution transitions to the test action in the same stage by taking the input artifact from the build action.
Pipeline execution transitions to the Prod (next) stage. Let’s say that the manual approval action takes input artifact from the test action in Dev stage.
If the reviewer approves the execution, pipeline execution will transition to the deploy action in the Prod stage. Then, it will take input artifact and deploy in production environment. If it gets rejected, the stage will fall into Failed status and the manual approval action will fall into Rejected status.

If you carefully read the flow of execution explained above, you might have some doubts about these input and output artifacts. Actually how are they exchanged? Yeah! That is another key concept you should know over AWS CodePipeline. When we create an AWS CodePipeline we should provide an Amazon S3 bucket for it. This bucket is considered as the Artifact Store for the CodePipeline. Thus, each and every artifact created by each action will be stored in this bucket. When transition continues, actions which are down the line will take specific artifacts from this bucket as its input artifact. Like wise, when we create a pipeline and configure actions in stages, we have to set the input artifact for each action from the list of output artifacts that are created by the earlier actions. Further, there is no restriction to have only one artifact for an action. Multiple artifacts can be configured as input artifacts for an action. I will write about them further in future posts.

There are several methods to create an AWS CodePipeline for your software release automation process. One way is by using AWS CodePipeline console. If you are new to this, I suggest you to first play around creating a pipeline via the console. Because it will create all other supporting resources for you automatically. For example, when you use AWS console, you do not have to worry about creating the Artifact Store for the pipeline, Service Role to execute the pipeline, and some other too. Other than this, you can use AWS Command Line Interface (AWS CLI), the AWS SDK, or any combination of these to create or manage a pipeline. Since there are lot of details on those concepts, I am not going to explain everything in this single story.

Apart from those information, you might need following important facts too when create a CodePipeline.

There MUST BE at least one Source stage and one or more other stages.
Which means having a Source stage is a MUST for every CodePipeline. Then, there must be at least one stage or more.
Every action must be configured to get a available input artifact through the pipeline. Which means that the particular artifact should be generated as an output artifact by an action configured before this action in the pipeline.
Valid actions and action providers must be used during the pipeline creation.
A manual approval action can be used in any stage as you wish except in a source stage.
Be careful when provide permissions for the service roles which are configured for different services in the pipeline. Always provide permissions only for accessing required resources.
AWS CodePipeline can be triggered in multiple ways such as using a GitHub WebHook, pipeline polls for source changes, or using an AWS CloudWatch event.
AWS CodeBuild is the key service for AWS CodePipeline. If you already know this service, no further explanation is needed. If you do not, please be patient and I will bring you another story for that.

Why are stages used to isolate an environment and how does it work?

As I promised above, here we go!

Basically, concurrent executions are handled because of the environment isolation. Each stage is allowed to get only one pipeline execution into it at once. When a pipeline execution enters a stage, that stage will get locked. No any other execution can enter when it is locked. When the existing execution finishes and after going out from the stage, it will be unlocked. Thus, the waiting pipeline executions can enter the stage. Did you get it? What if there are multiple pipeline executions waiting to enter the stage? Which one will get into?

Figure 3: Stage locking against pipeline executions

When there are multiple pipeline executions waiting outside the stage until it is unlocked, the latest pipeline execution will supersede all the other executions. Thus, only the latest execution continues from there on-words. Others stop at there. That is the main idea and the importance of isolating environment in AWS CodePipeline. Follow the figure 4 below to refill your mind!

Figure 4: Latest execution supersedes old waiting executions

This behavior is unchanged when there is a disabled transition. When a transition is disabled, the next stage will not receive the pipeline execution. However, stages prior to it might execute more new executions and make a collection of executions to transition when the transition is enabled back. The latest pipeline execution will supersede others and continue there too similar to the above explanation. Therefore, some may need to manage enabling and disabling transition in some situations.

What are Pipeline Executions and Action Executions?

Can you figure out? Yes, of course! Pipeline executions are the changes which are released by the pipeline. That may be due to some changes to the source repository or one can manually starts the pipeline execution too. On the other hand, an action execution means a process of completing an action configured in the pipeline on either an input artifact or an output artifact. While these executions occur, there are valid statuses defined for an action as well as for the pipeline.

Valid AWS CodePipeline statuses: InProgress, Stopping, Stopped, Succeeded, Superseded, and Failed.

Valid Action statuses: InProgress, Succeeded, and Failed.
( Rejected and Approved statuses for a Manual Approval action )

Ways to stop a pipeline execution

As coming to the end of the story, let’s end this by stopping the pipeline executions. There two ways to stop execution via the AWS CodePipeline console such as Stop and Wait and Stop and Abandon.

Stop and Wait
This method allows all in-progress actions to complete their execution. Thus, after completing the actions, the pipeline will go to the stopped status. Since the action will complete the execution, the execution can be continued from there without any error later.
Stop and Abandon
This method does not allow all in-progress actions to complete. Thus, this does not wait for completion and every action execution will abandon. There is no guarantee of having completed actions after stopping the execution using stop and abandon method. Therefore, those actions are required to be executed again when the execution continues later.

Hope you could get something new from this.

What should you know about AWS CodePipeline?

AWS CodePipeline workflow

Why are stages used to isolate an environment and how does it work?

What are Pipeline Executions and Action Executions?

Ways to stop a pipeline execution

Written by Sahan Gunathilaka

No responses yet