Productionize ArgoCD
ArgoCD is like the new shiny toy that every kid wants to play with. It’s special and designed for the GenZ of applications, applications that are native to the cloud. Building a Proof Of Concept is easy but rolling out ArgoCD to production could be hard because it includes integration with existing tooling and other third-party systems used in the organization.We are going to discuss the challenges that we faced while implementing ArgoCD as The OPERATOR for our Continuous Delivery process.
After you have concluded that ArgoCD is the right choice, you need to start thinking about how to onboard your applications and users? What would be the right permission model that causes minimal intrusion for users and utilizes the systems already in place? The users we are talking about are a mix of developers and devops engineers.
In this blog, we will discuss Day1 tasks, identified below, that are required to make ArgoCD as a useful tool across the organization. Day2 tasks such as monitoring, alerting and support could be discussed in another blog.
Onboarding Applications
Application onboarding requires you to think about the relationship between your teams, applications and environments. We want each team to have freedom to build, test and deploy their apps ensuring a high degree of confidence before they bring the apps to the production pipeline.
The Relationship
The diagram below shows the cardinality between Teams, Applications and Environments. Please note that the Dev Environments that each team owns are not full-blown environments. That means, only services (and dependent services) that are used by the team, both upstream and downstream, are deployed.
Environments are not limited to only teams. In some cases, developers would like to have their own environment to try out new stuff.
The environments shown in the green box on the right side are full-blown environments, meaning, all services are deployed there. It shows the pipeline to production. When teams feel confident , they push their applications to an integration environment(first step in the pipeline) where we have built automation to test end-to-end functionality.
One team could own multiple applications and need one or more environments to deploy and test these inter-dependent applications before they are confident to promote their application to the integration environment.
Referring to the diagram below, Team1 owns applications a1, b1, c1, d1 and manages env1 and env2. `john`, a user in Team1, has his own environment. Env1 and env2 could run services from other teams if either of a1,b1,c1,d1 needed them.
Team maps to ArgoCD Project and binds to application ownership. We can use custom labels to identify and group by environment, application,etc.
Defining the Application
We use helmfile to manage helm release. When ArgoCD reads a helmfile, it deploys all the releases contained in that helmfile. To achieve separation of concern, we mapped each application individually to one ArgoCD Application. The 1-1 mapping provides you more granular control. Value overrides provide us to manage different configurations for each environment. An example of a helmfile is shown below. “ENV_VALUES_DIR” is evaluated at runtime and provides values for a specific environment. We use an OCI (Open Container Initiative) based registry to store helm charts.
helmDefaults:
timeout: 300
releases:
- name: mychart-app
chart: oci://localhost:5000/helm-charts/mychart
values:
- {{ env “ENV_VALUES_DIR” }}/mychart-app/mychart.values.yaml
The above helmfile definition translates into an ArgoCD application as shown below.(Sensitive text has been erased)
To simplify the application onboarding, we automated the ArgoCD application creation using the CLI tool so that users do not have to go to UI to create Applications.
Onboarding Users
Most likely, you have a system in place that is used for deploying applications and your users have become comfortable using it. We want to offer the path of least resistance for users to sign up for this new tool. We have 3 main categories of users here.
- Devops Team: This group of users focus on developing tools and applications that developers use to deploy applications into different types of environments.
- Service Team: These users own the application and focus more on delivering the features right. They want to test the functionality end-to-end before it is ready to be released for production environments.
- SRE Team: We think this user group is missed often. Imagine a production issue and you are paged at 2 am. In case you need to rollback or roll forward, you need access permission to deploy the fix on prod in the middle of the night.
We provided read-only access to ArgoCD for all engineering teams.
Most of the organizations would have some Identity and Access Management tools that provide user management and Single Sign On(SSO) functionality. We use a combination of WorkspaceOne and Active Directory(AD) in our case. Active Directory Group (AD Group) is a nice way to provide a unique identifier for any team. In our design, the AD Group sits in the center of the permission management.
Permission Management
ArgoCD manages permission in a declarative fashion using a CSV file. We mapped the AD Group to the role, which is the same as the team name in our case. Team name is team-flash and we create the group name as g.argocd-team-flash.
g, g.argocd-team-flash, role:team-flash
We use role and team name to assign permission to applications in an environment. Below is the format that we use.
p, role:team-name, applications, action, team-name/<envname>-appname, allow
Using the above format, here is an example. We are providing the permission to sync applications to team-flash in environment prduse1.
p, role:team-flash, applications, sync, team-flash/prduse1*, allow
Single Sign On
VMware Workspace One(WS1) brings SSO ability and integrates with AD groups. To onboard a new Team or user you need to create a new ArgoCD Project and a new AD Group. WS1 can register the new AD group and enable SSO for the group. Adding/removing users to AD groups is automated using another tool called AccessNow. You can manually manage it but having a tool to manage the workflow makes it easier.
User will authenticate into WS1 and when it’s successful, gets redirected to the ArgoCD URL.
Plugging into your CI-CD system
ArgoCD provides a nice user-friendly interface to interact with your applications across environments. However, when your CI-CD is automated every interaction has to happen in the background. Developers own the service definitions and might be pushing the changes any time. We use the ArgoCD CLI tool to automate application creation and update. Any change will be detected by ArgoCD and we can choose to deploy applications either automatically or manually. We prefer manual deployments for production clusters while automated for development and test clusters.
Managing Multiple Environments
ArgoCD makes it easier to manage multiple environments across multiple clusters. It is super useful debugging when we quickly and visually want to identify the root cause of deployment failure. Our design focuses on minimizing the blast radius when some wrong code is merged into main.
We use branches to isolate the concern. As shown in the diagram below, we see that ArgoCD is listening to the same git repository but the environments are deployed from separate branches.
Applications for each environment are pointing to a dedicated branch for the environment.
For e.g., Applications deployed in DEV environment point to Main branch in the git repository.
The CD pipeline manages the promotion of helm charts from through DEV->TEST->PRD ensuring proper checks and balances with a gating factor to maintain the quality.