Advanced GitOps Implementation with ArgoCD and Flux: Enterprise Production Framework 2026
GitOps represents a paradigm shift in how we approach continuous delivery and infrastructure management, leveraging Git as the single source of truth for declarative infrastructure and application deployment. This comprehensive guide explores advanced GitOps implementation patterns using ArgoCD and Flux, covering enterprise-scale multi-cluster architectures, progressive delivery strategies, and production-ready automation frameworks.
Enterprise GitOps Architecture Framework
GitOps Principles and Implementation Strategy
GitOps fundamentally transforms deployment workflows by treating Git repositories as the declarative source of truth for both application configurations and infrastructure state, enabling automated, auditable, and rollback-capable deployment processes.
Core GitOps Implementation Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Enterprise GitOps Platform │
├─────────────────┬─────────────────┬─────────────────┬───────────┤
│ Source │ GitOps │ Target │ Policy │
│ Management │ Controllers │ Clusters │ Engine │
├─────────────────┼─────────────────┼─────────────────┼───────────┤
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────┐ │ ┌───────┐ │
│ │ Git Repos │ │ │ ArgoCD │ │ │ Production │ │ │ OPA │ │
│ │ - Apps │ │ │ Flux │ │ │ Staging │ │ │ Policies│ │
│ │ - Config │ │ │ Tekton │ │ │ Development │ │ │ RBAC │ │
│ │ - Infra │ │ │ Jenkins X │ │ │ Edge/IoT │ │ │ Security│ │
│ └─────────────┘ │ └─────────────┘ │ └─────────────┘ │ └───────┘ │
│ │ │ │ │
│ • Versioning │ • Sync engines │ • Multi-cluster │ • Compliance│
│ • Reviews │ • Health checks │ • Multi-cloud │ • Governance│
│ • Approvals │ • Rollbacks │ • Edge compute │ • Audit │
└─────────────────┴─────────────────┴─────────────────┴───────────┘
Advanced ArgoCD Enterprise Configuration
ArgoCD provides sophisticated declarative continuous delivery capabilities with support for complex application dependencies, multi-source configurations, and enterprise security requirements.
# argocd-enterprise-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-server-config
namespace: argocd
data:
# Multi-source application support
application.instanceLabelKey: argocd.argoproj.io/instance
server.rbac.log.enforce.enable: "true"
# Enterprise authentication configuration
url: https://argocd.company.com
dex.config: |
connectors:
- type: oidc
id: corporate-sso
name: Corporate SSO
config:
issuer: https://auth.company.com
clientID: argocd-client
clientSecret: $corporate-sso:clientSecret
requestedScopes: ["openid", "profile", "email", "groups"]
requestedIDTokenClaims: {"groups": {"essential": true}}
- type: ldap
id: corporate-ldap
name: Corporate LDAP
config:
host: ldap.company.com:636
insecureNoSSL: false
bindDN: "cn=service-account,ou=apps,dc=company,dc=com"
bindPW: $corporate-ldap:bindPW
usernamePrompt: "Corporate Username"
userSearch:
baseDN: "ou=users,dc=company,dc=com"
filter: "(objectClass=person)"
username: sAMAccountName
idAttr: DN
emailAttr: mail
nameAttr: displayName
groupSearch:
baseDN: "ou=groups,dc=company,dc=com"
filter: "(objectClass=group)"
userMatchers:
- userAttr: DN
groupAttr: member
nameAttr: cn
# Advanced RBAC configuration
policy.default: role:readonly
policy.csv: |
p, role:admin, applications, *, */*, allow
p, role:admin, clusters, *, *, allow
p, role:admin, repositories, *, *, allow
p, role:deployer, applications, create, */*, allow
p, role:deployer, applications, update, */*, allow
p, role:deployer, applications, sync, */*, allow
p, role:deployer, applications, action/*, */*, allow
p, role:developer, applications, get, dev/*, allow
p, role:developer, applications, sync, dev/*, allow
p, role:readonly, applications, get, */*, allow
p, role:readonly, repositories, get, *, allow
p, role:readonly, clusters, get, *, allow
g, company:platform-team, role:admin
g, company:sre-team, role:deployer
g, company:dev-team, role:developer
# Resource customizations and health checks
resource.customizations.health.argoproj.io_Application: |
hs = {}
hs.status = "Progressing"
hs.message = ""
if obj.status ~= nil then
if obj.status.health ~= nil then
hs.status = obj.status.health.status
if obj.status.health.message ~= nil then
hs.message = obj.status.health.message
end
end
end
return hs
# Application controller configuration
application.controller.repo.server.timeout.seconds: "300"
application.controller.status.processors: "20"
application.controller.operation.processors: "10"
application.controller.self.heal.timeout.seconds: "300"
application.controller.repo.server.plaintext: "false"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: argocd-application-controller
namespace: argocd
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: argocd-application-controller
template:
metadata:
labels:
app.kubernetes.io/name: argocd-application-controller
spec:
serviceAccountName: argocd-application-controller
containers:
- name: application-controller
image: quay.io/argoproj/argocd:v2.10.0
command:
- argocd-application-controller
args:
- --status-processors=20
- --operation-processors=10
- --app-resync=180
- --repo-server-timeout-seconds=300
- --self-heal-timeout-seconds=300
- --metrics-port=8082
- --redis=argocd-redis:6379
env:
- name: ARGOCD_CONTROLLER_REPLICAS
value: "2"
- name: ARGOCD_RECONCILIATION_TIMEOUT
value: "300s"
- name: ARGOCD_HARD_RECONCILIATION_TIMEOUT
value: "0s"
- name: ARGOCD_APPLICATION_CONTROLLER_REPO_SERVER_TIMEOUT_SECONDS
value: "300"
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
readinessProbe:
httpGet:
path: /healthz
port: 8082
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8082
initialDelaySeconds: 60
periodSeconds: 30
Advanced Flux v2 Enterprise Implementation
Flux v2 provides a comprehensive GitOps toolkit with advanced capabilities for multi-tenancy, progressive delivery, and infrastructure automation through a collection of specialized controllers.
# flux-system-enterprise.yaml
apiVersion: v1
kind: Namespace
metadata:
name: flux-system
labels:
app.kubernetes.io/instance: flux-system
app.kubernetes.io/part-of: flux
app.kubernetes.io/version: v2.2.0
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
secretRef:
name: flux-system
url: ssh://git@github.com/company/platform-gitops
verify:
mode: strict
secretRef:
name: cosign-public-key
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: flux-system
namespace: flux-system
spec:
interval: 10m0s
path: "./clusters/production"
prune: true
sourceRef:
kind: GitRepository
name: flux-system
validation: client
decryption:
provider: sops
secretRef:
name: sops-age
postBuild:
substitute:
cluster_name: "production-eks-us-west-2"
cluster_region: "us-west-2"
environment: "production"
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: kustomize-controller
namespace: flux-system
- apiVersion: apps/v1
kind: Deployment
name: source-controller
namespace: flux-system
patches:
- patch: |
- op: replace
path: /spec/template/spec/containers/0/resources/limits/memory
value: 2Gi
target:
kind: Deployment
name: kustomize-controller
namespace: flux-system
---
# Multi-source Helm repository configuration
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: enterprise-charts
namespace: flux-system
spec:
interval: 5m0s
url: https://charts.company.com
secretRef:
name: enterprise-charts-auth
oci: false
type: default
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
name: enterprise-oci-charts
namespace: flux-system
spec:
interval: 5m0s
url: oci://registry.company.com/helm-charts
secretRef:
name: oci-registry-auth
ref:
tag: "latest"
---
# Advanced notification configuration
apiVersion: notification.toolkit.fluxcd.io/v1beta2
kind: Provider
metadata:
name: slack-alerts
namespace: flux-system
spec:
type: slack
channel: "#platform-alerts"
secretRef:
name: slack-webhook
---
apiVersion: notification.toolkit.fluxcd.io/v1beta2
kind: Alert
metadata:
name: flux-system-alerts
namespace: flux-system
spec:
providerRef:
name: slack-alerts
eventSeverity: info
eventSources:
- kind: GitRepository
name: flux-system
- kind: Kustomization
name: flux-system
- kind: HelmRelease
name: "*"
namespace: "*"
summary: "Flux GitOps Event: {{ .InvolvedObject.kind }}/{{ .InvolvedObject.name }} in {{ .InvolvedObject.namespace }}"
inclusionList:
- ".*succeeded.*"
- ".*failed.*"
- ".*error.*"
exclusionList:
- ".*waiting.*"
- ".*pending.*"
Multi-Cluster GitOps Management
Enterprise Multi-Cluster ArgoCD Configuration
Managing multiple Kubernetes clusters requires sophisticated cluster registration, RBAC configuration, and application distribution strategies that ensure consistent deployment across diverse environments.
# argocd-cluster-management.yaml
apiVersion: v1
kind: Secret
metadata:
name: production-cluster
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: production-eks-us-west-2
server: https://A1B2C3D4E5F6.gr7.us-west-2.eks.amazonaws.com
config: |
{
"bearerToken": "eyJhbGciOiJSUzI1NiIsImtpZCI6...",
"tlsClientConfig": {
"insecure": false,
"caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0t..."
},
"awsAuthConfig": {
"clusterName": "production-eks-us-west-2",
"roleARN": "arn:aws:iam::123456789012:role/ArgoCD-EKS-Access"
},
"execProviderConfig": {
"command": "aws",
"args": ["eks", "get-token", "--cluster-name", "production-eks-us-west-2"],
"env": {
"AWS_REGION": "us-west-2"
},
"apiVersion": "client.authentication.k8s.io/v1beta1",
"installHint": "Install aws-cli and configure AWS credentials"
}
}
---
apiVersion: v1
kind: Secret
metadata:
name: staging-cluster
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: staging-eks-us-east-1
server: https://F6E5D4C3B2A1.sk8.us-east-1.eks.amazonaws.com
config: |
{
"bearerToken": "eyJhbGciOiJSUzI1NiIsImtpZCI6...",
"tlsClientConfig": {
"insecure": false,
"caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0t..."
},
"awsAuthConfig": {
"clusterName": "staging-eks-us-east-1",
"roleARN": "arn:aws:iam::123456789012:role/ArgoCD-EKS-Access"
}
}
---
# ApplicationSet for multi-cluster deployment
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: multi-cluster-applications
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
environment: production
- clusters:
selector:
matchLabels:
environment: staging
- git:
repoURL: https://github.com/company/platform-applications
revision: HEAD
directories:
- path: applications/*
- path: infrastructure/*
template:
metadata:
name: '{{path.basename}}-{{name}}'
labels:
environment: '{{metadata.labels.environment}}'
cluster: '{{name}}'
spec:
project: default
source:
repoURL: https://github.com/company/platform-applications
targetRevision: HEAD
path: '{{path}}'
helm:
valueFiles:
- values.yaml
- values-{{metadata.labels.environment}}.yaml
parameters:
- name: cluster.name
value: '{{name}}'
- name: cluster.environment
value: '{{metadata.labels.environment}}'
- name: cluster.region
value: '{{metadata.labels.region}}'
destination:
server: '{{server}}'
namespace: '{{path.basename}}'
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
Flux Multi-Cluster Tenant Management
Flux provides sophisticated multi-tenancy capabilities that enable secure, isolated deployment environments while maintaining centralized policy enforcement and operational oversight.
# flux-multi-tenant-setup.yaml
apiVersion: v1
kind: Namespace
metadata:
name: tenant-team-alpha
labels:
toolkit.fluxcd.io/tenant: team-alpha
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: team-alpha
namespace: tenant-team-alpha
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-alpha-reconciler
namespace: tenant-team-alpha
subjects:
- kind: ServiceAccount
name: team-alpha
namespace: tenant-team-alpha
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flux-tenant-reconciler
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: team-alpha-apps
namespace: tenant-team-alpha
spec:
interval: 5m0s
ref:
branch: main
url: https://github.com/company/team-alpha-applications
secretRef:
name: team-alpha-git-auth
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: team-alpha-apps
namespace: tenant-team-alpha
spec:
interval: 10m0s
path: "./environments/production"
prune: true
sourceRef:
kind: GitRepository
name: team-alpha-apps
serviceAccountName: team-alpha
validation: client
dependsOn:
- name: shared-infrastructure
namespace: flux-system
postBuild:
substitute:
tenant_name: "team-alpha"
environment: "production"
resource_quota_cpu: "10"
resource_quota_memory: "20Gi"
patches:
- patch: |
- op: add
path: /metadata/labels/tenant
value: team-alpha
target:
labelSelector: "!metadata.labels.tenant"
---
# Network policy for tenant isolation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: team-alpha-isolation
namespace: tenant-team-alpha
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: istio-system
- namespaceSelector:
matchLabels:
toolkit.fluxcd.io/tenant: team-alpha
- namespaceSelector:
matchLabels:
name: shared-services
egress:
- to:
- namespaceSelector:
matchLabels:
name: istio-system
- namespaceSelector:
matchLabels:
toolkit.fluxcd.io/tenant: team-alpha
- to: []
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
- to: []
ports:
- protocol: TCP
port: 443
- protocol: TCP
port: 80
Progressive Delivery and Canary Deployments
ArgoCD Rollouts Integration
ArgoCD Rollouts provides advanced deployment capabilities including canary releases, blue-green deployments, and traffic-based progressive delivery with integrated analysis and automated rollback mechanisms.
# argocd-rollouts-configuration.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: application-rollout
namespace: production
spec:
replicas: 10
strategy:
canary:
maxSurge: "25%"
maxUnavailable: 0
analysis:
templates:
- templateName: success-rate
startingStep: 2
args:
- name: service-name
value: application-service
- name: namespace
valueFrom:
fieldRef:
fieldPath: metadata.namespace
canaryService: application-canary
stableService: application-stable
trafficRouting:
istio:
virtualService:
name: application-virtualservice
routes:
- primary
destinationRule:
name: application-destinationrule
canarySubsetName: canary
stableSubsetName: stable
nginx:
stableIngress: application-stable
annotationPrefix: nginx.ingress.kubernetes.io
additionalIngressAnnotations:
canary-by-header: X-Canary
canary-by-header-value: "true"
steps:
- setWeight: 5
- pause:
duration: 2m
- setWeight: 10
- pause:
duration: 5m
- analysis:
templates:
- templateName: success-rate
- templateName: latency
args:
- name: service-name
value: application-service
- setWeight: 20
- pause:
duration: 10m
- setWeight: 40
- pause:
duration: 15m
- setWeight: 60
- pause:
duration: 20m
- setWeight: 80
- pause:
duration: 30m
- setWeight: 100
- pause:
duration: 5m
abortCondition: "result.successRate < 0.95 || result.latencyP99 > 1000"
selector:
matchLabels:
app: application
template:
metadata:
labels:
app: application
spec:
containers:
- name: application
image: company/application:v2.1.0
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 512Mi
cpu: 500m
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
env:
- name: ENVIRONMENT
value: "production"
- name: VERSION
value: "v2.1.0"
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
namespace: production
spec:
args:
- name: service-name
- name: namespace
value: default
metrics:
- name: success-rate
interval: 60s
count: 5
successCondition: result[0] >= 0.95
failureLimit: 3
provider:
prometheus:
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
sum(rate(http_requests_total{job="{{args.service-name}}",code!~"5.."}[2m])) /
sum(rate(http_requests_total{job="{{args.service-name}}"}[2m]))
- name: latency
interval: 60s
count: 5
successCondition: result[0] <= 1000
failureLimit: 3
provider:
prometheus:
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket{job="{{args.service-name}}"}[2m])) by (le)
) * 1000
Flagger Progressive Delivery
Flagger provides automated canary deployments with advanced traffic management, metrics analysis, and rollback capabilities integrated with service mesh technologies.
# flagger-canary-configuration.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: application-canary
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: application
progressDeadlineSeconds: 600
service:
port: 80
targetPort: 8080
name: http
appProtocol: http
headers:
request:
add:
x-canary-region: us-west-2
x-canary-version: v2.1.0
response:
remove:
x-envoy-upstream-service-time: ""
gateways:
- istio-system/public-gateway
hosts:
- app.company.com
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
retries:
attempts: 3
perTryTimeout: 30s
retryOn: 5xx,reset,connect-failure,refused-stream
corsPolicy:
allowOrigins:
- exact: "https://company.com"
- prefix: "https://*.company.com"
allowMethods:
- GET
- POST
- PUT
- DELETE
allowHeaders:
- authorization
- content-type
- x-requested-with
maxAge: 24h
analysis:
interval: 60s
threshold: 5
maxWeight: 50
stepWeight: 10
iterations: 10
match:
- headers:
x-canary:
exact: "true"
- headers:
user-agent:
regex: ".*Chrome.*"
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 60s
query: |
sum(
rate(
istio_requests_total{
reporter="destination",
destination_service_name="application",
destination_service_namespace="production",
response_code!~"5.*"
}[1m]
)
) /
sum(
rate(
istio_requests_total{
reporter="destination",
destination_service_name="application",
destination_service_namespace="production"
}[1m]
)
) * 100
- name: request-duration
thresholdRange:
max: 500
interval: 60s
query: |
histogram_quantile(0.99,
sum(
rate(
istio_request_duration_milliseconds_bucket{
reporter="destination",
destination_service_name="application",
destination_service_namespace="production"
}[1m]
)
) by (le)
)
- name: error-rate
thresholdRange:
max: 1
interval: 60s
query: |
sum(
rate(
istio_requests_total{
reporter="destination",
destination_service_name="application",
destination_service_namespace="production",
response_code=~"5.*"
}[1m]
)
) /
sum(
rate(
istio_requests_total{
reporter="destination",
destination_service_name="application",
destination_service_namespace="production"
}[1m]
)
) * 100
webhooks:
- name: pre-rollout
type: pre-rollout
url: http://webhook-service.monitoring.svc.cluster.local:8080/pre-rollout
timeout: 30s
metadata:
environment: production
application: application
version: v2.1.0
- name: rollout
type: rollout
url: http://webhook-service.monitoring.svc.cluster.local:8080/rollout
timeout: 30s
metadata:
environment: production
application: application
- name: post-rollout
type: post-rollout
url: http://webhook-service.monitoring.svc.cluster.local:8080/post-rollout
timeout: 30s
metadata:
environment: production
application: application
status: "{{ .Status }}"
reason: "{{ .Reason }}"
Enterprise Security and Compliance
Secret Management Integration
Enterprise GitOps implementations require sophisticated secret management that integrates with external secret stores while maintaining security boundaries and audit trails.
# gitops-secret-management.yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-backend
namespace: flux-system
spec:
provider:
vault:
server: "https://vault.company.com"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "flux-operator"
serviceAccountRef:
name: "external-secrets-operator"
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: argocd-repo-credentials
namespace: argocd
spec:
refreshInterval: 300s
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: repo-credentials
creationPolicy: Owner
template:
type: Opaque
metadata:
labels:
argocd.argoproj.io/secret-type: repo-creds
data:
type: git
url: "https://github.com/company/platform-gitops"
password: "{{ .token }}"
username: "git"
data:
- secretKey: token
remoteRef:
key: gitops/github
property: access_token
---
# SOPS encryption configuration for Flux
apiVersion: v1
kind: Secret
metadata:
name: sops-age
namespace: flux-system
type: Opaque
data:
age.agekey: |
AGE-SECRET-KEY-1234567890ABCDEF...
---
# Kustomization with SOPS decryption
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: encrypted-secrets
namespace: flux-system
spec:
interval: 10m0s
path: "./secrets/production"
prune: true
sourceRef:
kind: GitRepository
name: flux-system
decryption:
provider: sops
secretRef:
name: sops-age
validation: client
postBuild:
substitute:
cluster_name: "production-eks-us-west-2"
Policy-as-Code Implementation
Advanced GitOps implementations integrate policy engines to enforce security, compliance, and operational standards across all deployed resources.
# gitops-policy-enforcement.yaml
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: gatekeeper-system
spec:
match:
- excludedNamespaces: ["kube-system", "gatekeeper-system", "flux-system", "argocd"]
processes: ["*"]
validation:
traces:
- user:
kind:
group: "*"
version: "*"
kind: "*"
kind:
group: "*"
version: "*"
kind: "*"
parameters:
assign:
assignImage:
- image: "company-registry.com/*"
allowlist: ["company-registry.com/approved/*"]
---
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
validation:
openAPIV3Schema:
type: object
properties:
labels:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg}] {
required := input.parameters.labels
provided := input.review.object.metadata.labels
missing := required[_]
not provided[missing]
msg := sprintf("Missing required label: %v", [missing])
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: deployment-labels
spec:
match:
kinds:
- apiGroups: ["apps"]
kinds: ["Deployment"]
parameters:
labels: ["app.kubernetes.io/name", "app.kubernetes.io/version", "app.kubernetes.io/managed-by", "environment", "team"]
---
# Conftest policy for pre-commit validation
# .conftest/policy/deployment.rego
package main
deny[msg] {
input.kind == "Deployment"
not input.metadata.labels["app.kubernetes.io/name"]
msg := "Deployment must have app.kubernetes.io/name label"
}
deny[msg] {
input.kind == "Deployment"
not input.spec.template.spec.securityContext.runAsNonRoot == true
msg := "Containers must run as non-root user"
}
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.securityContext.allowPrivilegeEscalation == false
msg := sprintf("Container %s must not allow privilege escalation", [container.name])
}
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.resources.limits.memory
msg := sprintf("Container %s must have memory limits", [container.name])
}
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.resources.limits.cpu
msg := sprintf("Container %s must have CPU limits", [container.name])
}
Monitoring and Observability
GitOps Operations Monitoring
Comprehensive monitoring of GitOps operations requires tracking sync status, performance metrics, and operational health across multiple clusters and applications.
# gitops-monitoring-configuration.yaml
apiVersion: v1
kind: ServiceMonitor
metadata:
name: argocd-metrics
namespace: argocd
spec:
selector:
matchLabels:
app.kubernetes.io/name: argocd-metrics
endpoints:
- port: metrics
interval: 30s
path: /metrics
---
apiVersion: v1
kind: ServiceMonitor
metadata:
name: flux-metrics
namespace: flux-system
spec:
selector:
matchLabels:
app: source-controller
endpoints:
- port: http-prom
interval: 30s
path: /metrics
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: gitops-alerts
namespace: monitoring
spec:
groups:
- name: argocd.rules
rules:
- alert: ArgoCDApplicationSyncFailed
expr: argocd_app_sync_total{phase="Failed"} > 0
for: 5m
labels:
severity: critical
service: argocd
annotations:
summary: "ArgoCD application sync failed"
description: "Application {{ $labels.name }} in namespace {{ $labels.namespace }} sync failed"
- alert: ArgoCDApplicationHealthDegraded
expr: argocd_app_health_status{health_status!="Healthy"} > 0
for: 10m
labels:
severity: warning
service: argocd
annotations:
summary: "ArgoCD application health degraded"
description: "Application {{ $labels.name }} health status is {{ $labels.health_status }}"
- alert: ArgoCDRepositoryConnectionFailed
expr: argocd_repo_connection_status == 0
for: 5m
labels:
severity: critical
service: argocd
annotations:
summary: "ArgoCD repository connection failed"
description: "Repository {{ $labels.repo }} connection failed"
- name: flux.rules
rules:
- alert: FluxReconciliationFailure
expr: gotk_reconcile_condition{type="Ready",status="False"} == 1
for: 10m
labels:
severity: critical
service: flux
annotations:
summary: "Flux reconciliation failure"
description: "{{ $labels.kind }}/{{ $labels.name }} reconciliation failed in namespace {{ $labels.namespace }}"
- alert: FluxSourceNotReady
expr: gotk_source_condition{type="Ready",status="False"} == 1
for: 15m
labels:
severity: warning
service: flux
annotations:
summary: "Flux source not ready"
description: "Source {{ $labels.kind }}/{{ $labels.name }} is not ready in namespace {{ $labels.namespace }}"
- alert: FluxSuspendedResource
expr: gotk_suspend_status == 1
for: 1h
labels:
severity: info
service: flux
annotations:
summary: "Flux resource suspended"
description: "Resource {{ $labels.kind }}/{{ $labels.name }} is suspended in namespace {{ $labels.namespace }}"
---
# Grafana dashboard for GitOps monitoring
apiVersion: v1
kind: ConfigMap
metadata:
name: gitops-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
gitops-overview.json: |
{
"dashboard": {
"id": null,
"title": "GitOps Operations Overview",
"tags": ["gitops", "argocd", "flux"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Application Sync Status",
"type": "stat",
"targets": [
{
"expr": "count by (phase) (argocd_app_sync_total)",
"legendFormat": "{{ phase }}"
}
],
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 1},
{"color": "red", "value": 5}
]
}
}
}
},
{
"id": 2,
"title": "Flux Reconciliation Rate",
"type": "graph",
"targets": [
{
"expr": "rate(gotk_reconcile_duration_seconds_count[5m])",
"legendFormat": "{{ kind }}/{{ name }}"
}
]
},
{
"id": 3,
"title": "Repository Sync Health",
"type": "table",
"targets": [
{
"expr": "argocd_repo_connection_status",
"legendFormat": "{{ repo }}"
}
]
}
]
}
}
Advanced GitOps Patterns
Application of Applications Pattern
The Application of Applications pattern enables hierarchical application management and dependency resolution in complex enterprise environments.
# app-of-apps-pattern.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: platform-bootstrap
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/company/platform-bootstrap
targetRevision: HEAD
path: applications
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
---
# Infrastructure application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: infrastructure
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/company/platform-infrastructure
targetRevision: HEAD
path: base
helm:
valueFiles:
- ../environments/production/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: infrastructure
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- ServerSideApply=true
syncWaves:
- wave: 0
---
# Platform services application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: platform-services
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/company/platform-services
targetRevision: HEAD
path: overlays/production
destination:
server: https://kubernetes.default.svc
namespace: platform
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
syncWaves:
- wave: 1
dependsOn:
- group: argoproj.io
kind: Application
name: infrastructure
namespace: argocd
GitOps Workflows and Automation
Advanced GitOps implementations require sophisticated automation workflows that integrate with existing development processes and provide comprehensive audit trails.
#!/bin/bash
# gitops-automation-workflow.sh
set -euo pipefail
# GitOps automation script for enterprise deployments
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(git rev-parse --show-toplevel)"
ENVIRONMENT="${1:-staging}"
APPLICATION="${2:-all}"
DRY_RUN="${3:-false}"
# Configuration
ARGOCD_SERVER="argocd.company.com"
VAULT_ADDR="https://vault.company.com"
GITHUB_ORG="company"
GITOPS_REPO="platform-gitops"
# Logging configuration
LOG_LEVEL="${LOG_LEVEL:-INFO}"
LOG_FILE="/var/log/gitops-automation.log"
log() {
local level="$1"
shift
local message="$*"
local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$timestamp] [$level] $message" | tee -a "$LOG_FILE"
}
# Validate environment and prerequisites
validate_prerequisites() {
log "INFO" "Validating prerequisites for GitOps deployment"
# Check required tools
local required_tools=("kubectl" "argocd" "flux" "git" "jq" "yq" "helm")
for tool in "${required_tools[@]}"; do
if ! command -v "$tool" &> /dev/null; then
log "ERROR" "Required tool $tool is not installed"
exit 1
fi
done
# Validate Kubernetes context
local current_context=$(kubectl config current-context)
if [[ "$current_context" != *"$ENVIRONMENT"* ]]; then
log "ERROR" "Invalid Kubernetes context: $current_context. Expected: *$ENVIRONMENT*"
exit 1
fi
# Validate ArgoCD connectivity
if ! argocd cluster list --server "$ARGOCD_SERVER" &> /dev/null; then
log "ERROR" "Cannot connect to ArgoCD server: $ARGOCD_SERVER"
exit 1
fi
log "INFO" "Prerequisites validation completed successfully"
}
# Generate application manifests
generate_manifests() {
local app_name="$1"
local environment="$2"
log "INFO" "Generating manifests for application: $app_name, environment: $environment"
# Create application directory structure
local app_dir="$REPO_ROOT/applications/$app_name"
local env_dir="$app_dir/environments/$environment"
mkdir -p "$env_dir"
# Generate base kustomization
cat > "$app_dir/kustomization.yaml" << EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- base
commonLabels:
app.kubernetes.io/name: $app_name
app.kubernetes.io/managed-by: gitops
images:
- name: $app_name
newTag: \${IMAGE_TAG}
replicas:
- name: $app_name
count: \${REPLICA_COUNT}
EOF
# Generate environment-specific overlay
cat > "$env_dir/kustomization.yaml" << EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
patchesStrategicMerge:
- deployment-patch.yaml
- service-patch.yaml
configMapGenerator:
- name: $app_name-config
files:
- application.properties
- logging.properties
secretGenerator:
- name: $app_name-secrets
envs:
- secrets.env
vars:
- name: ENVIRONMENT
objref:
kind: ConfigMap
name: $app_name-config
apiVersion: v1
fieldref:
fieldpath: data.environment
EOF
log "INFO" "Manifests generated successfully for $app_name"
}
# Validate manifests against policies
validate_manifests() {
local app_path="$1"
log "INFO" "Validating manifests against security and compliance policies"
# Run Conftest policy validation
if command -v conftest &> /dev/null; then
if ! conftest verify --policy "$REPO_ROOT/.conftest/policy" "$app_path"/*.yaml; then
log "ERROR" "Policy validation failed for $app_path"
return 1
fi
fi
# Run OPA Gatekeeper dry-run validation
if command -v gator &> /dev/null; then
if ! gator test "$REPO_ROOT/.gatekeeper/constraints" "$app_path"; then
log "ERROR" "Gatekeeper validation failed for $app_path"
return 1
fi
fi
# Validate Kubernetes manifests
if ! kubectl apply --dry-run=client -f "$app_path"; then
log "ERROR" "Kubernetes manifest validation failed for $app_path"
return 1
fi
log "INFO" "Manifest validation completed successfully"
}
# Deploy applications using GitOps
deploy_application() {
local app_name="$1"
local environment="$2"
local dry_run="$3"
log "INFO" "Deploying application: $app_name to environment: $environment (dry-run: $dry_run)"
# Generate ArgoCD application manifest
local app_manifest="$REPO_ROOT/argocd/applications/$environment/$app_name.yaml"
mkdir -p "$(dirname "$app_manifest")"
cat > "$app_manifest" << EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: $app_name-$environment
namespace: argocd
labels:
environment: $environment
application: $app_name
annotations:
argocd.argoproj.io/sync-wave: "10"
spec:
project: $environment
source:
repoURL: https://github.com/$GITHUB_ORG/$GITOPS_REPO
targetRevision: HEAD
path: applications/$app_name/environments/$environment
kustomize:
images:
- $app_name=\${IMAGE_REGISTRY}/$app_name:\${IMAGE_TAG}
destination:
server: https://kubernetes.default.svc
namespace: $app_name-$environment
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
revisionHistoryLimit: 10
EOF
if [[ "$dry_run" == "true" ]]; then
log "INFO" "Dry-run mode: Application manifest generated at $app_manifest"
cat "$app_manifest"
return 0
fi
# Apply the application to ArgoCD
if kubectl apply -f "$app_manifest"; then
log "INFO" "Application $app_name deployed successfully to $environment"
# Wait for sync completion
argocd app wait "$app_name-$environment" \
--server "$ARGOCD_SERVER" \
--timeout 600 \
--health
log "INFO" "Application $app_name sync completed successfully"
else
log "ERROR" "Failed to deploy application $app_name to $environment"
return 1
fi
}
# Monitor deployment status
monitor_deployment() {
local app_name="$1"
local environment="$2"
log "INFO" "Monitoring deployment status for $app_name in $environment"
# Get application status
local app_status=$(argocd app get "$app_name-$environment" \
--server "$ARGOCD_SERVER" \
--output json | jq -r '.status.sync.status')
local health_status=$(argocd app get "$app_name-$environment" \
--server "$ARGOCD_SERVER" \
--output json | jq -r '.status.health.status')
log "INFO" "Application $app_name sync status: $app_status"
log "INFO" "Application $app_name health status: $health_status"
# Check for sync errors
if [[ "$app_status" != "Synced" ]]; then
log "WARNING" "Application $app_name is not in sync"
argocd app diff "$app_name-$environment" --server "$ARGOCD_SERVER"
fi
# Check for health issues
if [[ "$health_status" != "Healthy" ]]; then
log "WARNING" "Application $app_name is not healthy"
kubectl get pods -n "$app_name-$environment" -o wide
fi
}
# Rollback application
rollback_application() {
local app_name="$1"
local environment="$2"
local revision="$3"
log "INFO" "Rolling back application $app_name in $environment to revision $revision"
# Perform rollback
if argocd app rollback "$app_name-$environment" "$revision" \
--server "$ARGOCD_SERVER"; then
log "INFO" "Rollback initiated for $app_name to revision $revision"
# Wait for rollback completion
argocd app wait "$app_name-$environment" \
--server "$ARGOCD_SERVER" \
--timeout 300 \
--health
log "INFO" "Rollback completed successfully for $app_name"
else
log "ERROR" "Failed to rollback application $app_name"
return 1
fi
}
# Main execution
main() {
log "INFO" "Starting GitOps automation workflow"
log "INFO" "Environment: $ENVIRONMENT, Application: $APPLICATION, Dry-run: $DRY_RUN"
# Validate prerequisites
validate_prerequisites
# Process applications
if [[ "$APPLICATION" == "all" ]]; then
local apps=($(find "$REPO_ROOT/applications" -maxdepth 1 -type d -exec basename {} \; | grep -v applications))
for app in "${apps[@]}"; do
if [[ -d "$REPO_ROOT/applications/$app/environments/$ENVIRONMENT" ]]; then
validate_manifests "$REPO_ROOT/applications/$app/environments/$ENVIRONMENT"
deploy_application "$app" "$ENVIRONMENT" "$DRY_RUN"
if [[ "$DRY_RUN" == "false" ]]; then
monitor_deployment "$app" "$ENVIRONMENT"
fi
fi
done
else
if [[ -d "$REPO_ROOT/applications/$APPLICATION/environments/$ENVIRONMENT" ]]; then
validate_manifests "$REPO_ROOT/applications/$APPLICATION/environments/$ENVIRONMENT"
deploy_application "$APPLICATION" "$ENVIRONMENT" "$DRY_RUN"
if [[ "$DRY_RUN" == "false" ]]; then
monitor_deployment "$APPLICATION" "$ENVIRONMENT"
fi
else
log "ERROR" "Application $APPLICATION not found for environment $ENVIRONMENT"
exit 1
fi
fi
log "INFO" "GitOps automation workflow completed successfully"
}
# Handle script arguments and execution
case "${1:-}" in
deploy)
shift
main "$@"
;;
rollback)
shift
rollback_application "$@"
;;
monitor)
shift
monitor_deployment "$@"
;;
*)
echo "Usage: $0 {deploy|rollback|monitor} [environment] [application] [dry-run]"
echo " deploy - Deploy applications using GitOps"
echo " rollback - Rollback application to previous revision"
echo " monitor - Monitor deployment status"
exit 1
;;
esac
This comprehensive GitOps implementation guide provides enterprise-ready patterns and configurations for advanced continuous delivery using ArgoCD and Flux. The framework supports multi-cluster management, progressive delivery, security integration, and operational monitoring necessary for production environments.
Key benefits of this advanced GitOps approach include:
- Declarative Infrastructure: Complete infrastructure and application state managed through Git
- Multi-Cluster Orchestration: Centralized management of diverse Kubernetes environments
- Progressive Delivery: Automated canary deployments with integrated analysis and rollback
- Security Integration: Policy-as-code enforcement and secret management integration
- Operational Excellence: Comprehensive monitoring, alerting, and automation workflows
- Compliance and Audit: Complete audit trails and governance frameworks
The implementation patterns demonstrated here enable organizations to achieve reliable, secure, and scalable continuous delivery at enterprise scale while maintaining operational excellence and security standards.