Compliance Automation Frameworks: Continuous Compliance for Platform Engineering
Compliance automation transforms manual audit processes into continuous validation, reducing risk and enabling faster deployment cycles. This guide demonstrates implementing comprehensive compliance frameworks using policy as code, automated scanning, and continuous monitoring for enterprise Kubernetes platforms.
Compliance Automation Frameworks: Continuous Compliance for Platform Engineering
Compliance Framework Architecture
┌──────────────────────────────────────────────────────────┐
│ Policy Definition Layer │
│ ┌────────────┐ ┌────────────┐ ┌──────────┐ ┌──────┐│
│ │ OPA │ │ Kyverno │ │ Custom │ │ CIS ││
│ │ Policies │ │ Policies │ │ Policies │ │Benchmarks││
│ └────────────┘ └────────────┘ └──────────┘ └──────┘│
└──────────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────┐
│ Policy Enforcement Layer │
│ ┌────────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │ Admission │ │ Runtime │ │ Build-Time │ │
│ │ Control │ │ Enforcement│ │ Validation │ │
│ └────────────┘ └────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────┘
│
┌──────────────────────────────────────────────────────────┐
│ Compliance Monitoring & Reporting │
│ ┌────────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │Continuous │ │ Audit │ │ Compliance │ │
│ │ Scanning │ │ Logging │ │ Dashboard │ │
│ └────────────┘ └────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────┘
Policy as Code with OPA
OPA Gatekeeper Installation
# Install Gatekeeper
apiVersion: v1
kind: Namespace
metadata:
name: gatekeeper-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: gatekeeper-controller-manager
namespace: gatekeeper-system
spec:
replicas: 3
selector:
matchLabels:
app: gatekeeper
template:
metadata:
labels:
app: gatekeeper
spec:
serviceAccountName: gatekeeper-admin
containers:
- name: manager
image: openpolicyagent/gatekeeper:v3.14.0
args:
- --port=8443
- --logtostderr
- --audit-interval=60
- --constraint-violations-limit=20
ports:
- containerPort: 8443
name: webhook-server
- containerPort: 8888
name: metrics
- containerPort: 9090
name: healthz
livenessProbe:
httpGet:
path: /healthz
port: 9090
readinessProbe:
httpGet:
path: /readyz
port: 9090
Compliance Policy Templates
# Constraint Template: Required Labels
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
validation:
openAPIV3Schema:
properties:
labels:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("Missing required labels: %v", [missing])
}
---
# Constraint: Enforce compliance labels
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-compliance-labels
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Namespace"]
- apiGroups: ["apps"]
kinds: ["Deployment", "StatefulSet"]
parameters:
labels:
- "compliance/soc2"
- "compliance/pci"
- "data-classification"
- "owner"
- "cost-center"
---
# Constraint Template: Container Image Registry
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8sallowedrepos
spec:
crd:
spec:
names:
kind: K8sAllowedRepos
validation:
openAPIV3Schema:
properties:
repos:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8sallowedrepos
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
not any(satisfied)
msg := sprintf("Container image %v from unauthorized registry", [container.image])
}
---
# Constraint: Only allow approved registries
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
name: allowed-container-registries
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
- apiGroups: ["apps"]
kinds: ["Deployment", "StatefulSet", "DaemonSet"]
parameters:
repos:
- "gcr.io/company/"
- "company.azurecr.io/"
- "123456789012.dkr.ecr.us-east-1.amazonaws.com/"
---
# Constraint Template: No Privileged Containers
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8spsprivilegedcontainer
spec:
crd:
spec:
names:
kind: K8sPSPrivilegedContainer
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8spsprivileged
violation[{"msg": msg}] {
c := input_containers[_]
c.securityContext.privileged
msg := sprintf("Privileged container not allowed: %v", [c.name])
}
input_containers[c] {
c := input.review.object.spec.containers[_]
}
input_containers[c] {
c := input.review.object.spec.initContainers[_]
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPrivilegedContainer
metadata:
name: psp-privileged-container
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
Kyverno Policy Engine
# Install Kyverno
apiVersion: v1
kind: Namespace
metadata:
name: kyverno
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kyverno
namespace: kyverno
spec:
replicas: 3
selector:
matchLabels:
app: kyverno
template:
metadata:
labels:
app: kyverno
spec:
serviceAccountName: kyverno
containers:
- name: kyverno
image: ghcr.io/kyverno/kyverno:v1.11.0
ports:
- containerPort: 9443
name: https
- containerPort: 8000
name: metrics
---
# Kyverno Policy: Require resource limits
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-requests-limits
annotations:
policies.kyverno.io/title: Require Resource Requests and Limits
policies.kyverno.io/category: Best Practices, EKS Best Practices
policies.kyverno.io/severity: medium
policies.kyverno.io/description: >-
Resource requests and limits control resource allocation and consumption.
This policy requires all containers define requests and limits.
spec:
validationFailureAction: enforce
background: true
rules:
- name: validate-resources
match:
any:
- resources:
kinds:
- Pod
validate:
message: "CPU and memory resource requests and limits are required"
pattern:
spec:
containers:
- resources:
requests:
memory: "?*"
cpu: "?*"
limits:
memory: "?*"
cpu: "?*"
---
# Kyverno Policy: Add network policy
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-default-network-policy
annotations:
policies.kyverno.io/title: Add Default Network Policy
policies.kyverno.io/category: Security, Network
policies.kyverno.io/severity: medium
spec:
rules:
- name: add-networkpolicy
match:
any:
- resources:
kinds:
- Namespace
exclude:
any:
- resources:
namespaces:
- kube-system
- kube-public
generate:
kind: NetworkPolicy
name: default-deny-all
namespace: "{{request.object.metadata.name}}"
synchronize: true
data:
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Kyverno Policy: Mutate image pull policy
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: set-image-pull-policy
spec:
rules:
- name: set-image-pull-policy-always
match:
any:
- resources:
kinds:
- Pod
mutate:
patchStrategicMerge:
spec:
containers:
- (name): "*"
imagePullPolicy: Always
---
# Kyverno Policy: Block latest tag
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-latest-tag
spec:
validationFailureAction: enforce
background: true
rules:
- name: require-image-tag
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Using a specific image tag is required"
pattern:
spec:
containers:
- image: "!*:latest"
Compliance Scanning
Trivy Security Scanner
# Trivy Operator deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: trivy-operator
namespace: trivy-system
spec:
replicas: 1
selector:
matchLabels:
app: trivy-operator
template:
metadata:
labels:
app: trivy-operator
spec:
serviceAccountName: trivy-operator
containers:
- name: operator
image: aquasec/trivy-operator:0.16.0
env:
- name: OPERATOR_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: OPERATOR_TARGET_NAMESPACES
value: "" # Scan all namespaces
- name: OPERATOR_SCAN_JOB_TIMEOUT
value: "5m"
- name: TRIVY_SEVERITY
value: "CRITICAL,HIGH,MEDIUM"
---
# Trivy ConfigMap for compliance checks
apiVersion: v1
kind: ConfigMap
metadata:
name: trivy-operator-config
namespace: trivy-system
data:
scanJob.complianceFailEntriesLimit: "10"
compliance.failEntriesLimit: "10"
compliance.reportType: "summary"
vulnerabilityReports.scanner: "Trivy"
configAuditReports.scanner: "Trivy"
---
# CronJob for periodic scanning
apiVersion: batch/v1
kind: CronJob
metadata:
name: trivy-scan-all
namespace: trivy-system
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
serviceAccountName: trivy-scanner
containers:
- name: trivy
image: aquasec/trivy:latest
command:
- trivy
args:
- k8s
- --report=summary
- --severity=HIGH,CRITICAL
- all
restartPolicy: Never
Falco Runtime Security
# Falco DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: falco
namespace: falco
spec:
selector:
matchLabels:
app: falco
template:
metadata:
labels:
app: falco
spec:
serviceAccountName: falco
hostNetwork: true
hostPID: true
containers:
- name: falco
image: falcosecurity/falco-no-driver:latest
securityContext:
privileged: true
volumeMounts:
- name: docker-socket
mountPath: /host/var/run/docker.sock
- name: dev-fs
mountPath: /host/dev
- name: proc-fs
mountPath: /host/proc
readOnly: true
- name: boot-fs
mountPath: /host/boot
readOnly: true
- name: lib-modules
mountPath: /host/lib/modules
- name: usr-fs
mountPath: /host/usr
readOnly: true
- name: falco-config
mountPath: /etc/falco
volumes:
- name: docker-socket
hostPath:
path: /var/run/docker.sock
- name: dev-fs
hostPath:
path: /dev
- name: proc-fs
hostPath:
path: /proc
- name: boot-fs
hostPath:
path: /boot
- name: lib-modules
hostPath:
path: /lib/modules
- name: usr-fs
hostPath:
path: /usr
- name: falco-config
configMap:
name: falco-config
---
# Falco custom rules for compliance
apiVersion: v1
kind: ConfigMap
metadata:
name: falco-config
namespace: falco
data:
falco_rules.yaml: |
- rule: Write below root
desc: Detect writes below root directory (compliance violation)
condition: >
evt.type = open and
evt.arg.flags contains O_WRONLY and
fd.name glob /root/*
output: "Write below root directory (user=%user.name command=%proc.cmdline file=%fd.name)"
priority: ERROR
tags: [filesystem, compliance]
- rule: Unauthorized process in container
desc: Detect unauthorized processes (whitelist approach)
condition: >
spawned_process and
container and
not proc.name in (nginx, java, python, node)
output: "Unauthorized process in container (user=%user.name command=%proc.cmdline container=%container.name)"
priority: WARNING
tags: [process, compliance]
- rule: Sensitive file access
desc: Detect access to sensitive files
condition: >
open_read and
fd.name in (/etc/shadow, /etc/sudoers, /root/.ssh/id_rsa)
output: "Sensitive file accessed (user=%user.name file=%fd.name command=%proc.cmdline)"
priority: CRITICAL
tags: [filesystem, compliance, security]
Audit Logging and Monitoring
# Elasticsearch for audit logs
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: compliance-logs
namespace: elastic-system
spec:
version: 8.11.0
nodeSets:
- name: default
count: 3
config:
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
---
# Kibana for compliance dashboards
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: compliance-dashboard
namespace: elastic-system
spec:
version: 8.11.0
count: 1
elasticsearchRef:
name: compliance-logs
---
# Filebeat for K8s audit logs
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: audit-logs
namespace: elastic-system
spec:
type: filebeat
version: 8.11.0
elasticsearchRef:
name: compliance-logs
config:
filebeat.inputs:
- type: log
paths:
- /var/log/kubernetes/kube-apiserver-audit.log
json.keys_under_root: true
json.add_error_key: true
processors:
- add_cloud_metadata: {}
- add_kubernetes_metadata: {}
daemonSet:
podTemplate:
spec:
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
containers:
- name: filebeat
volumeMounts:
- name: audit-logs
mountPath: /var/log/kubernetes
readOnly: true
volumes:
- name: audit-logs
hostPath:
path: /var/log/kubernetes
Compliance Reporting
# Compliance report generator
import yaml
from datetime import datetime, timedelta
from kubernetes import client, config
class ComplianceReporter:
def __init__(self):
config.load_kube_config()
self.api = client.CustomObjectsApi()
self.core_api = client.CoreV1Api()
def generate_compliance_report(self, framework='SOC2'):
"""
Generate comprehensive compliance report
"""
report = {
'generated_at': datetime.now().isoformat(),
'framework': framework,
'clusters': self.get_cluster_info(),
'policy_compliance': self.check_policy_compliance(),
'vulnerability_summary': self.get_vulnerability_summary(),
'audit_findings': self.get_audit_findings(),
'recommendations': []
}
# Analyze and add recommendations
report['recommendations'] = self.generate_recommendations(report)
return report
def check_policy_compliance(self):
"""
Check OPA Gatekeeper policy compliance
"""
constraints = self.api.list_cluster_custom_object(
group="constraints.gatekeeper.sh",
version="v1beta1",
plural="k8srequiredlabels"
)
compliance = {
'total_constraints': len(constraints['items']),
'violations': 0,
'details': []
}
for constraint in constraints['items']:
status = constraint.get('status', {})
violations = status.get('violations', [])
if violations:
compliance['violations'] += len(violations)
compliance['details'].append({
'constraint': constraint['metadata']['name'],
'violations': violations
})
return compliance
def get_vulnerability_summary(self):
"""
Get vulnerability scan summary from Trivy
"""
vuln_reports = self.api.list_cluster_custom_object(
group="aquasecurity.github.io",
version="v1alpha1",
plural="vulnerabilityreports"
)
summary = {
'critical': 0,
'high': 0,
'medium': 0,
'low': 0,
'total_scanned': len(vuln_reports['items'])
}
for report in vuln_reports['items']:
vulns = report.get('report', {}).get('vulnerabilities', [])
for vuln in vulns:
severity = vuln.get('severity', '').lower()
if severity in summary:
summary[severity] += 1
return summary
def get_audit_findings(self):
"""
Query Elasticsearch for audit log findings
"""
# Query last 30 days of audit logs
findings = {
'unauthorized_access_attempts': 0,
'privilege_escalations': 0,
'policy_violations': 0,
'suspicious_activities': []
}
# Implementation would query Elasticsearch
# This is a simplified example
return findings
def generate_recommendations(self, report):
"""
Generate actionable recommendations based on report
"""
recommendations = []
# Check vulnerability levels
if report['vulnerability_summary']['critical'] > 0:
recommendations.append({
'severity': 'critical',
'category': 'vulnerabilities',
'finding': f"{report['vulnerability_summary']['critical']} critical vulnerabilities found",
'recommendation': 'Patch critical vulnerabilities within 24 hours',
'remediation': 'Update affected container images to patched versions'
})
# Check policy violations
if report['policy_compliance']['violations'] > 0:
recommendations.append({
'severity': 'high',
'category': 'policy',
'finding': f"{report['policy_compliance']['violations']} policy violations detected",
'recommendation': 'Remediate policy violations immediately',
'remediation': 'Review and update resource configurations to meet policy requirements'
})
return recommendations
def export_report(self, report, format='json'):
"""
Export report in various formats
"""
if format == 'json':
return json.dumps(report, indent=2)
elif format == 'html':
return self.render_html_report(report)
elif format == 'pdf':
return self.render_pdf_report(report)
Best Practices
Policy Development
- Start with Audit Mode: Test policies before enforcement
- Incremental Rollout: Deploy policies gradually
- Clear Documentation: Document all policy requirements
- Exception Process: Define process for policy exceptions
- Regular Reviews: Quarterly policy review and updates
Automation
- Continuous Scanning: Automated daily/weekly scans
- Shift Left: Integrate compliance checks in CI/CD
- Auto-Remediation: Automatic fixes where safe
- Real-Time Alerting: Immediate notification of violations
- Dashboard Visibility: Real-time compliance status
Compliance Frameworks
- SOC 2: Focus on security, availability, confidentiality
- PCI-DSS: Payment card data protection requirements
- HIPAA: Healthcare data privacy and security
- GDPR: Data protection and privacy regulations
- ISO 27001: Information security management
Conclusion
Compliance automation enables continuous security and regulatory adherence. Success requires:
- Policy as Code: Automated, version-controlled policies
- Multi-Layer Defense: Admission, runtime, and audit controls
- Continuous Monitoring: Real-time compliance validation
- Clear Reporting: Actionable compliance dashboards
- Regular Updates: Keep policies current with regulations
The goal is embedding compliance into development workflows rather than treating it as a periodic audit exercise.