Migrating from RKE1 to RKE2: A Seamless Transition with SUSE Rancher Prime
Migrating from RKE1 to RKE2 is an essential transition for organizations relying on Rancher-managed Kubernetes clusters. With RKE1 reaching end-of-life (EOL) on July 31, 2025, moving to RKE2 ensures ongoing support, security updates, and performance improvements.
Why Migrate from RKE1 to RKE2?
RKE1 will no longer receive security patches or updates beyond its EOL date. According to the official SUSE announcement, RKE1 support ends July 31, 2025.
Delaying migration increases operational risk due to:
- Lack of security updates
- Compatibility issues with future Kubernetes versions
- Missing out on critical performance improvements
Here are the key advantages of moving to RKE2:
- Improved Security: SELinux support, FIPS compliance, and Pod Security Standards.
- Better Performance: RKE2 uses
containerd
, optimizing resource utilization and reducing overhead. - Long-Term Stability: RKE2 aligns closely with upstream Kubernetes for better future compatibility.
- Seamless Rancher Integration: Multi-cluster management with built-in rolling upgrades.
Beyond RKE1 EOL: Other Reasons for Cluster Migration
While RKE1’s EOL is a pressing concern, there are other scenarios where cluster migration becomes necessary:
Moving Into and Out of the Cloud
Organizations frequently move workloads between cloud and on-premises environments for cost savings, compliance, performance, and vendor flexibility.
Common Challenges:
- Networking Differences: VPC configurations, CNI plugins, and ingress controllers need reconfiguration
- Cloud Storage Differences: Persistent volume formats are cloud-specific (e.g., AWS EBS vs. Azure Disks)
- IAM & Security Policies: RBAC and firewall rules require updates
Example Use Cases:
- AWS EKS → RKE2 on-prem for cost control and compliance
- Self-managed Kubernetes → managed services (EKS, AKS, GKE)
- Hybrid & multi-cloud scaling for resilience
Disaster Recovery (DR) & High Availability
Ensuring business continuity by maintaining failover clusters or running workloads across multiple regions.
Benefits:
- Minimize downtime during failures
- Protection against outages (cloud, network, hardware)
- Regulatory compliance with business continuity requirements
Key Challenges:
- Keeping stateful applications in sync
- Failover orchestration using DNS, load balancers, or BGP
- Storage and data replication across environments
Foundational Changes & Infrastructure Upgrades
Major infrastructure changes often require migration rather than in-place upgrades.
Common Scenarios:
- Adopting new Kubernetes architectures
- Improving performance and scalability
- Enhancing security and compliance
- Switching container runtimes (Docker → Containerd)
- Upgrading storage solutions
Choosing the Right Migration Strategy
Migration isn’t a one-size-fits-all approach. The right strategy depends on several factors:
- Timeline: How quickly do you need to complete the migration?
- Risk Tolerance: Can you afford downtime or need a gradual transition?
- Team Involvement: Will this be admin-driven or do app teams need control?
- Cluster Differences: Are you making minimal changes or a major infrastructure shift?
Below are the three common migration strategies to consider:
1. Lift-and-Shift (Fastest, but Riskier)
What is Lift-and-Shift?
- You as the cluster admin move all workloads from one cluster to another in one big move
- Little to no changes are made to applications or configurations
- Best when workloads are compatible with the new cluster
Pros:
- Fastest migration method - Everything moves at once
- Minimal app team involvement - Admin-driven process
- Works well when clusters are nearly identical (same Kubernetes version, storage, etc.)
Cons:
- Higher risk of failures - No gradual testing phase
- Potential downtime - Some workloads may need to restart in the new cluster
- Infrastructure differences may require post-move fixes
2. Rolling Migration (Balanced Approach)
What is Rolling Migration?
- You as the cluster admin move applications one at a time in coordination with app teams
- Small to medium-size changes to applications may be made to better utilize the new environment
- Each app team tests and validates their services in the new cluster before fully migrating
Pros:
- Minimized risk - Applications are moved gradually with validation
- App teams validate their own workloads - Less troubleshooting after migration
- No major downtime - Old cluster stays online while workloads migrate
Cons:
- Slower migration process - Requires coordination with multiple teams
- Potential inconsistencies - If teams don’t migrate in sync, dependencies may break
- Higher resource costs - Both clusters run in parallel during migration
3. Phased Migration (Most Flexible, Requires App Team Cooperation)
What is Phased Migration?
- You as the cluster admin build a new cluster and inform app teams that they need to migrate
- Responsibility is on app teams to move their workloads when ready
- Original cluster stays online until everything is moved, then decommissioned
Pros:
- Less work for cluster admins - App teams handle their own migrations
- Flexibility - Teams move on their own timeline, reducing coordination pressure
- Great for major infrastructure changes - Teams can refactor if needed before moving
Cons:
- Unpredictable timeline - Some teams may delay migration, leaving two clusters running longer
- Potential inconsistencies - If teams don’t migrate in a structured way, dependencies may break
- May require temporary workarounds - Cross-cluster communication might be needed during migration
Migration Methods – Choosing the Right Approach
Different workloads and environments require different migration techniques. When selecting a method, consider:
- Are your workloads stateless or stateful?
- Do you need a fast migration or a controlled process?
- How critical is data consistency?
- What’s your team’s expertise with various migration tools?
1. YAML Export/Import
How It Works:
- Export workloads using:
kubectl get resource -o yaml > backup.yaml
- Apply them in the new cluster with:
kubectl apply -f backup.yaml
- Export workloads using:
Pros:
- Fast and simple, no extra tools required
- Good for stateless workloads (Deployments, Services, ConfigMaps)
Cons:
- No Persistent Volume (PV) migration, must move storage separately
- Manual and error-prone, requires careful dependency handling
Best for:
- Small workloads, quick transitions, and environments without persistent data
- Detailed YAML Export/Import guide
Open Source Tool:
2. DR-Syncer
How It Works:
- Replicates Deployments, Services, ConfigMaps, Secrets, Persistent Volumes across clusters
- Ensures scheduled syncing for seamless migration
Pros:
- Purpose-built for Kubernetes migrations/DR – Handles both workloads and PVs
- Minimizes downtime – Keeps namespaces and data synchronized
- More efficient than manual YAML exports – Reduces human error
Cons:
- Requires setup & configuration
- May need cluster connectivity – Ensure network policies allow cross-cluster syncs
- Requires similar cluster setup – Target cluster should match source
- Target cluster must have storage configured for PV replication
Best for:
- Stateless and Stateful applications that need replication between clusters
Open Source Tool:
3. Backup and Restore Tools
How It Works:
- Backup workloads in the old cluster
- Restore them in the new cluster, including Persistent Volumes
Pros:
- Works across cloud and on-prem clusters
- Backs up all workloads including PVs, RBAC, and secrets
Cons:
- Requires object storage (AWS S3, MinIO, Azure Blob)
- May be slow for large clusters with many Persistent Volumes
- Some solutions require paid licenses
Best for:
- Full-cluster migrations needing persistent storage and security settings
- Backup and disaster recovery strategies
Detailed guides available for:
- Velero - Open-source Kubernetes backup/restore with plugin architecture
- CloudCasa - Cloud-based backup solution with comprehensive resource coverage
- Kasten K10 - Application-centric Kubernetes data management platform
4. Redeploy (GitOps)
How It Works:
- Update the target cluster in your pipelines to reflect the new environment
- Deploy a fresh environment in the new cluster using Helm, Kustomize, or GitOps (ArgoCD, Flux)
- Migrate data separately using snapshots, database replication, or manual restores
Pros:
- Ensures a clean deployment, avoiding legacy config issues
- Best for infrastructure upgrades or Kubernetes version changes
Cons:
- No automatic PV migration, must handle database and storage manually
- Takes more time, especially for complex applications
- Requires applications to be fully defined as code (IaC/GitOps)
Best for:
- Organizations following Infrastructure-as-Code (IaC) or GitOps practices
- Teams migrating to declarative deployments for better reproducibility
5. Cattle-Drive for Rancher Resources
How It Works:
- Migrates Rancher-specific objects from source to target cluster
- Includes Projects, Namespaces, Rancher Permissions, Cluster Apps, and Catalog Repos
Pros:
- Automates the migration of Rancher resources between clusters
- Preserves project structure and access controls
Cons:
- Does not migrate your applications
- Limited to Rancher-specific resources
Best for:
- Use with redeployment migrations where you don’t want to manually recreate Projects and permissions
Open Source Tool:
Data Migration Methods
Migrating persistent data is crucial to maintaining application stability. Here are the recommended approaches:
Longhorn DR Volumes
How It Works:
- Longhorn’s Disaster Recovery (DR) volumes sync with a backup cluster on a scheduled basis
- Uses incremental restores to minimize transfer time
- DR volume is created from a volume’s backup in the backupstore
- Scheduled backup intervals determine how frequently data is updated
Pros:
- Scheduled Data Syncing – Uses periodic snapshots and incremental restoration
- Faster Recovery vs. Full Backup Restores – Avoids recovering entire volumes from scratch
- Built-in with Longhorn – No additional tools required for Longhorn users
Cons:
- Not real-time replication – Data is only as current as the last scheduled backup
- No live snapshots or backups on DR volumes
- Recovery Point Objective (RPO) depends on backup frequency
Best for:
- Organizations already using Longhorn for persistent storage
- Detailed guide on migrating using Longhorn DR volumes
pv-migrate
How It Works:
- CLI tool that migrates Persistent Volume Claims (PVCs) across namespaces, clusters, or storage backends
- Uses rsync over SSH with Load Balancers, Bind Mounts, and Port-Forwarding for data transfer
- Supports multiple migration strategies, automatically selecting the most efficient method
Pros:
- Works across namespaces, clusters, and storage backends – Not tied to a specific CSI driver
- Secure migrations – Uses SSH and rsync for encrypted data transfer
- Multiple migration strategies – Falls back to different approaches when needed
- Highly customizable – Configure rsync/SSH images, affinity, and network settings
Cons:
- Requires storage compatibility – Target storage class must support expected access modes
- Live data requires careful handling – Works best for pre-migration syncing
- Networking considerations – Cross-cluster migrations require proper network connectivity
Best for:
- Moving Persistent Volumes across namespaces or clusters
- Changing storage classes
- Step-by-step instructions for PVC migration
Open Source Tool:
Backup and Restore Solutions
Backup and restore solutions that work across cloud and on-prem environments
Pros: Support full-cluster backups, including PVCs, RBAC, and custom resources
Cons: Slower for large clusters, requires object storage (e.g., AWS S3)
Detailed guides available for:
- Velero - Open-source backup/restore tool
- CloudCasa - SaaS Kubernetes backup solution
- Kasten K10 - Enterprise data management platform
Common Migration Failures & Troubleshooting
Even with careful planning, migrations can encounter issues. Here are common problems and their solutions:
1. Missing Critical Cluster Services
Issue: After migration, applications fail due to missing dependencies like cert-manager, monitoring, or GitOps tools.
Fix:
- Ensure required cluster services are installed first (cert-manager, Prometheus, ArgoCD)
- Deploy cluster-wide services before migrating workloads
2. Forgetting Cluster-Scoped Resources
Issue: Applications fail to start because ClusterRoles, RoleBindings, or CRDs are missing.
Fix:
- Export and apply CRDs before migrating workloads:
kubectl get crd -o yaml > crds.yaml kubectl apply -f crds.yaml
- Ensure RBAC rules (ClusterRoleBindings, ClusterRoles) are migrated properly
- List cluster-wide resources with:
kubectl api-resources --verbs=list --namespaced=false
3. Secrets Not Stored Externally
Issue: Applications crash because Secrets were lost during migration.
Fix:
- Externalize secrets using Vault, AWS Secrets Manager, or Kubernetes External Secrets
- Backup secrets before migration:
kubectl get secrets -A -o yaml > secrets-backup.yaml
- Restore secrets manually or via GitOps after migration
4. CNI Changes Impact Network Policies
Issue: A different CNI (Calico, Cilium, etc.) can change network policies, causing communication failures.
Fix:
- Check existing network policies before migration:
kubectl get networkpolicy -A
- Verify pod-to-pod and pod-to-service communication is allowed
- Update network policies to match the new CNI’s behavior before migration
Best Practices for a Smooth Migration
- Pre-flight validation: Run
kubectl get all -A
to detect missing resources - Test migration in staging: Never migrate production workloads without a test run
- Use GitOps for consistency: Store and redeploy cluster-wide resources via ArgoCD or Flux
- Document dependencies: Ensure all external services, cluster-scoped resources, and security policies are accounted for
- Inventory Resources: Run
kubectl api-resources
on both clusters to identify potential CRD compatibility issues - Resource Planning: Ensure RKE2 nodes have sufficient capacity for all workloads
- Version Compatibility: Verify compatibility of operators and controllers between clusters
- Network Testing: Validate network connectivity between clusters before migration
Conclusion
Migrating from RKE1 to RKE2 is a critical step to ensure your Kubernetes clusters remain secure, performant, and supported. With RKE1 reaching end-of-life in 2025, organizations need to plan their transition strategy now.
By understanding the different migration strategies and choosing the right migration method for your specific workloads, you can transition seamlessly with minimal downtime. The key is thorough preparation, testing, and addressing common challenges before they impact your production environment.
We’ve created detailed guides for several migration methods to help you through the process:
- YAML Export/Import Migration - Simple method for stateless workloads
- Velero Migration - Open-source backup/restore approach
- CloudCasa Migration - Cloud-based backup solution
- Kasten K10 Migration - Application-centric data management
- Longhorn DR Volumes Migration - Kubernetes-native replication
- pv-migrate Migration - Targeted PVC migration
For further discussion, feel free to connect with me at support.tools or check out my book Rancher Deep Dive for in-depth insights into Kubernetes and Rancher management.
Additional Resources
- Official SUSE RKE1 EOL Announcement: SUSE KB
- Migration Tool Documentation:
- Rancher Resources:
- Detailed Migration Guides: