When disaster strikes and your entire RKE2 cluster or Longhorn control plane becomes unavailable, you can still recover critical data from a single Longhorn volume replica using a static pod definition. This emergency recovery method works even when the Kubernetes API server is completely offline and Docker is not available on your RKE2 nodes.

Recovering Longhorn Replica Data in RKE2

In this guide, you’ll learn how to safely access and export data from a Longhorn volume when traditional recovery methods aren’t possible. This approach leverages RKE2’s static pod capability to temporarily mount a volume replica without requiring the Longhorn controller or Kubernetes API.

Prerequisites

  • Access to at least one RKE2 node containing a healthy replica of your Longhorn volume
  • Basic knowledge of Linux filesystem commands
  • Root access to the node

Step 1: Locate the Replica Data on Disk

First, identify where Longhorn stores its replica data. Run this command to find the replica storage path:

find / -name longhorn-disk.cfg

You might see:

/var/lib/longhorn/longhorn-disk.cfg
  

Then list the replicas:

ls /var/lib/longhorn/replicas/

Example:

pvc--<8charUUID>
pvc-27c076f8-5710-416f-9729-83194cad4aac-7fb2c32d
  

Placeholder: /var/lib/longhorn/replicas/pvc-<your-volume-name>-<uuid>

This command searches your entire filesystem for the Longhorn configuration file, which indicates where replicas are stored.

Step 2: Determine the Volume Size from Metadata

To correctly mount the volume, you need its exact size. Examine the volume metadata file:

cat /var/lib/longhorn/replicas/pvc-<volume-name>-<uuid>/volume.meta

Look for the Size field:

{"Size":10737418240, ...}

Placeholder: Size: <volume-size-in-bytes>

Example:

cat /var/lib/longhorn/replicas/pvc-27c076f8-5710-416f-9729-83194cad4aac-7fb2c32d/volume.meta

Yields:

{"Size":10737418240, "Head":"volume-head-000.img", ...}

The Size field contains the volume’s size in bytes, which you’ll need in the next step. The JSON output also includes other useful metadata about the volume structure.

Step 3: Create a Static Pod Manifest to Launch the Longhorn Engine

Now you’ll create a static pod definition that RKE2 will automatically deploy. This pod will run the Longhorn engine and expose your volume as a block device:

/var/lib/rancher/rke2/agent/pod-manifests/longhorn-recovery.yaml

Template:

apiVersion: v1
kind: Pod
metadata:
  name: longhorn-launch
spec:
  hostPID: true
  containers:
  - name: engine
    image: longhornio/longhorn-engine:v<version>
    securityContext:
      privileged: true
    command: ["launch-simple-longhorn"]
    args: ["<volume-name>", "<volume-size-in-bytes>"]
    volumeMounts:
    - name: dev
      mountPath: /host/dev
    - name: proc
      mountPath: /host/proc
    - name: data
      mountPath: /volume
  volumes:
  - name: dev
    hostPath:
      path: /dev
  - name: proc
    hostPath:
      path: /proc
  - name: data
    hostPath:
      path: <host-path-to-replica>
  restartPolicy: Never

Example:

apiVersion: v1
kind: Pod
metadata:
  name: longhorn-launch
spec:
  hostPID: true
  containers:
  - name: engine
    image: longhornio/longhorn-engine:v1.8.0
    securityContext:
      privileged: true
    command: ["launch-simple-longhorn"]
    args: ["pvc-27c076f8-5710-416f-9729-83194cad4aac", "10737418240"]
    volumeMounts:
    - name: dev
      mountPath: /host/dev
    - name: proc
      mountPath: /host/proc
    - name: data
      mountPath: /volume
  volumes:
  - name: dev
    hostPath:
      path: /dev
  - name: proc
    hostPath:
      path: /proc
  - name: data
    hostPath:
      path: /var/lib/longhorn/replicas/pvc-27c076f8-5710-416f-9729-83194cad4aac-7fb2c32d
  restartPolicy: Never

This manifest creates a privileged pod that mounts your replica data and exposes it as a standard block device. Be sure to replace the placeholders with your actual values, including:

  • The correct Longhorn engine version
  • Your volume name (from the replica path)
  • The exact volume size in bytes (from volume.meta)
  • The full path to your replica directory

Step 4: Monitor the Recovery Process Through Pod Logs

To verify the recovery process is working, check the pod logs:

export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml

Find the pod:

/var/lib/rancher/rke2/bin/crictl pods | grep longhorn-launch

Then tail the logs:

/var/lib/rancher/rke2/bin/crictl logs <container-id>

Once the pod is running, you should see log messages indicating that the Longhorn engine has started and the volume is available.

Step 5: Mount and Access the Recovered Volume Data

After the Longhorn engine initializes successfully, a new block device will appear on your system:

/dev/longhorn/
  

Mount this device in read-only mode to prevent any accidental data corruption:

mkdir -p /mnt/longhorn
mount -o ro /dev/longhorn/pvc-27c076f8-5710-416f-9729-83194cad4aac /mnt/longhorn

At this point, all your volume data is accessible under /mnt/longhorn. You can use standard file operations to copy data to a safe location:

# Example: Create a backup archive
tar -czf /tmp/volume-backup.tar.gz -C /mnt/longhorn .

# Or copy specific files
cp -rp /mnt/longhorn/important-data /tmp/backup/

# Or use rsync for large data sets
rsync -av /mnt/longhorn/ /tmp/backup/

You can also create a volume in Longhorn and mount it in maintense mode to this same node copy the data directly to the new volume.

Step 6: Clean Up After Recovery

Once you’ve recovered your data, clean up the resources:

rm /var/lib/rancher/rke2/agent/pod-manifests/longhorn-recovery.yaml

After removing the manifest file, RKE2 will automatically stop the static pod, and the block device will disappear. You should also unmount the filesystem before this happens:

umount /mnt/longhorn

Best Practice: Always mount Longhorn recovery volumes as read-only (-o ro) to prevent accidental data corruption. Any writes to an isolated replica could cause data inconsistencies if you later restore the Longhorn system.

Troubleshooting Common Issues

Block Device Doesn’t Appear

If the /dev/longhorn/<volume-name> device doesn’t appear:

  • Check the pod logs for errors using the crictl commands from Step 4
  • Verify that the replica path and volume size match exactly with the metadata
  • Ensure the Longhorn engine image version is compatible with your volume format

Mount Operation Fails

If you encounter filesystem errors when mounting:

  • The filesystem might be corrupted within the volume
  • Try using filesystem recovery tools like fsck before mounting
  • Consider using data recovery tools on the raw block device

Conclusion

This emergency recovery technique provides a reliable method to access your data from a Longhorn volume even when the entire Kubernetes control plane or Longhorn system is unavailable. By leveraging RKE2’s static pod capability, you can temporarily bring up just enough of the Longhorn engine to access your volume data without requiring the full orchestration system.

While this method is intended for emergency recovery scenarios, understanding the underlying structure of Longhorn volumes gives you a powerful option for data recovery. Remember to perform regular backups using Longhorn’s built-in snapshot and backup features to minimize the need for such emergency measures in the future.