Stablepoint Knowledge Base

Get all the help you need here.

This is restricted content – not visible to the public.

Repairing AWS Volumes (Slightly Dangerous)

Posted 06th August, 2019

This method has the potential to cause data loss! Be sure there's good backups at hand

  1. Power down the server which has the affected volume attached to it. Also make note of the availability zone of the server (e.g. eu-west-1b)

  2. Click Launch Instance at the top of the Instances page, and select the Amazon Linux 2 AMI (HVM), SSD Volume Type AMI. Set the instance type to t2.micro and click Configure Instance Details on the bottom right.

  3. On the Instance Details page, in the Subnet drop down, choose the one that matches the availability zone of the old server. Then just click Review and Launch and then Launch. You'll be prompted for a Key Pair. Just generate a new one and it'll download a pem file to your machine.

    • You'll be sent back to the Instance list. Take a note of the Instance ID for the new server. You'll need this later.
  4. Whilst the new server is launching, select the server the volume is attached to in the list, and in the bottom panel, click /dev/sda1. In the pop up tooltip, click the EBS ID (e.g. vol-04fs4fs4q3rs) to go to the console for that volume.

  5. Click the Actions button at the top of the page and choose Detach Volume

  6. Using the .pem file that was generated, ssh to the new Amazon Linux instance with the username ec2-user to the public IP. Once in, run sudo poweroff.

  7. Once the new server has shut down (it will show as Stopped in the instance list), in the Volumes list, find the volume that matches the ID you clicked earlier, it should show its state as Available with a blue gem next to it. Select it and the click the Actions button and choose Attach Volume. In the instance dropdown, choose the instance with the ID you noted earlier, and set the device to the highest possible value (e.g. /dev/sdp)

  8. Power up the instance, and once it's up SSH to it. n.b. It will likely have a new public IP, but just use the same key and ec2-userl

  9. Use lsblk to find the dev name of the volume (despite what you put in the attach instance device name, it will not always match, so double check. It'll be the one that isn't mounted.

  10. Run sudo xfs_repair -L /dev/sdp (swap sdp with whatever the actual device name ended up being). This will take a minute or two.

  11. Once down, run sudo poweroff and wait until the instance's status is Stopped in the Instance list. Like with the main server, click the volume in the bottom panel, and go to the volume.

  12. Click Actions on the top left and again Detach Volume. Then click Actions again and choose Attach Volume. Now select the instance it was originally attached to, but make sure the device name is set to /dev/sda1. It must be exactly /dev/sda1, not sda or anything else.

  13. Boot up the original instance. Once the status shows as running, click Actions, then select Instance Settings and finally Get Instance Screenshot.

  14. Refresh this now and then to make sure no xfs errors prevent booting. Once you see a login prompt or can SSH then you should be all set.

If the server still won't boot after this, ask Bruce or Dom for help.