{"id":1117,"date":"2020-06-22T17:08:12","date_gmt":"2020-06-23T00:08:12","guid":{"rendered":"http:\/\/blog.nillsf.com\/?p=1117"},"modified":"2020-06-22T17:08:20","modified_gmt":"2020-06-23T00:08:20","slug":"vm-broken-use-os-disk-swap-in-azure-to-fix-and-restore","status":"publish","type":"post","link":"https:\/\/blog.nillsf.com\/index.php\/2020\/06\/22\/vm-broken-use-os-disk-swap-in-azure-to-fix-and-restore\/","title":{"rendered":"VM Broken? Use OS disk swap in Azure to fix and restore"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">I can&#8217;t begin to count the amount of times I screwed up a virtual machine that wouldn&#8217;t boot anymore. In most cases, that was due to messing up <code>\/etc\/fstab<\/code>, which controls which disks get mounted in Linux. If that file is broken, your machine won&#8217;t boot anymore. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In Azure, there&#8217;s an ability to swap the OS disk of a machine. That means you can swap in a working drive to fix the issues with your VM. I hope you never need to use it, but if you need it, it&#8217;s there for you.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this post, I want to quickly explain how to do this and how you can use OS disk swap to get your VM back to a working state.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Process for fixing issues<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s walk through the process I typically use in cases where I need to swap the OS drive:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Take a snapshot of the broken OS disk.<\/li><li>Create a new disk based of this snapshot. <\/li><li>Attach new disk to a working VM.<\/li><li>Fix issues in working VM.<\/li><li>Detach disk.<\/li><li>Swap OS disk.<\/li><\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">In summary, I create a copy of the disk, fix the issues and then swap the OS drive. You cannot detach the OS drive from a VM, since a VM always needs to have an OS drive (except for VMs using ephemeral storage). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With the process covered, let&#8217;s have a look at a working example:<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Fixing a broken VM using OS disk swap<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Breaking my VM<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For my test setup, I pre-created 2 VMs running Ubuntu 18.04. Let&#8217;s login to the VM we plan to break, and make a change in <code>\/etc\/fstab<\/code>. I&#8217;ll remove the last digit of the ID of the boot disk.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"732\" height=\"207\" src=\"\/wp-content\/uploads\/2020\/06\/image-39.png\" alt=\"\" class=\"wp-image-1118\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-39.png 732w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-39-300x85.png 300w\" sizes=\"auto, (max-width: 732px) 100vw, 732px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And then I&#8217;ll do a reboot of the VM (in my case, from the Azure portal). This will cause the reboot, and this should cause the disk failure to present itself. I wasn&#8217;t expecting the VM to actually boot &#8211; but it did &#8211; and I could still connect to it using SSH. But while working on the VM, I ran into many issues. Just one of those was that I couldn&#8217;t restore the fstab file. Apperently the file system was mounted read-only, not read-write. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"740\" height=\"334\" src=\"\/wp-content\/uploads\/2020\/06\/image-40.png\" alt=\"\" class=\"wp-image-1119\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-40.png 740w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-40-300x135.png 300w\" sizes=\"auto, (max-width: 740px) 100vw, 740px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Also, trying to run <code>sudo apt update<\/code> resulted in many errors:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"446\" src=\"\/wp-content\/uploads\/2020\/06\/image-41-1024x446.png\" alt=\"\" class=\"wp-image-1120\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-41-1024x446.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-41-300x131.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-41-768x334.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-41-1140x497.png 1140w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-41.png 1142w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In summary, not the issue I was expecting, but broken nonetheless. Let&#8217;s go ahead and fix it!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Take a snapshot of the broken OS disk<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">First step is to take a snapshot of the OS disk. This can be done either via PowerShell\/CLI or in the portal. For the purpose of this post, I&#8217;ll use the portal route. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To take the snapshot, first navigate to your broken OS disk. You can get there via the Disks part of the VM blade:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"754\" height=\"426\" src=\"\/wp-content\/uploads\/2020\/06\/image-42.png\" alt=\"\" class=\"wp-image-1121\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-42.png 754w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-42-300x169.png 300w\" sizes=\"auto, (max-width: 754px) 100vw, 754px\" \/><figcaption>Go to the OS disk.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the resulting disk blade, you&#8217;ll see the &#8216;Create snapshot&#8217; option. Click that.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"790\" height=\"455\" src=\"\/wp-content\/uploads\/2020\/06\/image-43.png\" alt=\"\" class=\"wp-image-1122\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-43.png 790w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-43-300x173.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-43-768x442.png 768w\" sizes=\"auto, (max-width: 790px) 100vw, 790px\" \/><figcaption>Create a snapshot<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the next blade, provide the necesarry details for the snapshot and hit Review and Create.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"776\" height=\"612\" src=\"\/wp-content\/uploads\/2020\/06\/image-44.png\" alt=\"\" class=\"wp-image-1123\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-44.png 776w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-44-300x237.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-44-768x606.png 768w\" sizes=\"auto, (max-width: 776px) 100vw, 776px\" \/><figcaption>Provide details for the snapshot<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Create a new disk based of this snapshot<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Next up, we&#8217;ll create a disk from this snapshot. This cannot be done from the snapshot blade itself, you&#8217;ll have to navigate to the disk blade for this. To get there, just type disks in the search bar. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"456\" height=\"198\" src=\"\/wp-content\/uploads\/2020\/06\/image-45.png\" alt=\"\" class=\"wp-image-1124\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-45.png 456w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-45-300x130.png 300w\" sizes=\"auto, (max-width: 456px) 100vw, 456px\" \/><figcaption>Look for the disks blade<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once there, hit the Add button on the top.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"687\" height=\"418\" src=\"\/wp-content\/uploads\/2020\/06\/image-46.png\" alt=\"\" class=\"wp-image-1125\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-46.png 687w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-46-300x183.png 300w\" sizes=\"auto, (max-width: 687px) 100vw, 687px\" \/><figcaption>Create a new disk.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">What you&#8217;ll need to do here is set the source type to snapshot, and refer to the snapshot we just created.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"771\" height=\"707\" src=\"\/wp-content\/uploads\/2020\/06\/image-47.png\" alt=\"\" class=\"wp-image-1126\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-47.png 771w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-47-300x275.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-47-768x704.png 768w\" sizes=\"auto, (max-width: 771px) 100vw, 771px\" \/><figcaption>Create a new disk based on the snapshot.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Hit create, and give it some time to create. Then we can attach it to the working VM.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Attach new disk to a working VM<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In the VM blade of the VM that you&#8217;ll use to fix this issue, navigate to the disks blade and hit the &#8220;Attach existing disks&#8221; button. Look for the disk we just created, and hit the save button. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"898\" height=\"585\" src=\"\/wp-content\/uploads\/2020\/06\/image-49.png\" alt=\"\" class=\"wp-image-1128\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-49.png 898w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-49-300x195.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-49-768x500.png 768w\" sizes=\"auto, (max-width: 898px) 100vw, 898px\" \/><figcaption>Attach the existing disk, and hit the save button.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Give it a couple seconds to attach the disk. Once the disk is attached, connect to your VM, and mount the new disk. The commands I typically use for this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo fdisk -l \n#now look for the \/dev\/sdX of the new disk.\nsudo mkdir \/mnt\/broken-disk\nsudo mount \/dev\/sdX1 \/mnt\/broken-disk<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">And with the mount done, we can now fix the issue.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Fix issues in working VM.<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Fixing the issue in this case is as easy as adding the digit back into the <code>\/etc\/fstab<\/code> file. In this case, that file will be at location <code>\/mnt\/broken-disk\/etc\/fstab<\/code>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I added the &#8216;3&#8217; back in, and saved the updated file. Let&#8217;s move to the next step.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Detach disk<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To detach the disk, navigate back to the disk blade on the fixing VM. On the line of the disk we just attached, navigate to the right part of the screen, hit the &#8216;X&#8217; and hit the save button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"999\" height=\"566\" src=\"\/wp-content\/uploads\/2020\/06\/image-50.png\" alt=\"\" class=\"wp-image-1129\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-50.png 999w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-50-300x170.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-50-768x435.png 768w\" sizes=\"auto, (max-width: 999px) 100vw, 999px\" \/><figcaption>Detach the disk<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">With the disk detached, we can now swap the OS drive.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Swap OS disk<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Navigate to your broken VM, open the disks section of the VM blade and hit the Swap OS disk button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"701\" height=\"316\" src=\"\/wp-content\/uploads\/2020\/06\/image-51.png\" alt=\"\" class=\"wp-image-1130\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-51.png 701w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-51-300x135.png 300w\" sizes=\"auto, (max-width: 701px) 100vw, 701px\" \/><figcaption>Hit the Swap OS disk button<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Pick the disk you want to swap in, and confirm the action by typing in the VM name.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"313\" height=\"466\" src=\"\/wp-content\/uploads\/2020\/06\/image-52.png\" alt=\"\" class=\"wp-image-1131\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-52.png 313w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-52-202x300.png 202w\" sizes=\"auto, (max-width: 313px) 100vw, 313px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Interestingly enough, hitting the OK button for this, will cause Azure to stop your VM first. I was wondering if this was going to create an error because my VM was still running, but Azure was smart enough to first stop the VM.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1006\" height=\"456\" src=\"\/wp-content\/uploads\/2020\/06\/image-54.png\" alt=\"\" class=\"wp-image-1133\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-54.png 1006w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-54-300x136.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-54-768x348.png 768w\" sizes=\"auto, (max-width: 1006px) 100vw, 1006px\" \/><figcaption>Azure will stop the VM if it is still running.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once the VM is stopped, this will trigger the OS disk swap:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"513\" height=\"306\" src=\"\/wp-content\/uploads\/2020\/06\/image-55.png\" alt=\"\" class=\"wp-image-1134\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-55.png 513w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-55-300x179.png 300w\" sizes=\"auto, (max-width: 513px) 100vw, 513px\" \/><figcaption>After the VM is stopped, the swap will happen.<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">It doesn&#8217;t trigger an automatic start of the VM, so hit the start button to see if our issues have been fixed. And in my case &#8211; as expected &#8211; everything worked normally again. I could do my apt-get update without issue (where it failed earlier).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"592\" src=\"\/wp-content\/uploads\/2020\/06\/image-56-1024x592.png\" alt=\"\" class=\"wp-image-1135\" srcset=\"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-56-1024x592.png 1024w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-56-300x173.png 300w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-56-768x444.png 768w, https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-56.png 1126w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Everything working fine again.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">There&#8217;s numerous ways in which you could mess up a VM and make it unable to boot. Having the swap OS disk option is a great option to quickly fix issues. This shouldn&#8217;t deter you from setting up good back-up and restore capabilities (e.g. Azure backup), however, swapping the OS disk can be a great quick fix to a number of issues.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Although I described a Linux focused solution here, you can do the same thing with Windows boxes, if you know which files to fix.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I can&#8217;t begin to count the amount of times I screwed up a virtual machine that wouldn&#8217;t boot anymore. In most cases, that was due to messing up \/etc\/fstab, which controls which disks get mounted in Linux. If that file is broken, your machine won&#8217;t boot anymore. In Azure, there&#8217;s an ability to swap the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1130,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[2,4],"tags":[8,126,125,127],"class_list":["post-1117","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure","category-management","tag-azure","tag-backup","tag-disk","tag-snapshot"],"jetpack_featured_media_url":"https:\/\/nillsfblog.blob.core.windows.net\/media\/2020\/06\/image-51.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/1117","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/comments?post=1117"}],"version-history":[{"count":1,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/1117\/revisions"}],"predecessor-version":[{"id":1136,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/posts\/1117\/revisions\/1136"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/media\/1130"}],"wp:attachment":[{"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/media?parent=1117"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/categories?post=1117"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.nillsf.com\/index.php\/wp-json\/wp\/v2\/tags?post=1117"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}