From 3f103e49a531e9aa4c2a6528dbfb808d7f7ad9db Mon Sep 17 00:00:00 2001 From: jessica Date: Mon, 1 Dec 2025 19:58:25 -0500 Subject: [PATCH 1/8] create and update guides --- .../raid_soft/guide.en-gb.md | 621 +++++++--- .../raid_soft/guide.fr-fr.md | 593 ++++++--- .../raid_soft_uefi/guide.en-gb.md | 1064 +++++++++++++++++ .../raid_soft_uefi/guide.fr-fr.md | 1064 +++++++++++++++++ .../raid_soft_uefi/meta.yaml | 2 + 5 files changed, 3045 insertions(+), 299 deletions(-) create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md index 89e4dac2e54..8fe329f0549 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of the software RAID of your server and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers in legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-02 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md index 811e36b6932..31333738d6b 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md @@ -1,7 +1,7 @@ --- -title: Configuration et reconstruction du RAID logiciel -excerpt: "Découvrez comment vérifier l'état du RAID logiciel de votre serveur et le reconstruire après un remplacement de disque" -updated: 2023-08-21 +title: Gestion et reconstruction du RAID logiciel sur les serveurs en mode legacy boot (BIOS) +excerpt: "Découvrez comment gérer et reconstruire le RAID logiciel après un remplacement de disque sur votre serveur en mode legacy boot (BIOS)" +updated: 2025-12-02 --- ## Objectif @@ -10,23 +10,69 @@ Le RAID (Redundant Array of Independent Disks) est un ensemble de techniques pr Le niveau RAID par défaut pour les installations de serveurs OVHcloud est RAID 1, ce qui double l'espace occupé par vos données, réduisant ainsi de moitié l'espace disque utilisable. -**Ce guide va vous aider à configurer la matrice RAID de votre serveur dans l'éventualité où elle doit être reconstruite en raison d'une corruption ou d'une panne de disque.** +**Ce guide explique comment gérer et reconstruire un RAID logiciel en cas de remplacement d'un disque sur votre serveur en mode legacy boot (BIOS).** +Avant de commencer, veuillez noter que ce guide se concentre sur les serveurs dédiés qui utilisent le mode legacy boot (BIOS). Si votre serveur utilise le mode UEFI (cartes mères plus récentes), reportez-vous à ce guide [Gestion et reconstruction du RAID logiciel sur les serveurs en mode boot UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +Pour vérifier si un serveur s'exécute en mode BIOS ou en mode UEFI, exécutez la commande suivante : + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + ## Prérequis - Posséder un [serveur dédié](/links/bare-metal/bare-metal) avec une configuration RAID logiciel. - Avoir accès à votre serveur via SSH en tant qu'administrateur (sudo). +- Compréhension du RAID et des partitions ## En pratique +### Présentation du contenu + +- [Informations de base](#basicinformation) +- [Simuler une panne de disque](#diskfailure) + - [Retrait du disque défaillant](#diskremove) +- [Reconstruction du RAID](#raidrebuild) + - [Reconstruction du RAID en mode rescue](#rescuemode) + - [Ajout du label à la partition SWAP (le cas échéant)](#swap-partition) + - [Reconstruction du RAID en mode normal](#normalmode) + + + + +### Informations de base + +Dans une session de ligne de commande, tapez le code suivant pour déterminer l'état actuel du RAID ### Retrait du disque La vérification de l’état actuel du RAID s’effectue via la commande suivante : ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Cette commande nous indique que deux périphériques RAID logiciels sont actuellement configurés, **md4** étant le plus grand. Le périphérique RAID **md4** se compose de deux partitions, appelées **nvme1n1p4** et **nvme0n1p4**. + +Le [UU] signifie que tous les disques fonctionnent normalement. Un `_` indique un disque défectueux. + +Si vous possédez un serveur avec des disques SATA, vous obtenez les résultats suivants : + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -40,12 +86,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Cette commande montre deux matrices RAID actuellement configurées, « md4 » étant la plus grande partition. La partition se compose de deux disques, appelés « sda4 » et « sdb4 ». Le [UU] signifie que tous les disques fonctionnent normalement. Un « _ » indiquerait un disque défectueux. - -Bien que cette commande affiche les volumes RAID, elle n'indique pas la taille des partitions elles-mêmes. Vous pouvez obtenir cette information via la commande suivante : +Bien que cette commande renvoie nos volumes RAID, elle ne nous indique pas la taille des partitions elles-mêmes. Nous pouvons retrouver cette information avec la commande suivante : ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,73 +131,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -La commande `fdisk -l` vous permet également d'identifier votre type de partition. C'est une information importante à connaître lorsqu'il s'agit de reconstruire votre RAID en cas de défaillance d'un disque. - -Pour les partitions **GPT**, la commande retournera : `Disklabel type: gpt`. - -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +La commande `fdisk -l` vous permet également d'identifier votre type de partition. Il s’agit d’une information importante pour reconstruire votre RAID en cas de défaillance d’un disque. -Pour les partitions **MBR**, la commande retournera : `Disklabel type: dos`. +Pour les partitions **GPT**, la ligne 6 affichera : `Disklabel type: gpt`. Ces informations ne sont visibles que lorsque le serveur est en mode normal. -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Cette commande montre que `/dev/md2` se compose de 888,8 Go et `/dev/md4` contient 973,5 Go. Exécuter la commande « mount » montre la disposition du disque. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Toujours en se basant sur les résultats de `fdisk -l`, on peut voir que `/dev/md2` se compose de 888.8GB et `/dev/md4` contient 973.5GB. Alternativement, la commande `lsblk` offre une vue différente des partitions : ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -173,91 +160,141 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Les disques sont actuellement montés par défaut. Pour retirer un disque du RAID, vous devez dans un premier temps démonter le disque, puis simuler un échec pour enfin le supprimer. -Nous allons supprimer `/dev/sda4` du RAID avec la commande suivante : +Nous prenons en compte les périphériques, les partitions et leurs points de montage. À partir des commandes et des résultats ci-dessus, nous avons : + +- Deux baies RAID : `/dev/md2` et `/dev/md4`. +- Quatre partitions font partie du RAID avec les points de montage : `/` et `/home`. + + + +### Simuler une panne de disque + +Maintenant que nous disposons de toutes les informations nécessaires, nous pouvons simuler une panne de disque et poursuivre les tests. Dans cet exemple, nous allons faire échouer le disque `sda`. + +Le moyen privilégié pour y parvenir est l’environnement en mode rescue d’OVHcloud. + +Redémarrez d'abord le serveur en mode rescue et connectez-vous avec les informations d'identification fournies. + +Pour retirer un disque du RAID, la première étape consiste à le marquer comme **Failed** et à retirer les partitions de leurs matrices RAID respectives. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +À partir de la sortie ci-dessus, sda se compose de deux partitions en RAID qui sont **sda2** et **sda4**. + + + +#### Retrait du disque défaillant + +Nous commençons par marquer les partitions **sda2** et **sda4** comme **failed**. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Veuillez noter que si vous êtes connecté en tant qu'utilisateur `root`, vous pouvez obtenir le message suivant lorsque vous essayez de démonter la partition (dans notre cas, où notre partition md4 est montée dans /home) : -> ->
umount: /home: target is busy
-> -> Dans ce cas, vous devez vous déconnecter en tant qu'utilisateur root et vous connecter en tant qu'utilisateur local (dans notre cas, `debian`) et utiliser la commande suivante : -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Si vous ne disposez pas d'utilisateur local, [vous devez en créer un](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` + +Nous avons maintenant simulé une défaillance du RAID, lorsque nous exécutons la commande `cat /proc/mdstat`, nous obtenons le résultat suivant : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Comme nous pouvons le voir ci-dessus, le [F] à côté des partitions indique que le disque est défaillant ou défectueux. -Le résultat obtenu sera le suivant : +Ensuite, nous retirons ces partitions des baies RAID. ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 ``` -L'entrée de `/dev/md4` n'est maintenant plus montée. Cependant, le RAID est toujours actif. Il est donc nécessaire de simuler un échec pour retirer le disque, ce qui peut être effectué grâce à la commande suivante : +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` + +Pour nous assurer que nous obtenons un disque qui est similaire à un disque vide, nous utilisons la commande suivante. Remplacez **sda** par vos propres valeurs : ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -Nous avons maintenant simulé un échec du RAID. L'étape suivante consiste à supprimer la partition du RAID avec la commande suivante : +Si nous exécutons la commande suivante, nous voyons que notre disque a été correctement « nettoyé » : ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -Vous pouvez vérifier que la partition a été supprimée avec la commande suivante : +L'état de notre RAID devrait maintenant ressembler à ceci : ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -La commande ci-dessous vérifie que la partition a été supprimée : +Les résultats ci-dessus montrent que seules deux partitions apparaissent désormais dans les matrices RAID. Nous avons réussi à faire échouer le disque **sda** et nous pouvons maintenant procéder au remplacement du disque. + +Pour plus d'informations sur la préparation et la demande de remplacement d'un disque, consultez ce [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +La commande suivante permet d'avoir plus de détails sur la ou les matrices RAID : ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -289,56 +326,200 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` -### Reconstruction du RAID + + +### Reconstruire le RAID + +> [!warning] +> +> Pour la plupart des serveurs en RAID logiciel, après un remplacement de disque, le serveur est capable de démarrer en mode normal (sur le disque sain) pour reconstruire le RAID. Cependant, si le serveur ne parvient pas à démarrer en mode normal, il sera redémarré en mode rescue pour procéder à la reconstruction du RAID. +> + + + +#### Reconstruire le RAID in normal mode + +Les étapes suivantes sont réalisées en mode normal. Dans notre exemple, nous avons remplacé le disque **sda**. + +Une fois le disque remplacé, nous devons copier la table de partition du disque sain (dans cet exemple, sdb) vers le nouveau (sda). + +> [!tabs] +> **Pour les partitions GPT** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sgdisk -R /dev/nouveau disque /dev/disque sain`. +>> +>> Une fois cette opération effectuée, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque afin d'éviter tout conflit avec les GUID d'autres disques : +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> Si le message suivant s'affiche : +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Vous pouvez simplement exécuter la commande `partprobe`. Si vous ne voyez toujours pas les partitions nouvellement créées (par exemple avec `lsblk`), vous devez redémarrer le serveur avant de continuer. +>> +> **Pour les partitions MBR** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sfdisk -d /dev/disksain | sfdisk /dev/nnouveaudisk`. +>> + +Ensuite, nous ajoutons les partitions au RAID : + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 +``` + +Utilisez la commande suivante pour surveiller la reconstruction du RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` -Une fois le disque remplacé, copiez la table de partition à partir d'un disque sain (« sdb » dans cet exemple) dans la nouvelle (« sda ») avec la commande suivante : +Enfin, nous ajoutons un label et montons la partition [SWAP] (le cas échéant). -**Pour les partitions GPT** +Pour ajouter un libellé à la partition SWAP : ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -La commande doit être au format suivant : `sgdisk -R /dev/nouveaudisque /dev/disquesain` +Ensuite, récupérez les UUID des deux partitions swap : + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Une fois cette opération effectuée, l’étape suivante consiste à rendre aléatoire le GUID du nouveau disque afin d’éviter tout conflit de GUID avec les autres disques : +Nous remplaçons l'ancien UUID de la partition swap (**sda4**) par le nouveau dans `/etc/fstab` : ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Pour les partitions MBR** +Assurez-vous de remplacer le bon UUID. + +Ensuite, rechargez le système avec la commande suivante : -Une fois le disque remplacé, copiez la table de partition à partir d'un disque sain (« sdb » dans cet exemple) dans la nouvelle (« sda ») avec la commande suivante : +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Exécutez la commande suivante pour activer la partition swap : ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -La commande doit être au format suivant : `sfdisk -d /dev/disquesain | sfdisk /dev/nouveaudisque` +La reconstruction du RAID est maintenant terminée. + + + +/// details | **Reconstruction du RAID en mode rescue** + +Une fois le disque remplacé, nous devons copier la table de partition du disque sain (dans cet exemple, sda) vers le nouveau (sdb). + +> [!tabs] +> **Pour les partitions GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sgdisk -R /dev/nouveau disque /dev/disque sain` +>> +>> Exemple : +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Une fois cette opération effectuée, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque afin d'éviter tout conflit avec les GUID d'autres disques : +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Si le message suivant s'affiche : +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Vous pouvez simplement exécuter la commande `partprobe`. Si vous ne voyez toujours pas les partitions nouvellement créées (par exemple avec `lsblk`), vous devez redémarrer le serveur avant de continuer. +>> +> **Pour les partitions MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> La commande doit être au format suivant : `sfdisk -d /dev/disque sain | sfdisk /dev/nouveau disque` +>> + +Nous pouvons maintenant reconstruire la matrice RAID. L'extrait de code suivant montre comment ajouter les nouvelles partitions (sdb2 et sdb4) dans la matrice RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -Il est maintenant possible de reconstruire la matrice RAID. L'extrait de code ci-dessous montre comment reconstruire la disposition de la partition `/dev/md4` avec la table de partition « sda » copiée récemment : +Utilisez la commande `cat /proc/mdstat` pour surveiller la reconstruction du RAID : ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Vérifiez les détails du RAID avec la commande suivante : +Pour plus de détails sur la ou les baies RAID : ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -369,12 +550,118 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -Le RAID a maintenant été reconstruit. Montez la partition (`/dev/md4` dans cet exemple) avec cette commande : + + +#### Ajout du label à la partition SWAP (le cas échéant) + +Une fois la reconstruction du RAID terminée, nous montons la partition contenant la racine de notre système d'exploitation sur `/mnt`. Dans notre exemple, cette partition est `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Nous ajoutons le label à notre partition swap avec la commande : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Ensuite, nous montons les répertoires suivants pour nous assurer que toute manipulation que nous faisons dans l'environnement chroot fonctionne correctement : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Ensuite, nous accédons à l'environnement `chroot` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Nous récupérons les UUID des deux partitions swap : + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Exemple: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Ensuite, nous remplaçons l'ancien UUID de la partition swap (**sdb4**) par le nouveau dans `/etc/fstab` : + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Exemple: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Assurez-vous de remplacer l'UUID approprié. Dans notre exemple ci-dessus, l'UUID à remplacer est `d6af33cf-fc15-4060-a43c-cb3b5537f58a` par le nouveau `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Assurez-vous de remplacer le bon UUID. + +Ensuite, nous nous assurons que tout est correctement monté : + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Rechargez le système avec la commande suivante : + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Activez la partition swap avec la commande suivante : + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Quittez l'environnement Chroot avec `exit` et démontez tous les disques : ```sh -mount /dev/md4 /home +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt ``` +Nous avons maintenant terminé avec succès la reconstruction du RAID sur le serveur et nous pouvons maintenant le redémarrer en mode normal. + + ## Aller plus loin [Remplacement à chaud - RAID logiciel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -385,4 +672,10 @@ mount /dev/md4 /home [Remplacement à chaud - RAID Matériel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Échangez avec notre [communauté d'utilisateurs](/links/community). +Pour des prestations spécialisées (référencement, développement, etc), contactez les [partenaires OVHcloud](/links/partner). + +Si vous souhaitez bénéficier d'une assistance à l'usage et à la configuration de vos solutions OVHcloud, nous vous proposons de consulter nos différentes [offres de support](/links/support). + +Si vous avez besoin d'une formation ou d'une assistance technique pour la mise en oeuvre de nos solutions, contactez votre commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander une analyse personnalisée de votre projet à nos experts de l’équipe Professional Services. + +Échangez avec notre [communauté d'utilisateurs](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md new file mode 100644 index 00000000000..f91dbd5e6bf --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md @@ -0,0 +1,1064 @@ +--- +title: Managing and rebuilding software RAID on servers using UEFI boot mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on a server using UEFI boot mode +updated: 2025-12-02 +--- + +## Objective + +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. + +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. + +**This guide explains how to manage and rebuild software RAID after a disk replacement on a server using UEFI boot mode** + +Before we begin, please note that this guide focuses on Dedicated servers that use UEFI as the boot mode. This is the case with modern motherboards. If your server uses the legacy boot (BIOS) mode, refer to this guide: [Managing and rebuilding software RAID on servers in legacy boot (BIOS) mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +To check whether a server runs on legacy BIOS mode or UEFI boot mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +For more information on UEFI, consult the following [article](https://uefi.org/about). + +## Requirements + +- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID, partitions and GRUB + +Throughout this guide, we use the terms **primary disk** and **secondary disk**. In this context: + +- The primary disk is the disk whose ESP (EFI System Partition) is mounted by Linux +- The secondary disk(s) are all the other disks in the RAID + +## Instructions + +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Understanding the EFI System Partition (ESP)](#efisystemparition) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID after the main disk is replaced (rescue mode)](#rescuemode) + - [Recreating the EFI System Partition](#recreateesp) + - [Rebuilding RAID when EFI partitions are not synchronized after major system updates (e.g GRUB)](efiraodgrub) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information + +In a command line session, type the following code to determine the current RAID status: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +This command shows us that we currently have two software RAID devices configured, **md2** and **md3**, with **md3** being the larger of the two. **md3** consists of two partitions, called **nvme1n1p3** and **nvme0n1p3**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. + +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. + +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 1022 MiB and `/dev/md3` contains 474.81 GiB. If we were to run the `mount` command we can also find out the layout of the disk. + +Alternatively, the `lsblk` command offers a different view of the partitions: + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Furthermore, if we run `lsblk -f`, we obtain more information about these partitions, such as the LABEL and UUID: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Take note of the devices, partitions, and their mount points; this is important, especially after replacing a disk. + +From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md3`. +- Four partitions are part of the RAID: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** with the mount points `/boot` and `/`. +- Two partitions not part of the RAID, with mount points: `/boot/efi` and [SWAP]. +- One partition does not have a mount point: **nvme1n1p1** + +The `nvme0n1p5` partition is a configuration partition, i.e. a read-only volume connected to the server that provides it with the initial configuration data. + + + +### Understanding the EFI System Partition (ESP) + +***What is an EFI System Partition?*** + +An EFI System Partition is a partition which can contain the boot loaders, boot managers, or kernel images of an installed operating system. It may also contain system utility programs designed to be run before the operating system boots, as well as data files such as error logs. + +***Is the EFI System Partition mirrored in RAID?*** + +No, as of August 2025, when the OS installation is performed by OVHcloud, the ESP is not included in the RAID. When you use our OS templates to install your server with software RAID, several EFI System Partitions are created: one per disk. However, only one EFI partition is mounted at once. All ESPs created at the time of installation contain the same files. + +The EFI System Partition is mounted at `/boot/efi` and the disk on which it is mounted is selected by Linux at boot. + +Example: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +From the output above, we see that we have two identical EFI System Partitions (nvme0n1p1 and nvme1n1p1) but only **nvme0n1p1** is mounted on `/boot/efi`. Both partitions have the LABEL: `EFI_SYSPART` (this naming is specific to OVHcloud). + +***Does the content of the EFI System Partition change regularly?*** + +In general, the contents of this partition do not change much, its content should only change on bootloader (e.g. GRUB) updates. + +However, we recommend running an automatic or manual script to synchronise all ESPs, so that they all contain the same up-to-date files. This way, if the drive on which this partition is mounted fails, the server will be able to restart on the ESP of one of the other drives. + +***What if the primary disk mounted on `boot/efi` fails?*** + +> [!primary] +> Please note that we explore the most common cases below, but there are several other reasons why a server may not start in normal mode after a disk replacement. +> + +**Case study 1** - There have been no changes or major system updates (e.g GRUB) to the OS + +- The server is able to boot in normal mode and you can proceed with the RAID rebuild. +- The server is unable to boot in normal mode, the server is rebooted into rescue mode, where you can rebuild the RAID and recreate the EFI partition on the new disk. + +**Case study 2** - There have been major system updates (e.g GRUB) to the OS and the ESPs have been synchronised + +- The server is able to boot in normal mode because all the ESPs contain up-to-date information and the RAID rebuild can be carried out in normal mode. +- The server is unable to boot in normal mode, the server is rebooted into rescue mode, where you can rebuild the RAID and recreate the EFI partition on the new disk. + +**Case study 3** - There have been major system updates (e.g GRUB) to the OS and the ESPs partitions have not been synchronised + +- The server is unable to boot in normal mode, the server is rebooted in rescue mode, where you can rebuild the RAID, recreate the EFI System partition on the new disk and reinstall the bootloader (e.g. GRUB) on it. +- The server is able to boot in normal mode (this can occur when an operating system is updated to a newer version, but the version of GRUB remains unchanged) and you can proceed with the RAID rebuild. + +Indeed, in some cases, booting from an out-of-date ESP may not work. For instance, a major GRUB update could cause the old GRUB binary present in the ESP to be incompatible with newer GRUB modules installed in the `/boot` partition. + +***How can I synchronise my EFI System Partitions, and how often should I synchronise them?*** + +> [!primary] +> Please note that depending on your operating system, the process might be different. Ubuntu for example is able to keep several EFI System Partitions synchronized at every GRUB update. However, it is the only operating system doing so. We recommend that you consult the official documentation of your operating system to understand how to manage ESPs. +> +> In this guide, the operating system used is Debian. + +We recommend that you synchronise your ESPs regularly or after each major system update. By default, all the EFI System partitions contain the same files after installation. However, if a major system update is involved, synchronising the ESPs is essential to keep the content up-to-date. + + + +#### Script + +Below is a script that you can use to manually synchronise them. You can also run an automated script to synchronise the partitions daily or whenever the service boots up. + +Before you execute the script, make sure `rsync` is installed on your system: + +**Debian/Ubuntu** + +```sh +sudo apt install rsync +``` + +**CentOS, Red Hat and Fedora** + +```sh +sudo yum install rsync +``` + +To execute a script in linux, you need an executable file: + +- Start by creating a .sh file in the directory of your choice, replacing `script-name` with the name of your choice + +```sh +sudo touch script-name.sh +``` + +- Open the file with a text editor and include the following lines + +```sh +sudo nano script-name.sh +``` + +```sh +#!/bin/bash + +set -euo pipefail + +MOUNTPOINT="/var/lib/grub/esp" +MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) + +echo "${MAIN_PARTITION} is the main partition" + +mkdir -p "${MOUNTPOINT}" + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Working on ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Save and exit the file. + +- Make the script executable + +```sh +sudo chmod +x script-name.sh +``` + +- Run the script + +```sh +sudo ./script-name.sh +``` + +- If you are not in the folder + +```sh +./path/to/folder/script-name.sh +``` + +When the script is executed, the contents of the mounted EFI partition will be synchronised with the others. To access the contents, you can mount any of these unmounted EFI partitions on the mount point: `/var/lib/grub/esp`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this first example, we will fail the primary disk `nvme0n1`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +From the above output, nvme0n1 consists of two partitions in RAID which are **nvme0n1p2** and **nvme0n1p3**. + + + +#### Removing the failed disk + +First we mark the partitions **nvme0n1p2** and **nvme0n1p3** as failed. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +When we run the `cat /proc/mdstat` command, we have the following output: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays to completely remove the disk from RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **nvme0n1**. + +To make sure that we get a disk that is similar to an empty disk, we use the following command on each partition, then on the disk: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +If we run the following command, we see that our disk has been successfully "wiped": + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +If you run the following command, you can have more details on the RAID arrays: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +We can now proceed with the disk replacement. + + + +### Rebuilding the RAID + +> [!primary] +> This process might be different depending on the operating system you have installed on your server. We recommend that you consult the official documentation of your operating system to have access to the proper commands. +> + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) and the rebuild can be done in normal mode. However, if the server is not able to boot in normal mode after a disk replacement, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> +> If your server is able to boot in normal mode after the disk replacement, simply proceed with the steps from [this section](#rebuilding-the-raid-in-normal-mode). + + + +#### Rebuilding the RAID in rescue mode + +Once the disk has been replaced, the next step is to copy the partition table from the healthy disk (in this example, nvme1n1) to the new one (nvme0n1). + +**For GPT partitions** + +The command should be in this format: `sgdisk -R /dev/new disk /dev/healthy disk` + +In our example: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Run `lsblk` to make sure the partition tables have been properly copied: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +If you receive the message below: + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Simply run the `partprobe` command. + +We can now rebuild the RAID array. The following code snippet shows how to add the new partitions (nvme0n1p2 and nvme0n1p3) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 +# mdadm: re-added /dev/nvme0n1p3 +``` + +To check the rebuild process: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Once the RAID rebuild is complete, run the following command to make sure that the partitions have been properly added to the RAID: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +Based on the above results, the partitions on the new disk have been correctly added to the RAID. However, the EFI System Partition and the SWAP partition (in some cases) have not been duplicated, which is normal as they are not included in the RAID. + +> [!warning] +> The examples above are merely illustrating the necessary steps based on a default server configuration. The information in the output table depends on your server's hardware and its partition scheme. When in doubt, consult the documentation of your operating system. +> +> If you require professional assistance with server administration, consider the details in the [Go further](#go-further) section of this guide. +> + + + +#### Recreating the EFI System Partition + +To recreate the EFI system partition, we need to format **nvme0n1p1** and then replicate the contents of the healthy partition (in our example: nvme1n1p1) onto it. + +Here, we assume that both partitions have been synchronised and contain up-to-date files. + +> [!warning] +> If there was a major system update such as kernel or grub and both partitions were not synchronised, consult this [section](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) once you are done creating the new EFI System Partition. +> + +First, we format the partition: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Next, we label the partition as `EFI_SYSPART` (this naming is specific to OVHcloud): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Next, we duplicate the contents of nvme1n1p1 to nvme0n1p1. We start by creating two folders, We start by creating two folders, which we name "old" and "new" in our example: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Next, we mount **nvme1n1p1** in the 'old' folder and **nvme0n1p1** in the 'new' folder to make the distinction: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Next, we copy the files from the 'old' folder to 'new' one: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Once this is done, we unmount both partitions: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Next, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is **md3**. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +We mount the following directories to make sure any manipulation we make in the `chroot` environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we use the `chroot` command to access the mount point and make sure the new EFI System Partition has been properly created and the system recongnises both ESPs: + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +To view the ESP partitions, we run the command `blkid -t LABEL=EFI_SYSPART`: + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +The above results show that the new EFI partition has been created correctly and that the LABEL has been applied correctly. + + + +#### Rebuilding RAID when EFI partitions are not synchronized after major system updates (GRUB) + +/// details | Unfold this section + +> [!warning] +> Please only follow the steps in this section if it applies to your case. +> + +When EFI system partitions are not synchronised after major system updates that modify/affect the GRUB, and the primary disk on which the partition is mounted is replaced, booting from a secondary disk containing an out-of-date ESP may not work. + +In this case, in addition to rebuilding the RAID and recreating the EFI system partition in rescue mode, you must also reinstall GRUB on it. + +So once we have recreated the EFI partition and made sure the system recognises both partitions (previous steps in `chroot`), we create the `/boot/efi` folder in order to mount the new EFI System Partition **nvme0n1p1**: + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Next, we reinstall the GRUB bootloader: + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Once done, run the following command: + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Adding the label to the SWAP partition (if applicable) + +Once we are done with the EFI partition, we move to the SWAP partition. + +We exit the `chroot` environment with `exit` in order to recreate our [SWAP] partition **nvme0n1p4** and add the label `swap-nvmenxxx`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +We verify that the label has been properly applied: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091fcd15 +``` + +We then access the `chroot` environment again: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUID of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID blkid /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" + +root@rescue12-customer-eu:/# blkid -s UUID blkid /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Next, we replace the old UUID of the swap partition (**nvme0n1p4**) with the new one in the `/etc/fstab` file: + +```sh +root@rescue12-customer-eu:/# nano /etc/fstab +``` + +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +root@rescue12-customer-eu:/# mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +We activate the swap partition: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +We exit the chroot environment with `exit` and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -Rl /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + + + +#### Rebuilding the RAID in normal mode + +/// details | Unfold this section + +If your server is able to boot in normal mode after a disk replacement, you can proceed with the following steps to rebuild the RAID. + +Once the disk has been replaced, we copy the partition table from the healthy disk (in this example, nvme1n1) to the new one (nvme0n1). + +**For GPT partitions** + +```sh +sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +The command should be in this format: `sgdisk -R /dev/new disk /dev/healthy disk`. + +Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: + +```sh +sgdisk -G /dev/nvme0n1 +``` + +If you receive the following message: + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you +run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/nvme0n1p2 + +# mdadm: added /dev/nvme0n1p2 + +[user@server_ip ~]# sudo mdadm --add /dev/md3 /dev/nvme0n1p3 + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Use the following command to follow the RAID rebuild: `cat /proc/mdstat`. + +**Recreating the EFI System Partition on the disk** + +First, we install the necessary tools: + +**Debian and Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Next, we format the partition. In our example `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Next, we label the partition as `EFI_SYSPART` (this naming is specific to OVHcloud) + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Once done, you can synchronize both partitions using the script we provided [here](#script). + +We verify that the new EFI System Partition has been properly created and the system recongnises it: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Lastly, we activate the [SWAP] partition (if applicable): + + +- We create and add the label: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- We retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- We replace the old UUID of the swap partition (**nvme0n1p4)** with the new one in `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Example: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Make sure you replace the correct UUID. + +Next, we run the following command to activate the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Next, we reload the system: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +We have now successfully completed the RAID rebuild. + +## Go Further + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + +Join our [community of users](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md new file mode 100644 index 00000000000..42a8241cca1 --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md @@ -0,0 +1,1064 @@ +--- +title: "Gestion et reconstruction d'un RAID logiciel sur les serveurs utilisant le mode de démarrage UEFI" +excerpt: Découvrez comment gérer et reconstruire un RAID logiciel après un remplacement de disque sur un serveur utilisant le mode de démarrage UEFI +updated: 2025-12-02 +--- + +## Objectif + +Un Redundant Array of Independent Disks (RAID) est une technologie qui atténue la perte de données sur un serveur en répliquant les données sur deux disques ou plus. + +Le niveau RAID par défaut pour les installations de serveurs OVHcloud est le RAID 1, qui double l'espace occupé par vos données, réduisant ainsi l'espace disque utilisable de moitié. + +**Ce guide explique comment gérer et reconstruire un RAID logiciel après un remplacement de disque sur votre serveur en mode EFI** + +Avant de commencer, veuillez noter que ce guide se concentre sur les serveurs dédiés qui utilisent le mode UEFI comme mode de démarrage. C'est le cas des cartes mères modernes. Si votre serveur utilise le mode de démarrage legacy (BIOS), veuillez consulter ce guide : [Gestion et reconstruction d'un RAID logiciel sur des serveurs en mode de démarrage legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Pour vérifier si un serveur fonctionne en mode BIOS legacy ou en mode UEFI, exécutez la commande suivante : + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Pour plus d'informations sur l'UEFI, consultez l'article suivant : [https://uefi.org/about](https://uefi.org/about). + +## Prérequis + +- Un [serveur dédié](/links/bare-metal/bare-metal) avec une configuration RAID logiciel +- Un accès administrateur (sudo) au serveur via SSH +- Une compréhension du RAID, des partitions et de GRUB + +Au cours de ce guide, nous utilisons les termes **disque principal** et **disque secondaire**. Dans ce contexte : + +- Le disque principal est le disque dont l'ESP (EFI System Partition) est monté par Linux +- Les disques secondaires sont tous les autres disques du RAID + +## En pratique + +Lorsque vous achetez un nouveau serveur, vous pouvez ressentir le besoin d'effectuer une série de tests et d'actions. Un tel test pourrait être de simuler une panne de disque afin de comprendre le processus de reconstruction du RAID et de vous préparer en cas de problème. + +### Aperçu du contenu + +- [Informations de base](#basicinformation) +- [Compréhension de la partition système EFI (ESP)](#efisystemparition) +- [Simulation d'une panne de disque](#diskfailure) + - [Suppression du disque défectueux](#diskremove) +- [Reconstruction du RAID](#raidrebuild) + - [Reconstruction du RAID après le remplacement du disque principal (mode de secours)](#rescuemode) + - [Re création de la partition système EFI](#recreateesp) + - [Reconstruction du RAID lorsque les partitions EFI ne sont pas synchronisées après des mises à jour majeures du système (ex. GRUB)](efiraodgrub) + - [Ajout de l'étiquette à la partition SWAP (si applicable)](#swap-partition) + - [Reconstruction du RAID en mode normal](#normalmode) + + + +### Informations de base + +Dans une session de ligne de commande, tapez la commande suivante pour déterminer l'état actuel du RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Cette commande nous montre que nous avons actuellement deux volumes RAID logiciels configurés, **md2** et **md3**, avec **md3** étant le plus grand des deux. **md3** se compose de deux partitions, appelées **nvme1n1p3** et **nvme0n1p3**. + +Le [UU] signifie que tous les disques fonctionnent normalement. Un `_` indiquerait un disque défectueux. + +Si vous avez un serveur avec des disques SATA, vous obtiendrez les résultats suivants : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Bien que cette commande retourne nos volumes RAID, elle ne nous indique pas la taille des partitions elles-mêmes. Nous pouvons trouver cette information avec la commande suivante : + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +La commande `fdisk -l` permet également d'identifier le type de vos partitions. C'est une information importante lors de la reconstruction de votre RAID en cas de panne de disque. + +Pour les partitions **GPT**, la ligne 6 affichera : `Disklabel type: gpt`. + +Toujours en se basant sur les résultats de `fdisk -l`, nous pouvons voir que `/dev/md2` se compose de 1022 MiB et `/dev/md3` contient 474,81 GiB. Si nous exécutons la commande `mount`, nous pouvons également trouver la disposition des disques. + +En alternative, la commande `lsblk` offre une vue différente des partitions : + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +De plus, si nous exécutons `lsblk -f`, nous obtenons davantage d'informations sur ces partitions, telles que le LABEL et l'UUID : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Notez les dispositifs, les partitions et leurs points de montage ; c'est important, surtout après le remplacement d'un disque. + +À partir des commandes et résultats ci-dessus, nous avons : + +- Deux matrices RAID : `/dev/md2` et `/dev/md3`. +- Quatre partitions qui font partie du RAID : **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** avec les points de montage `/boot` et `/`. +- Deux partitions non incluses dans le RAID, avec les points de montage : `/boot/efi` et [SWAP]. +- Une partition qui ne possède pas de point de montage : **nvme1n1p1** + +La partition `nvme0n1p5` est une partition de configuration, c'est-à-dire un volume en lecture seule connecté au serveur qui lui fournit les données de configuration initiale. + + + +### Comprendre la partition système EFI (ESP) + +***Qu'est-ce qu'une partition système EFI ?*** + +**Une partition système EFI est une partition sur laquelle le serveur demarre. Elle contient les fichiers de démarrage, mais aussi les gestionnaires de démarrage ou les images de noyau d'un système d'exploitation installé. Elle peut également contenir des programmes utilitaires conçus pour être exécutés avant que le système d'exploitation ne démarre, ainsi que des fichiers de données tels que des journaux d'erreurs. + +***La partition système EFI est-elle incluse dans le RAID ?*** + +Non, à partir d'août 2025, lorsqu'une installation du système d'exploitation est effectuée par OVHcloud, la partition ESP n'est pas incluse dans le RAID. Lorsque vous utilisez nos modèles d'OS pour installer votre serveur avec un RAID logiciel, plusieurs partitions système EFI sont créées : une par disque. Cependant, seule une partition EFI est montée à la fois. Toutes les ESP créées contiennent les mêmes fichiers. Tous les ESP créés au moment de l'installation contiennent les mêmes fichiers. + +La partition système EFI est montée à `/boot/efi` et le disque sur lequel elle est montée est sélectionné par Linux au démarrage. + +Exemple : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +D'après les résultats ci-dessus, nous voyons que nous avons deux partitions système EFI identiques (nvme0n1p1 et nvme1n1p1), mais seule **nvme0n1p1** est montée sur `/boot/efi`. Les deux partitions ont le LABEL : `EFI_SYSPART` (ce nommage est spécifique à OVHcloud). + +***Le contenu de la partition système EFI change-t-il régulièrement ?*** + +En général, le contenu de cette partition ne change pas beaucoup, son contenu ne devrait changer que lors des mises à jour du chargeur d'amorçage (*bootloader*). + +Cependant, nous recommandons d'exécuter un script automatique ou manuel pour synchroniser tous les ESP, afin qu'ils contiennent tous les mêmes fichiers à jour. De cette façon, si le disque sur lequel cette partition est montée tombe en panne, le serveur pourra redémarrer sur l'ESP de l'un des autres disques. + +***Que se passe-t-il si le disque principal monté sur `boot/efi` tombe en panne ?*** + +> [!primary] +> Veuillez noter que nous explorons ci-dessous les cas les plus courants, mais il existe plusieurs autres raisons pour lesquelles un serveur ne pourrait pas démarrer en mode normal après un remplacement de disque. +> + +**Étude de cas 1** - Il n'y a eu aucun changement ou mise à jour majeure du système (par exemple GRUB) + +- Le serveur est capable de démarrer en mode normal et vous pouvez procéder à la reconstruction du RAID. +- Le serveur n'est pas capable de démarrer en mode normal, le serveur est redémarré en mode rescue, où vous pouvez reconstruire le RAID et recréer la partition EFI sur le nouveau disque. + +**Étude de cas 2** - Il y a eu des mises à jour majeures du système (par exemple GRUB) et les ESP ont été synchronisées + +- Le serveur est capable de démarrer en mode normal car toutes les ESP contiennent des informations à jour et la reconstruction du RAID peut être effectuée en mode normal. +- Le serveur n'est pas capable de démarrer en mode normal, le serveur est redémarré en mode rescue, où vous pouvez reconstruire le RAID et recréer la partition système EFI sur le nouveau disque. + +**Étude de cas 3** - Il y a eu des mises à jour majeures du système (par exemple GRUB) et les partitions ESP n'ont pas été synchronisées + +- Le serveur n'est pas capable de démarrer en mode normal, le serveur est redémarré en mode rescue, où vous pouvez reconstruire le RAID, recréer la partition système EFI sur le nouveau disque et réinstaller le chargeur de démarrage (bootloader) sur celui-ci. +- Le serveur est capable de démarrer en mode normal (cela pourrait arriver dans le cas où un système d'exploitation est mis à jour vers une version plus récente mais que la version de GRUB reste inchangée) et vous pouvez procéder à la reconstruction du RAID. + +En effet, dans certains cas, le démarrage à partir d'une ESP obsolète ne fonctionne pas. Par exemple, une mise à jour majeure de GRUB pourrait rendre l'ancienne version de GRUB présente dans l'ESP incompatible avec les modules GRUB plus récents installés dans la partition `/boot`. + +***Comment puis-je synchroniser mes partitions système EFI, et à quelle fréquence devrais-je les synchroniser ?*** + +> [!primary] +> Veuillez noter que selon votre système d'exploitation, le processus peut être différent. Par exemple, Ubuntu est capable de garder plusieurs partitions système EFI synchronisées à chaque mise à jour de GRUB. Cependant, c'est le seul système d'exploitation qui le fait. Nous vous recommandons de consulter la documentation officielle de votre système d'exploitation pour comprendre comment gérer les ESP. +> +> Dans ce guide, le système d'exploitation utilisé est Debian. + +Nous vous recommandons de synchroniser vos ESP régulièrement ou après chaque mise à jour majeure du système. Par défaut, toutes les partitions système EFI contiennent les mêmes fichiers après l'installation. Cependant, si une mise à jour majeure du système est impliquée, la synchronisation des ESP est essentielle pour garder le contenu à jour. + + + +#### Script + +Voici un script que vous pouvez utiliser pour les synchroniser manuellement. Vous pouvez également exécuter un script automatisé pour synchroniser les partitions quotidiennement ou chaque fois que le service démarre. + +Avant d'exécuter le script, assurez-vous que `rsync` est installé sur votre système : + +**Debian/Ubuntu** + +```sh +sudo apt install rsync +``` + +**CentOS, Red Hat et Fedora** + +```sh +sudo yum install rsync +``` + +Pour exécuter un script sous Linux, vous avez besoin d'un fichier exécutable : + +- Commencez par créer un fichier .sh dans le répertoire de votre choix, en remplaçant `nom-du-script` par le nom de votre choix. + +```sh +sudo touch nom-du-script.sh +``` + +- Ouvrez le fichier avec un éditeur de texte et ajoutez les lignes suivantes : + +```sh +sudo nano nom-du-script.sh +``` + +```sh +#!/bin/bash + +set -euo pipefail + +MOUNTPOINT="/var/lib/grub/esp" +MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) + +echo "${MAIN_PARTITION} est la partition principale" + +mkdir -p "${MOUNTPOINT}" + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Travail sur ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Enregistrez et fermez le fichier. + +- Rendez le script exécutable + +```sh +sudo chmod +x nom-du-script.sh +``` + +- Exécutez le script + +```sh +sudo ./nom-du-script.sh +``` + +- Si vous n'êtes pas dans le dossier + +```sh +./chemin/vers/dossier/nom-du-script.sh +``` + +Lorsque le script est exécuté, le contenu de la partition EFI montée sera synchronisé avec les autres. Pour accéder au contenu, vous pouvez monter l'une de ces partitions EFI non montées sur le point de montage : `/var/lib/grub/esp`. + + + +### Simulation d'une panne de disque + +Maintenant que nous avons toutes les informations nécessaires, nous pouvons simuler une panne de disque et procéder aux tests. Dans ce premier exemple, nous allons provoquer une défaillance du disque principal `nvme0n1`. + +La méthode préférée pour le faire est via le mode rescue d'OVHcloud. + +Redémarrez d'abord le serveur en mode rescue et connectez-vous avec les identifiants fournis. + +Pour retirer un disque du RAID, la première étape est de le marquer comme **Failed** et de retirer les partitions de leurs tableaux RAID respectifs. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +À partir du résultat ci-dessus, nvme0n1 comporte deux partitions en RAID qui sont **nvme0n1p2** et **nvme0n1p3**. + + + +#### Retrait du disque défectueux + +Tout d'abord, nous marquons les partitions **nvme0n1p2** et **nvme0n1p3** comme défectueuses. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Lorsque nous exécutons la commande `cat /proc/mdstat`, nous obtenons : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Comme nous pouvons le voir ci-dessus, le [F] à côté des partitions indique que le disque est défectueux ou en panne. + +Ensuite, nous retirons ces partitions des tableaux RAID pour supprimer complètement le disque du RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +L'état de notre RAID devrait maintenant ressembler à ceci : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +D'après les résultats ci-dessus, nous pouvons voir qu'il n'y a désormais que deux partitions dans les tableaux RAID. Nous avons réussi à dégrader le disque **nvme0n1**. + +Pour nous assurer d'obtenir un disque similaire à un disque vide, nous utilisons la commande suivante sur chaque partition, puis sur le disque lui-même : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +Le disque apparaît désormais comme un disque neuf et vide : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Si nous exécutons la commande suivante, nous constatons que notre disque a été correctement "effacé" : + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Pour plus d'informations sur la préparation et la demande de remplacement d'un disque, consultez ce [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Si vous exécutez la commande suivante, vous pouvez obtenir davantage de détails sur les tableaux RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Nous pouvons maintenant procéder au remplacement du disque. + + + +### Reconstruction du RAID + +> [!primary] +> Ce processus peut varier selon le système d'exploitation installé sur votre serveur. Nous vous recommandons de consulter la documentation officielle de votre système d'exploitation pour obtenir les commandes appropriées. +> + +> [!warning] +> +> Sur la plupart des serveurs en RAID logiciel, après un remplacement de disque, le serveur est capable de démarrer en mode normal (sur le disque sain) et la reconstruction peut être effectuée en mode normal. Cependant, si le serveur ne parvient pas à démarrer en mode normal après le remplacement du disque, il redémarrera en mode rescue pour procéder à la reconstruction du RAID. +> +> Si votre serveur est capable de démarrer en mode normal après le remplacement du disque, suivez simplement les étapes de [cette section](#rebuilding-the-raid-in-normal-mode). + + + +#### Reconstruction du RAID en mode rescue + +Une fois le disque remplacé, l'étape suivante consiste à copier la table de partitions du disque sain (dans cet exemple, nvme1n1) sur le nouveau (nvme0n1). + +**Pour les partitions GPT** + +La commande doit être dans ce format : `sgdisk -R /dev/nouveau disque /dev/disque sain` + +Dans notre exemple : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Exécutez `lsblk` pour vous assurer que les tables de partitions ont été correctement copiées : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Une fois cela fait, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque afin d'éviter les conflits de GUID avec d'autres disques : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Si vous recevez le message suivant : + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Exécutez simplement la commande `partprobe`. + +Nous pouvons maintenant reconstruire la matrice RAID. L'extrait de code suivant montre comment ajouter à nouveau les nouvelles partitions (nvme0n1p2 et nvme0n1p3) à la matrice RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 +# mdadm: re-added /dev/nvme0n1p3 +``` + +Pour vérifier le processus de reconstruction : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Une fois la reconstruction du RAID terminée, exécutez la commande suivante pour vous assurer que les partitions ont été correctement ajoutées au RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +D'après les résultats ci-dessus, les partitions du nouveau disque ont été correctement ajoutées au RAID. Toutefois, la partition EFI System et la partition SWAP (dans certains cas) n'ont pas été dupliquées, ce qui est normal car elles ne font pas partie du RAID. + +> [!warning] +> Les exemples ci-dessus illustrent simplement les étapes nécessaires sur la base d'une configuration de serveur par défaut. Les résultats de chaque commande dépendent du type de matériel installé sur votre serveur et de la structure de ses partitions. En cas de doute, consultez la documentation de votre système d'exploitation. +> +> Si vous avez besoin d'une assistance professionnelle pour l'administration de votre serveur, consultez les détails de la section [Aller plus loin](#go-further) de ce guide. +> + + + +#### Recréation de la partition EFI System + +Pour recréer la partition EFI System, nous devons formater **nvme0n1p1** et répliquer le contenu de la partition EFI System saine (dans notre exemple : nvme1n1p1) sur celle-ci. + +Ici, nous supposons que les deux partitions ont été synchronisées et contiennent des fichiers à jour ou n'ont tout simplement pas subi de mises à jour système ayant un impact sur le *bootloader*. + +> [!warning] +> Si une mise à jour majeure du système, telle qu'une mise à jour du noyau ou de GRUB, a eu lieu et que les deux partitions n'ont pas été synchronisées, consultez cette [section](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) une fois que vous avez terminé la création de la nouvelle partition EFI System. +> + +Tout d'abord, nous formattons la partition : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Ensuite, nous attribuons l'étiquette `EFI_SYSPART` à la partition. (ce nommage est spécifique à OVHcloud) : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Ensuite, nous dupliquons le contenu de nvme1n1p1 vers nvme0n1p1. Nous commençons par créer deux dossiers, que nous nommons « old » et « new » dans notre exemple : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Ensuite, nous montons **nvme1n1p1** dans le dossier « old » et **nvme0n1p1** dans le dossier « new » pour faire la distinction : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Ensuite, nous copions les fichiers du dossier 'old' vers 'new' : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Une fois cela fait, nous démontons les deux partitions : + + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Ensuite, nous montons la partition contenant la racine de notre système d'exploitation sur `/mnt`. Dans notre exemple, cette partition est **md3**: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Nous montons les répertoires suivants pour nous assurer que toute manipulation que nous effectuons dans l'environnement `chroot` fonctionne correctement : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Ensuite, nous utilisons la commande `chroot` pour accéder au point de montage et nous assurer que la nouvelle partition système EFI a été correctement créée et que le système reconnaît les deux ESP : + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Pour afficher les partitions ESP, nous exécutons la commande `blkid -t LABEL=EFI_SYSPART` : + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Les résultats ci-dessus montrent que la nouvelle partition EFI a été créée correctement et que le LABEL a été appliqué correctement. + + + +#### Reconstruction du RAID lorsque les partitions EFI ne sont pas synchronisées après des mises à jour majeures du système (GRUB) + +/// details | Développer cette section + +> [!warning] +> Veuillez suivre les étapes de cette section uniquement si cela s'applique à votre cas. +> + +Lorsque les partitions système EFI ne sont pas synchronisées après des mises à jour majeures du système qui modifient/affectent le GRUB, et que le disque principal sur lequel la partition est montée est remplacé, le démarrage à partir d'un disque secondaire contenant une ESP obsolète peut ne pas fonctionner. + +Dans ce cas, en plus de reconstruire le RAID et de recréer la partition système EFI en mode rescue, vous devez également réinstaller le GRUB sur celle-ci. + +Une fois que nous avons recréé la partition EFI et nous sommes assurés que le système reconnaît les deux partitions (étapes précédentes dans `chroot`), nous créons le dossier `/boot/efi` afin de monter la nouvelle partition système EFI **nvme0n1p1** : + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Ensuite, nous réinstallons le chargeur de démarrage GRUB (*bootloader*) : + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Une fois fait, exécutez la commande suivante : + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Ajout de l'étiquette à la partition SWAP (si applicable) + +Une fois que nous avons terminé avec la partition EFI, nous passons à la partition SWAP. + +Nous sortons de l'environnement `chroot` avec `exit` afin de recréer notre partition [SWAP] **nvme0n1p4** et ajouter l'étiquette `swap-nvme0n1p4` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Nous vérifions que l'étiquette a été correctement appliquée : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091fcd15 +``` + +Nous accédons ensuite à nouveau à l'environnement `chroot` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Nous récupérons l'UUID des deux partitions swap : + +```sh +root@rescue12-customer-eu:/# blkid -s UUID blkid /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" + +root@rescue12-customer-eu:/# blkid -s UUID blkid /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Ensuite, nous remplaçons l'ancien UUID de la partition swap (**nvme0n1p4**) par le nouveau dans le fichier `/etc/fstab` : + +```sh +root@rescue12-customer-eu:/# nano /etc/fstab +``` + +Exemple : + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Sur la base des résultats ci-dessus, l'ancien UUID est `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` et doit être remplacé par le nouveau `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Assurez-vous de remplacer le bon UUID. + +Ensuite, nous vérifions que tout est correctement monté avec la commande suivante : + +```sh +root@rescue12-customer-eu:/# mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Nous activons la partition swap : + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Nous sortons de l'environnement chroot avec `exit` et rechargeons le système : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +Nous démontons tous les disques : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -Rl /mnt +``` + +Nous avons maintenant terminé avec succès la reconstruction RAID sur le serveur et nous pouvons désormais le redémarrer en mode normal. + + + +#### Reconstruction du RAID en mode normal + +/// details | Développer cette section + +Si votre serveur est capable de démarrer en mode normal après un remplacement de disque, vous pouvez suivre les étapes suivantes pour reconstruire le RAID. + +Une fois le disque remplacé, nous copions la table de partition du disque sain (dans cet exemple, nvme1n1) vers le nouveau (nvme0n1). + +**Pour les partitions GPT** + +```sh +sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +La commande doit être dans ce format : `sgdisk -R /dev/nouveau disque /dev/disque sain`. + +Une fois cela fait, l'étape suivante consisteà attribuer un GUID aléatoire au nouveau disque pour éviter les conflits de GUID avec d'autres disques : + +```sh +sgdisk -G /dev/nvme0n1 +``` + +Si vous recevez le message suivant : + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you +run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Exécutez simplement la commande `partprobe`. Si vous ne voyez toujours pas les nouvelles partitions créées (ex. avec `lsblk`), vous devez redémarrer le serveur avant de continuer. + +Ensuite, nous ajoutons les partitions au RAID : + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/nvme0n1p2 + +# mdadm: added /dev/nvme0n1p2 + +[user@server_ip ~]# sudo mdadm --add /dev/md3 /dev/nvme0n1p3 + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Utilisez la commande suivante pour suivre la reconstruction du RAID : `cat /proc/mdstat`. + +**Recreation de la partition EFI System sur le disque** + +Tout d'abord, nous installons les outils nécessaires : + +**Debian et Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Ensuite, nous formattons la partition. Dans notre exemple `nvme0n1p1` : + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Ensuite, nous attribuons l'étiquette `EFI_SYSPART` à la partition. (ce nommage est spécifique à OVHcloud): + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Une fois cela fait, vous pouvez synchroniser les deux partitions à l'aide du script que nous avons fourni [ici](#script). + +Nous vérifions que la nouvelle partition EFI System a été correctement créée et que le système la reconnaît : + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Enfin, nous activons la partition [SWAP] (si applicable) : + +- Nous créons et ajoutons l'étiquette : + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Nous récupérons les UUID des deux partitions swap : + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Nous remplaçons l'ancien UUID de la partition swap (**nvme0n1p4)** par le nouveau dans `/etc/fstab` : + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Exemple : + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +D'après les résultats ci-dessus, l'ancien UUID est `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` et doit être remplacé par le nouveau `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Assurez-vous de remplacer le bon UUID. + +Ensuite, nous exécutons la commande suivante pour activer la partition swap : + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Ensuite, nous rechargeons le système : + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Nous avons maintenant terminé avec succès la reconstruction RAID. + +## Aller plus loin + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Pour les services spécialisés (SEO, développement, etc.), contactez [les partenaires OVHcloud](/links/partner). + +Si vous avez besoin d'une assistance pour utiliser et configurer vos solutions OVHcloud, veuillez consulter nos [offres de support](/links/support). + +Si vous avez besoin de formation ou d'une assistance technique pour mettre en place nos solutions, contactez votre représentant commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander à nos experts de Services Professionnels d'intervenir sur votre cas d'utilisation spécifique. + +Rejoignez notre [communauté d'utilisateurs](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml new file mode 100644 index 00000000000..fc2cce15b0e --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml @@ -0,0 +1,2 @@ +id: 026d2da0-e852-4c24-b78c-39660ef19c06 +full_slug: dedicated-servers-raid-soft-uefi \ No newline at end of file From 238df1efb3f828ebff2e005d0b0f4c7182239fe3 Mon Sep 17 00:00:00 2001 From: jessica Date: Mon, 1 Dec 2025 20:07:36 -0500 Subject: [PATCH 2/8] update --- .../bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md index 8fe329f0549..465a28eb56f 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md @@ -1,5 +1,5 @@ --- -title: Managing and rebuilding software RAID on servers in legacy boot (BIOS) mode +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-02 --- From c0876ba039c47d05a9097f81fe90fe646136fd8c Mon Sep 17 00:00:00 2001 From: jessica Date: Mon, 1 Dec 2025 20:08:30 -0500 Subject: [PATCH 3/8] index update --- pages/index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/pages/index.md b/pages/index.md index 94e8de20140..d7fbb892efe 100644 --- a/pages/index.md +++ b/pages/index.md @@ -164,7 +164,8 @@ + [How to assign a tag to a Bare Metal server](bare_metal_cloud/dedicated_servers/resource-tag-assign) + [How to install VMware ESXi 8 on a dedicated server](bare_metal_cloud/dedicated_servers/esxi-partitioning) + [Storage](bare-metal-cloud-dedicated-servers-configuration-storage) - + [How to configure and rebuild software RAID](bare_metal_cloud/dedicated_servers/raid_soft) + + [Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode](bare_metal_cloud/dedicated_servers/raid_soft) + + [Managing and rebuilding software RAID on servers using UEFI boot mode](bare_metal_cloud/dedicated_servers/raid_soft_uefi) + [Managing hardware RAID](bare_metal_cloud/dedicated_servers/raid_hard) + [Hot swap - Hardware RAID](bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + [Hot swap - Software RAID](bare_metal_cloud/dedicated_servers/hotswap_raid_soft) From 4bda2271f8ee8ffef248c8ef6764681c547056f2 Mon Sep 17 00:00:00 2001 From: jessica Date: Mon, 1 Dec 2025 20:22:45 -0500 Subject: [PATCH 4/8] date update --- .../bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md | 2 +- .../bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md | 2 +- .../dedicated_servers/raid_soft_uefi/guide.en-gb.md | 2 +- .../dedicated_servers/raid_soft_uefi/guide.fr-fr.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md index 465a28eb56f..c9e4c5fcc92 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode -updated: 2025-12-02 +updated: 2025-12-03 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md index 31333738d6b..4f488f10f6a 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: Gestion et reconstruction du RAID logiciel sur les serveurs en mode legacy boot (BIOS) excerpt: "Découvrez comment gérer et reconstruire le RAID logiciel après un remplacement de disque sur votre serveur en mode legacy boot (BIOS)" -updated: 2025-12-02 +updated: 2025-12-03 --- ## Objectif diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md index f91dbd5e6bf..337e8a7c5ee 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using UEFI boot mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on a server using UEFI boot mode -updated: 2025-12-02 +updated: 2025-12-03 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md index 42a8241cca1..719ffa7a887 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: "Gestion et reconstruction d'un RAID logiciel sur les serveurs utilisant le mode de démarrage UEFI" excerpt: Découvrez comment gérer et reconstruire un RAID logiciel après un remplacement de disque sur un serveur utilisant le mode de démarrage UEFI -updated: 2025-12-02 +updated: 2025-12-03 --- ## Objectif From 33de926509a544d7f0a7c995751b007d8fac3533 Mon Sep 17 00:00:00 2001 From: Yoann Cosse Date: Fri, 5 Dec 2025 14:55:54 +0100 Subject: [PATCH 5/8] Proofreading --- .../raid_soft/guide.en-gb.md | 2 +- .../raid_soft/guide.fr-fr.md | 7 ++--- .../raid_soft_uefi/guide.en-gb.md | 2 +- .../raid_soft_uefi/guide.fr-fr.md | 27 +++++++++---------- 4 files changed, 19 insertions(+), 19 deletions(-) diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md index c9e4c5fcc92..d3295c1c924 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode -updated: 2025-12-03 +updated: 2025-12-05 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md index 4f488f10f6a..8b04a0d900c 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: Gestion et reconstruction du RAID logiciel sur les serveurs en mode legacy boot (BIOS) excerpt: "Découvrez comment gérer et reconstruire le RAID logiciel après un remplacement de disque sur votre serveur en mode legacy boot (BIOS)" -updated: 2025-12-03 +updated: 2025-12-05 --- ## Objectif @@ -45,7 +45,8 @@ Pour vérifier si un serveur s'exécute en mode BIOS ou en mode UEFI, exécutez ### Informations de base -Dans une session de ligne de commande, tapez le code suivant pour déterminer l'état actuel du RAID +Dans une session de ligne de commande, tapez le code suivant pour déterminer l'état actuel du RAID. + ### Retrait du disque La vérification de l’état actuel du RAID s’effectue via la commande suivante : @@ -337,7 +338,7 @@ Consistency Policy : bitmap -#### Reconstruire le RAID in normal mode +#### Reconstruire le RAID en mode normal Les étapes suivantes sont réalisées en mode normal. Dans notre exemple, nous avons remplacé le disque **sda**. diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md index 337e8a7c5ee..c5fab1c787f 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using UEFI boot mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on a server using UEFI boot mode -updated: 2025-12-03 +updated: 2025-12-05 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md index 719ffa7a887..96d2370ae91 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: "Gestion et reconstruction d'un RAID logiciel sur les serveurs utilisant le mode de démarrage UEFI" excerpt: Découvrez comment gérer et reconstruire un RAID logiciel après un remplacement de disque sur un serveur utilisant le mode de démarrage UEFI -updated: 2025-12-03 +updated: 2025-12-05 --- ## Objectif @@ -45,7 +45,7 @@ Lorsque vous achetez un nouveau serveur, vous pouvez ressentir le besoin d'effec - [Suppression du disque défectueux](#diskremove) - [Reconstruction du RAID](#raidrebuild) - [Reconstruction du RAID après le remplacement du disque principal (mode de secours)](#rescuemode) - - [Re création de la partition système EFI](#recreateesp) + - [Recréation de la partition système EFI](#recreateesp) - [Reconstruction du RAID lorsque les partitions EFI ne sont pas synchronisées après des mises à jour majeures du système (ex. GRUB)](efiraodgrub) - [Ajout de l'étiquette à la partition SWAP (si applicable)](#swap-partition) - [Reconstruction du RAID en mode normal](#normalmode) @@ -186,7 +186,7 @@ nvme0n1 └─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 ``` -Notez les dispositifs, les partitions et leurs points de montage ; c'est important, surtout après le remplacement d'un disque. +Prenez note des dispositifs, des partitions et de leurs points de montage ; c'est important, surtout après le remplacement d'un disque. À partir des commandes et résultats ci-dessus, nous avons : @@ -383,7 +383,7 @@ unused devices: #### Retrait du disque défectueux -Tout d'abord, nous marquons les partitions **nvme0n1p2** et **nvme0n1p3** comme défectueuses. +Tout d'abord, nous marquons les partitions **nvme0n1p2** et **nvme0n1p3** comme défectueuses. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 @@ -655,7 +655,7 @@ Ici, nous supposons que les deux partitions ont été synchronisées et contienn > Si une mise à jour majeure du système, telle qu'une mise à jour du noyau ou de GRUB, a eu lieu et que les deux partitions n'ont pas été synchronisées, consultez cette [section](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) une fois que vous avez terminé la création de la nouvelle partition EFI System. > -Tout d'abord, nous formattons la partition : +Tout d'abord, nous formatons la partition : ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 @@ -700,7 +700,6 @@ total size is 6,097,843 speedup is 1.00 Une fois cela fait, nous démontons les deux partitions : - ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 @@ -745,13 +744,13 @@ Les résultats ci-dessus montrent que la nouvelle partition EFI a été créée #### Reconstruction du RAID lorsque les partitions EFI ne sont pas synchronisées après des mises à jour majeures du système (GRUB) -/// details | Développer cette section +/// details | Développez cette section > [!warning] > Veuillez suivre les étapes de cette section uniquement si cela s'applique à votre cas. > -Lorsque les partitions système EFI ne sont pas synchronisées après des mises à jour majeures du système qui modifient/affectent le GRUB, et que le disque principal sur lequel la partition est montée est remplacé, le démarrage à partir d'un disque secondaire contenant une ESP obsolète peut ne pas fonctionner. +Lorsque les partitions système EFI ne sont pas synchronisées après des mises à jour majeures du système qui modifient/affectent le GRUB, et que le disque principal sur lequel la partition est montée est remplacé, le démarrage à partir d'un disque secondaire contenant une ESP obsolète peut ne pas fonctionner. Dans ce cas, en plus de reconstruire le RAID et de recréer la partition système EFI en mode rescue, vous devez également réinstaller le GRUB sur celle-ci. @@ -904,9 +903,9 @@ Nous avons maintenant terminé avec succès la reconstruction RAID sur le serveu #### Reconstruction du RAID en mode normal -/// details | Développer cette section +/// details | Développez cette section -Si votre serveur est capable de démarrer en mode normal après un remplacement de disque, vous pouvez suivre les étapes suivantes pour reconstruire le RAID. +Si votre serveur est capable de démarrer en mode normal après un remplacement de disque, vous pouvez suivre les étapes ci-dessous pour reconstruire le RAID. Une fois le disque remplacé, nous copions la table de partition du disque sain (dans cet exemple, nvme1n1) vers le nouveau (nvme0n1). @@ -918,7 +917,7 @@ sgdisk -R /dev/nvme0n1 /dev/nvme1n1 La commande doit être dans ce format : `sgdisk -R /dev/nouveau disque /dev/disque sain`. -Une fois cela fait, l'étape suivante consisteà attribuer un GUID aléatoire au nouveau disque pour éviter les conflits de GUID avec d'autres disques : +Une fois cela fait, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque pour éviter les conflits de GUID avec d'autres disques : ```sh sgdisk -G /dev/nvme0n1 @@ -965,7 +964,7 @@ Tout d'abord, nous installons les outils nécessaires : [user@server_ip ~]# sudo yum install dosfstools ``` -Ensuite, nous formattons la partition. Dans notre exemple `nvme0n1p1` : +Ensuite, nous formatons la partition. Dans notre exemple `nvme0n1p1` : ```sh [user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 @@ -1056,9 +1055,9 @@ Nous avons maintenant terminé avec succès la reconstruction RAID. [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) Pour les services spécialisés (SEO, développement, etc.), contactez [les partenaires OVHcloud](/links/partner). - + Si vous avez besoin d'une assistance pour utiliser et configurer vos solutions OVHcloud, veuillez consulter nos [offres de support](/links/support). -Si vous avez besoin de formation ou d'une assistance technique pour mettre en place nos solutions, contactez votre représentant commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander à nos experts de Services Professionnels d'intervenir sur votre cas d'utilisation spécifique. +Si vous avez besoin de formation ou d'une assistance technique pour mettre en place nos solutions, contactez votre représentant commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander à nos experts de l'équipe Professional Services d'intervenir sur votre cas d'utilisation spécifique. Rejoignez notre [communauté d'utilisateurs](/links/community). \ No newline at end of file From c50ffa5cd908f7a9b3eac22b38fd1695b82a33b4 Mon Sep 17 00:00:00 2001 From: Montrealhub <89825661+Jessica41@users.noreply.github.com> Date: Fri, 5 Dec 2025 15:11:47 -0500 Subject: [PATCH 6/8] Update guide.fr-fr.md --- .../dedicated_servers/raid_soft/guide.fr-fr.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md index 8b04a0d900c..ba75b528735 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md @@ -47,10 +47,6 @@ Pour vérifier si un serveur s'exécute en mode BIOS ou en mode UEFI, exécutez Dans une session de ligne de commande, tapez le code suivant pour déterminer l'état actuel du RAID. -### Retrait du disque - -La vérification de l’état actuel du RAID s’effectue via la commande suivante : - ```sh [user@server_ip ~]# cat /proc/mdstat @@ -679,4 +675,4 @@ Si vous souhaitez bénéficier d'une assistance à l'usage et à la configuratio Si vous avez besoin d'une formation ou d'une assistance technique pour la mise en oeuvre de nos solutions, contactez votre commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander une analyse personnalisée de votre projet à nos experts de l’équipe Professional Services. -Échangez avec notre [communauté d'utilisateurs](/links/community). \ No newline at end of file +Échangez avec notre [communauté d'utilisateurs](/links/community). From d7cdf1ff6ae0738e1139011fb5d0c1b959cbdebc Mon Sep 17 00:00:00 2001 From: Yoann Cosse Date: Wed, 10 Dec 2025 18:00:44 +0100 Subject: [PATCH 7/8] Translations and duplication --- .../raid_soft/guide.de-de.md | 552 +++++++---- .../raid_soft/guide.en-asia.md | 619 +++++++++--- .../raid_soft/guide.en-au.md | 621 +++++++++--- .../raid_soft/guide.en-ca.md | 621 +++++++++--- .../raid_soft/guide.en-gb.md | 2 +- .../raid_soft/guide.en-ie.md | 621 +++++++++--- .../raid_soft/guide.en-sg.md | 621 +++++++++--- .../raid_soft/guide.en-us.md | 621 +++++++++--- .../raid_soft/guide.es-es.md | 544 +++++++---- .../raid_soft/guide.es-us.md | 546 +++++++---- .../raid_soft/guide.fr-ca.md | 592 +++++++++--- .../raid_soft/guide.fr-fr.md | 2 +- .../raid_soft/guide.it-it.md | 538 +++++++---- .../raid_soft/guide.pl-pl.md | 555 +++++++---- .../raid_soft/guide.pt-pt.md | 543 +++++++---- .../dedicated_servers/raid_soft/meta.yaml | 3 +- .../raid_soft_uefi/guide.de-de.md | 849 ++++++++++++++++ .../raid_soft_uefi/guide.en-gb.md | 2 +- .../raid_soft_uefi/guide.es-es.md | 908 ++++++++++++++++++ .../raid_soft_uefi/guide.fr-fr.md | 4 +- .../raid_soft_uefi/guide.it-it.md | 896 +++++++++++++++++ .../raid_soft_uefi/guide.pl-pl.md | 841 ++++++++++++++++ .../raid_soft_uefi/guide.pt-pt.md | 905 +++++++++++++++++ .../raid_soft_uefi/meta.yaml | 3 +- 24 files changed, 9804 insertions(+), 2205 deletions(-) create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md create mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md index faea64a9352..694eb58108a 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md @@ -1,32 +1,74 @@ --- -title: Software-RAID konfigurieren und neu erstellen -excerpt: "Erfahren Sie hier, wie Sie den Status des Software-RAID Ihres Servers überprüfen und im Fall eines Hardware-Austausches rekonfigurieren" -updated: 2022-10-11 +title: Verwalten und Neuaufbauen von Software-RAID auf Servern im Legacy-Boot-Modus (BIOS) +excerpt: Erfahren Sie, wie Sie Software-RAID verwalten und nach einem Wechsel der Festplatte auf Ihrem Server im Legacy-Boot-Modus (BIOS) neu aufbauen können +updated: 2025-12-11 --- -> [!primary] -> Diese Übersetzung wurde durch unseren Partner SYSTRAN automatisch erstellt. In manchen Fällen können ungenaue Formulierungen verwendet worden sein, z.B. bei der Beschriftung von Schaltflächen oder technischen Details. Bitte ziehen Sie im Zweifelsfall die englische oder französische Fassung der Anleitung zu Rate. Möchten Sie mithelfen, diese Übersetzung zu verbessern? Dann nutzen Sie dazu bitte den Button "Beitragen" auf dieser Seite. -> - ## Ziel -RAID (Redundant Array of Independent Disks) ist ein System, das Datenverlust auf Servern entgegenwirkt, indem es diese Daten auf mehreren Disks speichert. +Redundant Array of Independent Disks (RAID) ist eine Technologie, die Datenverluste auf einem Server durch die Replikation von Daten auf zwei oder mehr Festplatten minimiert. + +Die Standard-RAID-Ebene für OVHcloud-Serverinstallationen ist RAID 1, wodurch der Platz, den Ihre Daten einnehmen, verdoppelt wird und der nutzbare Festplattenplatz effektiv halbiert wird. + +**Dieses Handbuch erklärt, wie Sie ein Software-RAID verwalten und nach einem Festplattentausch auf Ihrem Server im Legacy-Boot-Modus (BIOS) neu aufbauen können.** -Das RAID Level für OVHcloud Server-Installationen ist standardmäßig RAID-1, was den von Ihren Daten verbrauchten Speicherplatz verdoppelt und somit den nutzbaren Platz halbiert. +Bevor wir beginnen, beachten Sie bitte, dass dieses Handbuch sich auf Dedicated Server konzentriert, die den Legacy-Boot-Modus (BIOS) verwenden. Wenn Ihr Server den UEFI-Modus verwendet (neuere Motherboards), konsultieren Sie bitte dieses Handbuch [Verwalten und Neuaufbauen von Software-RAID auf Servern im UEFI-Boot-Modus](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -**Diese Anleitung erklärt, wie Sie das RAID Array Ihres Servers konfigurieren, falls dieses aufgrund von Störungen oder Beschädigung neu eingerichtet werden muss.** +Um zu prüfen, ob ein Server im Legacy-BIOS- oder UEFI-Modus läuft, führen Sie den folgenden Befehl aus: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Voraussetzungen -- Sie haben einen [Dedicated Server](/links/bare-metal/bare-metal) mit Software-RAID-Konfiguration. -- Sie haben administrativen Zugriff (sudo) auf Ihren Server über SSH. +- Ein [Dedicated Server](/links/bare-metal/bare-metal) mit Software-RAID-Konfiguration +- Administrative (sudo) Zugriffsrechte auf den Server über SSH +- Grundkenntnisse zu RAID und Partitionen + +## ## In der praktischen Anwendung + +Wenn Sie einen neuen Server erwerben, könnten Sie sich möglicherweise entscheiden, eine Reihe von Tests und Aktionen durchzuführen. Ein solcher Test könnte darin bestehen, einen Festplattenausfall zu simulieren, um den Rebuild-Prozess des RAIDs zu verstehen und sich darauf vorzubereiten, falls dies jemals tatsächlich passiert. + +### Inhaltsoverview + +- [Grundlegende Informationen](#basicinformation) +- [Simulieren eines Festplattenausfalls](#diskfailure) + - [Entfernen der defekten Festplatte](#diskremove) +- [Neuaufbau des RAIDs](#raidrebuild) + - [Neuaufbau des RAIDs im Rescue-Modus](#rescuemode) + - [Hinzufügen des Labels zur SWAP-Partition (falls zutreffend)](#swap-partition) + - [Neuaufbau des RAIDs im Normalmodus](#normalmode) + + + +### Grundlegende Informationen + +Geben Sie in einer Befehlszeilen-Sitzung den folgenden Code ein, um den aktuellen RAID-Status zu ermitteln: + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Dieser Befehl zeigt uns, dass wir zwei Software-RAID-Geräte eingerichtet haben, wobei **md4** das größte ist. Das **md4**-RAID-Gerät besteht aus zwei Partitionen, die als **nvme1n1p4** und **nvme0n1p4** bezeichnet werden. -## In der praktischen Anwendung +Die [UU] bedeutet, dass alle Festplatten normal funktionieren. Ein `_` würde eine defekte Festplatte anzeigen. -Den aktuellen Status des RAID erhalten Sie über folgenden Befehl: +Wenn Sie einen Server mit SATA-Festplatten haben, erhalten Sie die folgenden Ergebnisse: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -40,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Dieser Befehl zeigt, dass aktuell zwei RAID Arrays eingerichtet sind, wobei "md4" die größte Partition ist. Diese Partition besteht aus zwei Disks: “sda4” und “sdb4”. `[UU]` zeigt an, dass alle Disks normal funktionieren. Ein “`_`” an dieser Stelle bedeutet, dass eine Disk defekt ist. - -Dieser Befehl zeigt zwar die RAID Disks an, jedoch nicht die Größe der Partitionen selbst. Diese Information erhalten Sie mit folgendem Befehl: +Obwohl dieser Befehl unsere RAID-Volumes zurückgibt, sagt er uns nicht die Größe der Partitionen selbst. Wir können diese Informationen mit dem folgenden Befehl erhalten: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,73 +127,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Mit dem Befehl `fdisk -l` können Sie auch Ihren Partitionstyp identifizieren. Dies ist eine wichtige Information, die Sie bei der Rekonstruktion Ihres RAID im Falle eines Ausfalls beachten müssen. +Der Befehl `fdisk -l` erlaubt es Ihnen auch, den Typ Ihrer Partition zu identifizieren. Dies ist eine wichtige Information, wenn es darum geht, Ihr RAID im Falle eines Festplattenausfalls neu aufzubauen. -Für **GPT** Partitionen wird zurückgegeben: `Disklabel type: gpt`. +Für **GPT**-Partitionen wird in Zeile 6 angezeigt: `Disklabel type: gpt`. Diese Information ist nur sichtbar, wenn der Server im Normalmodus läuft. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Basierend auf den Ergebnissen von `fdisk -l`, können wir erkennen, dass `/dev/md2` 888,8 GB umfasst und `/dev/md4` 973,5 GB enthält. -Für **MBR** Partitionen wird zurückgegeben: `Disklabel type: dos`. +Alternativ bietet der Befehl `lsblk` eine andere Ansicht der Partitionen: ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Die Ausgabe zeigt, das `/dev/md2` 888,8 GB und `/dev/md4` 973,5 GB enthält. Über den Befehl “mount” erhalten Sie das Layout der Disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` - -Alternativ kann mit dem Befehl `lsblk` eine andere Ansicht zu den Partitionen angezeigt werden: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -173,169 +156,253 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Die Disks sind aktuell standardmäßig gemountet. Um eine Disk aus dem RAID zu entfernen, muss diese zuerst ausgehängt und dann ein Fehler simuliert werden, um sie endgültig zu entfernen. Um `/dev/sda4` aus dem RAID zu entfernen, folgen Sie den nachstehenden Schritten und verwenden Sie zunächst folgenden Befehl: +Wir notieren uns die Geräte, Partitionen und ihre Mountpoints. Aus den oben genannten Befehlen und Ergebnissen haben wir: + +- Zwei RAID-Arrays: `/dev/md2` und `/dev/md4`. +- Vier Partitionen, die Teil des RAIDs sind, mit den Mountpoints: `/` und `/home`. + + + +### Simulieren eines Festplattenausfalls + +Jetzt, da wir alle notwendigen Informationen haben, können wir einen Festplattenausfall simulieren und die Tests durchführen. In diesem Beispiel werden wir die Festplatte `sda` als defekt markieren. + +Die bevorzugte Methode, dies zu tun, ist über den Rescue-Modus-Umgebung von OVHcloud. + +Starten Sie zunächst den Server im Rescue-Modus neu und melden Sie sich mit den bereitgestellten Anmeldeinformationen an. + +Um eine Festplatte aus dem RAID zu entfernen, ist der erste Schritt, sie als **defekt** zu markieren und die Partitionen aus ihren jeweiligen RAID-Arrays zu entfernen. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Aus der obigen Ausgabe ergibt sich, dass sda aus zwei Partitionen besteht, die im RAID sind, nämlich **sda2** und **sda4**. + + + +#### Entfernen der defekten Festplatte + +Zunächst markieren wir die Partitionen **sda2** und **sda4** als defekt. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Beachten Sie, dass, falls Sie mit dem Account `root` eingeloggt sind, folgende Nachricht erhalten können, wenn Sie versuchen, die Partition zu unmounten (in unserem Fall wird die Partition md4 in `/home` gemountet): -> ->
umount: /home: target is busy
-> -> Wechseln Sie in diesem Fall zu einem anderen sudo-Benutzer (in diesem Fall `debian`) und verwenden Sie folgenden Befehl: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Wenn Sie noch keine anderen User-Accounts haben, [erstellen Sie einen](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -Als Ergebnis erhalten Sie: +Wir haben nun einen RAID-Ausfall simuliert. Wenn wir den Befehl `cat /proc/mdstat` ausführen, erhalten wir die folgende Ausgabe: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: ``` -`/dev/md4` ist nicht länger gemountet. Das RAID ist jedoch noch aktiv. Daher ist es notwendig, einen Fehler zu simulieren, um die Disk zu entfernen. Dies geschieht über folgenden Befehl: +Wie wir oben sehen können, zeigt das [F] neben den Partitionen an, dass die Festplatte fehlerhaft ist oder defekt ist. + +Als nächstes entfernen wir diese Partitionen aus den RAID-Arrays. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 ``` -Somit wurde ein Fehler des RAID simuliert. Im nächsten Schritt wird die Partition mit folgendem Befehl aus dem RAID entfernt: +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` + +Um sicherzustellen, dass wir eine Festplatte erhalten, die einem leeren Laufwerk ähnelt, verwenden wir den folgenden Befehl. Ersetzen Sie **sda** durch Ihre eigenen Werte: ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n + +# mdadm: /dev/sda4 erneut hinzugefügt ``` -Um zu überprüfen, ob die Partition entfernt wurde, verwenden Sie folgenden Befehl: +Verwenden Sie den folgenden Befehl, um das RAID-Neuaufbau zu überwachen: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Die Ausgabe des nachfolgenden Befehls bestätigt, dass die Partition entfernt wurde. +Zuletzt fügen wir eine Bezeichnung hinzu und mounten die [SWAP]-Partition (falls zutreffend). + +Um eine Bezeichnung für die SWAP-Partition hinzuzufügen: ```sh -mdadm --detail /dev/md4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 +``` -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +Rufen Sie als nächstes die UUIDs beider Swap-Partitionen ab: - Intent Bitmap : Internal +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +Wir ersetzen die alte UUID der Swap-Partition (**sda4**) durch die neue in `/etc/fstab`. -Consistency Policy : bitmap +Beispiel: - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +```sh +[user@server_ip ~]# sudo nano etc/fstab - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -### RAID neu einrichten - -Wenn die Disk ersetzt wurde, kopieren Sie die Partitionstabelle einer funktionsfähigen Disk (in unserem Beispiel “sdb”) zur neuen Disk (“sda”): +Basierend auf den oben genannten Ergebnissen ist die alte UUID `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` und sollte durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15` ersetzt werden. Stellen Sie sicher, dass Sie die richtige UUID ersetzen. -**Für GPT Partitionen** +Als nächstes prüfen wir, ob alles ordnungsgemäß gemountet ist, mit dem folgenden Befehl: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Der Befehl muss im folgenden Format sein: `sgdisk -R /dev/newdisk /dev/healthydisk`. - -Nach diesem Vorgang ist der nächste Schritt, die GUID für die neue Disk zu randomisieren, um Konflikte mit anderen Disks zu vermeiden: +Führen Sie den folgenden Befehl aus, um die Swap-Partition zu aktivieren: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -**Für MBR Partitionen** +Laden Sie anschließend das System mit dem folgenden Befehl neu: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -Der Befehl muss im folgenden Format sein: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +Wir haben nun erfolgreich das RAID-Neuaufbau abgeschlossen. + + + +/// details | **Neuaufbau des RAIDs im Rescue-Modus** + +Falls Ihr Server nach einem Wechsel der Festplatte nicht im normalen Modus neu starten kann, wird er im Rescue-Modus neu gestartet. + +In diesem Beispiel ersetzen wir die Festplatte `sdb`. + +Nachdem die Festplatte ausgetauscht wurde, müssen wir die Partitionstabelle von der gesunden Festplatte (in diesem Beispiel sda) auf die neue (sdb) kopieren. + +> [!tabs] +> **Für GPT-Partitionen** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> Der Befehl sollte in diesem Format lauten: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Beispiel: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Sobald dies erledigt ist, ist der nächste Schritt, die GUID der neuen Festplatte zu randomisieren, um Konflikte mit anderen Festplatten zu vermeiden: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Falls Sie die folgende Meldung erhalten: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Können Sie einfach den Befehl `partprobe` ausführen. +>> +> **Für MBR-Partitionen** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> Der Befehl sollte in diesem Format lauten: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +Wir können nun das RAID-Array neu aufbauen. Der folgende Code zeigt, wie wir die neuen Partitionen (sdb2 und sdb4) wieder ins RAID-Array einfügen können. -Jetzt können Sie das RAID Array neu konfigurieren. Der nachstehende Code zeigt, wie das Layout der Partition `/dev/md4` mit der zuvor kopierten Partitionstabelle von “sda” wiederhergestellt werden kann: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` + +Verwenden Sie den Befehl `cat /proc/mdstat`, um das RAID-Neuaufbau zu überwachen: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Überprüfen Sie die Details des RAID mit folgendem Befehl: +Für weitere Details zu den RAID-Array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,24 +429,129 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Bezeichnung der SWAP-Partition hinzufügen (falls zutreffend) + +Sobald das RAID-Neuaufbau abgeschlossen ist, mounten wir die Partition, die die Wurzel unseres Betriebssystems enthält, auf `/mnt`. In unserem Beispiel ist dies die Partition `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Wir fügen die Bezeichnung unserer Swap-Partition mit dem folgenden Befehl hinzu: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Als nächstes mounten wir die folgenden Verzeichnisse, um sicherzustellen, dass alle Manipulationen im chroot-Umgebung ordnungsgemäß funktionieren: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Als nächstes greifen wir in die `chroot`-Umgebung: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Wir rufen die UUIDs beider Swap-Partitionen ab: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Beispiel: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Als nächstes ersetzen wir die alte UUID der Swap-Partition (**sdb4**) durch die neue in `/etc/fstab`: + +```sh +root@rescue12-customer-eu:/# nano etc/fstab ``` -Das RAID wurde neu eingerichtet. Mounten Sie die Partition (in diesem Beispiel `/dev/md4`) mit folgendem Befehl: +Beispiel: ```sh -mount /dev/md4 /home +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` +Stellen Sie sicher, dass Sie die richtige UUID ersetzen. In unserem obigen Beispiel ist die UUID, die ersetzt werden muss, `d6af33cf-fc15-4060-a43c-cb3b5537f58a` durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Stellen Sie sicher, dass Sie die richtige UUID ersetzen. + +Als nächstes stellen wir sicher, dass alles ordnungsgemäß gemountet ist: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Aktivieren Sie die Swap-Partition mit dem folgenden Befehl: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Wir verlassen die `chroot`-Umgebung mit exit und laden das System neu: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +Wir entmounten alle Festplatten: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Wir haben nun erfolgreich das RAID-Neuaufbau auf dem Server abgeschlossen und können ihn nun im normalen Modus neu starten. + ## Weiterführende Informationen +[Hot Swap - Software-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API und Speicher](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) (Englisch) +[Verwalten von Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) (Englisch) +[Hot Swap - Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard)(Englisch) +Für spezialisierte Dienstleistungen (SEO, Entwicklung usw.) kontaktieren Sie [OVHcloud Partner](/links/partner). -[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) (Englisch) +Wenn Sie bei der Nutzung und Konfiguration Ihrer OVHcloud-Lösungen Unterstützung benötigen, wenden Sie sich an unsere [Support-Angebote](/links/support). -Für den Austausch mit unserer User Community gehen Sie auf . +Wenn Sie Schulungen oder technische Unterstützung benötigen, um unsere Lösungen umzusetzen, wenden Sie sich an Ihren Vertriebsmitarbeiter oder klicken Sie auf [diesen Link](/links/professional-services), um ein Angebot zu erhalten und unsere Expert \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-asia.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-asia.md index de7a4d03eeb..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-asia.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-asia.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements - A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-au.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-au.md index 0de4ca168ec..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-au.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-au.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ca.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ca.md index 0de4ca168ec..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ca.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ca.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md index d3295c1c924..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode -updated: 2025-12-05 +updated: 2025-12-11 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ie.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ie.md index 0de4ca168ec..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ie.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-ie.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-sg.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-sg.md index 0de4ca168ec..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-sg.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-sg.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-us.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-us.md index 0de4ca168ec..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-us.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.en-us.md @@ -1,7 +1,7 @@ --- -title: How to configure and rebuild software RAID -excerpt: Find out how to verify the state of your software RAID and rebuild it after a disk replacement -updated: 2023-08-21 +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode +updated: 2025-12-11 --- ## Objective @@ -10,21 +10,65 @@ Redundant Array of Independent Disks (RAID) is a technology that mitigates data The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**This guide explains how to configure your server’s RAID array in the event that it needs to be rebuilt due to corruption or disk failure.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** + +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requirements -- A [dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration - Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions ## Instructions -### Removing the disk +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. + +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) + + + +### Basic Information In a command line session, type the following code to determine the current RAID status: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. + +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. + +If you have a server with SATA disks, you would get the following results: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -38,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -This command shows us that we have two RAID arrays currently set up, with md4 being the largest partition. The partition consists of two disks, which are known as sda4 and sdb4. The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. - Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,71 +129,14 @@ I/O size (minimum/optimal): 512 bytes / 512 bytes The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -For **GPT** partitions, the command will return: `Disklabel type: gpt`. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` - -For **MBR** partitions, the command will return: `Disklabel type: dos`. - -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -We can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. If we were to run the mount command we can also find out the layout of the disk. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. Alternatively, the `lsblk` command offers a different view of the partitions: ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -171,90 +156,151 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -As the disks are currently mounted by default, to remove a disk from the RAID, we first need to unmount the disk, then simulate a failure, and finally remove it. We will remove `/dev/sda4` from the RAID with the following command: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: + +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. + + + +### Simulating a disk failure + +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. + +The preferred way to do this is via the OVHcloud rescue mode environment. + +First reboot the server in rescue mode and log in with the provided credentials. + +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. + + + +#### Removing the failed disk + +First we mark the partitions **sda2** and **sda4** as failed. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Please note that if you are connected as the user `root`, you may get the following message when you try to unmount the partition (in our case, where our md4 partition is mounted in /home): -> ->
umount: /home: target is busy
-> -> In this case, you must log out as the user root and connect as a local user (in our case `debian`), and use the following command: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> If you do not have a local user, you need to [create one](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -This will provide us with the following output: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. + +Next, we remove these partitions from the RAID arrays. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -As we can see the, entry of `/dev/md4` is no longer mounted. However, the RAID is still active, so we need to simulate a failure to remove the disk. We can do this with the following command: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -We have now simulated a failure of the RAID. The next step is to remove the partition from the RAID array with the following command: +If we run the following command, we see that our disk has been successfully "wiped": ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -You can verify that the partition has been removed with the following command: +Our RAID status should now look like this: ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -The following command will verify that the partition has been removed: +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -286,56 +332,225 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` + + ### Rebuilding the RAID -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> -**For GPT partitions** + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 ``` -The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +Use the following command to monitor the RAID rebuild: -Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Lastly, we add a label and mount the [SWAP] partition (if applicable). + +To add a label the SWAP partition: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -**For MBR partitions** +Next, retrieve the UUIDs of both swap partitions: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` -Once the disk has been replaced, we need to copy the partition table from a healthy disk (in this example, sdb) to the new one (sda) with the following command: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: + +```sh +[user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. + +Next, we verify that everything is properly mounted with the following command: + +```sh +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Run the following command to enable the swap partition: + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +Then reload the system with the following command: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +We have now successfully completed the RAID rebuild. + + + +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Example: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> If you the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. +>> +> **For MBR partitions** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -We can now rebuild the RAID array. The following code snippet shows how we can rebulid the `/dev/md4` partition layout with the recently-copied sda partition table: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -We can verify the RAID details with the following command: +For more details on the RAID array(s): ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -362,16 +577,118 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Adding the label to the SWAP partition (if applicable) + +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +We add the label to our swap partition with the command: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Next, we access the `chroot` environment: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +We retrieve the UUIDs of both swap partitions: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Example: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -The RAID has now been rebuilt, but we still need to mount the partition (`/dev/md4` in this example) with the following command: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# nano etc/fstab ``` +Example: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. + +Next, we make sure everything is properly mounted: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Activate the swap partition the following command: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +We exit the `chroot` environment with exit and reload the system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + ## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -382,4 +699,10 @@ mount /dev/md4 /home [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. + Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md index ccf7a08920c..864886451bf 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md @@ -1,34 +1,73 @@ --- -title: Configuración y reconstrucción del RAID por software -excerpt: "Cómo verificar el estado del RAID por software de su servidor y reconstruirlo después de un reemplazo de disco" -updated: 2023-08-21 +title: Gestión y reconstrucción del RAID software en servidores en modo de arranque legacy (BIOS) +excerpt: "Descubra cómo gestionar y reconstruir el RAID software tras un reemplazo de disco en su servidor en modo de arranque legacy (BIOS)" +updated: 2025-12-11 --- -> [!primary] -> Esta traducción ha sido generada de forma automática por nuestro partner SYSTRAN. En algunos casos puede contener términos imprecisos, como en las etiquetas de los botones o los detalles técnicos. En caso de duda, le recomendamos que consulte la versión inglesa o francesa de la guía. Si quiere ayudarnos a mejorar esta traducción, por favor, utilice el botón «Contribuir» de esta página. -> - ## Objetivo -El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para prevenir la pérdida de datos en un servidor, replicándolos en varios discos. +El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para mitigar la pérdida de datos en un servidor replicándolos en varios discos. + +El nivel de RAID predeterminado para las instalaciones de servidores de OVHcloud es RAID 1, lo que duplica el espacio ocupado por sus datos, reduciendo así a la mitad el espacio de disco utilizable. + +**Este guía explica cómo gestionar y reconstruir un RAID software en caso de reemplazar un disco en su servidor en modo de arranque legacy (BIOS).** -El nivel de RAID por defecto en los servidores de OVHcloud es RAID 1. Con este nivel de RAID, el volumen que ocupan los datos se duplica, por lo que el espacio en disco útil se reduce a la mitad. +Antes de comenzar, tenga en cuenta que esta guía se centra en los servidores dedicados que utilizan el modo de arranque legacy (BIOS). Si su servidor utiliza el modo UEFI (tarjetas madre más recientes), consulte esta guía [Gestión y reconstrucción del RAID software en servidores en modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -**Esta guía explica cómo configurar el RAID de un servidor en caso de que sea necesario reconstruirlo por corrupción o fallo del disco.** +Para verificar si un servidor se ejecuta en modo BIOS o en modo UEFI, ejecute el siguiente comando: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requisitos -- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con RAID por software. -- Tener acceso al servidor por SSH como administrador (sudo). +- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software. +- Tener acceso a su servidor mediante SSH como administrador (sudo). +- Conocimiento del RAID y las particiones ## Procedimiento -### Eliminación del disco +### Presentación del contenido + +- [Información básica](#basicinformation) +- [Simular una falla de disco](#diskfailure) + - [Retirar el disco defectuoso](#diskremove) +- [Reconstrucción del RAID](#raidrebuild) + - [Reconstrucción del RAID en modo rescue](#rescuemode) + - [Añadir la etiqueta a la partición SWAP (si aplica)](#swap-partition) + - [Reconstrucción del RAID en modo normal](#normalmode) + + + + +### Información básica + +En una sesión de línea de comandos, escriba el siguiente código para determinar el estado actual del RAID. + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Este comando nos indica que dos dispositivos de RAID software están actualmente configurados, **md4** siendo el más grande. El dispositivo de RAID **md4** está compuesto por dos particiones, llamadas **nvme1n1p4** y **nvme0n1p4**. -Para comprobar el estado actual del RAID, utilice el siguiente comando: +El [UU] significa que todos los discos funcionan normalmente. Un `_` indica un disco defectuoso. + +Si posee un servidor con discos SATA, obtendrá los siguientes resultados: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -42,12 +81,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Este comando muestra dos conjuntos RAID configurados actualmente. La partición de mayor tamaño es **md4**, y está formada por dos discos llamados **sda4** y **sdb4**. **[UU]** significa que todos los discos funcionan con normalidad. Un guion bajo (**_**) indicaría un fallo en un disco. - -Aunque este comando muestra los volúmenes RAID, no indica el tamaño de las particiones. Para obtener esta información, utilice el siguiente comando: +Aunque este comando devuelve nuestros volúmenes de RAID, no nos indica el tamaño de las particiones mismas. Podemos encontrar esta información con el siguiente comando: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -89,73 +126,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -El comando `fdisk -l` también le permite identificar su tipo de partición. Esta es una información importante que deberá conocer cuando se trate de reconstruir su RAID en caso de fallo de un disco. +El comando `fdisk -l` también le permite identificar el tipo de partición. Esta es una información importante para reconstruir su RAID en caso de fallo de un disco. -Para las particiones **GPT**, el comando devolverá: `Disklabel type: gpt`. +Para las particiones **GPT**, la línea 6 mostrará: `Disklabel type: gpt`. Esta información solo es visible cuando el servidor está en modo normal. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 888.8GB y `/dev/md4` contiene 973.5GB. -Para las particiones **MBR**, el comando devolverá: `Disklabel type: dos`. +Alternativamente, el comando `lsblk` ofrece una vista diferente de las particiones: ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Este comando muestra que **/dev/md2** contiene 888,8 GB y **/dev/md4** contiene 973,5 GB. Para ver la disposición del disco, ejecute el comando `mount`. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosed,nodev,noexec,relatime) -proc on /proc type proc (rw,nosed,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosed,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosed,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosed,noexec,relatime,size=326656k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosed,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosed,nodev) -tmpfs on /run/lock type tmpfs (rw,nosed,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosed,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified tipo cgroup2 (rw,nosed,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosed,nodev,noexec,relatime,xlute,name=systemd) -pstore on /sys/fs/pstore tipo pstore (rw,nosed,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosed,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosed,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosed,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosed,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosed,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosed,nodev,noexec,relatime,net_cls,net_prio) -cpuacct on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosed,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosed,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosed,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosed,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosed,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw, relatime, pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosed,nodev,relatime,size=326652k,mode=700,uid=1000,gid=1000) -``` - -Como alternativa, el comando `lsblk` ofrece una vista diferente de las particiones: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -175,172 +155,241 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Ahora los discos están montados. Para sacar un disco del RAID, es necesario desmontar el disco en primer lugar y luego marcarlo como defectuoso para poder eliminarlo. -A continuación, podrá sacar **/dev/sda4** del RAID. +Tomamos en cuenta los dispositivos, las particiones y sus puntos de montaje. A partir de los comandos y resultados anteriores, tenemos: -```sh -umount /dev/md4 -``` +- Dos matrices RAID: `/dev/md2` y `/dev/md4`. +- Cuatro particiones forman parte del RAID con los puntos de montaje: `/` y `/home`. + + + +### Simular una falla de disco + +Ahora que disponemos de toda la información necesaria, podemos simular una falla de disco y continuar con las pruebas. En este ejemplo, haremos que el disco `sda` falle. -> [!warning] -> Tenga en cuenta que, si está conectado como usuario `root`, puede obtener el siguiente mensaje cuando intente desmontar la partición (en nuestro caso, la partición md4 está montada en /home): -> ->
umount: /home: target is busy
-> -> En ese caso, deberá desconectarse como usuario root y conectarse como usuario local (en nuestro caso, `debian`) y utilizar el siguiente comando: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Si no tiene un usuario local, [debe crear uno](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +El medio preferido para lograrlo es el entorno en modo rescue de OVHcloud. -Obtendrá la siguiente respuesta: +Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. + +Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: ``` -La entrada **/dev/md4** ya no está montada. Sin embargo, el RAID sigue activo. Para poder retirar el disco, debe marcarlo como defectuoso con el siguiente comando: +A partir de la salida anterior, sda se compone de dos particiones en RAID que son **sda2** y **sda4**. + + + +#### Retirar el disco defectuoso + +Comenzamos marcando las particiones **sda2** y **sda4** como **failed**. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -De este modo, hemos simultado un fallo del RAID. Ya puede proceder a eliminar la partición del RAID con el siguiente comando: - ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Para comprobar el nuevo estado del RAID, utilice el siguiente comando: +Hemos simulado ahora una falla del RAID, cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos el siguiente resultado: ```sh -cat /proc/mdstat +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -El siguiente comando permite comprobar que la partición se ha eliminado: +Como podemos ver arriba, el [F] junto a las particiones indica que el disco está fallando o defectuoso. + +A continuación, retiramos estas particiones de las matrices RAID. ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` - Intent Bitmap : Internal +Para asegurarnos de obtener un disco que sea similar a un disco vacío, utilizamos el siguiente comando. Reemplace **sda** por sus propios valores: - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home +``` -Consistency Policy : bitmap +Si ejecutamos el siguiente comando, vemos que nuestro disco ha sido correctamente «limpiado»: - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +```sh +parted /dev/sda +GNU Parted 3. - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 +# mdadm: re-added /dev/sda4 ``` -### Reconstrucción del RAID +Use el siguiente comando para supervisar la reconstrucción del RAID: -Una vez sustituido el disco, copie la tabla de particiones desde un disco « sano » (**sdb** en el ejemplo) a la nueva partición (**sda**) con el siguiente comando: +```sh +[user@server_ip ~]# cat /proc/mdstat -**Para las particiones GPT** +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Finalmente, añadimos una etiqueta y montamos la partición [SWAP] (si aplica). + +Para añadir una etiqueta a la partición SWAP: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -El comando debe tener el siguiente formato: `sgdisk -R /dev/newdisk /dev/healthydisk`. +A continuación, obtenga los UUID de ambas particiones de intercambio: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Una vez realizada esta operación, podrá aleatoriamente utilizar el GUID del nuevo disco para evitar cualquier conflicto de GUID con el resto de discos: +Reemplazamos el antiguo UUID de la partición de intercambio (**sda4**) por el nuevo en `/etc/fstab`: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Para las particiones MBR** +Asegúrese de reemplazar el UUID correcto. + +A continuación, recargue el sistema con el siguiente comando: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` -Una vez sustituido el disco, copie la tabla de particiones desde un disco sano (**sdb** en el ejemplo) en la nueva partición (**sda**) con el siguiente comando: +Ejecute el siguiente comando para activar la partición de intercambio: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -El comando debe tener el siguiente formato: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +La reconstrucción del RAID ahora está terminada. + + + +/// details | **Reconstrucción del RAID en modo rescue** + +Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco sano (en este ejemplo, sda) al nuevo (sdb). + +> [!tabs] +> **Para particiones GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> El comando debe tener el siguiente formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` +>> +>> Ejemplo: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Una vez realizada esta operación, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos con los GUID de otros discos: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Si aparece el siguiente mensaje: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Puede simplemente ejecutar el comando `partprobe`. Si aún no ve las nuevas particiones (por ejemplo, con `lsblk`), deberá reiniciar el servidor antes de continuar. +>> +> **Para particiones MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> El comando debe tener el siguiente formato: `sfdisk -d /dev/disco sano | sfdisk /dev/nuevo disco` +>> + +Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo añadir las nuevas particiones (sdb2 y sdb4) a la matriz RAID. -Ya puede reconstruir el RAID. El siguiente fragmento de código muestra cómo reconstruir la disposición de la partición **/dev/md4** con la tabla de particiones « sda » anteriormente copiada: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` + +Use el comando `cat /proc/mdstat` para supervisar la reconstrucción del RAID: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Para comprobar los detalles del RAID, utilice el siguiente comando: +Para obtener más detalles sobre la o las matrices RAID: ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -371,18 +420,131 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -Una vez reconstruido el RAID, monte la partición (**/dev/md4**, en el ejemplo) con el siguiente comando: + + +#### Añadimos la etiqueta a la partición SWAP (si aplica) + +Una vez finalizada la reconstrucción del RAID, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Añadimos la etiqueta a nuestra partición de intercambio con el siguiente comando: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +A continuación, montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno chroot funcione correctamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +A continuación, accedemos al entorno `chroot`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` +Recuperamos los UUID de ambas particiones de intercambio: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Ejemplo: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +A continuación, reemplazamos el antiguo UUID de la partición de intercambio (**sdb4**) por el nuevo en `/etc/fstab`: + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Ejemplo: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Asegúrese de reemplazar el UUID correcto. En nuestro ejemplo anterior, el UUID a reemplazar es `d6af33cf-fc15-4060-a43c-cb3b5537f58a` por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Asegúrese de reemplazar el UUID correcto. + +A continuación, nos aseguramos de que todo esté correctamente montado: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Recargue el sistema con el siguiente comando: + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Active la partición de intercambio con el siguiente comando: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Salga del entorno Chroot con `exit` y desmonte todos los discos: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Hemos terminado con éxito la reconstrucción del RAID en el servidor y ahora podemos reiniciar el servidor en modo normal. + + ## Más información +[Reemplazo a caliente - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[API OVHcloud y Almacenamiento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Gestión del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Reemplazo a caliente - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[Sustituir un disco en caliente en un servidor con RAID por software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +Para servicios especializados (posicionamiento, desarrollo, etc.), contacte con los [socios OVHcloud](/links/partner). -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +Si desea beneficiarse de una asistencia en el uso y configuración de sus soluciones OVHcloud, le invitamos a consultar nuestras distintas [ofertas de soporte](/links/support). -[RAID por hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +Si necesita una formación o asistencia técnica para la implementación de nuestras soluciones, contacte con su comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar un análisis personalizado de su proyecto a nuestros expertos del equipo Professional Services. -Interactúe con nuestra comunidad de usuarios en . +Interactúe con nuestra [comunidad de usuarios](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md index ccf7a08920c..c9db8b1e837 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md @@ -1,34 +1,73 @@ --- -title: Configuración y reconstrucción del RAID por software -excerpt: "Cómo verificar el estado del RAID por software de su servidor y reconstruirlo después de un reemplazo de disco" -updated: 2023-08-21 +title: Gestión y reconstrucción del RAID software en servidores en modo de arranque legacy (BIOS) +excerpt: "Descubra cómo gestionar y reconstruir el RAID software tras un reemplazo de disco en su servidor en modo de arranque legacy (BIOS)" +updated: 2025-12-11 --- -> [!primary] -> Esta traducción ha sido generada de forma automática por nuestro partner SYSTRAN. En algunos casos puede contener términos imprecisos, como en las etiquetas de los botones o los detalles técnicos. En caso de duda, le recomendamos que consulte la versión inglesa o francesa de la guía. Si quiere ayudarnos a mejorar esta traducción, por favor, utilice el botón «Contribuir» de esta página. -> - ## Objetivo -El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para prevenir la pérdida de datos en un servidor, replicándolos en varios discos. +El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para mitigar la pérdida de datos en un servidor replicándolos en varios discos. + +El nivel de RAID predeterminado para las instalaciones de servidores de OVHcloud es RAID 1, lo que duplica el espacio ocupado por sus datos, reduciendo así a la mitad el espacio de disco utilizable. + +**Este guía explica cómo gestionar y reconstruir un RAID software en caso de reemplazar un disco en su servidor en modo de arranque legacy (BIOS).** -El nivel de RAID por defecto en los servidores de OVHcloud es RAID 1. Con este nivel de RAID, el volumen que ocupan los datos se duplica, por lo que el espacio en disco útil se reduce a la mitad. + -**Esta guía explica cómo configurar el RAID de un servidor en caso de que sea necesario reconstruirlo por corrupción o fallo del disco.** +Antes de comenzar, tenga en cuenta que esta guía se centra en los servidores dedicados que utilizan el modo de arranque legacy (BIOS). Si su servidor utiliza el modo UEFI (tarjetas madre más recientes), consulte esta guía [Gestión y reconstrucción del RAID software en servidores en modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +Para verificar si un servidor se ejecuta en modo BIOS o en modo UEFI, ejecute el siguiente comando: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requisitos +- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software. +- Tener acceso a su servidor mediante SSH como administrador (sudo). +- Conocimiento del RAID y las particiones + +## En práctica +### Presentación del contenido + +- [Información básica](#basicinformation) +- [Simular una falla de disco](#diskfailure) + - [Retirar el disco defectuoso](#diskremove) +- [Reconstrucción del RAID](#raidrebuild) + - [Reconstrucción del RAID en modo rescue](#rescuemode) + - [Añadir la etiqueta a la partición SWAP (si aplica)](#swap-partition) + - [Reconstrucción del RAID en modo normal](#normalmode) + + + -- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con RAID por software. -- Tener acceso al servidor por SSH como administrador (sudo). +### Información básica + +En una sesión de línea de comandos, escriba el siguiente código para determinar el estado actual del RAID. + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` -## Procedimiento +Este comando nos indica que dos dispositivos de RAID software están actualmente configurados, **md4** siendo el más grande. El dispositivo de RAID **md4** está compuesto por dos particiones, llamadas **nvme1n1p4** y **nvme0n1p4**. -### Eliminación del disco +El [UU] significa que todos los discos funcionan normalmente. Un `_` indica un disco defectuoso. -Para comprobar el estado actual del RAID, utilice el siguiente comando: +Si posee un servidor con discos SATA, obtendrá los siguientes resultados: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -42,12 +81,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Este comando muestra dos conjuntos RAID configurados actualmente. La partición de mayor tamaño es **md4**, y está formada por dos discos llamados **sda4** y **sdb4**. **[UU]** significa que todos los discos funcionan con normalidad. Un guion bajo (**_**) indicaría un fallo en un disco. - -Aunque este comando muestra los volúmenes RAID, no indica el tamaño de las particiones. Para obtener esta información, utilice el siguiente comando: +Aunque este comando devuelve nuestros volúmenes de RAID, no nos indica el tamaño de las particiones mismas. Podemos encontrar esta información con el siguiente comando: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -89,73 +126,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -El comando `fdisk -l` también le permite identificar su tipo de partición. Esta es una información importante que deberá conocer cuando se trate de reconstruir su RAID en caso de fallo de un disco. +El comando `fdisk -l` también le permite identificar el tipo de partición. Esta es una información importante para reconstruir su RAID en caso de fallo de un disco. -Para las particiones **GPT**, el comando devolverá: `Disklabel type: gpt`. +Para las particiones **GPT**, la línea 6 mostrará: `Disklabel type: gpt`. Esta información solo es visible cuando el servidor está en modo normal. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 888.8GB y `/dev/md4` contiene 973.5GB. -Para las particiones **MBR**, el comando devolverá: `Disklabel type: dos`. +Alternativamente, el comando `lsblk` ofrece una vista diferente de las particiones: ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Este comando muestra que **/dev/md2** contiene 888,8 GB y **/dev/md4** contiene 973,5 GB. Para ver la disposición del disco, ejecute el comando `mount`. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosed,nodev,noexec,relatime) -proc on /proc type proc (rw,nosed,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosed,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosed,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosed,noexec,relatime,size=326656k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosed,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosed,nodev) -tmpfs on /run/lock type tmpfs (rw,nosed,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosed,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified tipo cgroup2 (rw,nosed,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosed,nodev,noexec,relatime,xlute,name=systemd) -pstore on /sys/fs/pstore tipo pstore (rw,nosed,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosed,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosed,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosed,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosed,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosed,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosed,nodev,noexec,relatime,net_cls,net_prio) -cpuacct on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosed,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosed,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosed,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosed,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosed,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw, relatime, pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosed,nodev,relatime,size=326652k,mode=700,uid=1000,gid=1000) -``` - -Como alternativa, el comando `lsblk` ofrece una vista diferente de las particiones: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -175,172 +155,241 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Ahora los discos están montados. Para sacar un disco del RAID, es necesario desmontar el disco en primer lugar y luego marcarlo como defectuoso para poder eliminarlo. -A continuación, podrá sacar **/dev/sda4** del RAID. +Tomamos en cuenta los dispositivos, las particiones y sus puntos de montaje. A partir de los comandos y resultados anteriores, tenemos: -```sh -umount /dev/md4 -``` +- Dos matrices RAID: `/dev/md2` y `/dev/md4`. +- Cuatro particiones forman parte del RAID con los puntos de montaje: `/` y `/home`. + + + +### Simular una falla de disco + +Ahora que disponemos de toda la información necesaria, podemos simular una falla de disco y continuar con las pruebas. En este ejemplo, haremos que el disco `sda` falle. + +El medio preferido para lograrlo es el entorno en modo rescue de OVHcloud. -> [!warning] -> Tenga en cuenta que, si está conectado como usuario `root`, puede obtener el siguiente mensaje cuando intente desmontar la partición (en nuestro caso, la partición md4 está montada en /home): -> ->
umount: /home: target is busy
-> -> En ese caso, deberá desconectarse como usuario root y conectarse como usuario local (en nuestro caso, `debian`) y utilizar el siguiente comando: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Si no tiene un usuario local, [debe crear uno](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. -Obtendrá la siguiente respuesta: +Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: ``` -La entrada **/dev/md4** ya no está montada. Sin embargo, el RAID sigue activo. Para poder retirar el disco, debe marcarlo como defectuoso con el siguiente comando: +A partir de la salida anterior, sda se compone de dos particiones en RAID que son **sda2** y **sda4**. + + + +#### Retirar el disco defectuoso + +Comenzamos marcando las particiones **sda2** y **sda4** como **failed**. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -De este modo, hemos simultado un fallo del RAID. Ya puede proceder a eliminar la partición del RAID con el siguiente comando: - ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Para comprobar el nuevo estado del RAID, utilice el siguiente comando: +Hemos simulado ahora una falla del RAID, cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos el siguiente resultado: ```sh -cat /proc/mdstat +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -El siguiente comando permite comprobar que la partición se ha eliminado: +Como podemos ver arriba, el [F] junto a las particiones indica que el disco está fallando o defectuoso. + +A continuación, retiramos estas particiones de las matrices RAID. ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` - Intent Bitmap : Internal +Para asegurarnos de obtener un disco que sea similar a un disco vacío, utilizamos el siguiente comando. Reemplace **sda** por sus propios valores: - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home +``` -Consistency Policy : bitmap +Si ejecutamos el siguiente comando, vemos que nuestro disco ha sido correctamente «limpiado»: - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +```sh +parted /dev/sda +GNU Parted 3. - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 +# mdadm: re-added /dev/sda4 ``` -### Reconstrucción del RAID +Use el siguiente comando para supervisar la reconstrucción del RAID: -Una vez sustituido el disco, copie la tabla de particiones desde un disco « sano » (**sdb** en el ejemplo) a la nueva partición (**sda**) con el siguiente comando: +```sh +[user@server_ip ~]# cat /proc/mdstat -**Para las particiones GPT** +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Finalmente, añadimos una etiqueta y montamos la partición [SWAP] (si aplica). + +Para añadir una etiqueta a la partición SWAP: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -El comando debe tener el siguiente formato: `sgdisk -R /dev/newdisk /dev/healthydisk`. +A continuación, obtenga los UUID de ambas particiones de intercambio: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Una vez realizada esta operación, podrá aleatoriamente utilizar el GUID del nuevo disco para evitar cualquier conflicto de GUID con el resto de discos: +Reemplazamos el antiguo UUID de la partición de intercambio (**sda4**) por el nuevo en `/etc/fstab`: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Para las particiones MBR** +Asegúrese de reemplazar el UUID correcto. + +A continuación, recargue el sistema con el siguiente comando: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` -Una vez sustituido el disco, copie la tabla de particiones desde un disco sano (**sdb** en el ejemplo) en la nueva partición (**sda**) con el siguiente comando: +Ejecute el siguiente comando para activar la partición de intercambio: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -El comando debe tener el siguiente formato: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +La reconstrucción del RAID ahora está terminada. + + + +/// details | **Reconstrucción del RAID en modo rescue** + +Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco sano (en este ejemplo, sda) al nuevo (sdb). + +> [!tabs] +> **Para particiones GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> El comando debe tener el siguiente formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` +>> +>> Ejemplo: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Una vez realizada esta operación, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos con los GUID de otros discos: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Si aparece el siguiente mensaje: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Puede simplemente ejecutar el comando `partprobe`. Si aún no ve las nuevas particiones (por ejemplo, con `lsblk`), deberá reiniciar el servidor antes de continuar. +>> +> **Para particiones MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> El comando debe tener el siguiente formato: `sfdisk -d /dev/disco sano | sfdisk /dev/nuevo disco` +>> + +Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo añadir las nuevas particiones (sdb2 y sdb4) a la matriz RAID. -Ya puede reconstruir el RAID. El siguiente fragmento de código muestra cómo reconstruir la disposición de la partición **/dev/md4** con la tabla de particiones « sda » anteriormente copiada: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` + +Use el comando `cat /proc/mdstat` para supervisar la reconstrucción del RAID: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Para comprobar los detalles del RAID, utilice el siguiente comando: +Para obtener más detalles sobre la o las matrices RAID: ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -371,18 +420,131 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -Una vez reconstruido el RAID, monte la partición (**/dev/md4**, en el ejemplo) con el siguiente comando: + + +#### Añadimos la etiqueta a la partición SWAP (si aplica) + +Una vez finalizada la reconstrucción del RAID, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Añadimos la etiqueta a nuestra partición de intercambio con el siguiente comando: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +A continuación, montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno chroot funcione correctamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +A continuación, accedemos al entorno `chroot`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` +Recuperamos los UUID de ambas particiones de intercambio: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Ejemplo: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +A continuación, reemplazamos el antiguo UUID de la partición de intercambio (**sdb4**) por el nuevo en `/etc/fstab`: + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Ejemplo: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Asegúrese de reemplazar el UUID correcto. En nuestro ejemplo anterior, el UUID a reemplazar es `d6af33cf-fc15-4060-a43c-cb3b5537f58a` por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Asegúrese de reemplazar el UUID correcto. + +A continuación, nos aseguramos de que todo esté correctamente montado: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Recargue el sistema con el siguiente comando: + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Active la partición de intercambio con el siguiente comando: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Salga del entorno Chroot con `exit` y desmonte todos los discos: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Hemos terminado con éxito la reconstrucción del RAID en el servidor y ahora podemos reiniciar el servidor en modo normal. + + ## Más información +[Reemplazo a caliente - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[API OVHcloud y Almacenamiento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Gestión del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Reemplazo a caliente - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[Sustituir un disco en caliente en un servidor con RAID por software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +Para servicios especializados (posicionamiento, desarrollo, etc.), contacte con los [socios OVHcloud](/links/partner). -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +Si desea beneficiarse de una asistencia en el uso y configuración de sus soluciones OVHcloud, le invitamos a consultar nuestras distintas [ofertas de soporte](/links/support). -[RAID por hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +Si necesita una formación o asistencia técnica para la implementación de nuestras soluciones, contacte con su comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar un análisis personalizado de su proyecto a nuestros expertos del equipo Professional Services. -Interactúe con nuestra comunidad de usuarios en . +Interactúe con nuestra [comunidad de usuarios](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-ca.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-ca.md index 811e36b6932..7173ac0de0f 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-ca.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-ca.md @@ -1,7 +1,7 @@ --- -title: Configuration et reconstruction du RAID logiciel -excerpt: "Découvrez comment vérifier l'état du RAID logiciel de votre serveur et le reconstruire après un remplacement de disque" -updated: 2023-08-21 +title: Gestion et reconstruction du RAID logiciel sur les serveurs en mode legacy boot (BIOS) +excerpt: "Découvrez comment gérer et reconstruire le RAID logiciel après un remplacement de disque sur votre serveur en mode legacy boot (BIOS)" +updated: 2025-12-11 --- ## Objectif @@ -10,23 +10,66 @@ Le RAID (Redundant Array of Independent Disks) est un ensemble de techniques pr Le niveau RAID par défaut pour les installations de serveurs OVHcloud est RAID 1, ce qui double l'espace occupé par vos données, réduisant ainsi de moitié l'espace disque utilisable. -**Ce guide va vous aider à configurer la matrice RAID de votre serveur dans l'éventualité où elle doit être reconstruite en raison d'une corruption ou d'une panne de disque.** +**Ce guide explique comment gérer et reconstruire un RAID logiciel en cas de remplacement d'un disque sur votre serveur en mode legacy boot (BIOS).** +Avant de commencer, veuillez noter que ce guide se concentre sur les serveurs dédiés qui utilisent le mode legacy boot (BIOS). Si votre serveur utilise le mode UEFI (cartes mères plus récentes), reportez-vous à ce guide [Gestion et reconstruction du RAID logiciel sur les serveurs en mode boot UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). + +Pour vérifier si un serveur s'exécute en mode BIOS ou en mode UEFI, exécutez la commande suivante : + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + ## Prérequis - Posséder un [serveur dédié](/links/bare-metal/bare-metal) avec une configuration RAID logiciel. - Avoir accès à votre serveur via SSH en tant qu'administrateur (sudo). +- Compréhension du RAID et des partitions ## En pratique -### Retrait du disque +### Présentation du contenu + +- [Informations de base](#basicinformation) +- [Simuler une panne de disque](#diskfailure) + - [Retrait du disque défaillant](#diskremove) +- [Reconstruction du RAID](#raidrebuild) + - [Reconstruction du RAID en mode rescue](#rescuemode) + - [Ajout du label à la partition SWAP (le cas échéant)](#swap-partition) + - [Reconstruction du RAID en mode normal](#normalmode) + + + + +### Informations de base + +Dans une session de ligne de commande, tapez le code suivant pour déterminer l'état actuel du RAID. + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Cette commande nous indique que deux périphériques RAID logiciels sont actuellement configurés, **md4** étant le plus grand. Le périphérique RAID **md4** se compose de deux partitions, appelées **nvme1n1p4** et **nvme0n1p4**. + +Le [UU] signifie que tous les disques fonctionnent normalement. Un `_` indique un disque défectueux. -La vérification de l’état actuel du RAID s’effectue via la commande suivante : +Si vous possédez un serveur avec des disques SATA, vous obtenez les résultats suivants : ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -40,12 +83,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Cette commande montre deux matrices RAID actuellement configurées, « md4 » étant la plus grande partition. La partition se compose de deux disques, appelés « sda4 » et « sdb4 ». Le [UU] signifie que tous les disques fonctionnent normalement. Un « _ » indiquerait un disque défectueux. - -Bien que cette commande affiche les volumes RAID, elle n'indique pas la taille des partitions elles-mêmes. Vous pouvez obtenir cette information via la commande suivante : +Bien que cette commande renvoie nos volumes RAID, elle ne nous indique pas la taille des partitions elles-mêmes. Nous pouvons retrouver cette information avec la commande suivante : ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -87,73 +128,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -La commande `fdisk -l` vous permet également d'identifier votre type de partition. C'est une information importante à connaître lorsqu'il s'agit de reconstruire votre RAID en cas de défaillance d'un disque. - -Pour les partitions **GPT**, la commande retournera : `Disklabel type: gpt`. - -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +La commande `fdisk -l` vous permet également d'identifier votre type de partition. Il s’agit d’une information importante pour reconstruire votre RAID en cas de défaillance d’un disque. -Pour les partitions **MBR**, la commande retournera : `Disklabel type: dos`. +Pour les partitions **GPT**, la ligne 6 affichera : `Disklabel type: gpt`. Ces informations ne sont visibles que lorsque le serveur est en mode normal. -```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Cette commande montre que `/dev/md2` se compose de 888,8 Go et `/dev/md4` contient 973,5 Go. Exécuter la commande « mount » montre la disposition du disque. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` +Toujours en se basant sur les résultats de `fdisk -l`, on peut voir que `/dev/md2` se compose de 888.8GB et `/dev/md4` contient 973.5GB. Alternativement, la commande `lsblk` offre une vue différente des partitions : ```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -173,91 +157,141 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Les disques sont actuellement montés par défaut. Pour retirer un disque du RAID, vous devez dans un premier temps démonter le disque, puis simuler un échec pour enfin le supprimer. -Nous allons supprimer `/dev/sda4` du RAID avec la commande suivante : +Nous prenons en compte les périphériques, les partitions et leurs points de montage. À partir des commandes et des résultats ci-dessus, nous avons : + +- Deux baies RAID : `/dev/md2` et `/dev/md4`. +- Quatre partitions font partie du RAID avec les points de montage : `/` et `/home`. + + + +### Simuler une panne de disque + +Maintenant que nous disposons de toutes les informations nécessaires, nous pouvons simuler une panne de disque et poursuivre les tests. Dans cet exemple, nous allons faire échouer le disque `sda`. + +Le moyen privilégié pour y parvenir est l’environnement en mode rescue d’OVHcloud. + +Redémarrez d'abord le serveur en mode rescue et connectez-vous avec les informations d'identification fournies. + +Pour retirer un disque du RAID, la première étape consiste à le marquer comme **Failed** et à retirer les partitions de leurs matrices RAID respectives. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +À partir de la sortie ci-dessus, sda se compose de deux partitions en RAID qui sont **sda2** et **sda4**. + + + +#### Retrait du disque défaillant + +Nous commençons par marquer les partitions **sda2** et **sda4** comme **failed**. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Veuillez noter que si vous êtes connecté en tant qu'utilisateur `root`, vous pouvez obtenir le message suivant lorsque vous essayez de démonter la partition (dans notre cas, où notre partition md4 est montée dans /home) : -> ->
umount: /home: target is busy
-> -> Dans ce cas, vous devez vous déconnecter en tant qu'utilisateur root et vous connecter en tant qu'utilisateur local (dans notre cas, `debian`) et utiliser la commande suivante : -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Si vous ne disposez pas d'utilisateur local, [vous devez en créer un](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -Le résultat obtenu sera le suivant : +Nous avons maintenant simulé une défaillance du RAID, lorsque nous exécutons la commande `cat /proc/mdstat`, nous obtenons le résultat suivant : ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Comme nous pouvons le voir ci-dessus, le [F] à côté des partitions indique que le disque est défaillant ou défectueux. + +Ensuite, nous retirons ces partitions des baies RAID. + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 ``` -L'entrée de `/dev/md4` n'est maintenant plus montée. Cependant, le RAID est toujours actif. Il est donc nécessaire de simuler un échec pour retirer le disque, ce qui peut être effectué grâce à la commande suivante : +Pour nous assurer que nous obtenons un disque qui est similaire à un disque vide, nous utilisons la commande suivante. Remplacez **sda** par vos propres valeurs : ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -Nous avons maintenant simulé un échec du RAID. L'étape suivante consiste à supprimer la partition du RAID avec la commande suivante : +Si nous exécutons la commande suivante, nous voyons que notre disque a été correctement « nettoyé » : ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: ``` -Vous pouvez vérifier que la partition a été supprimée avec la commande suivante : +L'état de notre RAID devrait maintenant ressembler à ceci : ```sh -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] + 1020767232 blocks super 1.2 [1/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -La commande ci-dessous vérifie que la partition a été supprimée : +Les résultats ci-dessus montrent que seules deux partitions apparaissent désormais dans les matrices RAID. Nous avons réussi à faire échouer le disque **sda** et nous pouvons maintenant procéder au remplacement du disque. + +Pour plus d'informations sur la préparation et la demande de remplacement d'un disque, consultez ce [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +La commande suivante permet d'avoir plus de détails sur la ou les matrices RAID : ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -289,56 +323,200 @@ Consistency Policy : bitmap 1 8 20 1 active sync /dev/sdb4 ``` -### Reconstruction du RAID + + +### Reconstruire le RAID + +> [!warning] +> +> Pour la plupart des serveurs en RAID logiciel, après un remplacement de disque, le serveur est capable de démarrer en mode normal (sur le disque sain) pour reconstruire le RAID. Cependant, si le serveur ne parvient pas à démarrer en mode normal, il sera redémarré en mode rescue pour procéder à la reconstruction du RAID. +> + + + +#### Reconstruire le RAID en mode normal + +Les étapes suivantes sont réalisées en mode normal. Dans notre exemple, nous avons remplacé le disque **sda**. + +Une fois le disque remplacé, nous devons copier la table de partition du disque sain (dans cet exemple, sdb) vers le nouveau (sda). + +> [!tabs] +> **Pour les partitions GPT** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sgdisk -R /dev/nouveau disque /dev/disque sain`. +>> +>> Une fois cette opération effectuée, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque afin d'éviter tout conflit avec les GUID d'autres disques : +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> Si le message suivant s'affiche : +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Vous pouvez simplement exécuter la commande `partprobe`. Si vous ne voyez toujours pas les partitions nouvellement créées (par exemple avec `lsblk`), vous devez redémarrer le serveur avant de continuer. +>> +> **Pour les partitions MBR** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sfdisk -d /dev/disksain | sfdisk /dev/nnouveaudisk`. +>> + +Ensuite, nous ajoutons les partitions au RAID : + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 +``` + +Utilisez la commande suivante pour surveiller la reconstruction du RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` -Une fois le disque remplacé, copiez la table de partition à partir d'un disque sain (« sdb » dans cet exemple) dans la nouvelle (« sda ») avec la commande suivante : +Enfin, nous ajoutons un label et montons la partition [SWAP] (le cas échéant). -**Pour les partitions GPT** +Pour ajouter un libellé à la partition SWAP : ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -La commande doit être au format suivant : `sgdisk -R /dev/nouveaudisque /dev/disquesain` +Ensuite, récupérez les UUID des deux partitions swap : + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Une fois cette opération effectuée, l’étape suivante consiste à rendre aléatoire le GUID du nouveau disque afin d’éviter tout conflit de GUID avec les autres disques : +Nous remplaçons l'ancien UUID de la partition swap (**sda4**) par le nouveau dans `/etc/fstab` : ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Pour les partitions MBR** +Assurez-vous de remplacer le bon UUID. + +Ensuite, rechargez le système avec la commande suivante : + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` -Une fois le disque remplacé, copiez la table de partition à partir d'un disque sain (« sdb » dans cet exemple) dans la nouvelle (« sda ») avec la commande suivante : +Exécutez la commande suivante pour activer la partition swap : ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -La commande doit être au format suivant : `sfdisk -d /dev/disquesain | sfdisk /dev/nouveaudisque` +La reconstruction du RAID est maintenant terminée. + + + +/// details | **Reconstruction du RAID en mode rescue** + +Une fois le disque remplacé, nous devons copier la table de partition du disque sain (dans cet exemple, sda) vers le nouveau (sdb). + +> [!tabs] +> **Pour les partitions GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> La commande doit être au format suivant : `sgdisk -R /dev/nouveau disque /dev/disque sain` +>> +>> Exemple : +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Une fois cette opération effectuée, l'étape suivante consiste à attribuer un GUID aléatoire au nouveau disque afin d'éviter tout conflit avec les GUID d'autres disques : +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Si le message suivant s'affiche : +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Vous pouvez simplement exécuter la commande `partprobe`. Si vous ne voyez toujours pas les partitions nouvellement créées (par exemple avec `lsblk`), vous devez redémarrer le serveur avant de continuer. +>> +> **Pour les partitions MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> La commande doit être au format suivant : `sfdisk -d /dev/disque sain | sfdisk /dev/nouveau disque` +>> + +Nous pouvons maintenant reconstruire la matrice RAID. L'extrait de code suivant montre comment ajouter les nouvelles partitions (sdb2 et sdb4) dans la matrice RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 + +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -Il est maintenant possible de reconstruire la matrice RAID. L'extrait de code ci-dessous montre comment reconstruire la disposition de la partition `/dev/md4` avec la table de partition « sda » copiée récemment : +Utilisez la commande `cat /proc/mdstat` pour surveiller la reconstruction du RAID : ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Vérifiez les détails du RAID avec la commande suivante : +Pour plus de détails sur la ou les baies RAID : ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -369,12 +547,118 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -Le RAID a maintenant été reconstruit. Montez la partition (`/dev/md4` dans cet exemple) avec cette commande : + + +#### Ajout du label à la partition SWAP (le cas échéant) + +Une fois la reconstruction du RAID terminée, nous montons la partition contenant la racine de notre système d'exploitation sur `/mnt`. Dans notre exemple, cette partition est `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Nous ajoutons le label à notre partition swap avec la commande : ```sh -mount /dev/md4 /home +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` +Ensuite, nous montons les répertoires suivants pour nous assurer que toute manipulation que nous faisons dans l'environnement chroot fonctionne correctement : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Ensuite, nous accédons à l'environnement `chroot` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Nous récupérons les UUID des deux partitions swap : + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Exemple: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Ensuite, nous remplaçons l'ancien UUID de la partition swap (**sdb4**) par le nouveau dans `/etc/fstab` : + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Exemple: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Assurez-vous de remplacer l'UUID approprié. Dans notre exemple ci-dessus, l'UUID à remplacer est `d6af33cf-fc15-4060-a43c-cb3b5537f58a` par le nouveau `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Assurez-vous de remplacer le bon UUID. + +Ensuite, nous nous assurons que tout est correctement monté : + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Rechargez le système avec la commande suivante : + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Activez la partition swap avec la commande suivante : + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Quittez l'environnement Chroot avec `exit` et démontez tous les disques : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Nous avons maintenant terminé avec succès la reconstruction du RAID sur le serveur et nous pouvons maintenant le redémarrer en mode normal. + + ## Aller plus loin [Remplacement à chaud - RAID logiciel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) @@ -385,4 +669,10 @@ mount /dev/md4 /home [Remplacement à chaud - RAID Matériel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +Pour des prestations spécialisées (référencement, développement, etc), contactez les [partenaires OVHcloud](/links/partner). + +Si vous souhaitez bénéficier d'une assistance à l'usage et à la configuration de vos solutions OVHcloud, nous vous proposons de consulter nos différentes [offres de support](/links/support). + +Si vous avez besoin d'une formation ou d'une assistance technique pour la mise en oeuvre de nos solutions, contactez votre commercial ou cliquez sur [ce lien](/links/professional-services) pour obtenir un devis et demander une analyse personnalisée de votre projet à nos experts de l’équipe Professional Services. + Échangez avec notre [communauté d'utilisateurs](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md index ba75b528735..7173ac0de0f 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: Gestion et reconstruction du RAID logiciel sur les serveurs en mode legacy boot (BIOS) excerpt: "Découvrez comment gérer et reconstruire le RAID logiciel après un remplacement de disque sur votre serveur en mode legacy boot (BIOS)" -updated: 2025-12-05 +updated: 2025-12-11 --- ## Objectif diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md index 40d97a4274c..39f7fa84d0e 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md @@ -1,34 +1,72 @@ --- -title: Configurazione e ricostruzione del RAID software -excerpt: "Come verificare lo stato del RAID software del tuo server e ricostruirlo dopo la sostituzione del disco" -updated: 2023-08-21 +title: Gestione e ricostruzione del RAID software sui server in modalità legacy boot (BIOS) +excerpt: "Scopri come gestire e ricostruire il RAID software dopo il sostituzione di un disco su un server in modalità legacy boot (BIOS)" +updated: 2025-12-11 --- -> [!primary] -> Questa traduzione è stata generata automaticamente dal nostro partner SYSTRAN. I contenuti potrebbero presentare imprecisioni, ad esempio la nomenclatura dei pulsanti o alcuni dettagli tecnici. In caso di dubbi consigliamo di fare riferimento alla versione inglese o francese della guida. Per aiutarci a migliorare questa traduzione, utilizza il pulsante "Contribuisci" di questa pagina. -> - ## Obiettivo -Il RAID (Redundant Array of Independent Disks) consiste in un insieme di tecniche che consentono di limitare la perdita delle informazioni presenti su un server grazie alla replica dei dati su più dischi. +Il RAID (Redundant Array of Independent Disks) è un insieme di tecniche progettate per ridurre la perdita di dati su un server replicandoli su più dischi. + +Il livello RAID predefinito per le installazioni dei server OVHcloud è RAID 1, che raddoppia lo spazio occupato dai vostri dati, riducendo quindi a metà lo spazio disco utilizzabile. + +**Questa guida spiega come gestire e ricostruire un RAID software in caso di sostituzione di un disco su un server in modalità legacy boot (BIOS).** -Il livello RAID implementato di default sui server OVHcloud è RAID 1, un sistema che raddoppia lo spazio occupato dai dati dimezzando quindi quello utile. +Prima di iniziare, notate che questa guida si concentra sui Server dedicati che utilizzano la modalità legacy boot (BIOS). Se il vostro server utilizza la modalità UEFI (schede madri più recenti), fate riferimento a questa guida [Gestione e ricostruzione del RAID software sui server in modalità boot UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -**Questa guida ti mostra come configurare il volume RAID del tuo server nel caso in cui sia necessario ricostruirlo in seguito alla corruzione o guasto del disco.** +Per verificare se un server è in esecuzione in modalità BIOS o in modalità UEFI, eseguite il comando seguente : + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Prerequisiti -- Disporre di un [server dedicato](/links/bare-metal/bare-metal) con configurazione RAID software -- Avere accesso al server via SSH con l’utente root +- Possedere un [server dedicato](/links/bare-metal/bare-metal) con una configurazione RAID software. +- Avere accesso al server tramite SSH come amministratore (sudo). +- Conoscenza del RAID e delle partizioni ## Procedura -### Rimozione del disco +### Panoramica del contenuto -Per verificare lo stato corrente del RAID è necessario eseguire questo comando: +- [Informazioni di base](#basicinformation) +- [Simulare un guasto del disco](#diskfailure) + - [Rimozione del disco guasto](#diskremove) +- [Ricostruzione del RAID](#raidrebuild) + - [Ricostruzione del RAID in modalità rescue](#rescuemode) + - [Aggiunta dell'etichetta alla partizione SWAP (se necessario)](#swap-partition) + - [Ricostruzione del RAID in modalità normale](#normalmode) + + + +### Informazioni di base + +Nella sessione della riga di comando, digitate il codice seguente per determinare lo stato attuale del RAID. ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Questo comando ci indica che due dispositivi RAID software sono attualmente configurati, **md4** essendo il più grande. Il dispositivo RAID **md4** è composto da due partizioni, denominate **nvme1n1p4** e **nvme0n1p4**. + +Il [UU] significa che tutti i dischi funzionano normalmente. Un `_` indica un disco guasto. + +Se possedete un server con dischi SATA, otterrete i seguenti risultati : + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -42,12 +80,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Il comando mostra due volumi RAID configurati. La partizione più grande è “md4” ed è composta dai due dischi “sda4” e “ sdb4”. [UU] indica che i dischi funzionano normalmente: in caso di disco difettoso sarebbe infatti presente una “`_`”. - -In questo modo è possibile visualizzare i volumi RAID, ma non la dimensione delle partizioni. È possibile ottenere questa informazione utilizzando un altro comando: +Sebbene questo comando restituisca i nostri volumi RAID, non ci indica la dimensione delle partizioni stesse. Possiamo trovare questa informazione con il comando seguente : ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -89,75 +125,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Il comando `fdisk -l` ti permette anche di identificare il tuo tipo di partizione. È un'informazione importante da conoscere quando si tratta di ricostruire il tuo RAID in caso di guasto di un disco. +Il comando `fdisk -l` vi permette inoltre di identificare il tipo di partizione. Si tratta di un'informazione importante per ricostruire il vostro RAID in caso di guasto di un disco. -Per le partizioni **GPT**, il comando restituisce: `Disklabel type: gpt`. +Per le partizioni **GPT**, la riga 6 mostrerà: `Disklabel type: gpt`. Queste informazioni sono visibili solo quando il server è in modalità normale. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Ancora in base ai risultati di `fdisk -l`, possiamo vedere che `/dev/md2` è composto da 888.8GB e `/dev/md4` contiene 973.5GB. -Per le partizioni **MBR**, il comando restituisce: `Disklabel type: dos`. +In alternativa, il comando `lsblk` offre una visione diversa delle partizioni : ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Questo comando mostra che `/dev/md2` è composto da 888,8 GB e `/dev/md4` contiene 973,5 GB. Eseguire il comando "mount" mostra la disposizione del disco. - -Per visualizzare lo stato del disco utilizza il comando “mount”: - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` - -In alternativa, il comando `lsblk` offre una vista differente delle partizioni: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -177,171 +154,233 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Al momento i dischi sono montati di default. Per rimuovere un disco dal RAID è necessario effettuarne l’unmount, indicarlo come difettoso e infine eliminarlo. Ad esempio, per rimuovere `/dev/sda4` dal RAID esegui il comando: +Prendiamo in considerazione i dispositivi, le partizioni e i loro punti di montaggio. Dai comandi e dai risultati sopra, abbiamo : -```sh -umount /dev/md4 -``` +- Due array RAID : `/dev/md2` e `/dev/md4`. +- Quattro partizioni fanno parte del RAID con i punti di montaggio : `/` e `/home`. + + + +### Simulare un guasto del disco + +Ora che abbiamo tutte le informazioni necessarie, possiamo simulare un guasto del disco e procedere ai test. In questo esempio, faremo fallire il disco `sda`. -> [!warning] -> Se sei connesso come utente `root`, puoi ottenere questo messaggio quando cerchi di smontare la partizione (nel nostro caso, la partizione md4 è montata in /home): -> ->
umount: /home: target is busy
-> -> In questo caso, dovrai disconnetterti come utente root e connetterti come utente locale (nel nostro caso, `debian`) utilizzando il comando: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Se non disponi di un utente locale, [è necessario crearne un](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +Il metodo preferito per farlo è l'ambiente in modalità rescue di OVHcloud. -Il risultato restituito sarà di questo tipo: +Riavviate prima il server in modalità rescue e connettetevi con le credenziali fornite. + +Per rimuovere un disco dal RAID, il primo passo consiste nel marcarlo come **Failed** e rimuovere le partizioni dai loro array RAID rispettivi. ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: ``` -A questo punto la partizione `/dev/md4` non è più montata, ma il RAID è ancora attivo. Per rimuovere il disco è necessario indicarlo come difettoso con il comando: +Dall'output sopra, sda è composto da due partizioni in RAID che sono **sda2** e **sda4**. + + + +#### Rimozione del disco guasto + +Iniziamo marciando le partizioni **sda2** e **sda4** come **failed**. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -In questo modo abbiamo simulato un malfunzionamento del RAID. Lo step successivo consiste nella rimozione della partizione dal RAID: - ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Per verificare che l’operazione sia stata effettuata correttamente, utilizza il comando: +Ora abbiamo simulato un guasto al RAID, quando eseguiamo il comando `cat /proc/mdstat`, otteniamo il risultato seguente : ```sh -cat /proc/mdstat +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Per verificare la corretta rimozione della partizione esegui il comando: +Come possiamo vedere sopra, il [F] accanto alle partizioni indica che il disco è guasto o difettoso. + +Successivamente, rimuoviamo queste partizioni dagli array RAID. ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` - Intent Bitmap : Internal +Per assicurarci di ottenere un disco simile a un disco vuoto, utilizziamo il comando seguente. Sostituite **sda** con i vostri valori : - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G -Consistency Policy : bitmap +# mdadm: re-added /dev/sda4 +``` - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +Utilizza il comando seguente per monitorare la ricostruzione del RAID : - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 -``` +```sh +[user@server_ip ~]# cat /proc/mdstat -### Ricostruzione del RAID +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` -Una volta sostituito il disco, copia la tabella delle partizioni da un disco funzionante (nell’esempio, “sdb”) in quello nuovo (“sda”) con il comando: +Infine, aggiungiamo un'etichetta e montiamo la partizione [SWAP] (se necessario). -**Per le partizioni GPT** +Per aggiungere un'etichetta alla partizione SWAP : ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -L'ordine deve essere nel seguente formato: `sgdisk -R /dev/newdisk /dev/healthydisk`. +Successivamente, recuperiamo gli UUID delle due partizioni swap : + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Una volta effettuata questa operazione, lo step successivo consiste nel rendere aleatoria la guida del nuovo disco per evitare qualsiasi conflitto di GUID con gli altri dischi: +Sostituiamo l'UUID vecchio della partizione swap (**sda4**) con il nuovo in `/etc/fstab` : ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Per le partizioni MBR** +Assicurati di sostituire l'UUID corretto. -Una volta sostituito il disco, copia la tabella delle partizioni da un disco funzionante (nell’esempio, “sdb”) in quello nuovo (“sda”) con il comando: +Successivamente, ricarica il sistema con il comando seguente : ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -L'ordine deve essere nel seguente formato: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +Esegui il comando seguente per attivare la partizione swap : + +```sh +[user@server_ip ~]# sudo swapon -av +``` + +La ricostruzione del RAID è ora completata. + + + +/// details | **Ricostruzione del RAID in modalità rescue** + +Una volta sostituito il disco, dobbiamo copiare la tabella delle partizioni del disco sano (in questo esempio, sda) verso il nuovo (sdb). + +> [!tabs] +> **Per le partizioni GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> Il comando deve essere nel formato seguente : `sgdisk -R /dev/nuovo disco /dev/disco sano` +>> +>> Esempio : +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Una volta completata questa operazione, il passo successivo consiste nell'assegnare un GUID casuale al nuovo disco per evitare conflitti con i GUID di altri dischi : +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Se appare il seguente messaggio : +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> È sufficiente eseguire il comando `partprobe`. Se non riesci comunque a visualizzare le nuove partizioni (ad esempio con `lsblk`), devi riavviare il server prima di procedere. +>> +> **Per le partizioni MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> Il comando deve essere nel formato seguente : `sfdisk -d /dev/disco sano | sfdisk /dev/nuovo disco` +>> + +Possiamo ora ricostruire l'array RAID. L'estratto di codice seguente mostra come aggiungere le nuove partizioni (sdb2 e sdb4) nell'array RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` -A questo punto è possibile ricostruire il volume RAID. Il codice seguente mostra come ricostruire la partizione `/dev/md4` tramite la tabella di “sda” copiata precedentemente: +Utilizza il comando `cat /proc/mdstat` per monitorare la ricostruzione del RAID : ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Per verificare i dettagli del RAID, utilizza il comando: +Per ulteriori dettagli su una o più matrici RAID : ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -372,20 +411,131 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -Una volta che il RAID è stato ricostruito, effettua il mount della partizione (nell’esempio, `/dev/md4`) con il comando: + + +#### Aggiunta dell'etichetta alla partizione SWAP (se necessario) + +Una volta completata la ricostruzione del RAID, montiamo la partizione che contiene la radice del nostro sistema operativo su `/mnt`. Nell'esempio, questa partizione è `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Aggiungiamo l'etichetta alla nostra partizione swap con il comando : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Successivamente, montiamo le seguenti directory per assicurarci che qualsiasi modifica che effettuiamo nell'ambiente chroot funzioni correttamente : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Successivamente, accediamo all'ambiente `chroot` : ```sh -mount /dev/md4 /home +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` +Recuperiamo gli UUID delle due partizioni swap : + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Esempio: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Successivamente, sostituiamo l'UUID vecchio della partizione swap (**sdb4**) con il nuovo in `/etc/fstab` : + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Esempio: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Assicurati di sostituire l'UUID corretto. Nell'esempio sopra, l'UUID da sostituire è `d6af33cf-fc15-4060-a43c-cb3b5537f58a` con il nuovo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Assicurati di sostituire l'UUID corretto. + +Successivamente, verifichiamo che tutto sia correttamente montato : + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Ricarica il sistema con il comando seguente : + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Attiva la partizione swap con il comando seguente : + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Esci dall'ambiente Chroot con `exit` e smonta tutti i dischi : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Abbiamo ora completato con successo la ricostruzione del RAID sul server e possiamo ora riavviarlo in modalità normale. + ## Per saperne di più -[Hot Swap – RAID Software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +[Hotswap - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[API OVHcloud e Archiviazione](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Gestione del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hotswap - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +Per servizi specializzati (posizionamento, sviluppo, ecc.), contatta i [partner OVHcloud](/links/partner). -[Hot Swap - RAID Hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +Se desideri ricevere un supporto sull'utilizzo e la configurazione delle tue soluzioni OVHcloud, consulta le nostre diverse [offerte di supporto](/links/support). -[Gestire il RAID Hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) (in inglese) +Se hai bisogno di un corso o di un supporto tecnico per l'implementazione delle nostre soluzioni, contatta il tuo commerciale o clicca su [questo link](/links/professional-services) per ottenere un preventivo e richiedere un'analisi personalizzata del tuo progetto ai nostri esperti del team Professional Services. -Contatta la nostra Community di utenti all’indirizzo . +Contatta la nostra [Community di utenti](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md index 003629b1631..b26dc5ac034 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md @@ -1,34 +1,74 @@ --- -title: Konfiguracja i rekonstrukcja programowej macierzy RAID -excerpt: "Dowiedz się, jak sprawdzić stan programowej macierzy RAID Twojego serwera i odtworzyć ją po wymianie dysku" -updated: 2023-08-21 +title: Zarządzanie i odbudowanie oprogramowania RAID na serwerach w trybie rozruchu legacy (BIOS) +excerpt: Dowiedz się, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie rozruchu legacy (BIOS) +updated: 2025-12-11 --- -> [!primary] -> Tłumaczenie zostało wygenerowane automatycznie przez system naszego partnera SYSTRAN. W niektórych przypadkach mogą wystąpić nieprecyzyjne sformułowania, na przykład w tłumaczeniu nazw przycisków lub szczegółów technicznych. W przypadku jakichkolwiek wątpliwości zalecamy zapoznanie się z angielską/francuską wersją przewodnika. Jeśli chcesz przyczynić się do ulepszenia tłumaczenia, kliknij przycisk "Zgłóś propozycję modyfikacji" na tej stronie. -> - ## Wprowadzenie -RAID (Redundant Array of Independent Disks) to narzędzie pozwalające zminimalizować ryzyko utraty danych zapisanych na serwerze poprzez ich replikację na wielu dyskach. +Redundantny zbiór niezależnych dysków (RAID) to technologia, która zmniejsza utratę danych na serwerze, replikując dane na dwóch lub więcej dyskach. + +Domyślny poziom RAID dla instalacji serwerów OVHcloud to RAID 1, który podwaja przestrzeń zajmowaną przez dane, skutecznie zmniejszając wykorzystywalną przestrzeń dyskową. + +**Ta instrukcja wyjaśnia, jak zarządzać i odbudować oprogramowanie RAID w przypadku wymiany dysku na serwerze w trybie rozruchu legacy (BIOS).** -Domyślny poziom RAID dla serwerów OVHcloud to RAID 1. Dzięki temu przestrzeń zajmowana przez dane zwiększa się dwukrotnie, natomiast wielkość użytkowanej przestrzeni dyskowej zmniejsza się o połowę. +Zanim zaczniemy, zwróć uwagę, że ta instrukcja koncentruje się na Serwerach dedykowanych, które używają trybu rozruchu legacy (BIOS). Jeśli Twój serwer używa trybu UEFI (nowsze płyty główne), odwiedź tę instrukcję [Zarządzanie i odbudowanie oprogramowania RAID na serwerach w trybie rozruchu UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -**W tym przewodniku wyjaśniamy, jak skonfigurować macierz RAID Twojego serwera w przypadku, gdy musi ona zostać odtworzona z powodu awarii lub uszkodzenia dysku.** +Aby sprawdzić, czy serwer działa w trybie legacy BIOS czy UEFI, uruchom następujące polecenie: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Wymagania początkowe -- Posiadanie [serwera dedykowanego](/links/bare-metal/bare-metal) ze skonfigurowaną programową macierzą RAID -- Dostęp do serwera przez SSH przy użyciu uprawnień administratora (sudo) +- Serwer [Dedykowany](/links/bare-metal/bare-metal) z konfiguracją oprogramowania RAID +- Dostęp administracyjny (sudo) do serwera przez SSH +- Zrozumienie RAID i partycji ## W praktyce -### Usuwanie dysku +Kiedy zakupisz nowy serwer, możesz czuć potrzebę wykonania szeregu testów i działań. Jednym z takich testów może być symulacja awarii dysku, aby zrozumieć proces odbudowy RAID i przygotować się na wypadek, gdyby to się kiedykolwiek zdarzyło. + +### Omówienie treści + +- [Podstawowe informacje](#basicinformation) +- [Symulowanie awarii dysku](#diskfailure) + - [Usuwanie uszkodzonego dysku](#diskremove) +- [Odbudowanie RAID](#raidrebuild) + - [Odbudowanie RAID w trybie ratunkowym](#rescuemode) + - [Dodawanie etykiety do partycji SWAP (jeśli dotyczy)](#swap-partition) + - [Odbudowanie RAID w trybie normalnym](#normalmode) + + + +### Podstawowe informacje -Weryfikacja aktualnego stanu RAID za pomocą polecenia: +W sesji wiersza poleceń wpisz poniższe polecenie, aby określić bieżący stan RAID: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +To polecenie pokazuje nam, że mamy dwa urządzenia RAID oprogramowania obecnie skonfigurowane, z **md4** będącym największym z nich. Urządzenie RAID **md4** składa się z dwóch partycji, które są znane jako **nvme1n1p4** i **nvme0n1p4**. + +[UU] oznacza, że wszystkie dyski działają normalnie. `_` wskazuje na uszkodzony dysk. + +Jeśli masz serwer z dyskami SATA, otrzymasz następujące wyniki: + +```sh +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -42,12 +82,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Polecenie wskazuje dwie aktualnie skonfigurowane macierze RAID, przy czym “md4” jest największą partycją. Partycja składa się z dwóch dysków o nazwach: “sda4” i “sdb4”. [UU] oznacza, że wszystkie dyski działają prawidłowo. “`_`” wskazuje wadliwy dysk. - -W poleceniu ukazane są wielkości macierzy RAID, nie podane są jednak rozmiary samych partycji. Informację tę można uzyskać za pomocą polecenia: +Choć to polecenie zwraca nasze objętości RAID, nie mówi nam o rozmiarze samych partycji. Te informacje możemy znaleźć za pomocą poniższego polecenia: ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -89,73 +127,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Komenda `fdisk -l` pozwala również na zidentyfikowanie partycji. Pamiętaj, że w przypadku awarii dysku możesz odtworzyć RAID. +Polecenie `fdisk -l` pozwala również zidentyfikować typ partycji. Jest to ważna informacja, gdy chodzi o odbudowanie RAID w przypadku awarii dysku. -W przypadku partycji **GPT** komenda zwróci: `Disklabel type: gpt`. +Dla partycji **GPT**, linia 6 będzie wyświetlać: `Disklabel type: gpt`. Ta informacja może być widoczna tylko, gdy serwer działa w trybie normalnym. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Zgodnie z wynikami `fdisk -l`, możemy stwierdzić, że `/dev/md2` składa się z 888,8 GB, a `/dev/md4` zawiera 973,5 GB. -W przypadku partycji **MBR** polecenie zwróci: `Disklabel type: dos`. +Alternatywnie, polecenie `lsblk` oferuje inny widok partycji: ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Polecenie pokazuje, że `/dev/md2` ma wielkość 888,8 GB, a `/dev/md4` 973,5 GB. Zastosuj polecenie “mount”, aby zobaczyć stan dysku. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` - -Komenda `lsblk` oferuje inny widok na partycje: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -175,172 +156,250 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Aktualnie dyski są zamontowane domyślnie. Aby usunąć dysk z macierzy RAID, najpierw odmontuj dysk, po czym wykonaj symulację błędu, aby ostatecznie go usunąć. -Następnie usuń `/dev/sda4` z macierzy RAID za pomocą polecenia: +Zwracamy uwagę na urządzenia, partycje i ich punkty montowania. Z powyższych poleceń i wyników mamy: + +- Dwa tablice RAID: `/dev/md2` i `/dev/md4`. +- Cztery partycje należące do RAID z punktami montowania: `/` i `/home`. + + + +### Symulowanie awarii dysku + +Teraz, gdy mamy wszystkie niezbędne informacje, możemy zasymulować awarię dysku i kontynuować testy. W tym przykładzie zasymulujemy awarię dysku `sda`. + +Preferowany sposób to wykonanie tego za pośrednictwem środowiska ratunkowego OVHcloud. + +Najpierw uruchom serwer w trybie ratunkowym i zaloguj się przy użyciu dostarczonych poświadczeń. + +Aby usunąć dysk z RAID, pierwszym krokiem jest oznaczenie go jako **Awaryjny** i usunięcie partycji z ich odpowiednich tablic RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Z powyższego wyniku wynika, że sda składa się z dwóch partycji w RAID, które to **sda2** i **sda4**. + + + +#### Usuwanie uszkodzonego dysku + +Najpierw oznaczamy partycje **sda2** i **sda4** jako awaryjne. ```sh -umount /dev/md4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -> [!warning] -> Pamiętaj, że jeśli jesteś zalogowany jako użytkownik `root`, możesz uzyskać następujący komunikat podczas próby odmontowania partycji (w naszym przypadku, kiedy nasza partycja md4 jest zamontowana w /home): -> ->
umount: /home: target is busy
-> -> W tym przypadku należy wylogować się jako użytkownik root i zalogować się jako użytkownik lokalny (w naszym przypadku, `debian`) i użyć następującej komendy: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Jeśli nie posiadasz lokalnego użytkownika, [musisz go utworzyć](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 +``` -Wynik będzie następujący: +Teraz zasymulowaliśmy awarię RAID, a po uruchomieniu polecenia `cat /proc/mdstat` mamy następujące dane wyjściowe: ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: ``` -Wpis `/dev/md4` nie jest już zamontowany. Jednak macierz RAID jest nadal aktywna. Konieczna jest zatem symulacja błędu umożliwiająca usunięcie dysku. W tym celu zastosuj polecenie: +Jak widać powyżej, [F] obok partycji wskazuje, że dysk uległ awarii lub jest uszkodzony. + +Następnie usuwamy te partycje z tablic RAID. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 ``` -Symulacja błędu RAID została wykonana. Następny krok to usunięcie partycji z macierzy RAID za pomocą polecenia: +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` + +Aby upewnić się, że otrzymamy dysk podobny do pustego dysku, używamy poniższego polecenia. Zamień **sda** na swoje własne wartości: ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +shred -s + +# mdadm: ponownie dodano /dev/sda4 ``` -Możesz sprawdzić, czy partycja została usunięta, stosując polecenie: +Aby monitorować odbudowę RAID, użyj poniższego polecenia: ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Poniższe polecenie pozwala upewnić się, czy partycja została usunięta: +Na koniec dodajemy etykietę i montujemy partycję [SWAP] (jeśli dotyczy). + +Aby dodać etykietę do partycji SWAP: ```sh -mdadm --detail /dev/md4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 +``` -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +Następnie pobierz UUID obu partycji SWAP: - Intent Bitmap : Internal +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +Zastępujemy stary UUID partycji SWAP (**sda4**) nowym w pliku `/etc/fstab`. -Consistency Policy : bitmap +Przykład: - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +```sh +[user@server_ip ~]# sudo nano etc/fstab - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -### Rekonstrukcja RAID - -Po wymianie dysku skopiuj tablicę partycji ze zdrowego dysku, (w tym przykładzie dysk “sdb”) do nowego dysku “sda” za pomocą następującego polecenia: +Na podstawie powyższych wyników, stary UUID to `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` i powinien zostać zastąpiony nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Upewnij się, że zastępujesz poprawny UUID. -**Dla partycji GPT** +Następnie sprawdzamy, czy wszystko zostało poprawnie zamontowane, używając poniższego polecenia: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Polecenie musi mieć następujący format: `sgdisk -R /dev/newdisk /dev/healthydisk`. - -Po wykonaniu tej operacji kolejny krok polega na losowym odwzorowaniu GUID nowego dysku, aby uniknąć konfliktu GUID z innymi dyskami: +Uruchom poniższe polecenie, aby włączyć partycję SWAP: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -**Dla partycji MBR** - -Po wymianie dysku skopiuj tablicę partycji ze zdrowego dysku, (w tym przykładzie dysk “sdb”) do nowego dysku “sda” za pomocą następującego polecenia: +Następnie przeładuj system poniższym poleceniem: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo systemctl daemon-reload ``` -Polecenie musi mieć następujący format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +W ten sposób skończyliśmy pomyślnie odbudowę RAID. + + + +/// details | **Odbudowanie RAID w trybie ratunkowym** + +Jeśli Twój serwer nie może uruchomić się w trybie normalnym po wymianie dysku, zostanie on uruchomiony w trybie ratunkowym. + +W tym przykładzie wymieniamy dysk `sdb`. + +Po wymianie dysku musimy skopiować tablicę partycji z dysku sprawnego (w tym przykładzie sda) na nowy (sdb). + +> [!tabs] +> **Dla partycji GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> Polecenie powinno mieć ten format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> +>> Przykład: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Po wykonaniu tego kroku następnym krokiem jest zrandomizowanie GUID nowego dysku, aby uniknąć konfliktów GUID z innymi dyskami: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Jeśli otrzymasz następującą wiadomość: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Możesz po prostu uruchomić polecenie `partprobe`. +>> +> **Dla partycji MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> Polecenie powinno mieć ten format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> + +Teraz możemy odbudować tablicę RAID. Poniższy fragment kodu pokazuje, jak możemy ponownie dodać nowe partycje (sdb2 i sdb4) do tablicy RAID. -Teraz możesz odtworzyć macierz RAID. Poniższy fragment kodu pokazuje, jak odtworzyć układ partycji `/dev/md4` za pomocą skopiowanej tablicy partycji “sda”: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` + +Użyj polecenia `cat /proc/mdstat`, aby monitorować odbudowę RAID: ```sh -mdadm —add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities: [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2: active raid1 sda2[1] sdb2[0] +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 stron [4KB], 65536KB chunk - -md4: active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] - bitmap: 0/8 stron [0KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk unused devices: ``` -Sprawdź szczegóły dotyczące RAID za pomocą polecenia: +Aby uzyskać więcej szczegółów na temat tablicy RAID: ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -367,22 +426,130 @@ mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 +``` + + + +#### Dodanie etykiety do partycji SWAP (jeśli dotyczy) + +Po zakończeniu odbudowy RAID montujemy partycję zawierającą korzeń naszego systemu operacyjnego na `/mnt`. W naszym przykładzie tą partycją jest `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Dodajemy etykietę do naszej partycji SWAP za pomocą polecenia: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Następnie montujemy poniższe katalogi, aby upewnić się, że wszystkie operacje w środowisku chroot będą działać poprawnie: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Następnie wchodzimy do środowiska `chroot`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Pobieramy UUID obu partycji SWAP: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Przykład: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Następnie zastępujemy stary UUID partycji SWAP (**sdb4**) nowym w pliku `/etc/fstab`: + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Przykład: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Upewnij się, że zastępujesz poprawny UUID. W powyższym przykładzie UUID do zastąpienia to `d6af33cf-fc15-4060-a43c-cb3b5537f58a` nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Upewnij się, że zastępujesz poprawny UUID. + +Następnie upewniamy się, że wszystko zostało poprawnie zamontowane: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Macierz RAID została odtworzona. Zamontuj partycję (w tym przykładzie `/dev/md4`) za pomocą polecenia: +Włącz partycję SWAP poniższym poleceniem: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 ``` +Wyjdź z środowiska `chroot` za pomocą `exit` i przeładuj system: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload +``` + +Odmontuj wszystkie dyski: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +W ten sposób pomyślnie zakończyliśmy odbudowę RAID na serwerze i teraz możemy go ponownie uruchomić w trybie normalnym. + ## Sprawdź również -[Wymiana dysku bez wyłączania serwera – Programowa macierz RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API i Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Zarządzanie hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[Sprzętowa macierz RAID (EN)](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +Dla usług specjalistycznych (SEO, rozwój, itp.), skontaktuj się z [partnerami OVHcloud](/links/partner). + +Jeśli potrzebujesz pomocy w użyciu i konfiguracji rozwiązań OVHcloud, skorzystaj z naszych [ofert wsparcia](/links/support). -Przyłącz się do społeczności naszych użytkowników na stronie . +Jeśli potrzebujesz szkoleń lub pomocy technicznej w wdroż \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md index 81cd9e6eaf2..c171f2ece6f 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md @@ -1,34 +1,73 @@ --- -title: Configuração e reconstrução do software RAID -excerpt: "Descubra como verificar o estado do RAID software do seu servidor e reconstruí-lo após uma substituição de disco" -updated: 2023-08-21 +title: Gestão e reconstrução do RAID software nos servidores em modo de arranque legado (BIOS) +excerpt: "Aprenda a gerir e reconstruir o RAID software após a substituição de um disco no seu servidor em modo de arranque legado (BIOS)" +updated: 2025-12-11 --- -> [!primary] -> Esta tradução foi automaticamente gerada pelo nosso parceiro SYSTRAN. Em certos casos, poderão ocorrer formulações imprecisas, como por exemplo nomes de botões ou detalhes técnicos. Recomendamos que consulte a versão inglesa ou francesa do manual, caso tenha alguma dúvida. Se nos quiser ajudar a melhorar esta tradução, clique em "Contribuir" nesta página. -> - ## Objetivo -O RAID (Redundant Array of Independent Disks) é um conjunto de técnicas concebidas para atenuar a perda de dados num servidor através da replicação dos dados em vários discos. +O RAID (Redundant Array of Independent Disks) é um conjunto de técnicas concebidas para mitigar a perda de dados num servidor replicando-os em vários discos. + +O nível RAID predefinido para as instalações de servidores OVHcloud é o RAID 1, o que duplica o espaço ocupado pelos seus dados, reduzindo assim metade do espaço de disco utilizável. + +**Este guia explica como gerir e reconstruir um RAID software em caso de substituição de um disco no seu servidor em modo de arranque legado (BIOS).** -O nível de RAID predefinido nos servidores da OVHcloud é RAID 1, ou seja, o dobro do espaço ocupado pelos dados, reduzindo assim para metade o espaço de disco utilizável. +Antes de começar, note que este guia se concentra nos servidores dedicados que utilizam o modo de arranque legado (BIOS). Se o seu servidor utiliza o modo UEFI (placas-mãe mais recentes), consulte este guia [Gestão e reconstrução do RAID software nos servidores em modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -**Este manual explica-lhe como configurar a matriz RAID de um servidor em caso de ter de ser reconstruída por motivos de corrupção ou de avaria de disco.** +Para verificar se um servidor está a executar em modo BIOS ou em modo UEFI, execute o seguinte comando : + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` ## Requisitos -- Dispor de um [servidor dedicado](/links/bare-metal/bare-metal) com uma configuração RAID por software. -- Ter acesso ao servidor através de SSH enquanto administrador (sudo). +- Ter um [servidor dedicado](/links/bare-metal/bare-metal) com uma configuração de RAID software. +- Ter acesso ao seu servidor via SSH com privilégios de administrador (sudo). +- Conhecimento de RAID e partições ## Instruções -### Retirada do disco +### Apresentação do conteúdo + +- [Informações básicas](#basicinformation) +- [Simular uma falha de disco](#diskfailure) + - [Remover o disco defeituoso](#diskremove) +- [Reconstrução do RAID](#raidrebuild) + - [Reconstrução do RAID em modo rescue](#rescuemode) + - [Adicionar o rótulo à partição SWAP (se aplicável)](#swap-partition) + - [Reconstrução do RAID em modo normal](#normalmode) + + + + +### Informações básicas + +Numa sessão de linha de comandos, introduza o seguinte código para determinar o estado atual do RAID. + +```sh +[user@server_ip ~]# cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 nvme0n1p2[1] nvme0n1p20] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: +``` + +Este comando indica-nos que dois dispositivos RAID software estão atualmente configurados, sendo **md4** o maior. O dispositivo RAID **md4** é composto por duas partições, denominadas **nvme1n1p4** e **nvme0n1p4**. -A verificação do estado atual do RAID pode ser efetuado através do seguinte comando: +O [UU] significa que todos os discos estão a funcionar normalmente. Um `_` indica um disco defeituoso. + +Se tiver um servidor com discos SATA, obterá os seguintes resultados : ```sh -cat /proc/mdstat +[user@server_ip ~]# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md2 : active raid1 sda2[1] sdb2[0] @@ -42,12 +81,10 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Este comando mostra as duas matrizes RAID que estão configuradas, sendo que “md4” é a maior partição. Uma partição é composta dois discos, chamados “sda4” e ”sdb4”. `[UU]` significa que todos os discos funcionam normalmente. Um “`_`” indicará um disco defeituoso. - -Embora este comando mostre os volumes RAID, este não indica o tamanho das próprias partições. Para obter esta informação, utilize o seguinte comando: +Embora este comando nos devolva os nossos volumes RAID, não nos indica o tamanho das próprias partições. Podemos encontrar esta informação com o seguinte comando : ```sh -fdisk -l +[user@server_ip ~]# sudo fdisk -l Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: HGST HUS724020AL @@ -89,73 +126,16 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -O comando `fdisk -l` permite-lhe também identificar o seu tipo de partição. Esta é uma informação importante para saber quando se trata de reconstruir o seu RAID em caso de falha de um disco. +O comando `fdisk -l` permite-nos também identificar o tipo de partição. Esta é uma informação importante para reconstruir o seu RAID em caso de falha de um disco. -Para as partições **GPT**, o comando voltará: `Disklabel type: gpt`. +Para as partições **GPT**, a linha 6 mostrará: `Disklabel type: gpt`. Estas informações só são visíveis quando o servidor está em modo normal. -```sh -Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors -Disk model: HGST HUS724020AL -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: gpt' -Disk identifier: F92B6C5B-2518-4B2D-8FF9-A311DED5845F -``` +Ainda com base nos resultados de `fdisk -l`, podemos ver que `/dev/md2` é composto por 888.8GB e `/dev/md4` contém 973.5GB. -Para as partições **MBR**, o comando voltará: `Disklabel type: dos`. +Alternativamente, o comando `lsblk` oferece uma visão diferente das partições : ```sh -Disk /dev/sda: 2.5 GiB, 2621440000 bytes, 5120000 sectors -Disk model: QEMU HARDDISK -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -'Disklabel type: dos' -Disk identifier: 0x150f6797 -``` - -Este comando mostra que `/dev/md2` é composto por 888,8 GB e `/dev/md4` contém 973,5 GB. Para mostrar a disposição do disco, execute o comando “mount”. - -```sh -mount - -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -/dev/md4 on /home type ext3 (rw,relatime) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) -``` - -Alternativamente, a encomenda `lsblk` oferece uma visão diferente das partições: - -```sh -lsblk +[user@server_ip ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk @@ -175,172 +155,237 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Os discos estão montados por predefinição. Para retirar um disco, em primeiro lugar deve desmontar o disco e, a seguir, simular uma falha para o poder eliminar. -De seguida, elimine `/dev/sda4` do RAID com o seguinte comando: +Temos em conta os dispositivos, as partições e os seus pontos de montagem. A partir dos comandos e resultados acima, temos : -```sh -umount /dev/md4 -``` +- Dois arrays RAID: `/dev/md2` e `/dev/md4`. +- Quatro partições fazem parte do RAID com os pontos de montagem: `/` e `/home`. + + + +### Simular uma falha de disco + +Agora que dispomos de todas as informações necessárias, podemos simular uma falha de disco e continuar com os testes. Neste exemplo, vamos fazer falhar o disco `sda`. -> [!warning] -> Atenção: se estiver conectado como utilizador `root`, pode obter a seguinte mensagem quando estiver a tentar desmontar a partição (no nosso caso, em que a nossa partição md4 está montada em /home): -> ->
umount: /home: target is busy
-> -> Neste caso, deve desligar-se enquanto utilizador root e ligar-se como utilizador local (no nosso caso, `debian`) e utilizar o seguinte comando: -> ->
debian@ns000000:/$ sudo umount /dev/md4
-> -> Se não dispõe de um utilizador local, [deve criar um](/pages/bare_metal_cloud/dedicated_servers/changing_root_password_linux_ds). +O meio preferido para isso é o ambiente em modo rescue da OVHcloud. -O resultado deverá ser este: +Reinicie primeiro o servidor em modo rescue e faça login com as credenciais fornecidas. + +Para remover um disco do RAID, o primeiro passo é marcá-lo como **Failed** e remover as partições dos seus arrays RAID respetivos. ```sh -sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) -proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) -udev on /dev type devtmpfs (rw,nosuid,relatime,size=16315920k,nr_inodes=4078980,mode=755) -devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) -tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=3266556k,mode=755) -/dev/md2 on / type ext4 (rw,relatime) -securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) -tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) -tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) -tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) -cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) -cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) -pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) -bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) -cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) -cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) -cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) -cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) -cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) -cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) -cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) -cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) -cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) -cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) -debugfs on /sys/kernel/debug type debugfs (rw,relatime) -hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) -mqueue on /dev/mqueue type mqueue (rw,relatime) -systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=45,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10340) -tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3266552k,mode=700,uid=1000,gid=1000) +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[1] sdb2[0] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sda4[0] sdb4[1] + 1020767232 blocks super 1.2 [2/2] [UU] + bitmap: 0/8 pages [0KB], 65536KB chunk + +unused devices: ``` -A entrada de `/dev/md4` já não está montada. No entanto, o RAID ainda está ativo. Assim, é necessário simular uma falha para retirar o disco, o que pode ser efetuado graças ao seguinte comando: +A partir da saída acima, sda é composto por duas partições em RAID que são **sda2** e **sda4**. + + + +#### Remover o disco defeituoso + +Começamos por marcar as partições **sda2** e **sda4** como **failed**. ```sh -sudo mdadm --fail /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 +# mdadm: set /dev/sda2 faulty in /dev/md2 ``` -Uma vez simulada a falha no RAID, pode eliminar a partição com o seguinte comando: - ```sh -sudo mdadm --remove /dev/md4 /dev/sda4 +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 --fail /dev/sda4 +# mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Poderá verificar que a partição foi eliminada com o seguinte comando: +Agora simulámos uma falha do RAID, quando executamos o comando `cat /proc/mdstat`, obtemos o seguinte resultado : ```sh -cat /proc/mdstat +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] - 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 4/7 pages [16KB], 65536KB chunk +md2 : active raid1 sda2[1](F) sdb2[0] + 931954688 blocks super 1.2 [2/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk -md4 : active raid1 sdb4[1] - 1020767232 blocks super 1.2 [2/1] [_U] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/2] [_U] bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -O comando abaixo verifica que a partição foi eliminada: +Como podemos ver acima, o [F] ao lado das partições indica que o disco está defeituoso ou com falha. -```sh -mdadm --detail /dev/md4 +Em seguida, removemos estas partições dos arrays RAID. -/dev/md4: - Version : 1.2 - Creation Time : Tue Jan 24 15:35:02 2023 - Raid Level : raid1 - Array Size : 1020767232 (973.48 GiB 1045.27 GB) - Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 +# mdadm: hot removed /dev/sda2 from /dev/md2 +``` - Intent Bitmap : Internal +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md4 --remove /dev/sda4 +# mdadm: hot removed /dev/sda4 from /dev/md4 +``` - Update Time : Tue Jan 24 16:28:03 2023 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 +Para nos certificarmos de que obtemos um disco semelhante a um disco vazio, utilizamos o seguinte comando. Substitua **sda** pelos seus próprios valores : -Consistency Policy : bitmap +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home +``` - Name : md4 - UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f - Events : 21 +Se executarmos o seguinte comando, vemos que o - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 8 20 1 active sync /dev/sdb4 +# mdadm: re-added /dev/sda4 ``` -### Reconstrução do RAID +Utilize o seguinte comando para monitorizar a reconstrução do RAID: -Uma vez substituído o disco, copie a tabela de partição a partir de um disco são (“sdb” neste exemplo) para a nova (“sda”), com o seguinte comando: +```sh +[user@server_ip ~]# cat /proc/mdstat -**Para as partições GPT** +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sda2[0] sdb2[1] + 931954688 blocks super 1.2 [2/2] [UU] + bitmap: 4/4 pages [16KB], 65536KB chunk + +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +Por fim, adicionamos um rótulo e montamos a partição [SWAP] (se aplicável). + +Para adicionar um rótulo à partição SWAP: ```sh -sgdisk -R /dev/sda /dev/sdb +[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 ``` -O comando deve ter o seguinte formato: `sgdisk -R /dev/newdisk /dev/healthydisk`. +Em seguida, recupere os UUID das duas partições swap: + +```sh +[user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +[user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +``` -Uma vez efetuada esta operação, o passo seguinte consiste em tornar aleatório o GUID do novo disco, a fim de evitar qualquer conflito do GUID com os outros discos: +Substituímos o antigo UUID da partição swap (**sda4**) pelo novo em `/etc/fstab`: ```sh -sgdisk -G /dev/sda +[user@server_ip ~]# sudo nano etc/fstab ``` -**Para as partições MBR** +Certifique-se de substituir o UUID correto. + +Em seguida, recarregue o sistema com o seguinte comando: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` -Uma vez substituído o disco, copie a tabela de partição a partir de um disco são (“sdb” neste exemplo) para a nova (“sda”), com o seguinte comando: +Execute o seguinte comando para ativar a partição swap: ```sh -sfdisk -d /dev/sdb | sfdisk /dev/sda +[user@server_ip ~]# sudo swapon -av ``` -O comando deve ter o seguinte formato: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +A reconstrução do RAID está agora concluída. + + + +/// details | **Reconstrução do RAID no modo rescue** + +Depois de substituir o disco, devemos copiar a tabela de partições do disco saudável (neste exemplo, sda) para o novo (sdb). + +> [!tabs] +> **Para as partições GPT** +>> +>> ```sh +>> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> O comando deve estar no seguinte formato: `sgdisk -R /dev/novo disco /dev/disco saudável` +>> +>> Exemplo: +>> +>> ```sh +>> sudo sgdisk -R /dev/sdb /dev/sda +>> ``` +>> +>> Depois de realizar esta operação, o passo seguinte consiste em atribuir um GUID aleatório ao novo disco para evitar quaisquer conflitos com os GUID de outros discos: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdb +>> ``` +>> +>> Se aparecer a seguinte mensagem: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> Pode simplesmente executar o comando `partprobe`. Se não conseguir ver as novas partições criadas (por exemplo, com `lsblk`), terá de reiniciar o servidor antes de continuar. +>> +> **Para as partições MBR** +>> +>> ```sh +>> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb +>> ``` +>> +>> O comando deve estar no seguinte formato: `sfdisk -d /dev/disco saudável | sfdisk /dev/novo disco` +>> + +Agora podemos reconstruir a matriz RAID. O seguinte código mostra como adicionar as novas partições (sdb2 e sdb4) na matriz RAID. -Já pode reconstruir a matriz RAID. O seguinte extrato do código mostra como reconstruir a disposição da partição `/dev/md4` com a tabela de partição “sda” que acaba de copiar: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 +# mdadm: added /dev/sdb2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 +# mdadm: re-added /dev/sdb4 +``` + +Utilize o comando `cat /proc/mdstat` para monitorizar a reconstrução do RAID: ```sh -mdadm --add /dev/md4 /dev/sda4 -cat /proc/mdstat +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md2 : active raid1 sda2[1] sdb2[0] +md2 : active raid1 sda2[0] sdb2[1] 931954688 blocks super 1.2 [2/2] [UU] - bitmap: 1/7 pages [4KB], 65536KB chunk + bitmap: 4/4 pages [16KB], 65536KB chunk -md4 : active raid1 sda4[0] sdb4[1] - 1020767232 blocks super 1.2 [2/2] [UU] +md4 : active raid1 sda4[0](F) sdb4[1] + 1020767232 blocks super 1.2 [2/1] [UU] + [============>........] recovery = 64.8% (822969856/1020767232) finish=7.2min speed=401664K/sec bitmap: 0/8 pages [0KB], 65536KB chunk - unused devices: ``` -Verifique os detalhes do RAID com o seguinte comando: +Para mais detalhes sobre a(s) baia(s) RAID: ```sh -mdadm --detail /dev/md4 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 /dev/md4: Version : 1.2 @@ -371,20 +416,132 @@ mdadm --detail /dev/md4 1 8 18 1 active sync /dev/sdb4 ``` -O RAID foi reconstruído. Para montar a partição (`/dev/md4`, no exemplo), utilize o seguinte comando: + + +#### Adição do rótulo à partição SWAP (se aplicável) + +Depois de concluir a reconstrução do RAID, montamos a partição que contém a raiz do nosso sistema operativo em `/mnt`. No nosso exemplo, esta partição é `md4`. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt +``` + +Adicionamos o rótulo à nossa partição swap com o seguinte comando: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 +mkswap: /dev/sda4: warning: wiping old swap signature. +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Em seguida, montamos os seguintes diretórios para garantir que todas as operações que realizamos no ambiente chroot funcionem corretamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Agora, acedemos ao ambiente `chroot`: ```sh -mount /dev/md4 /home +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` +Recuperamos os UUID das duas partições swap: + +```sh +root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 +root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 +``` + +Exemplo: + +```sh +blkid /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +``` + +```sh +blkid /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +Em seguida, substituímos o antigo UUID da partição swap (**sdb4**) pelo novo em `/etc/fstab`: + +```sh +root@rescue12-customer-eu:/# nano etc/fstab +``` + +Exemplo: + +```sh +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /home ext4 defaults 0 0 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Certifique-se de substituir o UUID correto. No nosso exemplo acima, o UUID a substituir é `d6af33cf-fc15-4060-a43c-cb3b5537f58a` pelo novo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Certifique-se de substituir o UUID correto. + +Em seguida, verificamos que tudo está corretamente montado: + +```sh +root@rescue12-customer-eu:/# mount -av +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored +``` + +Recarregue o sistema com o seguinte comando: + +```sh +root@rescue12-customer-eu:/# systemctl daemon-reload +``` + +Ative a partição swap com o seguinte comando: + +```sh +root@rescue12-customer-eu:/# swapon -av + +swapon: /dev/sda4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sda4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sda4 +swapon: /dev/sdb4: found signature [pagesize=4096, signature=swap] +swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/sdb4 +``` + +Saia do ambiente Chroot com `exit` e desmonte todos os discos: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` + +Concluímos com sucesso a reconstrução do RAID no servidor e agora podemos reiniciá-lo no modo normal. + + ## Quer saber mais? -[Hot Swap – RAID por hardware (EN)](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Remplacement à chaud - RAID logiciel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[API OVHcloud e Armazenamento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Gestão do RAID físico](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Remplacement à chaud - RAID Matériel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +Para serviços especializados (referênciação, desenvolvimento, etc), contacte os [parceiros OVHcloud](/links/partner). -[Substituir um disco a quente num servidor com RAID por software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +Se desejar beneficiar de assistência no uso e configuração das suas soluções OVHcloud, consulte as nossas diferentes [ofertas de suporte](/links/support). -[RAID por hardware (EN)](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +Se precisar de formação ou assistência técnica para a implementação das nossas soluções, contacte o seu contacto comercial ou clique [neste link](/links/professional-services) para obter um orçamento e solicitar uma análise personalizada do seu projeto aos nossos especialistas da equipa Professional Services. -Fale com a nossa comunidade de utilizadores em . +Fale com a nossa [comunidade de utilizadores](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/meta.yaml b/pages/bare_metal_cloud/dedicated_servers/raid_soft/meta.yaml index be427964bb9..b14684e9fd2 100755 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/meta.yaml +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/meta.yaml @@ -1,2 +1,3 @@ id: 415bbcf2-436c-4b1c-966f-903832fc0f8d -full_slug: dedicated-servers-raid-soft \ No newline at end of file +full_slug: dedicated-servers-raid-soft +translation_banner: true \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md new file mode 100644 index 00000000000..641b711a60a --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md @@ -0,0 +1,849 @@ +--- +title: Verwalten und Neuaufbauen von Software-RAID auf Servern mit UEFI-Boot-Modus +excerpt: Erfahren Sie, wie Sie Software-RAID nach einem Wechsel der Festplatte auf einem Server mit UEFI-Boot-Modus verwalten und neu aufbauen können +updated: 2025-12-11 +--- + +## Ziel + +Ein Redundanter Array unabhängiger Festplatten (RAID) ist eine Technologie, die den Datenverlust auf einem Server durch die Replikation von Daten auf zwei oder mehr Festplatten minimiert. + +Die Standard-RAID-Ebene für OVHcloud-Serverinstallationen ist RAID 1, wodurch der von Ihren Daten belegte Platz verdoppelt wird, was effektiv den nutzbaren Festplattenplatz halbiert. + +**Dieses Handbuch erklärt, wie Sie Software-RAID nach einem Festplattentausch auf einem Server mit UEFI-Boot-Modus verwalten und neu aufbauen können** + +Bevor wir beginnen, beachten Sie bitte, dass dieses Handbuch sich auf dedizierte Server konzentriert, die den UEFI-Boot-Modus verwenden. Dies ist bei modernen Motherboards der Fall. Wenn Ihr Server den Legacy-Boot-Modus (BIOS) verwendet, konsultieren Sie bitte dieses Handbuch: [Verwalten und Neuaufbauen von Software-RAID auf Servern im Legacy-Boot-Modus (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Um zu prüfen, ob ein Server im Legacy-BIOS-Modus oder im UEFI-Boot-Modus läuft, führen Sie den folgenden Befehl aus: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Weitere Informationen zu UEFI finden Sie in diesem [Artikel](https://uefi.org/about). + +## Voraussetzungen + +- Ein [dedizierter Server](/links/bare-metal/bare-metal) mit Software-RAID-Konfiguration +- Administrative (sudo-)Zugriffsrechte auf den Server über SSH +- Grundkenntnisse zu RAID, Partitionen und GRUB + +Im Laufe dieses Handbuchs verwenden wir die Begriffe **primäre Festplatte** und **sekundäre Festplatte**. In diesem Zusammenhang: + +- Die primäre Festplatte ist die Festplatte, deren ESP (EFI-Systempartition) von Linux eingehängt wird +- Die sekundäre(n) Festplatte(n) sind alle anderen Festplatten im RAID + +## In der praktischen Anwendung + +Wenn Sie einen neuen Server erwerben, können Sie sich möglicherweise dazu entschließen, eine Reihe von Tests und Aktionen durchzuführen. Ein solcher Test könnte darin bestehen, einen Festplattenausfall zu simulieren, um den RAID-Wiederherstellungsprozess zu verstehen und sich darauf vorzubereiten, falls dies jemals tatsächlich eintritt. + +### Inhaltsübersicht + +- [Grundlegende Informationen](#basicinformation) +- [Verständnis der EFI-Systempartition (ESP)](#efisystemparition) +- [Simulieren eines Festplattenausfalls](#diskfailure) + - [Entfernen der defekten Festplatte](#diskremove) +- [Neuaufbau des RAIDs](#raidrebuild) + - [Neuaufbau des RAIDs nach Austausch der Hauptfestplatte (Rettungsmodus)](#rescuemode) + - [Neuanlegen der EFI-Systempartition](#recreateesp) + - [Neuaufbau des RAIDs, wenn die EFI-Partitionen nach wichtigen Systemaktualisierungen (z. B. GRUB) nicht synchronisiert sind](efiraodgrub) + - [Hinzufügen der Bezeichnung zur SWAP-Partition (falls zutreffend)](#swap-partition) + - [Neuaufbau des RAIDs im normalen Modus](#normalmode) + + + +### Grundlegende Informationen + +In einer Befehlszeilensitzung geben Sie den folgenden Code ein, um den aktuellen RAID-Status zu ermitteln: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Dieser Befehl zeigt uns, dass wir derzeit zwei Software-RAID-Geräte konfiguriert haben, **md2** und **md3**, wobei **md3** das größere der beiden ist. **md3** besteht aus zwei Partitionen, genannt **nvme1n1p3** und **nvme0n1p3**. + +Die [UU] bedeutet, dass alle Festplatten normal funktionieren. Ein `_` würde eine defekte Festplatte anzeigen. + +Wenn Sie einen Server mit SATA-Festplatten haben, erhalten Sie die folgenden Ergebnisse: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Obwohl dieser Befehl unsere RAID-Volumes zurückgibt, sagt er uns nicht die Größe der Partitionen selbst. Wir können diese Informationen mit dem folgenden Befehl finden: + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +Der Befehl `fdisk -l` erlaubt es Ihnen auch, den Typ Ihrer Partition zu identifizieren. Dies ist eine wichtige Information, wenn es darum geht, Ihr RAID bei einem Festplattenausfall wiederherzustellen. + +Für **GPT**-Partitionen wird in Zeile 6 angezeigt: `Disklabel type: gpt`. + +Trotz der Ergebnisse von `fdisk -l` können wir sehen, dass `/dev/md2` aus 1022 MiB besteht und `/dev/md3` 474,81 GiB enthält. Wenn wir den Befehl `mount` ausführen, können wir auch die Struktur der Festplatte ermitteln. + +Alternativ bietet der Befehl `lsblk` eine andere Ansicht der Partitionen: + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Außerdem erhalten wir mit `lsblk -f` weitere Informationen zu diesen Partitionen, wie z. B. die Bezeichnung und UUID: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Notieren Sie sich die Geräte, Partitionen und ihre Einhängepunkte; dies ist besonders wichtig, nachdem Sie eine Festplatte ersetzt haben. + +Aus den oben genannten Befehlen und Ergebnissen haben wir: + +- Zwei RAID-Arrays: `/dev/md2` und `/dev/md3`. +- Vier Partitionen, die zum RAID gehören: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** mit den Einhängepunkten `/boot` und `/`. +- Zwei Partitionen, die nicht zum RAID gehören, mit Einhängepunkten: `/boot/efi` und [SWAP]. +- Eine Partition, die keinen Einhängepunkt hat: **nvme1n1p1** + +Die Partition **nvme0n1p5** ist eine Konfigurationspartition, d. h. ein schreibgeschütztes Volume, das mit dem Server verbunden ist und diesem die Anfangskonfigurationsdaten bereitstellt. + + + +### Verständnis der EFI-Systempartition (ESP) + +***Was ist eine EFI-Systempartition?*** + +Eine EFI-Systempartition ist eine Partition, die die Bootloader, Bootmanager oder Kernels eines installierten Betriebssystems enthalten kann. Sie kann auch Systemhilfeprogramme enthalten, die vor dem Start des Betriebssystems ausgeführt werden sollen, sowie Datendateien wie Fehlerprotokolle. + +***Wird die EFI-Systempartition in einem RAID gespiegelt?*** + +Nein, Stand August 2025, wenn die Installation des Betriebssystems von OVHcloud durchgeführt wird, ist die ESP nicht im RAID enthalten. Wenn Sie unsere Betriebssystemvorlagen verwenden, um Ihren Server mit Software-RAID zu installieren, werden mehrere EFI-Systempartitionen erstellt: eine pro Festplatte. Allerdings wird nur eine EFI-Partition gleichzeitig eingehängt. Alle ESPs, die zum Zeitpunkt der Installation erstellt wurden, enthalten die gleichen Dateien. + +Die EFI-Systempartition wird unter `/boot/efi` eingehängt und die Festplatte, auf der sie eingehängt ist, wird vom Linux-System beim Start ausgewählt. + +Beispiel: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Working on ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Speichern Sie die Datei und beenden Sie den Editor. + +- Machen Sie das Skript ausführbar + +```sh +sudo chmod +x script-name.sh +``` + +- Führen Sie das Skript aus + +```sh +sudo ./script-name.sh +``` + +- Wenn Sie sich nicht im richtigen Verzeichnis befinden + +```sh +./path/to/folder/script-name.sh +``` + +Wenn das Skript ausgeführt wird, werden die Inhalte der eingehängten EFI-Partition mit den anderen synchronisiert. Um auf den Inhalt zuzugreifen, können Sie eine dieser nicht eingehängten EFI-Partitionen am Einhängepunkt `/var/lib/grub/esp` einhängen. + + + +### Simulieren eines Festplattenausfalls + +Nachdem wir nun alle notwendigen Informationen haben, können wir einen Festplattenausfall simulieren und die Tests durchführen. In diesem ersten Beispiel simulieren wir den Ausfall der primären Festplatte `nvme0n1`. + +Die bevorzugte Methode hierzu ist die Nutzung des Rescue-Modus der OVHcloud. + +Starten Sie zunächst den Server im Rescue-Modus neu und melden Sie sich mit den bereitgestellten Anmeldeinformationen an. + +Um eine Festplatte aus dem RAID zu entfernen, ist der erste Schritt, sie als **fehlerhaft** zu markieren und die Partitionen aus ihren jeweiligen RAID-Arrays zu entfernen. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Aus der obigen Ausgabe ergibt sich, dass `nvme0n1` aus zwei Partitionen besteht, die sich im RAID befinden, nämlich **nvme0n1p2** und **nvme0n1p3**. + + + +#### Entfernen der fehlerhaften Festplatte + +Zunächst markieren wir die Partitionen **nvme0n1p2** und **nvme0n1p3** als fehlerhaft. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Wenn wir den Befehl `cat /proc/mdstat` ausführen, erhalten wir die folgende Ausgabe: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Wie oben zu sehen ist, zeigt das [F] neben den Partitionen an, dass die Festplatte fehlerhaft oder defekt ist. + +Als nächstes entfernen wir diese Partitionen aus den RAID-Arrays, um die Festplatte vollständig aus dem RAID zu entfernen. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +Der Status unseres RAIDs sollte nun wie folgt aussehen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Aus den oben genannten Ergebnissen können wir erkennen, dass nun nur noch zwei Partitionen in den RAID-Arrays erscheinen. Wir haben die Festplatte **nvme0n1** erfolgreich als fehlerhaft markiert. + +Um sicherzustellen, dass wir eine Festplatte erhalten, die einem leeren Laufwerk ähnelt, verwenden wir den folgenden Befehl auf jeder Partition und anschließend auf der Festplatte: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +Die Festplatte erscheint nun als neues, leeres Laufwerk: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Wenn wir den folgenden Befehl ausführen, sehen wir, dass unsere Festplatte erfolgreich „gelöscht“ wurde: + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Weitere Informationen zum Vorbereiten und Anfordern eines Festplattentauschs finden Sie in diesem [Leitfaden](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Wenn Sie den folgenden Befehl ausführen, erhalten Sie weitere Details zu den RAID-Arrays: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Wir können nun mit dem Festplattentausch fortfahren. + + + +### Neuaufbauen des RAIDs + +> [!primary] +> Dieser Prozess kann je nach installiertem Betriebssystem auf Ihrem Server variieren. Wir empfehlen Ihnen, die offizielle Dokumentation Ihres Betriebssystems zu konsultieren, um auf die richtigen Befehle zugreifen zu können. +> + +> [!warning] +> +> Bei den meisten Servern mit Software-RAID ist es nach einem Festplattentausch möglich, dass der Server im normalen Modus (auf der gesunden Festplatte) startet und das Neuaufbauen des RAIDs im normalen Modus durchgeführt werden kann. Wenn der Server nach einem Festplattentausch nicht im normalen Modus starten kann, wird er im Rescue-Modus neu gestartet, um das RAID-Neuaufbauen fortzusetzen. +> +> Wenn Ihr Server nach dem Festplattentausch im normalen Modus starten kann, führen Sie einfach die Schritte aus [diesem Abschnitt](#rebuilding-the-raid-in-normal-mode) aus. + + + +#### Neuaufbauen des RAIDs im Rescue-Modus + +Nachdem die Festplatte ersetzt wurde, ist der nächste Schritt, die Partitionstabelle von der gesunden Festplatte (in diesem Beispiel `nvme1n1`) auf die neue (`nvme0n1`) zu kopieren. + +**Für GPT-Partitionen** + +Der Befehl sollte in diesem Format lauten: `sgdisk -R /dev/new disk /dev/healthy disk` + +In unserem Beispiel: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Führen Sie `lsblk` aus, um sicherzustellen, dass die Partitionstabellen ordnungsgemäß kopiert wurden: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Sobald dies erledigt ist, ist der nächste Schritt, die GUID der neuen Festplatte zu randomisieren, um Konflikte mit anderen Festplatten zu vermeiden: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Wenn Sie die folgende Meldung erhalten: + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Führen Sie einfach den Befehl `partprobe` aus. + +Wir können nun das RAID-Array neu aufbauen. Der folgende Codeausschnitt zeigt, wie die neuen Partitionen (nvme0n1p2 und nvme0n1p3) in das RAID-Array zurückgefügt werden können. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 +``` + +# mdadm: /dev/nvme0n1p3 wurde wieder hinzugefügt +``` + +Um den Rebuild-Prozess zu prüfen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0,1% (801920/497875968) finish=41,3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Sobald der RAID-Rebuild abgeschlossen ist, führen Sie den folgenden Befehl aus, um sicherzustellen, dass die Partitionen ordnungsgemäß dem RAID hinzugefügt wurden: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +Basierend auf den oben genannten Ergebnissen wurden die Partitionen auf der neuen Festplatte korrekt dem RAID hinzugefügt. Allerdings wurden die EFI-Systempartition und die SWAP-Partition (in einigen Fällen) nicht dupliziert, was normal ist, da sie nicht in das RAID einbezogen werden. + +> [!warning] +> Die oben genannten Beispiele illustrieren lediglich die notwendigen Schritte anhand einer Standardserverkonfiguration. Die Informationen in der Ausgabetabelle hängen von der Hardware Ihres Servers und seinem Partitionsschema ab. Bei Unsicherheiten konsultieren Sie bitte die Dokumentation Ihres Betriebssystems. +> +> Wenn Sie professionelle Unterstützung bei der Serververwaltung benötigen, beachten Sie bitte die Details im Abschnitt [Weiterführende Informationen](#go-further) dieses Leitfadens. +> + + + +#### Wiederherstellen der EFI-Systempartition + +Um die EFI-Systempartition zu wiederherstellen, müssen wir **nvme0n1p1** formatieren und anschließend den Inhalt der gesunden Partition (in unserem Beispiel: nvme1n1p1) darauf kopieren. + +Wir gehen davon aus, dass beide Partitionen synchronisiert wurden und aktuelle Dateien enthalten. + +> [!warning] +> Falls es eine große Systemaktualisierung gab, z. B. Kernel oder GRUB, und beide Partitionen nicht synchronisiert wurden, beachten Sie bitte nach Abschluss der Erstellung der neuen EFI-Systempartition diesen [Abschnitt](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub). +> + +Zunächst formatieren wir die Partition: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Anschließend versehen wir die Partition mit dem Label `EFI_SYSPART` (dieser Name ist spezifisch für OVHcloud): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Nun kopieren wir den Inhalt von nvme1n1p1 auf nvme0n1p1. Zunächst erstellen wir zwei Ordner, die wir im Beispiel „old“ und „new“ nennen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Anschließend mounten wir **nvme1n1p1** im Ordner „old“ und **nvme0n1p1** im Ordner „new“, um den Unterschied zu verdeutlichen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Nun kopieren wir die Dateien vom Ordner „old“ in den Ordner „new“: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6.099.848 bytes received 165 bytes 12.200.026,00 bytes/sec +total size is 6.097.843 speedup is 1,00 +``` + +Sobald dies abgeschlossen ist, trennen wir beide Partitionen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Nun mounten wir die Partition, die die Wurzel unseres Betriebssystems enthält, auf `/mnt`. In unserem Beispiel ist dies die Partition **md3**. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Wir mounten die folgenden Ordner, um sicherzustellen, dass alle Manipulationen im `chroot`-Umgebung ordnungsgemäß funktionieren: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Nun verwenden wir den Befehl `chroot`, um auf den Mount-Punkt zuzugreifen und sicherzustellen, dass die neue EFI-Systempartition ordnungsgemäß erstellt wurde und das System beide ESPs erkennt: + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Um die ESP-Partitionen anzuzeigen, führen wir den Befehl `blkid -t LABEL=EFI_SYSPART` aus: + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Die oben genannten Ergebnisse zeigen, dass die neue EFI-Partition ordnungsgemäß erstellt wurde und das Label korrekt angewendet wurde. + + + +#### RAID neu aufbauen, wenn EFI-Partitionen nach größeren Systemaktualisierungen (GRUB) nicht synchronisiert sind + +/// details | Diesen Abschnitt ausklappen + +> [!warning] +> Bitte folgen Sie nur den Schritten in diesem Abschnitt, wenn sie auf Ihren Fall zutreffen. +> + +Wenn die EFI-Systempartitionen nach größeren Systemaktualisierungen, die GRUB modifizieren oder beeinflussen, nicht synchronisiert sind und die primäre Festplatte, auf der die Partition montiert ist, ersetzt wurde, kann das Starten von einer sekundären Festplatte mit einer veralteten ESP nicht funktionieren. + +In diesem Fall müssen Sie neben dem Neuaufbauen des RAIDs und dem Wiederherstellen der EFI-Systempartition im Rescue-Modus auch GRUB darauf neu installieren. + +Sobald wir die EFI-Partition wiederhergestellt und sichergestellt haben, dass das System beide Partitionen erkennt (vorige Schritte in `chroot`), erstellen wir den Ordner `/boot/efi`, um die neue EFI-Systempartition **nvme0n1p1** zu mounten: + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Anschließend installieren wir den GRUB-Bootloader erneut: + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Sobald dies abgeschlossen ist, führen Sie den folgenden Befehl aus: + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Label zur SWAP-Partition hinzufügen (falls zutreffend) + +Nachdem wir die EFI-Partition abgeschlossen haben, wechseln wir zur SWAP-Partition. + +Wir verlassen die `chroot`-Umgebung mit `exit`, um unsere [SWAP]-Partition **nvme0n1p4** zu erstellen und das Label `swap-nvme0n1p4` hinzuzufügen: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Wir prüfen, ob das Label ordnungsgemäß angewendet wurde: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504,9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441,2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441,2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091 + +# mdadm: /dev/nvme0n1p3 erneut hinzugefügt +``` + +Verwenden Sie den folgenden Befehl, um den RAID-Neuaufbau zu verfolgen: `cat /proc/mdstat`. + +**Erstellen der EFI-Systempartition auf der Festplatte** + +Zunächst installieren wir die erforderlichen Tools: + +**Debian und Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Als nächstes formatieren wir die Partition. In unserem Beispiel `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Als nächstes versehen wir die Partition mit dem Label `EFI_SYSPART` (dieser Name ist spezifisch für OVHcloud) + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Sobald dies abgeschlossen ist, können Sie beide Partitionen mithilfe des von uns bereitgestellten Skripts [hier](#script) synchronisieren. + +Wir prüfen, ob die neue EFI-Systempartition ordnungsgemäß erstellt wurde und vom System erkannt wird: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Zuletzt aktivieren wir die [SWAP]-Partition (sofern zutreffend): + + +- Wir erstellen und fügen das Label hinzu: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Wir rufen die UUIDs beider Swap-Partitionen ab: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Wir ersetzen die alte UUID der Swap-Partition (**nvme0n1p4)** durch die neue in `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Beispiel: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Basierend auf den obigen Ergebnissen ist die alte UUID `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` und sollte durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15` ersetzt werden. + +Stellen Sie sicher, dass Sie die richtige UUID ersetzen. + +Als nächstes führen wir den folgenden Befehl aus, um die Swap-Partition zu aktivieren: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Als nächstes laden wir das System neu: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Wir haben nun erfolgreich den RAID-Neuaufbau abgeschlossen. + +## Weiterführende Informationen + +[Hot Swap - Software-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API und Speicher](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Verwalten von Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Für spezialisierte Dienstleistungen (SEO, Entwicklung usw.) wenden Sie sich an [OVHcloud Partner](/links/partner). + +Wenn Sie bei der Nutzung und Konfiguration Ihrer OVHcloud-Lösungen Unterstützung benötigen, wenden Sie sich bitte an unsere [Support-Angebote](/links/support). + +Wenn Sie Schulungen oder technische Unterstützung benötigen, um unsere Lösungen umzusetzen, wenden Sie sich an Ihren Vertriebsmitarbeiter oder klicken Sie auf [diesen Link](/links/professional-services), um ein Angebot anzufordern und unsere Experten für Professional Services um Unterstützung bei Ihrem spezifischen Anwendungsfall zu bitten. + +Treten Sie unserer [User Community](/links/community) bei. \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md index c5fab1c787f..808bbff24e4 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.en-gb.md @@ -1,7 +1,7 @@ --- title: Managing and rebuilding software RAID on servers using UEFI boot mode excerpt: Find out how to manage and rebuild software RAID after a disk replacement on a server using UEFI boot mode -updated: 2025-12-05 +updated: 2025-12-11 --- ## Objective diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md new file mode 100644 index 00000000000..3ff2210a88b --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md @@ -0,0 +1,908 @@ +--- +title: "Gestión y reconstrucción de un RAID software en servidores que utilizan el modo de arranque UEFI" +excerpt: Aprenda a gestionar y reconstruir un RAID software tras un reemplazo de disco en un servidor que utiliza el modo de arranque UEFI +updated: 2025-12-11 +--- + +## Objetivo + +Un Redundant Array of Independent Disks (RAID) es una tecnología que atenúa la pérdida de datos en un servidor al replicar los datos en dos discos o más. + +El nivel RAID predeterminado para las instalaciones de servidores de OVHcloud es el RAID 1, que duplica el espacio ocupado por sus datos, reduciendo así el espacio de disco utilizable a la mitad. + +**Este tutorial explica cómo gestionar y reconstruir un RAID software tras un reemplazo de disco en su servidor en modo EFI** + +Antes de comenzar, tenga en cuenta que este tutorial se centra en los servidores dedicados que utilizan el modo UEFI como modo de arranque. Este es el caso de las placas base modernas. Si su servidor utiliza el modo de arranque legacy (BIOS), consulte este tutorial: [Gestión y reconstrucción de un RAID software en servidores en modo de arranque legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Para verificar si un servidor funciona en modo BIOS legacy o en modo UEFI, ejecute el siguiente comando: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Para obtener más información sobre UEFI, consulte el siguiente artículo: [https://uefi.org/about](https://uefi.org/about). + +## Requisitos + +- Un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software +- Acceso administrativo (sudo) al servidor a través de SSH +- Comprensión del RAID, las particiones y GRUB + +Durante este tutorial, utilizamos los términos **disco principal** y **disco secundario**. En este contexto: + +- El disco principal es el disco cuya ESP (partición del sistema EFI) está montada por Linux +- Los discos secundarios son todos los demás discos del RAID + +## Instrucciones + +Cuando adquiere un nuevo servidor, puede sentir la necesidad de realizar una serie de pruebas y acciones. Una de estas pruebas podría ser simular una falla de disco para comprender el proceso de reconstrucción del RAID y prepararse en caso de problemas. + +### Vista previa del contenido + +- [Información básica](#basicinformation) +- [Comprensión de la partición del sistema EFI (ESP)](#efisystemparition) +- [Simulación de una falla de disco](#diskfailure) + - [Eliminación del disco defectuoso](#diskremove) +- [Reconstrucción del RAID](#raidrebuild) + - [Reconstrucción del RAID después del reemplazo del disco principal (modo de rescate)](#rescuemode) + - [Recreación de la partición del sistema EFI](#recreateesp) + - [Reconstrucción del RAID cuando las particiones EFI no están sincronizadas después de actualizaciones importantes del sistema (ej. GRUB)](efiraodgrub) + - [Añadido de la etiqueta a la partición SWAP (si aplica)](#swap-partition) + - [Reconstrucción del RAID en modo normal](#normalmode) + + + +### Información básica + +En una sesión de línea de comandos, escriba el siguiente comando para determinar el estado actual del RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Este comando nos muestra que actualmente tenemos dos volúmenes RAID software configurados, **md2** y **md3**, con **md3** siendo el más grande de los dos. **md3** se compone de dos particiones, llamadas **nvme1n1p3** y **nvme0n1p3**. + +El [UU] significa que todos los discos funcionan normalmente. Un `_` indicaría un disco defectuoso. + +Si tiene un servidor con discos SATA, obtendrá los siguientes resultados : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Aunque este comando devuelve nuestros volúmenes RAID, no nos indica el tamaño de las particiones en sí. Podemos encontrar esta información con el siguiente comando : + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +El comando `fdisk -l` también permite identificar el tipo de sus particiones. Esta es una información importante durante la reconstrucción de su RAID en caso de falla de disco. + +Para las particiones **GPT**, la línea 6 mostrará : `Disklabel type: gpt`. + +Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 1022 MiB y `/dev/md3` contiene 474,81 GiB. Si ejecutamos el comando `mount`, también podemos encontrar la disposición de los discos. + +Como alternativa, el comando `lsblk` ofrece una vista diferente de las particiones : + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Además, si ejecutamos `lsblk -f`, obtenemos más información sobre estas particiones, tales como el LABEL y el UUID : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Tome nota de los dispositivos, las particiones y sus puntos de montaje; esto es importante, especialmente después del reemplazo de un disco. + +A partir de los comandos y resultados anteriores, tenemos : + +- Dos matrices RAID : `/dev/md2` y `/dev/md3`. +- Cuatro particiones que forman parte del RAID : **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** con los puntos de montaje `/boot` y `/`. +- Dos particiones no incluidas en el RAID, con los puntos de montaje : `/boot/efi` y [SWAP]. +- Una partición que no tiene punto de montaje : **nvme1n1p1** + +La partición `nvme0n1p5` es una partición de configuración, es decir, un volumen de solo lectura conectado al servidor que le proporciona los datos de configuración inicial. + + + +### Comprender la partición del sistema EFI (ESP) + +***¿Qué es una partición del sistema EFI ?*** + +Una partición del sistema EFI es una partición en la que el servidor inicia. Contiene los archivos de inicio, así como los controladores de inicio o las imágenes del kernel de un sistema operativo instalado. También puede contener programas útiles diseñados para ejecutarse antes de que el sistema operativo inicie, así como archivos de datos tales como registros de errores. + +***¿La partición del sistema EFI está incluida en el RAID ?*** + +No, a partir de agosto de 2025, cuando se realiza una instalación del sistema operativo por parte de OVHcloud, la partición ESP no está incluida en el RAID. Cuando utiliza nuestros modelos de sistema operativo para instalar su servidor con un RAID software, se crean varias particiones del sistema EFI: una por disco. Sin embargo, solo se monta una partición EFI a la vez. Todas las ESP creadas contienen los mismos archivos. Todos los ESP creados en el momento de la instalación contienen los mismos archivos. + +La partición del sistema EFI se monta en `/boot/efi` y el disco en el que se monta se selecciona por Linux al iniciar. + +Ejemplo : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 85 + +Le recomendamos sincronizar sus ESP regularmente o después de cada actualización importante del sistema. Por defecto, todas las particiones del sistema EFI contienen los mismos archivos después de la instalación. Sin embargo, si se implica una actualización importante del sistema, la sincronización de los ESP es esencial para mantener el contenido actualizado. + + + +#### Script + +Aquí tiene un script que puede utilizar para sincronizarlos manualmente. También puede ejecutar un script automatizado para sincronizar las particiones diariamente o cada vez que se inicie el servicio. + +Antes de ejecutar el script, asegúrese de que `rsync` esté instalado en su sistema : + +**Debian/Ubuntu** + +```sh +sudo apt install rsync +``` + +**CentOS, Red Hat y Fedora** + +```sh +sudo yum install rsync +``` + +Para ejecutar un script en Linux, necesita un archivo ejecutable : + +- Empiece creando un archivo .sh en el directorio que elija, reemplazando `nombre-del-script` por el nombre que elija. + +```sh +sudo touch nombre-del-script.sh +``` + +- Abra el archivo con un editor de texto y agregue las siguientes líneas : + +```sh +sudo nano nombre-del-script.sh +``` + +```sh +#!/bin/bash + +set -euo pipefail + +MOUNTPOINT="/var/lib/grub/esp" +MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) + +echo "${MAIN_PARTITION} es la partición principal" + +mkdir -p "${MOUNTPOINT}" + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Trabajo en ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Guarde y cierre el archivo. + +- Haga que el script sea ejecutable + +```sh +sudo chmod +x nombre-del-script.sh +``` + +- Ejecute el script + +```sh +sudo ./nombre-del-script.sh +``` + +- Si no está en el directorio + +```sh +./ruta/hacia/el/directorio/nombre-del-script.sh +``` + +Cuando se ejecuta el script, el contenido de la partición EFI montada se sincronizará con las demás. Para acceder al contenido, puede montar una de estas particiones EFI no montadas en el punto de montaje: `/var/lib/grub/esp`. + + + +### Simulación de una falla de disco + +Ahora que tenemos toda la información necesaria, podemos simular una falla de disco y proceder a los tests. En este primer ejemplo, provocaremos una falla del disco principal `nvme0n1`. + +El método preferido para hacerlo es a través del modo rescue de OVHcloud. + +Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. + +Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +A partir del resultado anterior, nvme0n1 contiene dos particiones en RAID que son **nvme0n1p2** y **nvme0n1p3**. + + + +#### Retiro del disco defectuoso + +En primer lugar, marcamos las particiones **nvme0n1p2** y **nvme0n1p3** como defectuosas. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Como podemos ver arriba, el [F] al lado de las particiones indica que el disco está defectuoso o fallido. + +A continuación, retiramos estas particiones de las matrices RAID para eliminar completamente el disco del RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +El estado de nuestro RAID debería parecerse ahora a esto : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +De acuerdo con los resultados anteriores, podemos ver que ahora solo hay dos particiones en las matrices RAID. Hemos logrado degradar el disco **nvme0n1**. + +Para asegurarnos de obtener un disco similar a un disco vacío, utilizamos el siguiente comando en cada partición, y luego en el disco mismo : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +El disco ahora aparece como un disco nuevo y vacío : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Si ejecutamos el siguiente comando, verificamos que nuestro disco ha sido correctamente "borrado" : + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Para obtener más información sobre la preparación y la solicitud de reemplazo de un disco, consulte este [guía](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Si ejecuta el siguiente comando, puede obtener más detalles sobre las matrices RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Ahora podemos proceder al reemplazo del disco. + + + +### Reconstrucción del RAID + +> [!primary] +> Este proceso puede variar según el sistema operativo instalado en su servidor. Le recomendamos consultar la documentación oficial de su sistema operativo para obtener los comandos adecuados. +> + +> [!warning] +> +> En la mayoría de los servidores con RAID software, después de un reemplazo de disco, el servidor puede arrancar en modo normal (sobre el disco sano) y la reconstrucción puede realizarse en modo normal. Sin embargo, si el servidor no puede arrancar en modo normal después del reemplazo del disco, reiniciará en modo rescue para proceder a la reconstrucción del RAID. +> +> Si su servidor puede arrancar en modo normal después del reemplazo del disco, simplemente siga los pasos de [esta sección](#rebuilding-the-raid-in-normal-mode). + + + +#### Reconstrucción del RAID en modo rescue + +Una vez reemplazado el disco, el siguiente paso consiste en copiar la tabla de particiones del disco sano (en este ejemplo, nvme1n1) en el nuevo (nvme0n1). + +**Para particiones GPT** + +El comando debe tener este formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` + +En nuestro ejemplo : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Ejecute `lsblk` para asegurarse de que las tablas de particiones se hayan copiado correctamente : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Una vez hecho esto, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos de GUID con otros discos : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Si recibe el siguiente mensaje : + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Simplemente ejecute el comando `partprobe`. + +Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo agregar nuevamente las nuevas particiones (nvme0n1p2 y nvme0n1p3) a la matriz RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 +``` + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Para verificar el proceso de reconstrucción: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Una vez que la reconstrucción del RAID esté terminada, ejecute el siguiente comando para asegurarse de que las particiones se hayan agregado correctamente al RAID: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +Según los resultados anteriores, las particiones del nuevo disco se han agregado correctamente al RAID. Sin embargo, la partición del sistema EFI y la partición SWAP (en algunos casos) no se han duplicado, lo cual es normal ya que no forman parte del RAID. + +> [!warning] +> Los ejemplos anteriores ilustran simplemente los pasos necesarios basados en una configuración de servidor predeterminada. Los resultados de cada comando dependen del tipo de hardware instalado en su servidor y de la estructura de sus particiones. En caso de duda, consulte la documentación de su sistema operativo. +> +> Si necesita asistencia profesional para la administración de su servidor, consulte los detalles de la sección [Más información](#go-further) de esta guía. +> + + + +#### Recreación de la partición del sistema EFI + +Para recrear la partición del sistema EFI, debemos formatear **nvme0n1p1** y replicar el contenido de la partición del sistema EFI sana (en nuestro ejemplo: nvme1n1p1) en esta última. + +Aquí, asumimos que ambas particiones se han sincronizado y contienen archivos actualizados o simplemente no han sufrido actualizaciones del sistema que afecten al *bootloader*. + +> [!warning] +> Si se ha realizado una actualización importante del sistema, como una actualización del kernel o de GRUB, y las dos particiones no se han sincronizado, consulte esta [sección](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) una vez que haya terminado de crear la nueva partición del sistema EFI. +> + +Primero, formateamos la partición: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +A continuación, asignamos la etiqueta `EFI_SYSPART` a la partición. (este nombre es específico de OVHcloud): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Luego, duplicamos el contenido de nvme1n1p1 en nvme0n1p1. Comenzamos creando dos directorios, que llamamos « old » y « new » en nuestro ejemplo: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +A continuación, montamos **nvme1n1p1** en el directorio « old » y **nvme0n1p1** en el directorio « new » para diferenciarlos: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Luego, copiamos los archivos del directorio 'old' a 'new': + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Una vez hecho esto, desmontamos ambas particiones: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +A continuación, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es **md3**: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno `chroot` funcione correctamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Luego, utilizamos el comando `chroot` para acceder al punto de montaje y asegurarnos de que la nueva partición del sistema EFI se ha creado correctamente y que el sistema reconoce las dos ESP: + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Para mostrar las particiones ESP, ejecutamos el comando `blkid -t LABEL=EFI_SYSPART`: + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Los resultados anteriores muestran que la nueva partición EFI se ha creado correctamente y que la etiqueta se ha aplicado correctamente. + + + +#### Reconstrucción del RAID cuando las particiones EFI no están sincronizadas después de actualizaciones importantes del sistema (GRUB) + +/// details | Despliegue esta sección + +> [!warning] +> Siga los pasos de esta sección solo si se aplica a su caso. +> + +Cuando las particiones del sistema EFI no están sincronizadas después de actualizaciones importantes del sistema que modifican/afectan a GRUB, y se reemplaza el disco principal en el que se monta la partición, el arranque desde un disco secundario que contiene una ESP obsoleta puede no funcionar. + +En este caso, además de reconstruir el RAID y recrear la partición del sistema EFI en modo rescue, también debe reinstalar GRUB en esta última. + +Una vez que hayamos recreado la partición EFI y nos aseguremos de que el sistema reconoce las dos particiones (pasos anteriores en `chroot`), creamos el directorio `/boot/efi` para montar la nueva partición del sistema EFI **nvme0n1p1**: + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +A continuación, reinstalamos el cargador de arranque GRUB (*bootloader*): + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Una vez hecho esto, ejecute el siguiente comando: + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Añadimos la etiqueta a la partición SWAP (si aplica) + +Una vez que hayamos terminado con la partición EFI, pasamos a la partición SWAP. + +Salimos del entorno `chroot` con `exit` para recrear nuestra partición [SWAP] **nvme0n1p4** y añadir la etiqueta `swap-nvme0n1p4`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Verificamos que la etiqueta se haya aplicado correctamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091fcd15 +``` + +Accedemos nuevamente al entorno `chroot`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Recuperamos el UUID de ambas particiones swap: + +```sh + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Utilice el siguiente comando para seguir la reconstrucción del RAID: `cat /proc/mdstat`. + +**Recreación de la partición EFI System en el disco** + +En primer lugar, instalamos las herramientas necesarias: + +**Debian y Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +A continuación, formateamos la partición. En nuestro ejemplo `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +A continuación, asignamos la etiqueta `EFI_SYSPART` a la partición. (este nombre es específico de OVHcloud): + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Una vez hecho esto, puede sincronizar las dos particiones utilizando el script que hemos proporcionado [aquí](#script). + +Comprobamos que la nueva partición EFI System se ha creado correctamente y que el sistema la reconoce: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Finalmente, activamos la partición [SWAP] (si aplica): + +- Creamos y añadimos la etiqueta: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Recuperamos los UUID de las dos particiones swap: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Reemplazamos el antiguo UUID de la partición swap (**nvme0n1p4)** por el nuevo en `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Ejemplo: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Según los resultados anteriores, el antiguo UUID es `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` y debe ser reemplazado por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Asegúrese de reemplazar el UUID correcto. + +A continuación, ejecutamos el siguiente comando para activar la partición swap: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +A continuación, recargamos el sistema: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Hemos terminado con éxito la reconstrucción del RAID. + +## Más información + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Para servicios especializados (SEO, desarrollo, etc.), contacte con [los socios de OVHcloud](/links/partner). + +Si necesita asistencia para utilizar y configurar sus soluciones OVHcloud, consulte nuestras [ofertas de soporte](/links/support). + +Si necesita formación o asistencia técnica para implementar nuestras soluciones, contacte con su representante comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar que los expertos del equipo de Professional Services intervengan en su caso de uso específico. + +Únase a nuestra [comunidad de usuarios](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md index 96d2370ae91..e958d4c3c0a 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.fr-fr.md @@ -1,7 +1,7 @@ --- title: "Gestion et reconstruction d'un RAID logiciel sur les serveurs utilisant le mode de démarrage UEFI" excerpt: Découvrez comment gérer et reconstruire un RAID logiciel après un remplacement de disque sur un serveur utilisant le mode de démarrage UEFI -updated: 2025-12-05 +updated: 2025-12-11 --- ## Objectif @@ -203,7 +203,7 @@ La partition `nvme0n1p5` est une partition de configuration, c'est-à-dire un vo ***Qu'est-ce qu'une partition système EFI ?*** -**Une partition système EFI est une partition sur laquelle le serveur demarre. Elle contient les fichiers de démarrage, mais aussi les gestionnaires de démarrage ou les images de noyau d'un système d'exploitation installé. Elle peut également contenir des programmes utilitaires conçus pour être exécutés avant que le système d'exploitation ne démarre, ainsi que des fichiers de données tels que des journaux d'erreurs. +Une partition système EFI est une partition sur laquelle le serveur demarre. Elle contient les fichiers de démarrage, mais aussi les gestionnaires de démarrage ou les images de noyau d'un système d'exploitation installé. Elle peut également contenir des programmes utilitaires conçus pour être exécutés avant que le système d'exploitation ne démarre, ainsi que des fichiers de données tels que des journaux d'erreurs. ***La partition système EFI est-elle incluse dans le RAID ?*** diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md new file mode 100644 index 00000000000..28d39c3c16d --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md @@ -0,0 +1,896 @@ +--- +title: "Gestione e ricostruzione di un RAID software sui server in modalità di avvio UEFI" +excerpt: Scopri come gestire e ricostruire un RAID software dopo il ripristino di un disco su un server in modalità di avvio UEFI +updated: 2025-12-11 +--- + +## Obiettivo + +Un Redundant Array of Independent Disks (RAID) è una tecnologia che riduce la perdita di dati su un server replicando i dati su due dischi o più. + +Il livello RAID predefinito per le installazioni dei server OVHcloud è il RAID 1, che raddoppia lo spazio occupato dai vostri dati, riducendo quindi la capacità di archiviazione utilizzabile a metà. + +**Questa guida spiega come gestire e ricostruire un RAID software dopo il ripristino di un disco sul vostro server in modalità EFI** + +Prima di iniziare, notate che questa guida si concentra sui Server dedicati che utilizzano la modalità UEFI come modalità di avvio. Questo è il caso delle schede madri moderne. Se il vostro server utilizza la modalità di avvio legacy (BIOS), consultate questa guida: [Gestione e ricostruzione di un RAID software su server in modalità di avvio legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Per verificare se un server funziona in modalità BIOS legacy o in modalità UEFI, eseguite il comando seguente: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Per ulteriori informazioni sull'UEFI, consultate l'articolo seguente: [https://uefi.org/about](https://uefi.org/about). + +## Prerequisiti + +- Un [server dedicato](/links/bare-metal/bare-metal) con una configurazione RAID software +- Un accesso amministrativo (sudo) al server tramite SSH +- Una comprensione del RAID, delle partizioni e di GRUB + +Durante questa guida utilizzeremo i termini **disco principale** e **disco secondario**. In questo contesto: + +- Il disco principale è il disco il cui ESP (EFI System Partition) è montato da Linux +- I dischi secondari sono tutti gli altri dischi del RAID + +## Procedura + +Quando acquisti un nuovo server, potresti sentire il bisogno di effettuare una serie di test e azioni. Un tale test potrebbe consistere nel simulare un guasto del disco per comprendere il processo di ricostruzione del RAID e prepararti in caso di problemi. + +### Panoramica del contenuto + +- [Informazioni di base](#basicinformation) +- [Comprendere la partizione del sistema EFI (ESP)](#efisystemparition) +- [Simulazione di un guasto del disco](#diskfailure) + - [Rimozione del disco guasto](#diskremove) +- [Ricostruzione del RAID](#raidrebuild) + - [Ricostruzione del RAID dopo la sostituzione del disco principale (modalità di salvataggio)](#rescuemode) + - [Ricreazione della partizione del sistema EFI](#recreateesp) + - [Ricostruzione del RAID quando le partizioni EFI non sono sincronizzate dopo aggiornamenti importanti del sistema (es. GRUB)](efiraodgrub) + - [Aggiunta dell'etichetta alla partizione SWAP (se applicabile)](#swap-partition) + - [Ricostruzione del RAID in modalità normale](#normalmode) + + + +### Informazioni di base + +In una sessione della riga di comando, digita il comando seguente per determinare lo stato corrente del RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Questo comando ci mostra che attualmente abbiamo due volumi RAID software configurati, **md2** e **md3**, con **md3** che è il più grande dei due. **md3** è composto da due partizioni, chiamate **nvme1n1p3** e **nvme0n1p3**. + +Il [UU] significa che tutti i dischi funzionano normalmente. Un `_` indicherebbe un disco guasto. + +Se hai un server con dischi SATA, otterrai i seguenti risultati : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Sebbene questo comando restituisca i nostri volumi RAID, non ci indica la dimensione delle partizioni stesse. Possiamo trovare queste informazioni con il comando seguente : + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +Il comando `fdisk -l` consente anche di identificare il tipo delle tue partizioni. Questa informazione è importante durante la ricostruzione del tuo RAID in caso di guasto del disco. + +Per le partizioni **GPT**, la riga 6 mostrerà: `Disklabel type: gpt`. + +Ancora basandosi sui risultati di `fdisk -l`, possiamo vedere che `/dev/md2` è composto da 1022 MiB e `/dev/md3` contiene 474,81 GiB. Se eseguiamo il comando `mount`, possiamo anche trovare la disposizione dei dischi. + +In alternativa, il comando `lsblk` offre una visione diversa delle partizioni : + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Inoltre, se eseguiamo `lsblk -f`, otteniamo ulteriori informazioni su queste partizioni, come l'etichetta (LABEL) e l'UUID : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Prendi nota dei dispositivi, delle partizioni e dei loro punti di montaggio; è importante, soprattutto dopo la sostituzione di un disco. + +Dalle comandi e risultati sopra, abbiamo : + +- Due matrici RAID : `/dev/md2` e `/dev/md3`. +- Quattro partizioni che fanno parte del RAID : **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** con i punti di montaggio `/boot` e `/`. +- Due partizioni non incluse nel RAID, con i punti di montaggio : `/boot/efi` e [SWAP]. +- Una partizione che non possiede un punto di montaggio : **nvme1n1p1** + +La partizione `nvme0n1p5` è una partizione di configurazione, cioè un volume in sola lettura connesso al server che gli fornisce i dati di configurazione iniziale. + + + +### Comprendere la partizione del sistema EFI (ESP) + +***Cos'è una partizione del sistema EFI ?*** + +Una partizione del sistema EFI è una partizione su cui il server si avvia. Contiene i file di avvio, ma anche i gestori di avvio o le immagini del kernel di un sistema operativo installato. Può anche contenere programmi utili progettati per essere eseguiti prima che il sistema operativo si avvii, così come file di dati come registri degli errori. + +***La partizione del sistema EFI è inclusa nel RAID ?*** + +No, a partire da agosto 2025, quando un'installazione del sistema operativo viene effettuata da OVHcloud, la partizione ESP non è inclusa nel RAID. Quando si utilizzano i nostri modelli OS per installare il server con un RAID software, vengono create più partizioni del sistema EFI: una per disco. Tuttavia, solo una partizione EFI è montata alla volta. Tutte le ESP create contengono gli stessi file. Tutte le ESP create al momento dell'installazione contengono gli stessi file. + +La partizione del sistema EFI è montata a `/boot/efi` e il disco su cui è montata viene selezionato da Linux all'avvio. + +Esempio : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f90 + +Ti consigliamo di sincronizzare regolarmente i tuoi ESP o dopo ogni aggiornamento importante del sistema. Per default, tutte le partizioni EFI del sistema contengono gli stessi file dopo l'installazione. Tuttavia, se è coinvolto un aggiornamento importante del sistema, la sincronizzazione degli ESP è essenziale per mantenere aggiornato il contenuto. + + + +#### Script + +Ecco uno script che puoi utilizzare per sincronizzarli manualmente. Puoi anche eseguire uno script automatizzato per sincronizzare le partizioni quotidianamente o ogni volta che il servizio parte. + +Prima di eseguire lo script, assicurati che `rsync` sia installato sul tuo sistema : + +**Debian/Ubuntu** + +```sh +sudo apt install rsync +``` + +**CentOS, Red Hat e Fedora** + +```sh +sudo yum install rsync +``` + +Per eseguire uno script su Linux, hai bisogno di un file eseguibile : + +- Inizia creando un file .sh nella directory di tuo interesse, sostituendo `nome-script` con il nome che preferisci. + +```sh +sudo touch nome-script.sh +``` + +- Apri il file con un editor di testo e aggiungi le seguenti righe : + +```sh +sudo nano nome-script.sh +``` + +```sh +#!/bin/bash + +set -euo pipefail + +MOUNTPOINT="/var/lib/grub/esp" +MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) + +echo "${MAIN_PARTITION} è la partizione principale" + +mkdir -p "${MOUNTPOINT}" + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Lavoro su ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Salva e chiudi il file. + +- Rendi lo script eseguibile + +```sh +sudo chmod +x nome-script.sh +``` + +- Esegui lo script + +```sh +sudo ./nome-script.sh +``` + +- Se non sei nella directory + +```sh +./percorso/verso/la/cartella/nome-script.sh +``` + +Quando lo script viene eseguito, il contenuto della partizione EFI montata verrà sincronizzato con le altre. Per accedere al contenuto, puoi montare una di queste partizioni EFI non montate sul punto di montaggio : `/var/lib/grub/esp`. + + + +### Simulazione di un guasto del disco + +Ora che abbiamo tutte le informazioni necessarie, possiamo simulare un guasto del disco e procedere ai test. In questo primo esempio, provocheremo un guasto del disco principale `nvme0n1`. + +Il metodo preferito per farlo è attraverso la modalità rescue di OVHcloud. + +Riavvia prima il server in modalità rescue e collegati con le credenziali fornite. + +Per rimuovere un disco dal RAID, il primo passo è contrassegnarlo come **Failed** e rimuovere le partizioni dai rispettivi array RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Dai risultati sopra, nvme0n1 contiene due partizioni in RAID che sono **nvme0n1p2** e **nvme0n1p3**. + + + +#### Rimozione del disco guasto + +In primo luogo, contrassegniamo le partizioni **nvme0n1p2** e **nvme0n1p3** come guaste. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Quando eseguiamo il comando `cat /proc/mdstat`, otteniamo : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Come possiamo vedere sopra, il [F] accanto alle partizioni indica che il disco è guasto o in panne. + +Successivamente, rimuoviamo queste partizioni dagli array RAID per eliminarle completamente dal RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +Lo stato del nostro RAID dovrebbe ora assomigliare a questo : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Dai risultati sopra, possiamo vedere che ora ci sono solo due partizioni negli array RAID. Abbiamo riuscito a degradare il disco **nvme0n1**. + +Per assicurarci di ottenere un disco simile a un disco vuoto, utilizziamo il comando seguente su ogni partizione, quindi sul disco stesso : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +Il disco appare ora come un disco nuovo e vuoto : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Se eseguiamo il comando seguente, constatiamo che il nostro disco è stato correttamente "cancellato" : + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Per ulteriori informazioni sulla preparazione e la richiesta di sostituzione di un disco, consulta questo [guida](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Se esegui il comando seguente, puoi ottenere ulteriori dettagli sugli array RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Possiamo ora procedere alla sostituzione del disco. + + + +### Ricostruzione del RAID + +> [!primary] +> Questo processo può variare a seconda del sistema operativo installato sul tuo server. Ti consigliamo di consultare la documentazione ufficiale del tuo sistema operativo per ottenere i comandi appropriati. +> + +> [!warning] +> +> Su la maggior parte dei server in RAID software, dopo la sostituzione di un disco, il server è in grado di avviarsi in modalità normale (sul disco sano) e la ricostruzione può essere effettuata in modalità normale. Tuttavia, se il server non riesce ad avviarsi in modalità normale dopo la sostituzione del disco, si riavvierà in modalità rescue per procedere alla ricostruzione del RAID. +> +> Se il tuo server è in grado di avviarsi in modalità normale dopo la sostituzione del disco, segui semplicemente le fasi di [questa sezione](#rebuilding-the-raid-in-normal-mode). + + + +#### Ricostruzione del RAID in modalità rescue + +Una volta sostituito il disco, il passo successivo consiste nel copiare la tabella delle partizioni del disco sano (in questo esempio, nvme1n1) sul nuovo (nvme0n1). + +**Per le partizioni GPT** + +Il comando deve essere in questo formato : `sgdisk -R /dev/nuovo disco /dev/disco sano` + +Nel nostro esempio : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Esegui `lsblk` per assicurarti che le tabelle delle partizioni siano state correttamente copiate : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Una volta fatto questo, il passo successivo consiste nell'assegnare un GUID casuale al nuovo disco per evitare conflitti di GUID con altri dischi : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Se ricevi il seguente messaggio : + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Esegui semplicemente il comando `partprobe`. + +Possiamo ora ricostruire l'array RAID. L'estratto di codice seguente mostra come aggiungere nuovamente le nuove partizioni (nvme0n1p2 e nvme0n1p3) all'array RAID. + +```sh +root + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Per verificare il processo di ricostruzione: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Una volta completata la ricostruzione del RAID, esegui il comando seguente per verificare che le partizioni siano state correttamente aggiunte al RAID: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +In base ai risultati sopra riportati, le partizioni del nuovo disco sono state correttamente aggiunte al RAID. Tuttavia, la partizione EFI System e la partizione SWAP (in alcuni casi) non sono state duplicate, il che è normale poiché non fanno parte del RAID. + +> [!warning] +> Gli esempi sopra illustrano semplicemente le fasi necessarie in base a una configurazione di server predefinita. I risultati di ogni comando dipendono dal tipo di hardware installato sul tuo server e dalla struttura delle sue partizioni. In caso di dubbi, consulta la documentazione del tuo sistema operativo. +> +> Se hai bisogno di un supporto professionale per l'amministrazione del tuo server, consulta i dettagli della sezione [Per saperne di più](#go-further) di questa guida. +> + + + +#### Ricostruzione della partizione EFI System + +Per ricostruire la partizione EFI System, dobbiamo formattare **nvme0n1p1** e replicare il contenuto della partizione EFI System sana (nel nostro esempio: nvme1n1p1) su questa. + +In questo caso, assumiamo che le due partizioni siano state sincronizzate e contengano file aggiornati o non abbiano subito aggiornamenti del sistema che influenzano il *bootloader*. + +> [!warning] +> Se è avvenuto un aggiornamento importante del sistema, ad esempio un aggiornamento del kernel o di GRUB, e le due partizioni non sono state sincronizzate, consulta questa [sezione](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) una volta completata la creazione della nuova partizione EFI System. +> + +In primo luogo, formattiamo la partizione: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Successivamente, assegniamo l'etichetta `EFI_SYSPART` alla partizione. (questo nome è specifico di OVHcloud): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Successivamente, duplichiamo il contenuto di nvme1n1p1 in nvme0n1p1. Creiamo prima due cartelle, che chiamiamo « old » e « new » nel nostro esempio: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Successivamente, montiamo **nvme1n1p1** nella cartella « old » e **nvme0n1p1** nella cartella « new » per distinguerle: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Successivamente, copiamo i file della cartella 'old' in 'new': + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Una volta completato, smontiamo le due partizioni: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Successivamente, montiamo la partizione che contiene la radice del nostro sistema operativo su `/mnt`. Nell'esempio, questa partizione è **md3**: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Montiamo i seguenti directory per assicurarci che qualsiasi operazione che eseguiamo nell'ambiente `chroot` funzioni correttamente: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Successivamente, utilizziamo il comando `chroot` per accedere al punto di montaggio e verificare che la nuova partizione EFI System sia stata correttamente creata e che il sistema riconosca entrambe le ESP: + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Per visualizzare le partizioni ESP, eseguiamo il comando `blkid -t LABEL=EFI_SYSPART`: + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +I risultati sopra mostrano che la nuova partizione EFI è stata creata correttamente e che l'etichetta è stata applicata correttamente. + + + +#### Ricostruzione del RAID quando le partizioni EFI non sono sincronizzate dopo aggiornamenti importanti del sistema (GRUB) + +/// details | Espandi questa sezione + +> [!warning] +> Segui le fasi di questa sezione solo se si applica al tuo caso. +> + +Quando le partizioni del sistema EFI non sono sincronizzate dopo aggiornamenti importanti del sistema che modificano/colpiscono il GRUB, e il disco principale su cui è montata la partizione viene sostituito, l'avvio da un disco secondario che contiene un'ESP obsoleta potrebbe non funzionare. + +In questo caso, oltre a ricostruire il RAID e a ricreare la partizione del sistema EFI in modalità rescue, devi anche reinstallare il GRUB su quest'ultima. + +Una volta che abbiamo ricreato la partizione EFI e ci siamo assicurati che il sistema riconosca entrambe le partizioni (fasi precedenti in `chroot`), creiamo la directory `/boot/efi` per montare la nuova partizione del sistema EFI **nvme0n1p1**: + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Successivamente, reinstalliamo il caricatore di avvio GRUB (*bootloader*): + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Una volta fatto, esegui il comando seguente: + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Aggiunta dell'etichetta alla partizione SWAP (se applicabile) + +Una volta completata la partizione EFI, passiamo alla partizione SWAP. + +Usciamo dall'ambiente `chroot` con `exit` per ricreare la nostra partizione [SWAP] **nvme0n1p4** e aggiungere l'etichetta `swap-nvme0n1p4`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Verifichiamo che l'etichetta sia stata correttamente applicata: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091fcd15 +``` + +Accediamo nuovamente all'ambiente `chroot` + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Utilizza il comando seguente per monitorare la ricostruzione del RAID: `cat /proc/mdstat`. + +**Ricreazione della partizione EFI System sul disco** + +Per prima cosa installiamo gli strumenti necessari: + +**Debian e Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Successivamente formattiamo la partizione. Nel nostro esempio `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Successivamente assegniamo l'etichetta `EFI_SYSPART` alla partizione. (questo nome è specifico per OVHcloud): + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Una volta completato, puoi sincronizzare le due partizioni utilizzando lo script che abbiamo fornito [qui](#script). + +Verifichiamo che la nuova partizione EFI System sia stata creata correttamente e che il sistema la riconosca: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Infine, attiviamo la partizione [SWAP] (se applicabile): + +- Creiamo e aggiungiamo l'etichetta: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Recuperiamo gli UUID delle due partizioni di swap: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Sostituiamo l'UUID vecchio della partizione swap (**nvme0n1p4)** con il nuovo in `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Esempio: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Secondo i risultati sopra, l'UUID vecchio è `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` e deve essere sostituito con il nuovo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Assicurati di sostituire l'UUID corretto. + +Successivamente, eseguiamo il comando seguente per attivare la partizione di swap: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Successivamente, ricarichiamo il sistema: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Abbiamo completato con successo la ricostruzione del RAID. + +## Per saperne di più + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Per servizi specializzati (SEO, sviluppo, ecc.), contatta [i partner OVHcloud](/links/partner). + +Se hai bisogno di un supporto per utilizzare e configurare le tue soluzioni OVHcloud, consulta le [nostre offerte di supporto](/links/support). + +Se hai bisogno di formazione o di un supporto tecnico per implementare le nostre soluzioni, contatta il tuo rappresentante commerciale o clicca su [questo link](/links/professional-services) per richiedere un preventivo e chiedere ai nostri esperti del team Professional Services di intervenire sul tuo caso d'uso specifico. + +Contatta la nostra [Community di utenti](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md new file mode 100644 index 00000000000..57a245df19f --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md @@ -0,0 +1,841 @@ +--- +title: Zarządzanie i odbudowa oprogramowania RAID na serwerach w trybie uruchamiania UEFI +excerpt: Dowiedz się, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie uruchamiania UEFI +updated: 2025-12-11 +--- + +## Wprowadzenie + +Redundant Array of Independent Disks (RAID) to technologia, która zmniejsza utratę danych na serwerze, replikując dane na dwóch lub więcej dyskach. + +Domyślny poziom RAID dla instalacji serwerów OVHcloud to RAID 1, który podwaja zajęte przez dane miejsce, skutecznie zmniejszając dostępne miejsce na dysku o połowę. + +**Ten przewodnik wyjaśnia, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie uruchamiania UEFI** + +Zanim zaczniemy, zwróć uwagę, że ten przewodnik skupia się na Serwerach dedykowanych, które używają UEFI jako trybu uruchamiania. Jest to typowe dla nowoczesnych płyt głównych. Jeśli Twój serwer używa trybu uruchamiania zgodnego (BIOS), odwiedź ten przewodnik: [Zarządzanie i odbudowa oprogramowania RAID na serwerach w trybie uruchamiania zgodnym (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Aby sprawdzić, czy serwer działa w trybie zgodnym BIOS czy trybie uruchamiania UEFI, uruchom następującą komendę: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Aby uzyskać więcej informacji na temat UEFI, zapoznaj się z poniższym [artykułem](https://uefi.org/about). + +## Wymagania początkowe + +- Serwer [dedykowany](/links/bare-metal/bare-metal) z konfiguracją oprogramowania RAID +- Dostęp administracyjny (sudo) do serwera przez SSH +- Zrozumienie RAID, partycji i GRUB + +W trakcie tego przewodnika używamy pojęć **główny dysk** i **dyski pomocnicze**. W tym kontekście: + +- Główny dysk to dysk, którego ESP (EFI System Partition) jest montowany przez system Linux +- Dyski pomocnicze to wszystkie inne dyski w RAID + +## Instrukcje + +Kiedy zakupisz nowy serwer, możesz poczuć potrzebę wykonania serii testów i działań. Jednym z takich testów może być symulacja awarii dysku, aby zrozumieć proces odbudowy RAID i przygotować się na wypadek, gdyby to się kiedykolwiek zdarzyło. + +### Omówienie treści + +- [Podstawowe informacje](#basicinformation) +- [Zrozumienie partycji systemu EFI (ESP)](#efisystemparition) +- [Symulowanie awarii dysku](#diskfailure) + - [Usunięcie awaryjnego dysku](#diskremove) +- [Odbudowanie RAID](#raidrebuild) + - [Odbudowanie RAID po wymianie głównego dysku (tryb ratunkowy)](#rescuemode) + - [Ponowne utworzenie partycji systemu EFI](#recreateesp) + - [Odbudowanie RAID, gdy partycje EFI nie są zsynchronizowane po dużych aktualizacjach systemu (np. GRUB)](efiraodgrub) + - [Dodanie etykiety do partycji SWAP (jeśli dotyczy)](#swap-partition) + - [Odbudowanie RAID w trybie normalnym](#normalmode) + + + +### Podstawowe informacje + +W sesji linii poleceń wpisz następujące polecenie, aby określić bieżący stan RAID: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +To polecenie pokazuje nam, że obecnie mamy skonfigurowane dwa urządzenia RAID oprogramowania, **md2** i **md3**, z **md3** będącym większym z nich. **md3** składa się z dwóch partycji o nazwach **nvme1n1p3** i **nvme0n1p3**. + +[UU] oznacza, że wszystkie dyski działają normalnie. `_` wskazywałby na awaryjny dysk. + +Jeśli masz serwer z dyskami SATA, otrzymasz następujące wyniki: + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Choć to polecenie zwraca nasze objęte RAID woluminy, nie mówi nam o rozmiarze partycji samych w sobie. Możemy znaleźć tę informację za pomocą poniższego polecenia: + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +Polecenie `fdisk -l` umożliwia również identyfikację typu partycji. Jest to ważna informacja przy odbudowie RAID w przypadku awarii dysku. + +Dla partycji **GPT**, linia 6 będzie wyświetlać: `Disklabel type: gpt`. + +Zgodnie z wynikami `fdisk -l`, możemy stwierdzić, że `/dev/md2` składa się z 1022 MiB, a `/dev/md3` zawiera 474,81 GiB. Jeśli uruchomimy polecenie `mount`, możemy również ustalić układ dysku. + +Alternatywnie, polecenie `lsblk` oferuje inny widok partycji: + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Ponadto, jeśli uruchomimy `lsblk -f`, otrzymamy więcej informacji o tych partycjach, takich jak Eтикетка i UUID: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Zwróć uwagę na urządzenia, partycje i ich punkty montowania; to jest ważne, zwłaszcza po wymianie dysku. + +Z powyższych poleceń i wyników mamy: + +- Dwa tablice RAID: `/dev/md2` i `/dev/md3`. +- Cztery partycje, które są częścią RAID: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** z punktami montowania `/boot` i `/`. +- Dwie partycje, które nie są częścią RAID, z punktami montowania: `/boot/efi` i [SWAP]. +- Jedna partycja, która nie ma punktu montowania: **nvme1n1p1** + +Partycja **nvme0n1p5** to partycja konfiguracyjna, czyli tylko do odczytu, połączona z serwerem, która dostarcza mu początkowe dane konfiguracyjne. + + + +### Zrozumienie partycji systemu EFI (ESP) + +***Co to jest partycja systemu EFI?*** + +Partycja systemu EFI to partycja, która może zawierać programy uruchamiające system operacyjny, zarządzacze uruchamiania, obrazy jądra lub inne programy systemowe. Może również zawierać programy narzędziowe systemowe zaprojektowane do uruchomienia przed uruchomieniem systemu operacyjnego, a także pliki danych, takie jak dzienniki błędów. + +***Czy partycja systemu EFI jest lustrzana w RAID?*** + +Nie, jak na sierpień 2025, gdy instalacja systemu operacyjnego jest wykonywana przez OVHcloud, ESP nie jest włączona do RAID. Gdy używasz naszych szablonów systemów operacyjnych do instalacji serwera z oprogramowaniem RAID, tworzone są kilka partycji systemu EFI: jedna na dysku. Jednak tylko jedna partycja EFI jest montowana jednocześnie. Wszystkie ESP utworzone w czasie instalacji zawierają te same pliki. + +Partycja systemu EFI jest montowana w `/boot/efi` i dysk, na którym jest montowana, jest wybierany przez Linux w czasie uruchamiania. + +Przykład: + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Working on ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Zapisz i zamknij plik. + +- Ustaw skrypt jako wykonywalny + +```sh +sudo chmod +x script-name.sh +``` + +- Uruchom skrypt + +```sh +sudo ./script-name.sh +``` + +- Jeśli nie jesteś w odpowiednim folderze + +```sh +./path/to/folder/script-name.sh +``` + +Po wykonaniu skryptu zawartość zainstalowanej partycji EFI zostanie zsynchronizowana z pozostałymi. Aby uzyskać dostęp do zawartości, możesz zainstalować dowolną z tych niezainstalowanych partycji EFI na punkcie montażu: `/var/lib/grub/esp`. + + + +### Symulowanie awarii dysku + +Teraz, gdy mamy wszystkie niezbędne informacje, możemy zasymulować awarię dysku i przystąpić do testów. W tym pierwszym przykładzie zasymulujemy awarię głównego dysku `nvme0n1`. + +Preferowanym sposobem jest użycie środowiska trybu ratunkowego OVHcloud. + +Najpierw uruchom serwer w trybie ratunkowym i zaloguj się przy użyciu dostarczonych poświadczeń. + +Aby usunąć dysk z tablicy RAID, pierwszym krokiem jest oznaczenie go jako **Nieprawidłowy** i usunięcie partycji z odpowiednich tablic RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Z powyższego wyniku wynika, że dysk `nvme0n1` składa się z dwóch partycji w RAID, które to są **nvme0n1p2** i **nvme0n1p3**. + + + +#### Usunięcie zepsutego dysku + +Najpierw oznacz partycje **nvme0n1p2** i **nvme0n1p3** jako zepsute. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Po uruchomieniu polecenia `cat /proc/mdstat`, otrzymujemy następujące dane wyjściowe: + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Jak widać powyżej, [F] obok partycji wskazuje, że dysk uległ awarii lub jest uszkodzony. + +Następnie usuwamy te partycje z tablic RAID, aby całkowicie usunąć dysk z RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +Status naszego RAID powinien teraz wyglądać tak: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Z powyższych wyników widać, że teraz tylko dwie partycje pojawiają się w tablicach RAID. Pomyślnie zakończyliśmy symulację awarii dysku **nvme0n1**. + +Aby upewnić się, że otrzymamy dysk podobny do pustego, używamy poniższego polecenia na każdej partycji, a następnie na samym dysku: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +Dysk teraz wygląda jak nowy, pusty dysk: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Jeśli uruchomimy poniższe polecenie, zobaczymy, że nasz dysk został pomyślnie "wyczyszczony": + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Aby uzyskać więcej informacji na temat przygotowania i złożenia wniosku o wymianę dysku, zapoznaj się z tym [przewodnikiem](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Jeśli uruchomisz poniższe polecenie, możesz uzyskać więcej szczegółów na temat tablic RAID: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Teraz możemy przystąpić do wymiany dysku. + + + +### Odbudowanie RAID + +> [!primary] +> Ten proces może się różnić w zależności od systemu operacyjnego zainstalowanego na Twoim serwerze. Zalecamy, abyś zapoznał się z oficjalną dokumentacją swojego systemu operacyjnego, aby uzyskać dostęp do odpowiednich poleceń. +> + +> [!warning] +> +> Dla większości serwerów w oprogramowaniu RAID po wymianie dysku serwer jest w stanie uruchomić się w normalnym trybie (na zdrowym dysku) i odbudować RAID w normalnym trybie. Jednak, jeśli serwer nie będzie mógł uruchomić się w normalnym trybie po wymianie dysku, zostanie uruchomiony w trybie ratunkowym, aby kontynuować odbudowę RAID. +> +> Jeśli Twój serwer będzie mógł uruchomić się w normalnym trybie po wymianie dysku, po prostu wykonaj kroki z [tej sekcji](#rebuilding-the-raid-in-normal-mode). + + + +#### Odbudowanie RAID w trybie ratunkowym + +Po wymianie dysku następnym krokiem jest skopiowanie tabeli partycji z zdrowego dysku (w tym przykładzie `nvme1n1`) na nowy (dysk `nvme0n1`). + +**Dla partycji GPT** + +Polecenie powinno mieć następującą postać: `sgdisk -R /dev/new disk /dev/healthy disk` + +W naszym przykładzie: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Uruchom `lsblk`, aby upewnić się, że tabele partycji zostały poprawnie skopiowane: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Po wykonaniu tego kroku następnym krokiem jest losowe ustawienie GUID nowego dysku, aby uniknąć konfliktów GUID z innymi dyskami: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Jeśli otrzymasz poniższy komunikat: + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Po prostu uruchom polecenie `partprobe`. + +Teraz możemy odbudować tablicę RAID. Poniższy fragment kodu pokazuje, jak dodać nowe partycje (nvme0n1p2 i nvme0n1p3) z powrotem do tablicy RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 +``` + +# mdadm: ponownie dodano /dev/nvme0n1p3 +``` + +Aby sprawdzić proces odbudowy: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Po zakończeniu odbudowy RAID uruchom poniższe polecenie, aby upewnić się, że partycje zostały poprawnie dodane do RAID: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +Na podstawie powyższych wyników partycje na nowym dysku zostały poprawnie dodane do RAID. Jednak partycja systemowa EFI i partycja SWAP (w niektórych przypadkach) nie zostały zduplikowane, co jest normalne, ponieważ nie są one uwzględniane w RAID. + +> [!warning] +> Powyższe przykłady ilustrują tylko niezbędne kroki na podstawie domyślnej konfiguracji serwera. Informacje w tabeli wyników zależą od sprzętu serwera i jego schematu partycji. W przypadku wątpliwości skonsultuj dokumentację swojego systemu operacyjnego. +> +> Jeśli potrzebujesz profesjonalnej pomocy z administracją serwerem, zapoznaj się z sekcją [Sprawdź również](#go-further) tego przewodnika. +> + + + +#### Odbudowanie partycji systemowej EFI + +Aby odbudować partycję systemową EFI, należy sformatować **nvme0n1p1** i następnie zrekopilować zawartość zdrowej partycji (w naszym przykładzie: nvme1n1p1) na nią. + +Zakładamy, że obie partycje zostały zsynchronizowane i zawierają aktualne pliki. + +> [!warning] +> Jeśli miało miejsce znaczące uaktualnienie systemu, takie jak jądro lub GRUB, i partycje nie zostały zsynchronizowane, skorzystaj z tej [sekcji](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub), gdy skończysz tworzyć nową partycję systemową EFI. +> + +Najpierw formatujemy partycję: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Następnie nadajemy partycji etykietę `EFI_SYSPART` (ta nazwa jest specyficzna dla OVHcloud): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Następnie kopiujemy zawartość nvme1n1p1 do nvme0n1p1. Najpierw tworzymy dwa katalogi, które nazwiemy "old" i "new" w naszym przykładzie: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Następnie montujemy **nvme1n1p1** w katalogu "old" i **nvme0n1p1** w katalogu "new", aby odróżnić je od siebie: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Następnie kopiujemy pliki z katalogu "old" do "new": + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Po wykonaniu tej czynności odmontowujemy obie partycje: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Następnie montujemy partycję zawierającą korzeń naszego systemu operacyjnego na `/mnt`. W naszym przykładzie jest to partycja **md3**. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Montujemy następujące katalogi, aby upewnić się, że wszystkie operacje w środowisku `chroot` przebiegną poprawnie: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Następnie korzystamy z polecenia `chroot`, aby uzyskać dostęp do punktu montażowego i upewnić się, że nowa partycja systemowa EFI została poprawnie utworzona i system rozpoznaje obie partycje ESP: + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Aby wyświetlić partycje ESP, uruchamiamy polecenie `blkid -t LABEL=EFI_SYSPART`: + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Wyniki powyższe pokazują, że nowa partycja EFI została poprawnie utworzona i etykieta została poprawnie zastosowana. + + + +#### Odbudowa RAID, gdy partycje EFI nie są zsynchronizowane po znaczących uaktualnieniach systemu (GRUB) + +/// details | Rozwiń tę sekcję + +> [!warning] +> Postępuj zgodnie z krokami w tej sekcji tylko wtedy, gdy dotyczy to Twojego przypadku. +> + +Gdy partycje systemowe EFI nie są zsynchronizowane po znaczących uaktualnieniach systemu, które modyfikują/lub wpływają na GRUB, a podstawowy dysk, na którym jest zamontowana partycja, zostaje wymieniony, uruchomienie z dysku pomocniczego zawierającego przestarzałą partycję ESP może się nie powieść. + +W takim przypadku, oprócz odbudowy RAID i ponownego utworzenia partycji systemowej EFI w trybie ratunkowym, należy również ponownie zainstalować GRUB na niej. + +Po utworzeniu partycji EFI i upewnieniu się, że system rozpoznaje obie partycje (poprzednie kroki w `chroot`), tworzymy katalog `/boot/efi`, aby zamontować nową partycję systemową EFI **nvme0n1p1**: + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Następnie ponownie instalujemy bootloader GRUB: + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Po wykonaniu tej czynności uruchamiamy poniższe polecenie: + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Dodanie etykiety do partycji SWAP (jeśli dotyczy) + +Po zakończeniu pracy z partycją EFI przechodzimy do partycji SWAP. + +Wyjdź z środowiska `chroot` za pomocą `exit`, aby ponownie utworzyć naszą [SWAP] partycję **nvme0n1p4** i dodać etykietę `swap-nvme0n1p4`: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Sprawdzamy, czy etykieta została poprawnie zastosowana: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685 + +# mdadm: ponownie dodano /dev/nvme0n1p3 +``` + +Użyj poniższego polecenia, aby śledzić odbudowę RAID: `cat /proc/mdstat`. + +**Odbudowanie partycji systemu EFI na dysku** + +Najpierw instalujemy niezbędne narzędzia: + +**Debian i Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Następnie formatujemy partycję. W naszym przykładzie `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Następnie nadajemy partycji etykietę `EFI_SYSPART` (ta nazwa jest specyficzna dla OVHcloud) + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Po wykonaniu tej czynności możesz zsynchronizować obie partycje za pomocą skryptu, który udostępniliśmy [tutaj](#script). + +Sprawdzamy, czy nowa partycja systemu EFI została poprawnie utworzona i system ją rozpoznaje: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Na koniec aktywujemy partycję [SWAP] (jeśli dotyczy): + + +- Tworzymy i dodajemy etykietę: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Pobieramy UUID obu partycji swap: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Zastępujemy stary UUID partycji swap (**nvme0n1p4)** nowym w pliku `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Przykład: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +Na podstawie powyższych wyników, stary UUID to `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` i powinien zostać zastąpiony nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Upewnij się, że zastępujesz poprawny UUID. + +Następnie uruchamiamy poniższe polecenie, aby aktywować partycję swap: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Następnie ponownie ładowujemy system: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Pomyślnie ukończono odbudowę RAID. + +## Sprawdź również + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API i Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Zarządzanie hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Dla usług specjalistycznych (SEO, programowanie itp.), skontaktuj się z [partnerami OVHcloud](/links/partner). + +Jeśli potrzebujesz pomocy w użyciu i konfiguracji rozwiązań OVHcloud, zapoznaj się z naszymi [ofertami wsparcia](/links/support). + +Jeśli potrzebujesz szkoleń lub pomocy technicznej w wdrożeniu naszych rozwiązań, skontaktuj się ze swoim przedstawicielem handlowym lub kliknij [ten link](/links/professional-services), aby uzyskać wycenę i zapytać ekspertów z Professional Services o pomoc w konkretnym przypadku użycia projektu. + +Dołącz do [grona naszych użytkowników](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md new file mode 100644 index 00000000000..d3053240d60 --- /dev/null +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md @@ -0,0 +1,905 @@ +--- +title: "Gestão e reconstrução de um RAID software nos servidores que utilizam o modo de arranque UEFI" +excerpt: Descubra como gerir e reconstruir um RAID software após a substituição de disco num servidor que utiliza o modo de arranque UEFI +updated: 2025-12-11 +--- + +## Objetivo + +Um Redundant Array of Independent Disks (RAID) é uma tecnologia que atenua a perda de dados num servidor ao replicar os dados em dois discos ou mais. + +O nível RAID predefinido para as instalações de servidores OVHcloud é o RAID 1, que duplica o espaço ocupado pelos seus dados, reduzindo assim o espaço de disco utilizável para metade. + +**Este guia explica como gerir e reconstruir um RAID software após a substituição de disco no seu servidor em modo EFI** + +Antes de começar, note que este guia foca-se nos servidores dedicados que utilizam o modo UEFI como modo de arranque. Este é o caso das placas-mãe modernas. Se o seu servidor utiliza o modo de arranque legacy (BIOS), consulte este guia: [Gestão e reconstrução de um RAID software em servidores no modo de arranque legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). + +Para verificar se um servidor está a funcionar no modo BIOS legacy ou no modo UEFI, execute o seguinte comando: + +```sh +[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS +``` + +Para mais informações sobre a UEFI, consulte o seguinte artigo: [https://uefi.org/about](https://uefi.org/about). + +## Requisitos + +- Um [servidor dedicado](/links/bare-metal/bare-metal) com uma configuração RAID software +- Acesso administrativo (sudo) ao servidor através de SSH +- Compreensão do RAID, partições e GRUB + +Ao longo deste guia, utilizamos os termos **disco principal** e **disco secundário**. Neste contexto: + +- O disco principal é o disco cuja ESP (EFI System Partition) está montada pelo Linux +- Os discos secundários são todos os outros discos do RAID + +## Instruções + +Quando compra um novo servidor, pode sentir a necessidade de realizar uma série de testes e ações. Um destes testes pode ser simular uma falha de disco para compreender o processo de reconstrução do RAID e preparar-se em caso de problema. + +### Visão geral do conteúdo + +- [Informações básicas](#basicinformation) +- [Compreensão da partição do sistema EFI (ESP)](#efisystemparition) +- [Simulação de uma falha de disco](#diskfailure) + - [Remoção do disco defeituoso](#diskremove) +- [Reconstrução do RAID](#raidrebuild) + - [Reconstrução do RAID após a substituição do disco principal (modo de recuperação)](#rescuemode) + - [Recriação da partição do sistema EFI](#recreateesp) + - [Reconstrução do RAID quando as partições EFI não estão sincronizadas após atualizações maiores do sistema (ex. GRUB)](efiraodgrub) + - [Adição da etiqueta à partição SWAP (se aplicável)](#swap-partition) + - [Reconstrução do RAID em modo normal](#normalmode) + + + +### Informações básicas + +Numa sessão de linha de comandos, introduza o seguinte comando para determinar o estado atual do RAID : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 2/4 pages [8KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Este comando mostra-nos que temos atualmente dois volumes RAID software configurados, **md2** e **md3**, com **md3** sendo o maior dos dois. **md3** é composto por duas partições, chamadas **nvme1n1p3** e **nvme0n1p3**. + +O [UU] significa que todos os discos estão a funcionar normalmente. Um `_` indicaria um disco defeituoso. + +Se tiver um servidor com discos SATA, obterá os seguintes resultados : + +```sh +[user@server_ip ~]# cat /proc/mdstat +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md3 : active raid1 sda3[0] sdb3[1] + 3904786432 blocks super 1.2 [2/2] [UU] + bitmap: 2/30 pages [8KB], 65536KB chunk + +md2 : active raid1 sda2[0] sdb2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +Embora este comando devolva os nossos volumes RAID, não nos indica o tamanho das próprias partições. Podemos encontrar esta informação com o seguinte comando : + +```sh +[user@server_ip ~]# sudo fdisk -l + +Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 + +Device Start End Sectors Size Type +/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files + + +Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors +Disk model: WDC CL SN720 SDAQNTW-512G-2000 +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +Disklabel type: gpt +Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 + +Device Start End Sectors Size Type +/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System +/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID +/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID +/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file +/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file + + +Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes + + +Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors +Units: sectors of 1 * 512 = 512 bytes +Sector size (logical/physical): 512 bytes / 512 bytes +I/O size (minimum/optimal): 512 bytes / 512 bytes +``` + +O comando `fdisk -l` permite também identificar o tipo das suas partições. Esta é uma informação importante durante a reconstrução do seu RAID em caso de falha de disco. + +Para as partições **GPT**, a linha 6 mostrará: `Disklabel type: gpt`. + +Também com base nos resultados de `fdisk -l`, podemos ver que `/dev/md2` é composto por 1022 MiB e `/dev/md3` contém 474,81 GiB. Se executarmos o comando `mount`, também podemos encontrar a disposição dos discos. + +Como alternativa, o comando `lsblk` oferece uma visão diferente das partições : + +```sh +[user@server_ip ~]# lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:7 0 511M 0 part +├─nvme1n1p2 259:8 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme1n1p3 259:9 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +└─nvme1n1p4 259:10 0 512M 0 part [SWAP] +nvme0n1 259:1 0 476.9G 0 disk +├─nvme0n1p1 259:2 0 511M 0 part /boot/efi +├─nvme0n1p2 259:3 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 /boot +├─nvme0n1p3 259:4 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 / +├─nvme0n1p4 259:5 0 512M 0 part [SWAP] +└─nvme0n1p5 259:6 0 2M 0 part +``` + +Além disso, se executarmos `lsblk -f`, obtemos mais informações sobre estas partições, tais como o LABEL e o UUID : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] +nvme0n1 +├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi +├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot +├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 +│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / +├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] +└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 +``` + +Note os dispositivos, as partições e os seus pontos de montagem; isto é importante, especialmente após a substituição de um disco. + +A partir dos comandos e resultados acima, temos : + +- Duas matrizes RAID: `/dev/md2` e `/dev/md3`. +- Quatro partições que fazem parte do RAID: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** com os pontos de montagem `/boot` e `/`. +- Duas partições não incluídas no RAID, com os pontos de montagem: `/boot/efi` e [SWAP]. +- Uma partição que não possui ponto de montagem: **nvme1n1p1** + +A partição `nvme0n1p5` é uma partição de configuração, ou seja, um volume somente leitura ligado ao servidor que lhe fornece os dados de configuração inicial. + + + +### Compreender a partição do sistema EFI (ESP) + +***O que é uma partição do sistema EFI ?*** + +Uma partição do sistema EFI é uma partição na qual o servidor inicia. Contém os ficheiros de arranque, bem como os gestores de arranque ou as imagens do núcleo de um sistema operativo instalado. Pode também conter programas utilitários concebidos para serem executados antes que o sistema operativo inicie, bem como ficheiros de dados tais como registos de erros. + +***A partição do sistema EFI está incluída no RAID ?*** + +Não, a partir de agosto de 2025, quando uma instalação do sistema operativo é efetuada pela OVHcloud, a partição ESP não está incluída no RAID. Quando utiliza os nossos modelos de SO para instalar o seu servidor com um RAID software, várias partições do sistema EFI são criadas: uma por disco. No entanto, apenas uma partição EFI é montada de cada vez. Todas as ESP criadas contêm os mesmos ficheiros. Todas as ESP criadas no momento da instalação contêm os mesmos ficheiros. + +A partição do sistema EFI é montada em `/boot/efi` e o disco no qual está montada é selecionado pelo Linux no arranque. + +Exemplo : + +```sh +[user@server_ip ~]# sudo lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA +├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea +│ └─md2 ext4 1.0 boot 96850c4e-e2b + +Recomendamos que sincronize os seus ESP regularmente ou após cada atualização importante do sistema. Por defeito, todas as partições do sistema EFI contêm os mesmos ficheiros após a instalação. No entanto, se uma atualização importante do sistema estiver envolvida, a sincronização dos ESP é essencial para manter o conteúdo atualizado. + + + +#### Script + +Aqui está um script que pode utilizar para os sincronizar manualmente. Também pode executar um script automatizado para sincronizar as partições diariamente ou sempre que o serviço iniciar. + +Antes de executar o script, certifique-se de que o `rsync` está instalado no seu sistema : + +**Debian/Ubuntu** + +```sh +sudo apt install rsync +``` + +**CentOS, Red Hat e Fedora** + +```sh +sudo yum install rsync +``` + +Para executar um script em Linux, necessita de um ficheiro executável : + +- Comece por criar um ficheiro .sh no diretório da sua escolha, substituindo `nome-do-script` pelo nome da sua escolha. + +```sh +sudo touch nome-do-script.sh +``` + +- Abra o ficheiro com um editor de texto e adicione as seguintes linhas : + +```sh +sudo nano nome-do-script.sh +``` + +```sh +#!/bin/bash + +set -euo pipefail + +MOUNTPOINT="/var/lib/grub/esp" +MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) + +echo "${MAIN_PARTITION} é a partição principal" + +mkdir -p "${MOUNTPOINT}" + +while read -r partition; do + if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then + continue + fi + echo "Trabalhando em ${partition}" + mount "${partition}" "${MOUNTPOINT}" + rsync -ax "/boot/efi/" "${MOUNTPOINT}/" + umount "${MOUNTPOINT}" +done < <(blkid -o device -t LABEL=EFI_SYSPART) +``` + +Guarde e feche o ficheiro. + +- Torne o script executável + +```sh +sudo chmod +x nome-do-script.sh +``` + +- Execute o script + +```sh +sudo ./nome-do-script.sh +``` + +- Se não estiver no diretório + +```sh +./caminho/para/o/diretório/nome-do-script.sh +``` + +Quando o script é executado, o conteúdo da partição EFI montada será sincronizado com as outras. Para aceder ao conteúdo, pode montar uma destas partições EFI não montadas no ponto de montagem: `/var/lib/grub/esp`. + + + +### Simulação de uma falha de disco + +Agora que temos todas as informações necessárias, podemos simular uma falha de disco e proceder aos testes. Neste primeiro exemplo, vamos provocar uma falha no disco principal `nvme0n1`. + +O método preferido para o fazer é através do modo rescue da OVHcloud. + +Reinicie primeiro o servidor em modo rescue e ligue-se com as credenciais fornecidas. + +Para retirar um disco do RAID, o primeiro passo é marcá-lo como **Failed** e retirar as partições dos seus arrays RAID respetivos. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/2] [UU] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] + +unused devices: +``` + +A partir do resultado acima, nvme0n1 contém duas partições em RAID que são **nvme0n1p2** e **nvme0n1p3**. + + + +#### Remoção do disco defeituoso + +Primeiro, marcamos as partições **nvme0n1p2** e **nvme0n1p3** como defeituosas. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 +# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 +# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 +``` + +Quando executamos o comando `cat /proc/mdstat`, obtemos : + +```sh +root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +Como podemos ver acima, o [F] ao lado das partições indica que o disco está defeituoso ou em falha. + +Em seguida, retiramos estas partições dos arrays RAID para eliminar completamente o disco do RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 +# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 +``` + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 +# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 +``` + +O estado do nosso RAID deverá agora assemelhar-se a isto : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme1n1p2[1] + 1046528 blocks super 1.2 [2/1] [_U] + +unused devices: +``` + +De acordo com os resultados acima, podemos ver que agora existem apenas duas partições nos arrays RAID. Conseguimos degradar com sucesso o disco **nvme0n1**. + +Para nos certificarmos de obter um disco semelhante a um disco vazio, utilizamos o seguinte comando em cada partição, seguido do próprio disco : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +shred -s10M -n1 /dev/nvme0n1p1 +shred -s10M -n1 /dev/nvme0n1p2 +shred -s10M -n1 /dev/nvme0n1p3 +shred -s10M -n1 /dev/nvme0n1p4 +shred -s10M -n1 /dev/nvme0n1p5 +shred -s10M -n1 /dev/nvme0n1 +``` + +O disco aparece agora como um disco novo e vazio : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +``` + +Se executarmos o seguinte comando, verificamos que o nosso disco foi corretamente "apagado" : + +```sh +parted /dev/nvme0n1 +GNU Parted 3.5 +Using /dev/nvme0n1 +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/nvme0n1: unrecognised disk label +Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) +Disk /dev/nvme0n1: 512GB +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Para mais informações sobre a preparação e a solicitação de substituição de um disco, consulte este [guia](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). + +Se executar o seguinte comando, pode obter mais detalhes sobre os arrays RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 + +/dev/md3: + Version : 1.2 + Creation Time : Fri Aug 1 14:51:13 2025 + Raid Level : raid1 + Array Size : 497875968 (474.81 GiB 509.82 GB) + Used Dev Size : 497875968 (474.81 GiB 509.82 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Fri Aug 1 15:56:17 2025 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md3 + UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff + Events : 215 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 259 4 1 active sync /dev/nvme1n1p3 +``` + +Agora podemos proceder à substituição do disco. + + + +### Reconstrução do RAID + +> [!primary] +> Este processo pode variar consoante o sistema operativo instalado no seu servidor. Recomendamos que consulte a documentação oficial do seu sistema operativo para obter os comandos adequados. +> + +> [!warning] +> +> Na maioria dos servidores com RAID software, após a substituição de um disco, o servidor é capaz de arrancar em modo normal (no disco saudável) e a reconstrução pode ser efetuada em modo normal. No entanto, se o servidor não conseguir arrancar em modo normal após a substituição do disco, reiniciará em modo rescue para proceder à reconstrução do RAID. +> +> Se o seu servidor for capaz de arrancar em modo normal após a substituição do disco, siga apenas as etapas da [secção seguinte](#rebuilding-the-raid-in-normal-mode). + + + +#### Reconstrução do RAID em modo rescue + +Uma vez o disco substituído, o próximo passo consiste em copiar a tabela de partições do disco saudável (neste exemplo, nvme1n1) para o novo (nvme0n1). + +**Para as partições GPT** + +O comando deve estar neste formato : `sgdisk -R /dev/novo disco /dev/disco saudável` + +No nosso exemplo : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 +``` + +Execute `lsblk` para se certificar de que as tabelas de partições foram corretamente copiadas : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk + +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS +nvme1n1 259:0 0 476.9G 0 disk +├─nvme1n1p1 259:1 0 511M 0 part +├─nvme1n1p2 259:2 0 1G 0 part +│ └─md2 9:2 0 1022M 0 raid1 +├─nvme1n1p3 259:3 0 474.9G 0 part +│ └─md3 9:3 0 474.8G 0 raid1 +└─nvme1n1p4 259:4 0 512M 0 part +nvme0n1 259:5 0 476.9G 0 disk +├─nvme0n1p1 259:10 0 511M 0 part +├─nvme0n1p2 259:11 0 1G 0 part +├─nvme0n1p3 259:12 0 474.9G 0 part +└─nvme0n1p4 259:13 0 512M 0 part +``` + +Uma vez feito isto, o próximo passo consiste em atribuir um GUID aleatório ao novo disco para evitar conflitos de GUID com outros discos : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 +``` + +Se receber a seguinte mensagem : + +```console +Warning: The kernel is still using the old partition table. +The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) +The operation has completed successfully. +``` + +Execute simplesmente o comando `partprobe`. + +Agora podemos reconstruir a matriz RAID. O seguinte trecho de código mostra como adicionar novamente as novas partições (nvme0n1p2 e nvme0n1p3) à matriz RAID. + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 +# mdadm: added /dev/nvme0n1p2 +root@res + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Para verificar o processo de reconstrução : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] +md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] + 497875968 blocks super 1.2 [2/1] [_U] + [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec + bitmap: 0/4 pages [0KB], 65536KB chunk + +md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] + 1046528 blocks super 1.2 [2/2] [UU] +``` + +Uma vez a reconstrução do RAID terminada, execute o seguinte comando para se assegurar que as partições foram corretamente adicionadas ao RAID : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 +├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 +├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 +nvme0n1 +├─nvme0n1p1 +├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f +│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d +├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff +│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f +└─nvme0n1p4 +``` + +De acordo com os resultados acima, as partições do novo disco foram corretamente adicionadas ao RAID. No entanto, a partição EFI System e a partição SWAP (em alguns casos) não foram duplicadas, o que é normal, pois não fazem parte do RAID. + +> [!warning] +> Os exemplos acima ilustram apenas as etapas necessárias com base numa configuração de servidor predefinida. Os resultados de cada comando dependem do tipo de hardware instalado no seu servidor e da estrutura das suas partições. Em caso de dúvida, consulte a documentação do seu sistema operativo. +> +> Se precisar de assistência profissional para a administração do seu servidor, consulte os detalhes da secção [Quer saber mais?](#go-further) deste guia. +> + + + +#### Recriação da partição EFI System + +Para recolocar a partição EFI System, temos de formatar **nvme0n1p1** e replicar o conteúdo da partição EFI System saudável (no nosso exemplo: nvme1n1p1) para esta. + +Aqui, assumimos que as duas partições foram sincronizadas e contêm ficheiros actualizados ou simplesmente não sofreram actualizações do sistema com impacto no *bootloader*. + +> [!warning] +> Se uma actualização importante do sistema, tal como uma actualização do núcleo ou do GRUB, ocorreu e as duas partições não foram sincronizadas, consulte esta [secção](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) uma vez que tenha terminado a criação da nova partição EFI System. +> + +Em primeiro lugar, formatamos a partição : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 +``` + +Em seguida, atribuímos a etiqueta `EFI_SYSPART` à partição. (este nome é específico da OVHcloud) : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Em seguida, replicamos o conteúdo de nvme1n1p1 para nvme0n1p1. Começamos por criar dois diretórios, que chamamos « old » e « new » no nosso exemplo : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new +``` + +Em seguida, montamos **nvme1n1p1** no diretório « old » e **nvme0n1p1** no diretório « new » para diferenciá-los : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new +``` + +Em seguida, copiamos os ficheiros do diretório 'old' para 'new' : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ +sending incremental file list +EFI/ +EFI/debian/ +EFI/debian/BOOTX64.CSV +EFI/debian/fbx64.efi +EFI/debian/grub.cfg +EFI/debian/grubx64.efi +EFI/debian/mmx64.efi +EFI/debian/shimx64.efi + +sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec +total size is 6,097,843 speedup is 1.00 +``` + +Uma vez feito isto, desmontamos as duas partições : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 +``` + +Em seguida, montamos a partição contendo a raiz do nosso sistema operativo em `/mnt`. No nosso exemplo, esta partição é **md3**: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt +``` + +Montamos os seguintes diretórios para nos assegurarmos que qualquer manipulação que realizamos no ambiente `chroot` funciona corretamente : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # +mount --types proc /proc /mnt/proc +mount --rbind /sys /mnt/sys +mount --make-rslave /mnt/sys +mount --rbind /dev /mnt/dev +mount --make-rslave /mnt/dev +mount --bind /run /mnt/run +mount --make-slave /mnt/run +``` + +Em seguida, utilizamos o comando `chroot` para aceder ao ponto de montagem e assegurar-nos que a nova partição do sistema EFI foi corretamente criada e que o sistema reconhece as duas ESP : + +```sh +root@rescue12-customer-eu:/# chroot /mnt +``` + +Para mostrar as partições ESP, executamos o comando `blkid -t LABEL=EFI_SYSPART` : + +```sh +root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Os resultados acima mostram que a nova partição EFI foi criada corretamente e que a etiqueta foi aplicada corretamente. + + + +#### Reconstrução do RAID quando as partições EFI não estão sincronizadas após actualizações importantes do sistema (GRUB) + +/// details | Desenvolva esta secção + +> [!warning] +> Siga as etapas desta secção apenas se se aplicarem ao seu caso. +> + +Quando as partições do sistema EFI não estão sincronizadas após actualizações importantes do sistema que modificam/afetam o GRUB, e o disco principal no qual a partição está montada é substituído, o arranque a partir de um disco secundário contendo uma ESP obsoleta pode não funcionar. + +Neste caso, para além de reconstruir o RAID e recolocar a partição do sistema EFI no modo rescue, também deve reinstalar o GRUB nela. + +Uma vez que tenhamos recolocado a partição EFI e nos certificamos que o sistema reconhece as duas partições (etapas anteriores no `chroot`), criamos a pasta `/boot/efi` para montar a nova partição do sistema EFI **nvme0n1p1** : + +```sh +root@rescue12-customer-eu:/# mount /boot +root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi +``` + +Em seguida, reinstalamos o carregador de arranque GRUB (*bootloader*) : + +```sh +root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 +``` + +Uma vez feito isto, execute o seguinte comando : + +```sh +root@rescue12-customer-eu:/# update-grub +``` +/// + + + +#### Adição da etiqueta à partição SWAP (se aplicável) + +Uma vez que tenhamos terminado com a partição EFI, passamos à partição SWAP. + +Sair do ambiente `chroot` com `exit` para recolocar a nossa partição [SWAP] **nvme0n1p4** e adicionar a etiqueta `swap-nvme0n1p4` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +Setting up swapspace version 1, size = 512 MiB (536866816 bytes) +LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +``` + +Verificamos que a etiqueta foi corretamente aplicada : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f +NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS +nvme1n1 + +├─nvme1n1p1 +│ vfat FAT16 EFI_SYSPART +│ BA77-E844 504.9M 1% /root/old +├─nvme1n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme1n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme1n1p4 + swap 1 swap-nvme1n1p4 + d6af33cf-fc15-4060-a43c-cb3b5537f58a +nvme0n1 + +├─nvme0n1p1 +│ vfat FAT16 EFI_SYSPART +│ 477D-6658 +├─nvme0n1p2 +│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 +│ └─md2 +│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac +├─nvme0n1p3 +│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c +│ └─md3 +│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt +└─nvme0n1p4 + swap 1 swap-nvme0n1p4 + b3c9e03a-52f5-4683-81b6-cc10091fcd15 +``` + +Acedemos novamente ao ambiente `chroot` : + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt +``` + +Recuperamos o UUID das + +# mdadm: re-added /dev/nvme0n1p3 +``` + +Utilize o seguinte comando para seguir a reconstrução do RAID: `cat /proc/mdstat`. + +**Recriação da partição do sistema EFI no disco** + +Primeiro, instalamos as ferramentas necessárias: + +**Debian e Ubuntu** + +```sh +[user@server_ip ~]# sudo apt install dosfstools +``` + +**CentOS** + +```sh +[user@server_ip ~]# sudo yum install dosfstools +``` + +Em seguida, formatamos a partição. No nosso exemplo `nvme0n1p1`: + +```sh +[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 +``` + +Em seguida, atribuímos a etiqueta `EFI_SYSPART` à partição. (este nome é específico da OVHcloud): + +```sh +[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART +``` + +Depois disso, pode sincronizar as duas partições com o script que fornecemos [aqui](#script). + +Verificamos que a nova partição do sistema EFI foi corretamente criada e que o sistema a reconhece: + +```sh +[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART +/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" +/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" +``` + +Por fim, ativamos a partição [SWAP] (se aplicável): + +- Criamos e adicionamos a etiqueta: + +```sh +[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 +``` + +- Recuperamos os UUID das duas partições swap: + +```sh +[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 +/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" +[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 +/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" +``` + +- Substituímos o antigo UUID da partição swap (**nvme0n1p4)** pelo novo em `/etc/fstab`: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +``` + +Exemplo: + +```sh +[user@server_ip ~]# sudo nano /etc/fstab +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 +``` + +De acordo com os resultados acima, o antigo UUID é `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` e deve ser substituído pelo novo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. + +Certifique-se de substituir o UUID correto. + +Em seguida, executamos o seguinte comando para ativar a partição swap: + +```sh +[user@server_ip ~]# sudo swapon -av +swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme0n1p4 +swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] +swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 +swapon /dev/nvme1n1p4 +``` + +Em seguida, recarregamos o sistema: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +Agora terminámos com sucesso a reconstrução do RAID. + +## Quer saber mais? + +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) + +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) + +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) + +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) + +Para serviços especializados (SEO, desenvolvimento, etc.), contacte [os parceiros da OVHcloud](/links/partner). + +Se precisar de assistência para utilizar e configurar as suas soluções OVHcloud, consulte as [nossas ofertas de suporte](/links/support). + +Se precisar de formação ou de assistência técnica para implementar as nossas soluções, contacte o seu representante comercial ou clique [neste link](/links/professional-services) para obter um orçamento e solicitar que a equipa de Professional Services intervenha no seu caso de utilização específico. + +Fale com a nossa [comunidade de utilizadores](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml index fc2cce15b0e..5a256dae3a2 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/meta.yaml @@ -1,2 +1,3 @@ id: 026d2da0-e852-4c24-b78c-39660ef19c06 -full_slug: dedicated-servers-raid-soft-uefi \ No newline at end of file +full_slug: dedicated-servers-raid-soft-uefi +translation_banner: true \ No newline at end of file From a54849017089bbe7b9ef7609d356a5674403ae8b Mon Sep 17 00:00:00 2001 From: Yoann Cosse Date: Thu, 11 Dec 2025 16:15:56 +0100 Subject: [PATCH 8/8] Reverting translations --- .../raid_soft/guide.de-de.md | 347 +++++-- .../raid_soft/guide.es-es.md | 380 +++++--- .../raid_soft/guide.es-us.md | 384 +++++--- .../raid_soft/guide.it-it.md | 387 +++++--- .../raid_soft/guide.pl-pl.md | 345 +++++-- .../raid_soft/guide.pt-pt.md | 381 +++++--- .../raid_soft_uefi/guide.de-de.md | 849 ---------------- .../raid_soft_uefi/guide.es-es.md | 908 ------------------ .../raid_soft_uefi/guide.it-it.md | 896 ----------------- .../raid_soft_uefi/guide.pl-pl.md | 841 ---------------- .../raid_soft_uefi/guide.pt-pt.md | 905 ----------------- 11 files changed, 1586 insertions(+), 5037 deletions(-) delete mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md delete mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md delete mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md delete mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md delete mode 100644 pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md index 694eb58108a..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.de-de.md @@ -1,50 +1,50 @@ --- -title: Verwalten und Neuaufbauen von Software-RAID auf Servern im Legacy-Boot-Modus (BIOS) -excerpt: Erfahren Sie, wie Sie Software-RAID verwalten und nach einem Wechsel der Festplatte auf Ihrem Server im Legacy-Boot-Modus (BIOS) neu aufbauen können +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Ziel +## Objective -Redundant Array of Independent Disks (RAID) ist eine Technologie, die Datenverluste auf einem Server durch die Replikation von Daten auf zwei oder mehr Festplatten minimiert. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -Die Standard-RAID-Ebene für OVHcloud-Serverinstallationen ist RAID 1, wodurch der Platz, den Ihre Daten einnehmen, verdoppelt wird und der nutzbare Festplattenplatz effektiv halbiert wird. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Dieses Handbuch erklärt, wie Sie ein Software-RAID verwalten und nach einem Festplattentausch auf Ihrem Server im Legacy-Boot-Modus (BIOS) neu aufbauen können.** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** -Bevor wir beginnen, beachten Sie bitte, dass dieses Handbuch sich auf Dedicated Server konzentriert, die den Legacy-Boot-Modus (BIOS) verwenden. Wenn Ihr Server den UEFI-Modus verwendet (neuere Motherboards), konsultieren Sie bitte dieses Handbuch [Verwalten und Neuaufbauen von Software-RAID auf Servern im UEFI-Boot-Modus](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Um zu prüfen, ob ein Server im Legacy-BIOS- oder UEFI-Modus läuft, führen Sie den folgenden Befehl aus: +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Voraussetzungen +## Requirements -- Ein [Dedicated Server](/links/bare-metal/bare-metal) mit Software-RAID-Konfiguration -- Administrative (sudo) Zugriffsrechte auf den Server über SSH -- Grundkenntnisse zu RAID und Partitionen +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## ## In der praktischen Anwendung +## Instructions -Wenn Sie einen neuen Server erwerben, könnten Sie sich möglicherweise entscheiden, eine Reihe von Tests und Aktionen durchzuführen. Ein solcher Test könnte darin bestehen, einen Festplattenausfall zu simulieren, um den Rebuild-Prozess des RAIDs zu verstehen und sich darauf vorzubereiten, falls dies jemals tatsächlich passiert. +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. -### Inhaltsoverview +### Content overview -- [Grundlegende Informationen](#basicinformation) -- [Simulieren eines Festplattenausfalls](#diskfailure) - - [Entfernen der defekten Festplatte](#diskremove) -- [Neuaufbau des RAIDs](#raidrebuild) - - [Neuaufbau des RAIDs im Rescue-Modus](#rescuemode) - - [Hinzufügen des Labels zur SWAP-Partition (falls zutreffend)](#swap-partition) - - [Neuaufbau des RAIDs im Normalmodus](#normalmode) +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Grundlegende Informationen +### Basic Information -Geben Sie in einer Befehlszeilen-Sitzung den folgenden Code ein, um den aktuellen RAID-Status zu ermitteln: +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -61,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -Dieser Befehl zeigt uns, dass wir zwei Software-RAID-Geräte eingerichtet haben, wobei **md4** das größte ist. Das **md4**-RAID-Gerät besteht aus zwei Partitionen, die als **nvme1n1p4** und **nvme0n1p4** bezeichnet werden. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -Die [UU] bedeutet, dass alle Festplatten normal funktionieren. Ein `_` würde eine defekte Festplatte anzeigen. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Wenn Sie einen Server mit SATA-Festplatten haben, erhalten Sie die folgenden Ergebnisse: +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -82,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Obwohl dieser Befehl unsere RAID-Volumes zurückgibt, sagt er uns nicht die Größe der Partitionen selbst. Wir können diese Informationen mit dem folgenden Befehl erhalten: +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -127,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Der Befehl `fdisk -l` erlaubt es Ihnen auch, den Typ Ihrer Partition zu identifizieren. Dies ist eine wichtige Information, wenn es darum geht, Ihr RAID im Falle eines Festplattenausfalls neu aufzubauen. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Für **GPT**-Partitionen wird in Zeile 6 angezeigt: `Disklabel type: gpt`. Diese Information ist nur sichtbar, wenn der Server im Normalmodus läuft. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Basierend auf den Ergebnissen von `fdisk -l`, können wir erkennen, dass `/dev/md2` 888,8 GB umfasst und `/dev/md4` 973,5 GB enthält. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -Alternativ bietet der Befehl `lsblk` eine andere Ansicht der Partitionen: +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -156,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Wir notieren uns die Geräte, Partitionen und ihre Mountpoints. Aus den oben genannten Befehlen und Ergebnissen haben wir: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Zwei RAID-Arrays: `/dev/md2` und `/dev/md4`. -- Vier Partitionen, die Teil des RAIDs sind, mit den Mountpoints: `/` und `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Simulieren eines Festplattenausfalls +### Simulating a disk failure -Jetzt, da wir alle notwendigen Informationen haben, können wir einen Festplattenausfall simulieren und die Tests durchführen. In diesem Beispiel werden wir die Festplatte `sda` als defekt markieren. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -Die bevorzugte Methode, dies zu tun, ist über den Rescue-Modus-Umgebung von OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Starten Sie zunächst den Server im Rescue-Modus neu und melden Sie sich mit den bereitgestellten Anmeldeinformationen an. +First reboot the server in rescue mode and log in with the provided credentials. -Um eine Festplatte aus dem RAID zu entfernen, ist der erste Schritt, sie als **defekt** zu markieren und die Partitionen aus ihren jeweiligen RAID-Arrays zu entfernen. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -188,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Aus der obigen Ausgabe ergibt sich, dass sda aus zwei Partitionen besteht, die im RAID sind, nämlich **sda2** und **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Entfernen der defekten Festplatte +#### Removing the failed disk -Zunächst markieren wir die Partitionen **sda2** und **sda4** als defekt. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -206,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Wir haben nun einen RAID-Ausfall simuliert. Wenn wir den Befehl `cat /proc/mdstat` ausführen, erhalten wir die folgende Ausgabe: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -222,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Wie wir oben sehen können, zeigt das [F] neben den Partitionen an, dass die Festplatte fehlerhaft ist oder defekt ist. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -Als nächstes entfernen wir diese Partitionen aus den RAID-Arrays. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -236,18 +236,165 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Um sicherzustellen, dass wir eine Festplatte erhalten, die einem leeren Laufwerk ähnelt, verwenden wir den folgenden Befehl. Ersetzen Sie **sda** durch Ihre eigenen Werte: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh shred -s10M -n1 /dev/sda1 shred -s10M -n1 /dev/sda2 shred -s10M -n1 /dev/sda3 -shred -s10M -n +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: -# mdadm: /dev/sda4 erneut hinzugefügt +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -Verwenden Sie den folgenden Befehl, um das RAID-Neuaufbau zu überwachen: +If we run the following command, we see that our disk has been successfully "wiped": + +```sh +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 +``` + +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -264,15 +411,15 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Zuletzt fügen wir eine Bezeichnung hinzu und mounten die [SWAP]-Partition (falls zutreffend). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Um eine Bezeichnung für die SWAP-Partition hinzuzufügen: +To add a label the SWAP partition: ```sh [user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -Rufen Sie als nächstes die UUIDs beider Swap-Partitionen ab: +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 @@ -281,9 +428,9 @@ Rufen Sie als nächstes die UUIDs beider Swap-Partitionen ab: /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Wir ersetzen die alte UUID der Swap-Partition (**sda4**) durch die neue in `/etc/fstab`. +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. -Beispiel: +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab @@ -295,9 +442,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Basierend auf den oben genannten Ergebnissen ist die alte UUID `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` und sollte durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15` ersetzt werden. Stellen Sie sicher, dass Sie die richtige UUID ersetzen. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -Als nächstes prüfen wir, ob alles ordnungsgemäß gemountet ist, mit dem folgenden Befehl: +Next, we verify that everything is properly mounted with the following command: ```sh [user@server_ip ~]# sudo mount -av @@ -308,52 +455,52 @@ swap : ignored swap : ignored ``` -Führen Sie den folgenden Befehl aus, um die Swap-Partition zu aktivieren: +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -Laden Sie anschließend das System mit dem folgenden Befehl neu: +Then reload the system with the following command: ```sh [user@server_ip ~]# sudo systemctl daemon-reload ``` -Wir haben nun erfolgreich das RAID-Neuaufbau abgeschlossen. +We have now successfully completed the RAID rebuild. -/// details | **Neuaufbau des RAIDs im Rescue-Modus** +/// details | **Rebuilding the RAID in rescue mode** -Falls Ihr Server nach einem Wechsel der Festplatte nicht im normalen Modus neu starten kann, wird er im Rescue-Modus neu gestartet. +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. -In diesem Beispiel ersetzen wir die Festplatte `sdb`. +In this example, we are replacing the disk `sdb`. -Nachdem die Festplatte ausgetauscht wurde, müssen wir die Partitionstabelle von der gesunden Festplatte (in diesem Beispiel sda) auf die neue (sdb) kopieren. +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Für GPT-Partitionen** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> Der Befehl sollte in diesem Format lauten: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Beispiel: +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Sobald dies erledigt ist, ist der nächste Schritt, die GUID der neuen Festplatte zu randomisieren, um Konflikte mit anderen Festplatten zu vermeiden: +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Falls Sie die folgende Meldung erhalten: +>> If you the following message: >> >> ```console >> Warning: The kernel is still using the old partition table. @@ -362,27 +509,28 @@ Nachdem die Festplatte ausgetauscht wurde, müssen wir die Partitionstabelle von >> The operation has completed successfully. >> ``` >> ->> Können Sie einfach den Befehl `partprobe` ausführen. +>> You can simply run the `partprobe` command. >> -> **Für MBR-Partitionen** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> Der Befehl sollte in diesem Format lauten: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Wir können nun das RAID-Array neu aufbauen. Der folgende Code zeigt, wie wir die neuen Partitionen (sdb2 und sdb4) wieder ins RAID-Array einfügen können. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Verwenden Sie den Befehl `cat /proc/mdstat`, um das RAID-Neuaufbau zu überwachen: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -399,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Für weitere Details zu den RAID-Array(s): +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -435,15 +583,15 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 -#### Bezeichnung der SWAP-Partition hinzufügen (falls zutreffend) +#### Adding the label to the SWAP partition (if applicable) -Sobald das RAID-Neuaufbau abgeschlossen ist, mounten wir die Partition, die die Wurzel unseres Betriebssystems enthält, auf `/mnt`. In unserem Beispiel ist dies die Partition `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Wir fügen die Bezeichnung unserer Swap-Partition mit dem folgenden Befehl hinzu: +We add the label to our swap partition with the command: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 @@ -452,7 +600,7 @@ Setting up swapspace version 1, size = 512 MiB (536866816 bytes) LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -Als nächstes mounten wir die folgenden Verzeichnisse, um sicherzustellen, dass alle Manipulationen im chroot-Umgebung ordnungsgemäß funktionieren: +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -465,20 +613,20 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -Als nächstes greifen wir in die `chroot`-Umgebung: +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Wir rufen die UUIDs beider Swap-Partitionen ab: +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Beispiel: +Example: ```sh blkid /dev/sda4 @@ -487,13 +635,13 @@ blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Als nächstes ersetzen wir die alte UUID der Swap-Partition (**sdb4**) durch die neue in `/etc/fstab`: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Beispiel: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -502,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Stellen Sie sicher, dass Sie die richtige UUID ersetzen. In unserem obigen Beispiel ist die UUID, die ersetzt werden muss, `d6af33cf-fc15-4060-a43c-cb3b5537f58a` durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Stellen Sie sicher, dass Sie die richtige UUID ersetzen. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -Als nächstes stellen wir sicher, dass alles ordnungsgemäß gemountet ist: +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -514,7 +662,7 @@ swap : ignored swap : ignored ``` -Aktivieren Sie die Swap-Partition mit dem folgenden Befehl: +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -527,31 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Wir verlassen die `chroot`-Umgebung mit exit und laden das System neu: +We exit the `chroot` environment with exit and reload the system: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Wir entmounten alle Festplatten: +We umount all the disks: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt ``` -Wir haben nun erfolgreich das RAID-Neuaufbau auf dem Server abgeschlossen und können ihn nun im normalen Modus neu starten. +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. + +## Go Further -## Weiterführende Informationen -[Hot Swap - Software-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[OVHcloud API und Speicher](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Verwalten von Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -[Hot Swap - Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Für spezialisierte Dienstleistungen (SEO, Entwicklung usw.) kontaktieren Sie [OVHcloud Partner](/links/partner). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). -Wenn Sie bei der Nutzung und Konfiguration Ihrer OVHcloud-Lösungen Unterstützung benötigen, wenden Sie sich an unsere [Support-Angebote](/links/support). +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Wenn Sie Schulungen oder technische Unterstützung benötigen, um unsere Lösungen umzusetzen, wenden Sie sich an Ihren Vertriebsmitarbeiter oder klicken Sie auf [diesen Link](/links/professional-services), um ein Angebot zu erhalten und unsere Expert \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md index 864886451bf..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-es.md @@ -1,49 +1,50 @@ --- -title: Gestión y reconstrucción del RAID software en servidores en modo de arranque legacy (BIOS) -excerpt: "Descubra cómo gestionar y reconstruir el RAID software tras un reemplazo de disco en su servidor en modo de arranque legacy (BIOS)" +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Objetivo +## Objective -El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para mitigar la pérdida de datos en un servidor replicándolos en varios discos. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -El nivel de RAID predeterminado para las instalaciones de servidores de OVHcloud es RAID 1, lo que duplica el espacio ocupado por sus datos, reduciendo así a la mitad el espacio de disco utilizable. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Este guía explica cómo gestionar y reconstruir un RAID software en caso de reemplazar un disco en su servidor en modo de arranque legacy (BIOS).** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** -Antes de comenzar, tenga en cuenta que esta guía se centra en los servidores dedicados que utilizan el modo de arranque legacy (BIOS). Si su servidor utiliza el modo UEFI (tarjetas madre más recientes), consulte esta guía [Gestión y reconstrucción del RAID software en servidores en modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Para verificar si un servidor se ejecuta en modo BIOS o en modo UEFI, ejecute el siguiente comando: +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Requisitos +## Requirements -- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software. -- Tener acceso a su servidor mediante SSH como administrador (sudo). -- Conocimiento del RAID y las particiones +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## Procedimiento +## Instructions -### Presentación del contenido +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. -- [Información básica](#basicinformation) -- [Simular una falla de disco](#diskfailure) - - [Retirar el disco defectuoso](#diskremove) -- [Reconstrucción del RAID](#raidrebuild) - - [Reconstrucción del RAID en modo rescue](#rescuemode) - - [Añadir la etiqueta a la partición SWAP (si aplica)](#swap-partition) - - [Reconstrucción del RAID en modo normal](#normalmode) +### Content overview +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Información básica +### Basic Information -En una sesión de línea de comandos, escriba el siguiente código para determinar el estado actual del RAID. +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -60,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -Este comando nos indica que dos dispositivos de RAID software están actualmente configurados, **md4** siendo el más grande. El dispositivo de RAID **md4** está compuesto por dos particiones, llamadas **nvme1n1p4** y **nvme0n1p4**. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -El [UU] significa que todos los discos funcionan normalmente. Un `_` indica un disco defectuoso. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Si posee un servidor con discos SATA, obtendrá los siguientes resultados: +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -81,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Aunque este comando devuelve nuestros volúmenes de RAID, no nos indica el tamaño de las particiones mismas. Podemos encontrar esta información con el siguiente comando: +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -126,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -El comando `fdisk -l` también le permite identificar el tipo de partición. Esta es una información importante para reconstruir su RAID en caso de fallo de un disco. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Para las particiones **GPT**, la línea 6 mostrará: `Disklabel type: gpt`. Esta información solo es visible cuando el servidor está en modo normal. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 888.8GB y `/dev/md4` contiene 973.5GB. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -Alternativamente, el comando `lsblk` ofrece una vista diferente de las particiones: +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -155,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Tomamos en cuenta los dispositivos, las particiones y sus puntos de montaje. A partir de los comandos y resultados anteriores, tenemos: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Dos matrices RAID: `/dev/md2` y `/dev/md4`. -- Cuatro particiones forman parte del RAID con los puntos de montaje: `/` y `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Simular una falla de disco +### Simulating a disk failure -Ahora que disponemos de toda la información necesaria, podemos simular una falla de disco y continuar con las pruebas. En este ejemplo, haremos que el disco `sda` falle. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -El medio preferido para lograrlo es el entorno en modo rescue de OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. +First reboot the server in rescue mode and log in with the provided credentials. -Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -187,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -A partir de la salida anterior, sda se compone de dos particiones en RAID que son **sda2** y **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Retirar el disco defectuoso +#### Removing the failed disk -Comenzamos marcando las particiones **sda2** y **sda4** como **failed**. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -205,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Hemos simulado ahora una falla del RAID, cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos el siguiente resultado: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -221,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Como podemos ver arriba, el [F] junto a las particiones indica que el disco está fallando o defectuoso. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -A continuación, retiramos estas particiones de las matrices RAID. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -235,7 +236,17 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Para asegurarnos de obtener un disco que sea similar a un disco vacío, utilizamos el siguiente comando. Reemplace **sda** por sus propios valores: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: + +```sh +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk @@ -250,16 +261,140 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Si ejecutamos el siguiente comando, vemos que nuestro disco ha sido correctamente «limpiado»: +If we run the following command, we see that our disk has been successfully "wiped": ```sh parted /dev/sda -GNU Parted 3. +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 # mdadm: re-added /dev/sda4 ``` -Use el siguiente comando para supervisar la reconstrucción del RAID: +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -276,72 +411,97 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Finalmente, añadimos una etiqueta y montamos la partición [SWAP] (si aplica). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Para añadir una etiqueta a la partición SWAP: +To add a label the SWAP partition: ```sh -[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -A continuación, obtenga los UUID de ambas particiones de intercambio: +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" [user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Reemplazamos el antiguo UUID de la partición de intercambio (**sda4**) por el nuevo en `/etc/fstab`: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Asegúrese de reemplazar el UUID correcto. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -A continuación, recargue el sistema con el siguiente comando: +Next, we verify that everything is properly mounted with the following command: ```sh -[user@server_ip ~]# sudo systemctl daemon-reload +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Ejecute el siguiente comando para activar la partición de intercambio: +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -La reconstrucción del RAID ahora está terminada. +Then reload the system with the following command: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +We have now successfully completed the RAID rebuild. -/// details | **Reconstrucción del RAID en modo rescue** +/// details | **Rebuilding the RAID in rescue mode** -Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco sano (en este ejemplo, sda) al nuevo (sdb). +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Para particiones GPT** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> El comando debe tener el siguiente formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Ejemplo: +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Una vez realizada esta operación, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos con los GUID de otros discos: +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Si aparece el siguiente mensaje: ->> +>> If you the following message: +>> >> ```console >> Warning: The kernel is still using the old partition table. >> The new table will be used at the next reboot or after you @@ -349,27 +509,28 @@ Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco s >> The operation has completed successfully. >> ``` >> ->> Puede simplemente ejecutar el comando `partprobe`. Si aún no ve las nuevas particiones (por ejemplo, con `lsblk`), deberá reiniciar el servidor antes de continuar. +>> You can simply run the `partprobe` command. >> -> **Para particiones MBR** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> El comando debe tener el siguiente formato: `sfdisk -d /dev/disco sano | sfdisk /dev/nuevo disco` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo añadir las nuevas particiones (sdb2 y sdb4) a la matriz RAID. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Use el comando `cat /proc/mdstat` para supervisar la reconstrucción del RAID: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -386,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Para obtener más detalles sobre la o las matrices RAID: +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -416,30 +577,30 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 ``` -#### Añadimos la etiqueta a la partición SWAP (si aplica) +#### Adding the label to the SWAP partition (if applicable) -Una vez finalizada la reconstrucción del RAID, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Añadimos la etiqueta a nuestra partición de intercambio con el siguiente comando: +We add the label to our swap partition with the command: ```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 -mkswap: /dev/sda4: warning: wiping old swap signature. +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -A continuación, montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno chroot funcione correctamente: +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -452,38 +613,35 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -A continuación, accedemos al entorno `chroot`: +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Recuperamos los UUID de ambas particiones de intercambio: +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Ejemplo: +Example: ```sh blkid /dev/sda4 /dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -``` - -```sh blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -A continuación, reemplazamos el antiguo UUID de la partición de intercambio (**sdb4**) por el nuevo en `/etc/fstab`: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Ejemplo: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -492,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Asegúrese de reemplazar el UUID correcto. En nuestro ejemplo anterior, el UUID a reemplazar es `d6af33cf-fc15-4060-a43c-cb3b5537f58a` por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Asegúrese de reemplazar el UUID correcto. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -A continuación, nos aseguramos de que todo esté correctamente montado: +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -504,13 +662,7 @@ swap : ignored swap : ignored ``` -Recargue el sistema con el siguiente comando: - -```sh -root@rescue12-customer-eu:/# systemctl daemon-reload -``` - -Active la partición de intercambio con el siguiente comando: +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -523,28 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Salga del entorno Chroot con `exit` y desmonte todos los discos: +We exit the `chroot` environment with exit and reload the system: ```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Hemos terminado con éxito la reconstrucción del RAID en el servidor y ahora podemos reiniciar el servidor en modo normal. +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. -## Más información -[Reemplazo a caliente - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +## Go Further -[API OVHcloud y Almacenamiento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[Gestión del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Reemplazo a caliente - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -Para servicios especializados (posicionamiento, desarrollo, etc.), contacte con los [socios OVHcloud](/links/partner). +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Si desea beneficiarse de una asistencia en el uso y configuración de sus soluciones OVHcloud, le invitamos a consultar nuestras distintas [ofertas de soporte](/links/support). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). -Si necesita una formación o asistencia técnica para la implementación de nuestras soluciones, contacte con su comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar un análisis personalizado de su proyecto a nuestros expertos del equipo Professional Services. +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Interactúe con nuestra [comunidad de usuarios](/links/community). \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md index c9db8b1e837..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.es-us.md @@ -1,49 +1,50 @@ --- -title: Gestión y reconstrucción del RAID software en servidores en modo de arranque legacy (BIOS) -excerpt: "Descubra cómo gestionar y reconstruir el RAID software tras un reemplazo de disco en su servidor en modo de arranque legacy (BIOS)" +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Objetivo +## Objective -El RAID (Redundant Array of Independent Disks) es un conjunto de técnicas diseñadas para mitigar la pérdida de datos en un servidor replicándolos en varios discos. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -El nivel de RAID predeterminado para las instalaciones de servidores de OVHcloud es RAID 1, lo que duplica el espacio ocupado por sus datos, reduciendo así a la mitad el espacio de disco utilizable. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Este guía explica cómo gestionar y reconstruir un RAID software en caso de reemplazar un disco en su servidor en modo de arranque legacy (BIOS).** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** - +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Antes de comenzar, tenga en cuenta que esta guía se centra en los servidores dedicados que utilizan el modo de arranque legacy (BIOS). Si su servidor utiliza el modo UEFI (tarjetas madre más recientes), consulte esta guía [Gestión y reconstrucción del RAID software en servidores en modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). - -Para verificar si un servidor se ejecuta en modo BIOS o en modo UEFI, ejecute el siguiente comando: +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Requisitos -- Tener un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software. -- Tener acceso a su servidor mediante SSH como administrador (sudo). -- Conocimiento del RAID y las particiones +## Requirements + +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## En práctica -### Presentación del contenido +## Instructions -- [Información básica](#basicinformation) -- [Simular una falla de disco](#diskfailure) - - [Retirar el disco defectuoso](#diskremove) -- [Reconstrucción del RAID](#raidrebuild) - - [Reconstrucción del RAID en modo rescue](#rescuemode) - - [Añadir la etiqueta a la partición SWAP (si aplica)](#swap-partition) - - [Reconstrucción del RAID en modo normal](#normalmode) +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Información básica +### Basic Information -En una sesión de línea de comandos, escriba el siguiente código para determinar el estado actual del RAID. +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -60,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -Este comando nos indica que dos dispositivos de RAID software están actualmente configurados, **md4** siendo el más grande. El dispositivo de RAID **md4** está compuesto por dos particiones, llamadas **nvme1n1p4** y **nvme0n1p4**. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -El [UU] significa que todos los discos funcionan normalmente. Un `_` indica un disco defectuoso. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Si posee un servidor con discos SATA, obtendrá los siguientes resultados: +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -81,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Aunque este comando devuelve nuestros volúmenes de RAID, no nos indica el tamaño de las particiones mismas. Podemos encontrar esta información con el siguiente comando: +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -126,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -El comando `fdisk -l` también le permite identificar el tipo de partición. Esta es una información importante para reconstruir su RAID en caso de fallo de un disco. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Para las particiones **GPT**, la línea 6 mostrará: `Disklabel type: gpt`. Esta información solo es visible cuando el servidor está en modo normal. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 888.8GB y `/dev/md4` contiene 973.5GB. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -Alternativamente, el comando `lsblk` ofrece una vista diferente de las particiones: +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -155,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Tomamos en cuenta los dispositivos, las particiones y sus puntos de montaje. A partir de los comandos y resultados anteriores, tenemos: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Dos matrices RAID: `/dev/md2` y `/dev/md4`. -- Cuatro particiones forman parte del RAID con los puntos de montaje: `/` y `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Simular una falla de disco +### Simulating a disk failure -Ahora que disponemos de toda la información necesaria, podemos simular una falla de disco y continuar con las pruebas. En este ejemplo, haremos que el disco `sda` falle. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -El medio preferido para lograrlo es el entorno en modo rescue de OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. +First reboot the server in rescue mode and log in with the provided credentials. -Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -187,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -A partir de la salida anterior, sda se compone de dos particiones en RAID que son **sda2** y **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Retirar el disco defectuoso +#### Removing the failed disk -Comenzamos marcando las particiones **sda2** y **sda4** como **failed**. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -205,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Hemos simulado ahora una falla del RAID, cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos el siguiente resultado: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -221,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Como podemos ver arriba, el [F] junto a las particiones indica que el disco está fallando o defectuoso. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -A continuación, retiramos estas particiones de las matrices RAID. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -235,7 +236,17 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Para asegurarnos de obtener un disco que sea similar a un disco vacío, utilizamos el siguiente comando. Reemplace **sda** por sus propios valores: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: + +```sh +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk @@ -250,16 +261,140 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Si ejecutamos el siguiente comando, vemos que nuestro disco ha sido correctamente «limpiado»: +If we run the following command, we see that our disk has been successfully "wiped": ```sh parted /dev/sda -GNU Parted 3. +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 # mdadm: re-added /dev/sda4 ``` -Use el siguiente comando para supervisar la reconstrucción del RAID: +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -276,72 +411,97 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Finalmente, añadimos una etiqueta y montamos la partición [SWAP] (si aplica). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Para añadir una etiqueta a la partición SWAP: +To add a label the SWAP partition: ```sh -[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -A continuación, obtenga los UUID de ambas particiones de intercambio: +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" [user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Reemplazamos el antiguo UUID de la partición de intercambio (**sda4**) por el nuevo en `/etc/fstab`: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Asegúrese de reemplazar el UUID correcto. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -A continuación, recargue el sistema con el siguiente comando: +Next, we verify that everything is properly mounted with the following command: ```sh -[user@server_ip ~]# sudo systemctl daemon-reload +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Ejecute el siguiente comando para activar la partición de intercambio: +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -La reconstrucción del RAID ahora está terminada. +Then reload the system with the following command: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +We have now successfully completed the RAID rebuild. -/// details | **Reconstrucción del RAID en modo rescue** +/// details | **Rebuilding the RAID in rescue mode** -Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco sano (en este ejemplo, sda) al nuevo (sdb). +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Para particiones GPT** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> El comando debe tener el siguiente formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Ejemplo: +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Una vez realizada esta operación, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos con los GUID de otros discos: +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Si aparece el siguiente mensaje: ->> +>> If you the following message: +>> >> ```console >> Warning: The kernel is still using the old partition table. >> The new table will be used at the next reboot or after you @@ -349,27 +509,28 @@ Una vez reemplazado el disco, debemos copiar la tabla de particiones del disco s >> The operation has completed successfully. >> ``` >> ->> Puede simplemente ejecutar el comando `partprobe`. Si aún no ve las nuevas particiones (por ejemplo, con `lsblk`), deberá reiniciar el servidor antes de continuar. +>> You can simply run the `partprobe` command. >> -> **Para particiones MBR** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> El comando debe tener el siguiente formato: `sfdisk -d /dev/disco sano | sfdisk /dev/nuevo disco` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo añadir las nuevas particiones (sdb2 y sdb4) a la matriz RAID. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Use el comando `cat /proc/mdstat` para supervisar la reconstrucción del RAID: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -386,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Para obtener más detalles sobre la o las matrices RAID: +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -416,30 +577,30 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 ``` -#### Añadimos la etiqueta a la partición SWAP (si aplica) +#### Adding the label to the SWAP partition (if applicable) -Una vez finalizada la reconstrucción del RAID, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Añadimos la etiqueta a nuestra partición de intercambio con el siguiente comando: +We add the label to our swap partition with the command: ```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 -mkswap: /dev/sda4: warning: wiping old swap signature. +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -A continuación, montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno chroot funcione correctamente: +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -452,38 +613,35 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -A continuación, accedemos al entorno `chroot`: +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Recuperamos los UUID de ambas particiones de intercambio: +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Ejemplo: +Example: ```sh blkid /dev/sda4 /dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -``` - -```sh blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -A continuación, reemplazamos el antiguo UUID de la partición de intercambio (**sdb4**) por el nuevo en `/etc/fstab`: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Ejemplo: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -492,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Asegúrese de reemplazar el UUID correcto. En nuestro ejemplo anterior, el UUID a reemplazar es `d6af33cf-fc15-4060-a43c-cb3b5537f58a` por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Asegúrese de reemplazar el UUID correcto. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -A continuación, nos aseguramos de que todo esté correctamente montado: +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -504,13 +662,7 @@ swap : ignored swap : ignored ``` -Recargue el sistema con el siguiente comando: - -```sh -root@rescue12-customer-eu:/# systemctl daemon-reload -``` - -Active la partición de intercambio con el siguiente comando: +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -523,28 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Salga del entorno Chroot con `exit` y desmonte todos los discos: +We exit the `chroot` environment with exit and reload the system: ```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Hemos terminado con éxito la reconstrucción del RAID en el servidor y ahora podemos reiniciar el servidor en modo normal. +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. -## Más información -[Reemplazo a caliente - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +## Go Further -[API OVHcloud y Almacenamiento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[Gestión del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Reemplazo a caliente - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -Para servicios especializados (posicionamiento, desarrollo, etc.), contacte con los [socios OVHcloud](/links/partner). +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Si desea beneficiarse de una asistencia en el uso y configuración de sus soluciones OVHcloud, le invitamos a consultar nuestras distintas [ofertas de soporte](/links/support). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). -Si necesita una formación o asistencia técnica para la implementación de nuestras soluciones, contacte con su comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar un análisis personalizado de su proyecto a nuestros expertos del equipo Professional Services. +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Interactúe con nuestra [comunidad de usuarios](/links/community). \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md index 39f7fa84d0e..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.it-it.md @@ -1,48 +1,50 @@ --- -title: Gestione e ricostruzione del RAID software sui server in modalità legacy boot (BIOS) -excerpt: "Scopri come gestire e ricostruire il RAID software dopo il sostituzione di un disco su un server in modalità legacy boot (BIOS)" +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Obiettivo +## Objective -Il RAID (Redundant Array of Independent Disks) è un insieme di tecniche progettate per ridurre la perdita di dati su un server replicandoli su più dischi. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -Il livello RAID predefinito per le installazioni dei server OVHcloud è RAID 1, che raddoppia lo spazio occupato dai vostri dati, riducendo quindi a metà lo spazio disco utilizzabile. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Questa guida spiega come gestire e ricostruire un RAID software in caso di sostituzione di un disco su un server in modalità legacy boot (BIOS).** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** -Prima di iniziare, notate che questa guida si concentra sui Server dedicati che utilizzano la modalità legacy boot (BIOS). Se il vostro server utilizza la modalità UEFI (schede madri più recenti), fate riferimento a questa guida [Gestione e ricostruzione del RAID software sui server in modalità boot UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Per verificare se un server è in esecuzione in modalità BIOS o in modalità UEFI, eseguite il comando seguente : +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Prerequisiti +## Requirements -- Possedere un [server dedicato](/links/bare-metal/bare-metal) con una configurazione RAID software. -- Avere accesso al server tramite SSH come amministratore (sudo). -- Conoscenza del RAID e delle partizioni +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## Procedura +## Instructions -### Panoramica del contenuto +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. -- [Informazioni di base](#basicinformation) -- [Simulare un guasto del disco](#diskfailure) - - [Rimozione del disco guasto](#diskremove) -- [Ricostruzione del RAID](#raidrebuild) - - [Ricostruzione del RAID in modalità rescue](#rescuemode) - - [Aggiunta dell'etichetta alla partizione SWAP (se necessario)](#swap-partition) - - [Ricostruzione del RAID in modalità normale](#normalmode) +### Content overview + +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Informazioni di base +### Basic Information -Nella sessione della riga di comando, digitate il codice seguente per determinare lo stato attuale del RAID. +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -59,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -Questo comando ci indica che due dispositivi RAID software sono attualmente configurati, **md4** essendo il più grande. Il dispositivo RAID **md4** è composto da due partizioni, denominate **nvme1n1p4** e **nvme0n1p4**. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -Il [UU] significa che tutti i dischi funzionano normalmente. Un `_` indica un disco guasto. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Se possedete un server con dischi SATA, otterrete i seguenti risultati : +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -80,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Sebbene questo comando restituisca i nostri volumi RAID, non ci indica la dimensione delle partizioni stesse. Possiamo trovare questa informazione con il comando seguente : +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -125,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Il comando `fdisk -l` vi permette inoltre di identificare il tipo di partizione. Si tratta di un'informazione importante per ricostruire il vostro RAID in caso di guasto di un disco. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Per le partizioni **GPT**, la riga 6 mostrerà: `Disklabel type: gpt`. Queste informazioni sono visibili solo quando il server è in modalità normale. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Ancora in base ai risultati di `fdisk -l`, possiamo vedere che `/dev/md2` è composto da 888.8GB e `/dev/md4` contiene 973.5GB. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -In alternativa, il comando `lsblk` offre una visione diversa delle partizioni : +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -154,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Prendiamo in considerazione i dispositivi, le partizioni e i loro punti di montaggio. Dai comandi e dai risultati sopra, abbiamo : +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Due array RAID : `/dev/md2` e `/dev/md4`. -- Quattro partizioni fanno parte del RAID con i punti di montaggio : `/` e `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Simulare un guasto del disco +### Simulating a disk failure -Ora che abbiamo tutte le informazioni necessarie, possiamo simulare un guasto del disco e procedere ai test. In questo esempio, faremo fallire il disco `sda`. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -Il metodo preferito per farlo è l'ambiente in modalità rescue di OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Riavviate prima il server in modalità rescue e connettetevi con le credenziali fornite. +First reboot the server in rescue mode and log in with the provided credentials. -Per rimuovere un disco dal RAID, il primo passo consiste nel marcarlo come **Failed** e rimuovere le partizioni dai loro array RAID rispettivi. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -186,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Dall'output sopra, sda è composto da due partizioni in RAID che sono **sda2** e **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Rimozione del disco guasto +#### Removing the failed disk -Iniziamo marciando le partizioni **sda2** e **sda4** come **failed**. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -204,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Ora abbiamo simulato un guasto al RAID, quando eseguiamo il comando `cat /proc/mdstat`, otteniamo il risultato seguente : +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -220,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Come possiamo vedere sopra, il [F] accanto alle partizioni indica che il disco è guasto o difettoso. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -Successivamente, rimuoviamo queste partizioni dagli array RAID. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -234,7 +236,17 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Per assicurarci di ottenere un disco simile a un disco vuoto, utilizziamo il comando seguente. Sostituite **sda** con i vostri valori : +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: + +```sh +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk @@ -245,12 +257,144 @@ sdb 8:16 0 1.8T 0 disk ├─sdb2 8:18 0 888.9G 0 part │ └─md2 9:2 0 888.8G 0 raid1 / ├─sdb3 8:19 0 512M 0 part [SWAP] -└─sdb4 8:20 0 973.6G +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home +``` + +If we run the following command, we see that our disk has been successfully "wiped": + +```sh +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 # mdadm: re-added /dev/sda4 ``` -Utilizza il comando seguente per monitorare la ricostruzione del RAID : +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -267,72 +411,97 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Infine, aggiungiamo un'etichetta e montiamo la partizione [SWAP] (se necessario). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Per aggiungere un'etichetta alla partizione SWAP : +To add a label the SWAP partition: ```sh -[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -Successivamente, recuperiamo gli UUID delle due partizioni swap : +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" [user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Sostituiamo l'UUID vecchio della partizione swap (**sda4**) con il nuovo in `/etc/fstab` : +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Assicurati di sostituire l'UUID corretto. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -Successivamente, ricarica il sistema con il comando seguente : +Next, we verify that everything is properly mounted with the following command: ```sh -[user@server_ip ~]# sudo systemctl daemon-reload +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Esegui il comando seguente per attivare la partizione swap : +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -La ricostruzione del RAID è ora completata. +Then reload the system with the following command: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +We have now successfully completed the RAID rebuild. -/// details | **Ricostruzione del RAID in modalità rescue** +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. -Una volta sostituito il disco, dobbiamo copiare la tabella delle partizioni del disco sano (in questo esempio, sda) verso il nuovo (sdb). +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Per le partizioni GPT** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> Il comando deve essere nel formato seguente : `sgdisk -R /dev/nuovo disco /dev/disco sano` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Esempio : +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Una volta completata questa operazione, il passo successivo consiste nell'assegnare un GUID casuale al nuovo disco per evitare conflitti con i GUID di altri dischi : +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Se appare il seguente messaggio : ->> +>> If you the following message: +>> >> ```console >> Warning: The kernel is still using the old partition table. >> The new table will be used at the next reboot or after you @@ -340,27 +509,28 @@ Una volta sostituito il disco, dobbiamo copiare la tabella delle partizioni del >> The operation has completed successfully. >> ``` >> ->> È sufficiente eseguire il comando `partprobe`. Se non riesci comunque a visualizzare le nuove partizioni (ad esempio con `lsblk`), devi riavviare il server prima di procedere. +>> You can simply run the `partprobe` command. >> -> **Per le partizioni MBR** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> Il comando deve essere nel formato seguente : `sfdisk -d /dev/disco sano | sfdisk /dev/nuovo disco` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Possiamo ora ricostruire l'array RAID. L'estratto di codice seguente mostra come aggiungere le nuove partizioni (sdb2 e sdb4) nell'array RAID. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Utilizza il comando `cat /proc/mdstat` per monitorare la ricostruzione del RAID : +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -377,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Per ulteriori dettagli su una o più matrici RAID : +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -407,30 +577,30 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 ``` -#### Aggiunta dell'etichetta alla partizione SWAP (se necessario) +#### Adding the label to the SWAP partition (if applicable) -Una volta completata la ricostruzione del RAID, montiamo la partizione che contiene la radice del nostro sistema operativo su `/mnt`. Nell'esempio, questa partizione è `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Aggiungiamo l'etichetta alla nostra partizione swap con il comando : +We add the label to our swap partition with the command: ```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 -mkswap: /dev/sda4: warning: wiping old swap signature. +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -Successivamente, montiamo le seguenti directory per assicurarci che qualsiasi modifica che effettuiamo nell'ambiente chroot funzioni correttamente : +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -443,38 +613,35 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -Successivamente, accediamo all'ambiente `chroot` : +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Recuperiamo gli UUID delle due partizioni swap : +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Esempio: +Example: ```sh blkid /dev/sda4 /dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -``` - -```sh blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Successivamente, sostituiamo l'UUID vecchio della partizione swap (**sdb4**) con il nuovo in `/etc/fstab` : +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Esempio: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -483,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Assicurati di sostituire l'UUID corretto. Nell'esempio sopra, l'UUID da sostituire è `d6af33cf-fc15-4060-a43c-cb3b5537f58a` con il nuovo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Assicurati di sostituire l'UUID corretto. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -Successivamente, verifichiamo che tutto sia correttamente montato : +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -495,13 +662,7 @@ swap : ignored swap : ignored ``` -Ricarica il sistema con il comando seguente : - -```sh -root@rescue12-customer-eu:/# systemctl daemon-reload -``` - -Attiva la partizione swap con il comando seguente : +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -514,28 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Esci dall'ambiente Chroot con `exit` e smonta tutti i dischi : +We exit the `chroot` environment with exit and reload the system: ```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Abbiamo ora completato con successo la ricostruzione del RAID sul server e possiamo ora riavviarlo in modalità normale. +We umount all the disks: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` -## Per saperne di più +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. -[Hotswap - RAID software](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +## Go Further -[API OVHcloud e Archiviazione](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[Gestione del RAID hardware](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Hotswap - RAID hardware](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -Per servizi specializzati (posizionamento, sviluppo, ecc.), contatta i [partner OVHcloud](/links/partner). +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Se desideri ricevere un supporto sull'utilizzo e la configurazione delle tue soluzioni OVHcloud, consulta le nostre diverse [offerte di supporto](/links/support). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). -Se hai bisogno di un corso o di un supporto tecnico per l'implementazione delle nostre soluzioni, contatta il tuo commerciale o clicca su [questo link](/links/professional-services) per ottenere un preventivo e richiedere un'analisi personalizzata del tuo progetto ai nostri esperti del team Professional Services. +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Contatta la nostra [Community di utenti](/links/community). \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md index b26dc5ac034..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pl-pl.md @@ -1,50 +1,50 @@ --- -title: Zarządzanie i odbudowanie oprogramowania RAID na serwerach w trybie rozruchu legacy (BIOS) -excerpt: Dowiedz się, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie rozruchu legacy (BIOS) +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Wprowadzenie +## Objective -Redundantny zbiór niezależnych dysków (RAID) to technologia, która zmniejsza utratę danych na serwerze, replikując dane na dwóch lub więcej dyskach. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -Domyślny poziom RAID dla instalacji serwerów OVHcloud to RAID 1, który podwaja przestrzeń zajmowaną przez dane, skutecznie zmniejszając wykorzystywalną przestrzeń dyskową. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Ta instrukcja wyjaśnia, jak zarządzać i odbudować oprogramowanie RAID w przypadku wymiany dysku na serwerze w trybie rozruchu legacy (BIOS).** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** -Zanim zaczniemy, zwróć uwagę, że ta instrukcja koncentruje się na Serwerach dedykowanych, które używają trybu rozruchu legacy (BIOS). Jeśli Twój serwer używa trybu UEFI (nowsze płyty główne), odwiedź tę instrukcję [Zarządzanie i odbudowanie oprogramowania RAID na serwerach w trybie rozruchu UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Aby sprawdzić, czy serwer działa w trybie legacy BIOS czy UEFI, uruchom następujące polecenie: +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Wymagania początkowe +## Requirements -- Serwer [Dedykowany](/links/bare-metal/bare-metal) z konfiguracją oprogramowania RAID -- Dostęp administracyjny (sudo) do serwera przez SSH -- Zrozumienie RAID i partycji +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## W praktyce +## Instructions -Kiedy zakupisz nowy serwer, możesz czuć potrzebę wykonania szeregu testów i działań. Jednym z takich testów może być symulacja awarii dysku, aby zrozumieć proces odbudowy RAID i przygotować się na wypadek, gdyby to się kiedykolwiek zdarzyło. +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. -### Omówienie treści +### Content overview -- [Podstawowe informacje](#basicinformation) -- [Symulowanie awarii dysku](#diskfailure) - - [Usuwanie uszkodzonego dysku](#diskremove) -- [Odbudowanie RAID](#raidrebuild) - - [Odbudowanie RAID w trybie ratunkowym](#rescuemode) - - [Dodawanie etykiety do partycji SWAP (jeśli dotyczy)](#swap-partition) - - [Odbudowanie RAID w trybie normalnym](#normalmode) +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Podstawowe informacje +### Basic Information -W sesji wiersza poleceń wpisz poniższe polecenie, aby określić bieżący stan RAID: +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -61,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -To polecenie pokazuje nam, że mamy dwa urządzenia RAID oprogramowania obecnie skonfigurowane, z **md4** będącym największym z nich. Urządzenie RAID **md4** składa się z dwóch partycji, które są znane jako **nvme1n1p4** i **nvme0n1p4**. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -[UU] oznacza, że wszystkie dyski działają normalnie. `_` wskazuje na uszkodzony dysk. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Jeśli masz serwer z dyskami SATA, otrzymasz następujące wyniki: +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -82,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Choć to polecenie zwraca nasze objętości RAID, nie mówi nam o rozmiarze samych partycji. Te informacje możemy znaleźć za pomocą poniższego polecenia: +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -127,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -Polecenie `fdisk -l` pozwala również zidentyfikować typ partycji. Jest to ważna informacja, gdy chodzi o odbudowanie RAID w przypadku awarii dysku. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Dla partycji **GPT**, linia 6 będzie wyświetlać: `Disklabel type: gpt`. Ta informacja może być widoczna tylko, gdy serwer działa w trybie normalnym. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Zgodnie z wynikami `fdisk -l`, możemy stwierdzić, że `/dev/md2` składa się z 888,8 GB, a `/dev/md4` zawiera 973,5 GB. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -Alternatywnie, polecenie `lsblk` oferuje inny widok partycji: +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -156,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Zwracamy uwagę na urządzenia, partycje i ich punkty montowania. Z powyższych poleceń i wyników mamy: +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Dwa tablice RAID: `/dev/md2` i `/dev/md4`. -- Cztery partycje należące do RAID z punktami montowania: `/` i `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Symulowanie awarii dysku +### Simulating a disk failure -Teraz, gdy mamy wszystkie niezbędne informacje, możemy zasymulować awarię dysku i kontynuować testy. W tym przykładzie zasymulujemy awarię dysku `sda`. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -Preferowany sposób to wykonanie tego za pośrednictwem środowiska ratunkowego OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Najpierw uruchom serwer w trybie ratunkowym i zaloguj się przy użyciu dostarczonych poświadczeń. +First reboot the server in rescue mode and log in with the provided credentials. -Aby usunąć dysk z RAID, pierwszym krokiem jest oznaczenie go jako **Awaryjny** i usunięcie partycji z ich odpowiednich tablic RAID. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -188,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Z powyższego wyniku wynika, że sda składa się z dwóch partycji w RAID, które to **sda2** i **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Usuwanie uszkodzonego dysku +#### Removing the failed disk -Najpierw oznaczamy partycje **sda2** i **sda4** jako awaryjne. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -206,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Teraz zasymulowaliśmy awarię RAID, a po uruchomieniu polecenia `cat /proc/mdstat` mamy następujące dane wyjściowe: +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -222,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Jak widać powyżej, [F] obok partycji wskazuje, że dysk uległ awarii lub jest uszkodzony. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -Następnie usuwamy te partycje z tablic RAID. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -236,15 +236,165 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Aby upewnić się, że otrzymamy dysk podobny do pustego dysku, używamy poniższego polecenia. Zamień **sda** na swoje własne wartości: +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: ```sh -shred -s +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: -# mdadm: ponownie dodano /dev/sda4 +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +sda 8:0 0 1.8T 0 disk +sdb 8:16 0 1.8T 0 disk +├─sdb1 8:17 0 1M 0 part +├─sdb2 8:18 0 888.9G 0 part +│ └─md2 9:2 0 888.8G 0 raid1 / +├─sdb3 8:19 0 512M 0 part [SWAP] +└─sdb4 8:20 0 973.6G 0 part + └─md4 9:4 0 973.5G 0 raid1 /home ``` -Aby monitorować odbudowę RAID, użyj poniższego polecenia: +If we run the following command, we see that our disk has been successfully "wiped": + +```sh +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 + +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 +# mdadm: re-added /dev/sda4 +``` + +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -261,15 +411,15 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Na koniec dodajemy etykietę i montujemy partycję [SWAP] (jeśli dotyczy). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Aby dodać etykietę do partycji SWAP: +To add a label the SWAP partition: ```sh [user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -Następnie pobierz UUID obu partycji SWAP: +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 @@ -278,9 +428,9 @@ Następnie pobierz UUID obu partycji SWAP: /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Zastępujemy stary UUID partycji SWAP (**sda4**) nowym w pliku `/etc/fstab`. +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. -Przykład: +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab @@ -292,9 +442,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Na podstawie powyższych wyników, stary UUID to `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` i powinien zostać zastąpiony nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Upewnij się, że zastępujesz poprawny UUID. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -Następnie sprawdzamy, czy wszystko zostało poprawnie zamontowane, używając poniższego polecenia: +Next, we verify that everything is properly mounted with the following command: ```sh [user@server_ip ~]# sudo mount -av @@ -305,52 +455,52 @@ swap : ignored swap : ignored ``` -Uruchom poniższe polecenie, aby włączyć partycję SWAP: +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -Następnie przeładuj system poniższym poleceniem: +Then reload the system with the following command: ```sh [user@server_ip ~]# sudo systemctl daemon-reload ``` -W ten sposób skończyliśmy pomyślnie odbudowę RAID. +We have now successfully completed the RAID rebuild. -/// details | **Odbudowanie RAID w trybie ratunkowym** +/// details | **Rebuilding the RAID in rescue mode** -Jeśli Twój serwer nie może uruchomić się w trybie normalnym po wymianie dysku, zostanie on uruchomiony w trybie ratunkowym. +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. -W tym przykładzie wymieniamy dysk `sdb`. +In this example, we are replacing the disk `sdb`. -Po wymianie dysku musimy skopiować tablicę partycji z dysku sprawnego (w tym przykładzie sda) na nowy (sdb). +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Dla partycji GPT** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> Polecenie powinno mieć ten format: `sgdisk -R /dev/newdisk /dev/healthydisk` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Przykład: +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Po wykonaniu tego kroku następnym krokiem jest zrandomizowanie GUID nowego dysku, aby uniknąć konfliktów GUID z innymi dyskami: +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Jeśli otrzymasz następującą wiadomość: +>> If you the following message: >> >> ```console >> Warning: The kernel is still using the old partition table. @@ -359,27 +509,28 @@ Po wymianie dysku musimy skopiować tablicę partycji z dysku sprawnego (w tym p >> The operation has completed successfully. >> ``` >> ->> Możesz po prostu uruchomić polecenie `partprobe`. +>> You can simply run the `partprobe` command. >> -> **Dla partycji MBR** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> Polecenie powinno mieć ten format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Teraz możemy odbudować tablicę RAID. Poniższy fragment kodu pokazuje, jak możemy ponownie dodać nowe partycje (sdb2 i sdb4) do tablicy RAID. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Użyj polecenia `cat /proc/mdstat`, aby monitorować odbudowę RAID: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -396,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Aby uzyskać więcej szczegółów na temat tablicy RAID: +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -432,15 +583,15 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 -#### Dodanie etykiety do partycji SWAP (jeśli dotyczy) +#### Adding the label to the SWAP partition (if applicable) -Po zakończeniu odbudowy RAID montujemy partycję zawierającą korzeń naszego systemu operacyjnego na `/mnt`. W naszym przykładzie tą partycją jest `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Dodajemy etykietę do naszej partycji SWAP za pomocą polecenia: +We add the label to our swap partition with the command: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 @@ -449,7 +600,7 @@ Setting up swapspace version 1, size = 512 MiB (536866816 bytes) LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -Następnie montujemy poniższe katalogi, aby upewnić się, że wszystkie operacje w środowisku chroot będą działać poprawnie: +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -462,20 +613,20 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -Następnie wchodzimy do środowiska `chroot`: +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Pobieramy UUID obu partycji SWAP: +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Przykład: +Example: ```sh blkid /dev/sda4 @@ -484,13 +635,13 @@ blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Następnie zastępujemy stary UUID partycji SWAP (**sdb4**) nowym w pliku `/etc/fstab`: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Przykład: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -499,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Upewnij się, że zastępujesz poprawny UUID. W powyższym przykładzie UUID do zastąpienia to `d6af33cf-fc15-4060-a43c-cb3b5537f58a` nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Upewnij się, że zastępujesz poprawny UUID. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -Następnie upewniamy się, że wszystko zostało poprawnie zamontowane: +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -511,7 +662,7 @@ swap : ignored swap : ignored ``` -Włącz partycję SWAP poniższym poleceniem: +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -524,32 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Wyjdź z środowiska `chroot` za pomocą `exit` i przeładuj system: +We exit the `chroot` environment with exit and reload the system: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Odmontuj wszystkie dyski: +We umount all the disks: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt ``` -W ten sposób pomyślnie zakończyliśmy odbudowę RAID na serwerze i teraz możemy go ponownie uruchomić w trybie normalnym. +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. -## Sprawdź również +## Go Further [Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[OVHcloud API i Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Zarządzanie hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) [Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Dla usług specjalistycznych (SEO, rozwój, itp.), skontaktuj się z [partnerami OVHcloud](/links/partner). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). -Jeśli potrzebujesz pomocy w użyciu i konfiguracji rozwiązań OVHcloud, skorzystaj z naszych [ofert wsparcia](/links/support). +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). + +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Jeśli potrzebujesz szkoleń lub pomocy technicznej w wdroż \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md index c171f2ece6f..2a20f7e415d 100644 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md +++ b/pages/bare_metal_cloud/dedicated_servers/raid_soft/guide.pt-pt.md @@ -1,49 +1,50 @@ --- -title: Gestão e reconstrução do RAID software nos servidores em modo de arranque legado (BIOS) -excerpt: "Aprenda a gerir e reconstruir o RAID software após a substituição de um disco no seu servidor em modo de arranque legado (BIOS)" +title: Managing and rebuilding software RAID on servers using legacy boot (BIOS) mode +excerpt: Find out how to manage and rebuild software RAID after a disk replacement on your server in legacy boot (BIOS) mode updated: 2025-12-11 --- -## Objetivo +## Objective -O RAID (Redundant Array of Independent Disks) é um conjunto de técnicas concebidas para mitigar a perda de dados num servidor replicando-os em vários discos. +Redundant Array of Independent Disks (RAID) is a technology that mitigates data loss on a server by replicating data across two or more disks. -O nível RAID predefinido para as instalações de servidores OVHcloud é o RAID 1, o que duplica o espaço ocupado pelos seus dados, reduzindo assim metade do espaço de disco utilizável. +The default RAID level for OVHcloud server installations is RAID 1, which doubles the space taken up by your data, effectively halving the useable disk space. -**Este guia explica como gerir e reconstruir um RAID software em caso de substituição de um disco no seu servidor em modo de arranque legado (BIOS).** +**This guide explains how to manage and rebuild a software RAID in the event of a disk replacement on your server in legacy boot mode (BIOS).** -Antes de começar, note que este guia se concentra nos servidores dedicados que utilizam o modo de arranque legado (BIOS). Se o seu servidor utiliza o modo UEFI (placas-mãe mais recentes), consulte este guia [Gestão e reconstrução do RAID software nos servidores em modo de arranque UEFI](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). +Before we begin, please note that this guide focuses on Dedicated servers that use legacy boot (BIOS) mode. If your server uses the UEFI mode (newer motherboards), refer to this guide [Managing and rebuilding software RAID on servers in UEFI boot mode](/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi). -Para verificar se um servidor está a executar em modo BIOS ou em modo UEFI, execute o seguinte comando : +To check whether a server runs on legacy BIOS or UEFI mode, run the following command: ```sh [user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS ``` -## Requisitos +## Requirements -- Ter um [servidor dedicado](/links/bare-metal/bare-metal) com uma configuração de RAID software. -- Ter acesso ao seu servidor via SSH com privilégios de administrador (sudo). -- Conhecimento de RAID e partições +- A [Dedicated server](/links/bare-metal/bare-metal) with a software RAID configuration +- Administrative (sudo) access to the server via SSH +- Understanding of RAID and partitions -## Instruções +## Instructions -### Apresentação do conteúdo +When you purchase a new server, you may feel the need to perform a series of tests and actions. One such test could be to simulate a disk failure in order to understand the RAID rebuild process and prepare yourself in case it ever happens. -- [Informações básicas](#basicinformation) -- [Simular uma falha de disco](#diskfailure) - - [Remover o disco defeituoso](#diskremove) -- [Reconstrução do RAID](#raidrebuild) - - [Reconstrução do RAID em modo rescue](#rescuemode) - - [Adicionar o rótulo à partição SWAP (se aplicável)](#swap-partition) - - [Reconstrução do RAID em modo normal](#normalmode) +### Content overview +- [Basic Information](#basicinformation) +- [Simulating a disk failure](#diskfailure) + - [Removing the failed disk](#diskremove) +- [Rebuilding the RAID](#raidrebuild) + - [Rebuilding the RAID in rescue mode](#rescuemode) + - [Adding the label to the SWAP partition (if applicable)](#swap-partition) + - [Rebuilding the RAID in normal mode](#normalmode) -### Informações básicas +### Basic Information -Numa sessão de linha de comandos, introduza o seguinte código para determinar o estado atual do RAID. +In a command line session, type the following code to determine the current RAID status: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -60,11 +61,11 @@ md4 : active raid1 nvme0n1p4[0] nvme1n1p4[1] unused devices: ``` -Este comando indica-nos que dois dispositivos RAID software estão atualmente configurados, sendo **md4** o maior. O dispositivo RAID **md4** é composto por duas partições, denominadas **nvme1n1p4** e **nvme0n1p4**. +This command shows us that we have two software RAID devices currently set up, with **md4** being the largest one. The **md4** RAID device consists of two partitions, which are known as **nvme1n1p4** and **nvme0n1p4**. -O [UU] significa que todos os discos estão a funcionar normalmente. Um `_` indica um disco defeituoso. +The [UU] means that all the disks are working normally. A `_` would indicate a failed disk. -Se tiver um servidor com discos SATA, obterá os seguintes resultados : +If you have a server with SATA disks, you would get the following results: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -81,7 +82,7 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -Embora este comando nos devolva os nossos volumes RAID, não nos indica o tamanho das próprias partições. Podemos encontrar esta informação com o seguinte comando : +Although this command returns our RAID volumes, it doesn't tell us the size of the partitions themselves. We can find this information with the following command: ```sh [user@server_ip ~]# sudo fdisk -l @@ -126,13 +127,13 @@ Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` -O comando `fdisk -l` permite-nos também identificar o tipo de partição. Esta é uma informação importante para reconstruir o seu RAID em caso de falha de um disco. +The `fdisk -l` command also allows you to identify your partition type. This is an important information when it comes to rebuilding your RAID in case of a disk failure. -Para as partições **GPT**, a linha 6 mostrará: `Disklabel type: gpt`. Estas informações só são visíveis quando o servidor está em modo normal. +For **GPT** partitions, line 6 will display: `Disklabel type: gpt`. This information can only been seen when the server is in normal mode. -Ainda com base nos resultados de `fdisk -l`, podemos ver que `/dev/md2` é composto por 888.8GB e `/dev/md4` contém 973.5GB. +Still going by the results of `fdisk -l`, we can see that `/dev/md2` consists of 888.8GB and `/dev/md4` contains 973.5GB. -Alternativamente, o comando `lsblk` oferece uma visão diferente das partições : +Alternatively, the `lsblk` command offers a different view of the partitions: ```sh [user@server_ip ~]# lsblk @@ -155,22 +156,22 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Temos em conta os dispositivos, as partições e os seus pontos de montagem. A partir dos comandos e resultados acima, temos : +We take note of the devices, partitions and their mount points. From the above commands and results, we have: -- Dois arrays RAID: `/dev/md2` e `/dev/md4`. -- Quatro partições fazem parte do RAID com os pontos de montagem: `/` e `/home`. +- Two RAID arrays: `/dev/md2` and `/dev/md4`. +- Four partitions are part of the RAID with the mount points: `/` and `/home`. -### Simular uma falha de disco +### Simulating a disk failure -Agora que dispomos de todas as informações necessárias, podemos simular uma falha de disco e continuar com os testes. Neste exemplo, vamos fazer falhar o disco `sda`. +Now that we have all the necessary information, we can simulate a disk failure and proceed with the tests. In this example, we will fail the disk `sda`. -O meio preferido para isso é o ambiente em modo rescue da OVHcloud. +The preferred way to do this is via the OVHcloud rescue mode environment. -Reinicie primeiro o servidor em modo rescue e faça login com as credenciais fornecidas. +First reboot the server in rescue mode and log in with the provided credentials. -Para remover um disco do RAID, o primeiro passo é marcá-lo como **Failed** e remover as partições dos seus arrays RAID respetivos. +To remove a disk from the RAID, the first step is to mark it as **Failed** and remove the partitions from their respective RAID arrays. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -187,13 +188,13 @@ md4 : active raid1 sda4[0] sdb4[1] unused devices: ``` -A partir da saída acima, sda é composto por duas partições em RAID que são **sda2** e **sda4**. +From the above output, sda consists of two partitions in RAID which are **sda2** and **sda4**. -#### Remover o disco defeituoso +#### Removing the failed disk -Começamos por marcar as partições **sda2** e **sda4** como **failed**. +First we mark the partitions **sda2** and **sda4** as failed. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/sda2 @@ -205,7 +206,7 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md4 - # mdadm: set /dev/sda4 faulty in /dev/md4 ``` -Agora simulámos uma falha do RAID, quando executamos o comando `cat /proc/mdstat`, obtemos o seguinte resultado : +We have now simulated a failure of the RAID, when we run the `cat /proc/mdstat` command, we have the following output: ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -221,9 +222,9 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Como podemos ver acima, o [F] ao lado das partições indica que o disco está defeituoso ou com falha. +As we can see above, the [F] next to the partitions indicates that the disk has failed or is faulty. -Em seguida, removemos estas partições dos arrays RAID. +Next, we remove these partitions from the RAID arrays. ```sh root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/md2 --remove /dev/sda2 @@ -235,7 +236,17 @@ root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --manage /dev/ # mdadm: hot removed /dev/sda4 from /dev/md4 ``` -Para nos certificarmos de que obtemos um disco semelhante a um disco vazio, utilizamos o seguinte comando. Substitua **sda** pelos seus próprios valores : +To make sure that we get a disk that is similar to an empty disk, we use the following command. Replace **sda** with your own values: + +```sh +shred -s10M -n1 /dev/sda1 +shred -s10M -n1 /dev/sda2 +shred -s10M -n1 /dev/sda3 +shred -s10M -n1 /dev/sda4 +shred -s10M -n1 /dev/sda +``` + +The disk now appears as a new, empty drive: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk @@ -250,12 +261,140 @@ sdb 8:16 0 1.8T 0 disk └─md4 9:4 0 973.5G 0 raid1 /home ``` -Se executarmos o seguinte comando, vemos que o +If we run the following command, we see that our disk has been successfully "wiped": + +```sh +parted /dev/sda +GNU Parted 3.5 +Using /dev/sda +Welcome to GNU Parted! Type 'help' to view a list of commands. +(parted) p +Error: /dev/sda: unrecognised disk label +Model: HGST HUS724020AL (SATA) +Disk /dev/sda: 1.8T +Sector size (logical/physical): 512B/512B +Partition Table: unknown +Disk Flags: +``` + +Our RAID status should now look like this: + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat + +Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] +md2 : active raid1 sdb2[0] + 931954688 blocks super 1.2 [1/2] [_U] + bitmap: 2/7 pages [8KB], 65536KB chunk + +md4 : active raid1 sdb4[1] + 1020767232 blocks super 1.2 [1/2] [_U] + bitmap: 0/8 pages [0KB], 65536KB chunk +unused devices: +``` + +From the results above, we can see that only two partitions now appear in the RAID arrays. We have successfully failed the disk **sda** and we can now proceed with the disk replacement. + +For more information on how to prepare and request for a disk replacement, consult this [guide](/pages/bare_metal_cloud/dedicated_servers/disk_replacement) + +If you run the following command, you can have more details on the RAID array(s): + +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 + +/dev/md4: + Version : 1.2 + Creation Time : Tue Jan 24 15:35:02 2023 + Raid Level : raid1 + Array Size : 1020767232 (973.48 GiB 1045.27 GB) + Used Dev Size : 1020767232 (973.48 GiB 1045.27 GB) + Raid Devices : 2 + Total Devices : 1 + Persistence : Superblock is persistent + + Intent Bitmap : Internal + + Update Time : Tue Jan 24 16:28:03 2023 + State : clean, degraded + Active Devices : 1 + Working Devices : 1 + Failed Devices : 0 + Spare Devices : 0 + +Consistency Policy : bitmap + + Name : md4 + UUID : 7b5c1d80:0a7ab4c2:e769b5e5:9c6eaa0f + Events : 21 + + Number Major Minor RaidDevice State + - 0 0 0 removed + 1 8 20 1 active sync /dev/sdb4 +``` + + + +### Rebuilding the RAID + +> [!warning] +> +> For most servers in software RAID, after a disk replacement, the server is able to boot in normal mode (on the healthy disk) to rebuild the RAID. However, if the server is not able to boot in normal mode, it will be rebooted in rescue mode to proceed with the RAID rebuild. +> + + + +#### Rebuilding the RAID in normal mode + +The following steps are performed in normal mode. In our example, we have replaced the disk **sda**. + +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sdb) to the new one (sda). + +> [!tabs] +> **For GPT partitions** +>> +>> ```sh +>> sudo sgdisk -R /dev/sdX /dev/sdX +>> ``` +>> +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk`. +>> +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: +>> +>> ```sh +>> sudo sgdisk -G /dev/sdX +>> ``` +>> +>> If you receive the following message: +>> +>> ```console +>> Warning: The kernel is still using the old partition table. +>> The new table will be used at the next reboot or after you +>> run partprobe(8) or kpartx(8) +>> The operation has completed successfully. +>> ``` +>> +>> You can simply run the `partprobe` command. If you still cannot see the newly-created partitions (e.g. with `lsblk`), you need to reboot the server before continuing. +>> +> **For MBR partitions** +>> +>> ```sh +>> [user@server_ip ~]# sudo sfdisk -d /dev/sdX | sfdisk /dev/sdX +>> ``` +>> +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk`. +>> + +Next, we add the partitions to the RAID: + +```sh +[user@server_ip ~]# sudo mdadm --add /dev/md2 /dev/sda2 +# mdadm: added /dev/sda2 +[user@server_ip ~]# sudo mdadm --add /dev/md4 /dev/sda4 # mdadm: re-added /dev/sda4 ``` -Utilize o seguinte comando para monitorizar a reconstrução do RAID: +Use the following command to monitor the RAID rebuild: ```sh [user@server_ip ~]# cat /proc/mdstat @@ -272,72 +411,97 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Por fim, adicionamos um rótulo e montamos a partição [SWAP] (se aplicável). +Lastly, we add a label and mount the [SWAP] partition (if applicable). -Para adicionar um rótulo à partição SWAP: +To add a label the SWAP partition: ```sh -[user@server_ip ~]# sudo mkswap /dev/sdb4 -L swap-sdb4 +[user@server_ip ~]# sudo mkswap /dev/sda4 -L swap-sda4 ``` -Em seguida, recupere os UUID das duas partições swap: +Next, retrieve the UUIDs of both swap partitions: ```sh [user@server_ip ~]# sudo blkid -s UUID /dev/sda4 +/dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" [user@server_ip ~]# sudo blkid -S UUID /dev/sdb4 +/dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Substituímos o antigo UUID da partição swap (**sda4**) pelo novo em `/etc/fstab`: +We replace the old UUID of the swap partition (**sda4**) with the new one in `/etc/fstab`. + +Example: ```sh [user@server_ip ~]# sudo nano etc/fstab + +UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 +UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 +LABEL=BIOS /boot vfat defaults 0 1 +UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 +UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Certifique-se de substituir o UUID correto. +Based on the above results, the old UUID is `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` and should be replaced with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the coorect UUID. -Em seguida, recarregue o sistema com o seguinte comando: +Next, we verify that everything is properly mounted with the following command: ```sh -[user@server_ip ~]# sudo systemctl daemon-reload +[user@server_ip ~]# sudo mount -av +/ : ignored +/boot : successfully mounted +/boot/efi : successfully mounted +swap : ignored +swap : ignored ``` -Execute o seguinte comando para ativar a partição swap: +Run the following command to enable the swap partition: ```sh [user@server_ip ~]# sudo swapon -av ``` -A reconstrução do RAID está agora concluída. +Then reload the system with the following command: + +```sh +[user@server_ip ~]# sudo systemctl daemon-reload +``` + +We have now successfully completed the RAID rebuild. -/// details | **Reconstrução do RAID no modo rescue** +/// details | **Rebuilding the RAID in rescue mode** + +If you server is unable to reboot in normal mode after a disk replacement, it will be rebooted in rescue mode. + +In this example, we are replacing the disk `sdb`. -Depois de substituir o disco, devemos copiar a tabela de partições do disco saudável (neste exemplo, sda) para o novo (sdb). +Once the disk has been replaced, we need to copy the partition table from the healthy disk (in this example, sda) to the new one (sdb). > [!tabs] -> **Para as partições GPT** +> **For GPT partitions** >> >> ```sh >> root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/sdX /dev/sdX >> ``` >> ->> O comando deve estar no seguinte formato: `sgdisk -R /dev/novo disco /dev/disco saudável` +>> The command should be in this format: `sgdisk -R /dev/newdisk /dev/healthydisk` >> ->> Exemplo: +>> Example: >> >> ```sh >> sudo sgdisk -R /dev/sdb /dev/sda >> ``` >> ->> Depois de realizar esta operação, o passo seguinte consiste em atribuir um GUID aleatório ao novo disco para evitar quaisquer conflitos com os GUID de outros discos: +>> Once this is done, the next step is to randomize the GUID of the new disk to prevent GUID conflicts with other disks: >> >> ```sh >> sudo sgdisk -G /dev/sdb >> ``` >> ->> Se aparecer a seguinte mensagem: ->> +>> If you the following message: +>> >> ```console >> Warning: The kernel is still using the old partition table. >> The new table will be used at the next reboot or after you @@ -345,27 +509,28 @@ Depois de substituir o disco, devemos copiar a tabela de partições do disco sa >> The operation has completed successfully. >> ``` >> ->> Pode simplesmente executar o comando `partprobe`. Se não conseguir ver as novas partições criadas (por exemplo, com `lsblk`), terá de reiniciar o servidor antes de continuar. +>> You can simply run the `partprobe` command. >> -> **Para as partições MBR** +> **For MBR partitions** >> >> ```sh >> sudo sfdisk -d /dev/sda | sfdisk /dev/sdb >> ``` >> ->> O comando deve estar no seguinte formato: `sfdisk -d /dev/disco saudável | sfdisk /dev/novo disco` +>> The command should be in this format: `sfdisk -d /dev/healthydisk | sfdisk /dev/newdisk` >> -Agora podemos reconstruir a matriz RAID. O seguinte código mostra como adicionar as novas partições (sdb2 e sdb4) na matriz RAID. +We can now rebuild the RAID array. The following code snippet shows how we can add the new partitions (sdb2 and sdb4) back in the RAID array. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md2 /dev/sdb2 # mdadm: added /dev/sdb2 + root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sudo mdadm --add /dev/md4 /dev/sdb4 # mdadm: re-added /dev/sdb4 ``` -Utilize o comando `cat /proc/mdstat` para monitorizar a reconstrução do RAID: +Use the `cat /proc/mdstat` command to monitor the RAID rebuild: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat @@ -382,7 +547,7 @@ md4 : active raid1 sda4[0](F) sdb4[1] unused devices: ``` -Para mais detalhes sobre a(s) baia(s) RAID: +For more details on the RAID array(s): ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 @@ -412,30 +577,30 @@ root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md4 Events : 0.95 Number Major Minor RaidDevice State - 0 8 2 0 spare rebuilding /dev/sda4 - 1 8 18 1 active sync /dev/sdb4 + 0 8 2 0 active sync /dev/sda4 + 1 8 18 1 spare rebuilding /dev/sdb4 ``` -#### Adição do rótulo à partição SWAP (se aplicável) +#### Adding the label to the SWAP partition (if applicable) -Depois de concluir a reconstrução do RAID, montamos a partição que contém a raiz do nosso sistema operativo em `/mnt`. No nosso exemplo, esta partição é `md4`. +Once the RAID rebuild is complete, we mount the partition containing the root of our operating system on `/mnt`. In our example, that partition is `md4`. ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md4 /mnt ``` -Adicionamos o rótulo à nossa partição swap com o seguinte comando: +We add the label to our swap partition with the command: ```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sda4 -L swap-sda4 -mkswap: /dev/sda4: warning: wiping old swap signature. +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/sdb4 -L swap-sdb4 +mkswap: /dev/sdb4: warning: wiping old swap signature. Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd +LABEL=swap-sdb4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd ``` -Em seguida, montamos os seguintes diretórios para garantir que todas as operações que realizamos no ambiente chroot funcionem corretamente: +Next, we mount the following directories to make sure any manipulation we make in the chroot environment works properly: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # @@ -448,38 +613,35 @@ mount --bind /run /mnt/run mount --make-slave /mnt/run ``` -Agora, acedemos ao ambiente `chroot`: +Next, we access the `chroot` environment: ```sh root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt ``` -Recuperamos os UUID das duas partições swap: +We retrieve the UUIDs of both swap partitions: ```sh root@rescue12-customer-eu:/# blkid -s UUID /dev/sda4 root@rescue12-customer-eu:/# blkid -s UUID /dev/sdb4 ``` -Exemplo: +Example: ```sh blkid /dev/sda4 /dev/sda4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -``` - -```sh blkid /dev/sdb4 /dev/sdb4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" ``` -Em seguida, substituímos o antigo UUID da partição swap (**sdb4**) pelo novo em `/etc/fstab`: +Next, we replace the old UUID of the swap partition (**sdb4**) with the new one in `/etc/fstab`: ```sh root@rescue12-customer-eu:/# nano etc/fstab ``` -Exemplo: +Example: ```sh UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 @@ -488,9 +650,9 @@ UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 ``` -Certifique-se de substituir o UUID correto. No nosso exemplo acima, o UUID a substituir é `d6af33cf-fc15-4060-a43c-cb3b5537f58a` pelo novo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Certifique-se de substituir o UUID correto. +Make sure you replace the proper UUID. In our example above, the UUID to replace is `d6af33cf-fc15-4060-a43c-cb3b5537f58a` with the new one `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. Make sure you replace the correct UUID. -Em seguida, verificamos que tudo está corretamente montado: +Next, we make sure everything is properly mounted: ```sh root@rescue12-customer-eu:/# mount -av @@ -500,13 +662,7 @@ swap : ignored swap : ignored ``` -Recarregue o sistema com o seguinte comando: - -```sh -root@rescue12-customer-eu:/# systemctl daemon-reload -``` - -Ative a partição swap com o seguinte comando: +Activate the swap partition the following command: ```sh root@rescue12-customer-eu:/# swapon -av @@ -519,29 +675,34 @@ swapon: /dev/sdb4: pagesize=4096, swapsize=536870912, devsize=536870912 swapon /dev/sdb4 ``` -Saia do ambiente Chroot com `exit` e desmonte todos os discos: +We exit the `chroot` environment with exit and reload the system: ```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # systemctl daemon-reload ``` -Concluímos com sucesso a reconstrução do RAID no servidor e agora podemos reiniciá-lo no modo normal. +We umount all the disks: +```sh +root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount -R /mnt +``` -## Quer saber mais? +We have now successfully completed the RAID rebuild on the server and we can now reboot it in normal mode. -[Remplacement à chaud - RAID logiciel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) +## Go Further -[API OVHcloud e Armazenamento](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) +[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) -[Gestão do RAID físico](/pages/bare_metal_cloud/dedicated_servers/raid_hard) +[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) -[Remplacement à chaud - RAID Matériel](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) +[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) -Para serviços especializados (referênciação, desenvolvimento, etc), contacte os [parceiros OVHcloud](/links/partner). +[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) -Se desejar beneficiar de assistência no uso e configuração das suas soluções OVHcloud, consulte as nossas diferentes [ofertas de suporte](/links/support). +For specialised services (SEO, development, etc.), contact [OVHcloud partners](/links/partner). + +If you would like assistance using and configuring your OVHcloud solutions, please refer to our [support offers](/links/support). -Se precisar de formação ou assistência técnica para a implementação das nossas soluções, contacte o seu contacto comercial ou clique [neste link](/links/professional-services) para obter um orçamento e solicitar uma análise personalizada do seu projeto aos nossos especialistas da equipa Professional Services. +If you need training or technical assistance to implement our solutions, contact your sales representative or click on [this link](/links/professional-services) to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project. -Fale com a nossa [comunidade de utilizadores](/links/community). \ No newline at end of file +Join our [community of users](/links/community). diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md deleted file mode 100644 index 641b711a60a..00000000000 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.de-de.md +++ /dev/null @@ -1,849 +0,0 @@ ---- -title: Verwalten und Neuaufbauen von Software-RAID auf Servern mit UEFI-Boot-Modus -excerpt: Erfahren Sie, wie Sie Software-RAID nach einem Wechsel der Festplatte auf einem Server mit UEFI-Boot-Modus verwalten und neu aufbauen können -updated: 2025-12-11 ---- - -## Ziel - -Ein Redundanter Array unabhängiger Festplatten (RAID) ist eine Technologie, die den Datenverlust auf einem Server durch die Replikation von Daten auf zwei oder mehr Festplatten minimiert. - -Die Standard-RAID-Ebene für OVHcloud-Serverinstallationen ist RAID 1, wodurch der von Ihren Daten belegte Platz verdoppelt wird, was effektiv den nutzbaren Festplattenplatz halbiert. - -**Dieses Handbuch erklärt, wie Sie Software-RAID nach einem Festplattentausch auf einem Server mit UEFI-Boot-Modus verwalten und neu aufbauen können** - -Bevor wir beginnen, beachten Sie bitte, dass dieses Handbuch sich auf dedizierte Server konzentriert, die den UEFI-Boot-Modus verwenden. Dies ist bei modernen Motherboards der Fall. Wenn Ihr Server den Legacy-Boot-Modus (BIOS) verwendet, konsultieren Sie bitte dieses Handbuch: [Verwalten und Neuaufbauen von Software-RAID auf Servern im Legacy-Boot-Modus (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). - -Um zu prüfen, ob ein Server im Legacy-BIOS-Modus oder im UEFI-Boot-Modus läuft, führen Sie den folgenden Befehl aus: - -```sh -[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS -``` - -Weitere Informationen zu UEFI finden Sie in diesem [Artikel](https://uefi.org/about). - -## Voraussetzungen - -- Ein [dedizierter Server](/links/bare-metal/bare-metal) mit Software-RAID-Konfiguration -- Administrative (sudo-)Zugriffsrechte auf den Server über SSH -- Grundkenntnisse zu RAID, Partitionen und GRUB - -Im Laufe dieses Handbuchs verwenden wir die Begriffe **primäre Festplatte** und **sekundäre Festplatte**. In diesem Zusammenhang: - -- Die primäre Festplatte ist die Festplatte, deren ESP (EFI-Systempartition) von Linux eingehängt wird -- Die sekundäre(n) Festplatte(n) sind alle anderen Festplatten im RAID - -## In der praktischen Anwendung - -Wenn Sie einen neuen Server erwerben, können Sie sich möglicherweise dazu entschließen, eine Reihe von Tests und Aktionen durchzuführen. Ein solcher Test könnte darin bestehen, einen Festplattenausfall zu simulieren, um den RAID-Wiederherstellungsprozess zu verstehen und sich darauf vorzubereiten, falls dies jemals tatsächlich eintritt. - -### Inhaltsübersicht - -- [Grundlegende Informationen](#basicinformation) -- [Verständnis der EFI-Systempartition (ESP)](#efisystemparition) -- [Simulieren eines Festplattenausfalls](#diskfailure) - - [Entfernen der defekten Festplatte](#diskremove) -- [Neuaufbau des RAIDs](#raidrebuild) - - [Neuaufbau des RAIDs nach Austausch der Hauptfestplatte (Rettungsmodus)](#rescuemode) - - [Neuanlegen der EFI-Systempartition](#recreateesp) - - [Neuaufbau des RAIDs, wenn die EFI-Partitionen nach wichtigen Systemaktualisierungen (z. B. GRUB) nicht synchronisiert sind](efiraodgrub) - - [Hinzufügen der Bezeichnung zur SWAP-Partition (falls zutreffend)](#swap-partition) - - [Neuaufbau des RAIDs im normalen Modus](#normalmode) - - - -### Grundlegende Informationen - -In einer Befehlszeilensitzung geben Sie den folgenden Code ein, um den aktuellen RAID-Status zu ermitteln: - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 2/4 pages [8KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Dieser Befehl zeigt uns, dass wir derzeit zwei Software-RAID-Geräte konfiguriert haben, **md2** und **md3**, wobei **md3** das größere der beiden ist. **md3** besteht aus zwei Partitionen, genannt **nvme1n1p3** und **nvme0n1p3**. - -Die [UU] bedeutet, dass alle Festplatten normal funktionieren. Ein `_` würde eine defekte Festplatte anzeigen. - -Wenn Sie einen Server mit SATA-Festplatten haben, erhalten Sie die folgenden Ergebnisse: - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 sda3[0] sdb3[1] - 3904786432 blocks super 1.2 [2/2] [UU] - bitmap: 2/30 pages [8KB], 65536KB chunk - -md2 : active raid1 sda2[0] sdb2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Obwohl dieser Befehl unsere RAID-Volumes zurückgibt, sagt er uns nicht die Größe der Partitionen selbst. Wir können diese Informationen mit dem folgenden Befehl finden: - -```sh -[user@server_ip ~]# sudo fdisk -l - -Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 - -Device Start End Sectors Size Type -/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files - - -Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 - -Device Start End Sectors Size Type -/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file -/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file - - -Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes - - -Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -``` - -Der Befehl `fdisk -l` erlaubt es Ihnen auch, den Typ Ihrer Partition zu identifizieren. Dies ist eine wichtige Information, wenn es darum geht, Ihr RAID bei einem Festplattenausfall wiederherzustellen. - -Für **GPT**-Partitionen wird in Zeile 6 angezeigt: `Disklabel type: gpt`. - -Trotz der Ergebnisse von `fdisk -l` können wir sehen, dass `/dev/md2` aus 1022 MiB besteht und `/dev/md3` 474,81 GiB enthält. Wenn wir den Befehl `mount` ausführen, können wir auch die Struktur der Festplatte ermitteln. - -Alternativ bietet der Befehl `lsblk` eine andere Ansicht der Partitionen: - -```sh -[user@server_ip ~]# lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:7 0 511M 0 part -├─nvme1n1p2 259:8 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme1n1p3 259:9 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -└─nvme1n1p4 259:10 0 512M 0 part [SWAP] -nvme0n1 259:1 0 476.9G 0 disk -├─nvme0n1p1 259:2 0 511M 0 part /boot/efi -├─nvme0n1p2 259:3 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme0n1p3 259:4 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -├─nvme0n1p4 259:5 0 512M 0 part [SWAP] -└─nvme0n1p5 259:6 0 2M 0 part -``` - -Außerdem erhalten wir mit `lsblk -f` weitere Informationen zu diesen Partitionen, wie z. B. die Bezeichnung und UUID: - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] -nvme0n1 -├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi -├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] -└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 -``` - -Notieren Sie sich die Geräte, Partitionen und ihre Einhängepunkte; dies ist besonders wichtig, nachdem Sie eine Festplatte ersetzt haben. - -Aus den oben genannten Befehlen und Ergebnissen haben wir: - -- Zwei RAID-Arrays: `/dev/md2` und `/dev/md3`. -- Vier Partitionen, die zum RAID gehören: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** mit den Einhängepunkten `/boot` und `/`. -- Zwei Partitionen, die nicht zum RAID gehören, mit Einhängepunkten: `/boot/efi` und [SWAP]. -- Eine Partition, die keinen Einhängepunkt hat: **nvme1n1p1** - -Die Partition **nvme0n1p5** ist eine Konfigurationspartition, d. h. ein schreibgeschütztes Volume, das mit dem Server verbunden ist und diesem die Anfangskonfigurationsdaten bereitstellt. - - - -### Verständnis der EFI-Systempartition (ESP) - -***Was ist eine EFI-Systempartition?*** - -Eine EFI-Systempartition ist eine Partition, die die Bootloader, Bootmanager oder Kernels eines installierten Betriebssystems enthalten kann. Sie kann auch Systemhilfeprogramme enthalten, die vor dem Start des Betriebssystems ausgeführt werden sollen, sowie Datendateien wie Fehlerprotokolle. - -***Wird die EFI-Systempartition in einem RAID gespiegelt?*** - -Nein, Stand August 2025, wenn die Installation des Betriebssystems von OVHcloud durchgeführt wird, ist die ESP nicht im RAID enthalten. Wenn Sie unsere Betriebssystemvorlagen verwenden, um Ihren Server mit Software-RAID zu installieren, werden mehrere EFI-Systempartitionen erstellt: eine pro Festplatte. Allerdings wird nur eine EFI-Partition gleichzeitig eingehängt. Alle ESPs, die zum Zeitpunkt der Installation erstellt wurden, enthalten die gleichen Dateien. - -Die EFI-Systempartition wird unter `/boot/efi` eingehängt und die Festplatte, auf der sie eingehängt ist, wird vom Linux-System beim Start ausgewählt. - -Beispiel: - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n - -while read -r partition; do - if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then - continue - fi - echo "Working on ${partition}" - mount "${partition}" "${MOUNTPOINT}" - rsync -ax "/boot/efi/" "${MOUNTPOINT}/" - umount "${MOUNTPOINT}" -done < <(blkid -o device -t LABEL=EFI_SYSPART) -``` - -Speichern Sie die Datei und beenden Sie den Editor. - -- Machen Sie das Skript ausführbar - -```sh -sudo chmod +x script-name.sh -``` - -- Führen Sie das Skript aus - -```sh -sudo ./script-name.sh -``` - -- Wenn Sie sich nicht im richtigen Verzeichnis befinden - -```sh -./path/to/folder/script-name.sh -``` - -Wenn das Skript ausgeführt wird, werden die Inhalte der eingehängten EFI-Partition mit den anderen synchronisiert. Um auf den Inhalt zuzugreifen, können Sie eine dieser nicht eingehängten EFI-Partitionen am Einhängepunkt `/var/lib/grub/esp` einhängen. - - - -### Simulieren eines Festplattenausfalls - -Nachdem wir nun alle notwendigen Informationen haben, können wir einen Festplattenausfall simulieren und die Tests durchführen. In diesem ersten Beispiel simulieren wir den Ausfall der primären Festplatte `nvme0n1`. - -Die bevorzugte Methode hierzu ist die Nutzung des Rescue-Modus der OVHcloud. - -Starten Sie zunächst den Server im Rescue-Modus neu und melden Sie sich mit den bereitgestellten Anmeldeinformationen an. - -Um eine Festplatte aus dem RAID zu entfernen, ist der erste Schritt, sie als **fehlerhaft** zu markieren und die Partitionen aus ihren jeweiligen RAID-Arrays zu entfernen. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Aus der obigen Ausgabe ergibt sich, dass `nvme0n1` aus zwei Partitionen besteht, die sich im RAID befinden, nämlich **nvme0n1p2** und **nvme0n1p3**. - - - -#### Entfernen der fehlerhaften Festplatte - -Zunächst markieren wir die Partitionen **nvme0n1p2** und **nvme0n1p3** als fehlerhaft. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 -# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 -# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 -``` - -Wenn wir den Befehl `cat /proc/mdstat` ausführen, erhalten wir die folgende Ausgabe: - -```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Wie oben zu sehen ist, zeigt das [F] neben den Partitionen an, dass die Festplatte fehlerhaft oder defekt ist. - -Als nächstes entfernen wir diese Partitionen aus den RAID-Arrays, um die Festplatte vollständig aus dem RAID zu entfernen. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 -# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 -# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 -``` - -Der Status unseres RAIDs sollte nun wie folgt aussehen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Aus den oben genannten Ergebnissen können wir erkennen, dass nun nur noch zwei Partitionen in den RAID-Arrays erscheinen. Wir haben die Festplatte **nvme0n1** erfolgreich als fehlerhaft markiert. - -Um sicherzustellen, dass wir eine Festplatte erhalten, die einem leeren Laufwerk ähnelt, verwenden wir den folgenden Befehl auf jeder Partition und anschließend auf der Festplatte: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -shred -s10M -n1 /dev/nvme0n1p1 -shred -s10M -n1 /dev/nvme0n1p2 -shred -s10M -n1 /dev/nvme0n1p3 -shred -s10M -n1 /dev/nvme0n1p4 -shred -s10M -n1 /dev/nvme0n1p5 -shred -s10M -n1 /dev/nvme0n1 -``` - -Die Festplatte erscheint nun als neues, leeres Laufwerk: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -``` - -Wenn wir den folgenden Befehl ausführen, sehen wir, dass unsere Festplatte erfolgreich „gelöscht“ wurde: - -```sh -parted /dev/nvme0n1 -GNU Parted 3.5 -Using /dev/nvme0n1 -Welcome to GNU Parted! Type 'help' to view a list of commands. -(parted) p -Error: /dev/nvme0n1: unrecognised disk label -Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) -Disk /dev/nvme0n1: 512GB -Sector size (logical/physical): 512B/512B -Partition Table: unknown -Disk Flags: -``` - -Weitere Informationen zum Vorbereiten und Anfordern eines Festplattentauschs finden Sie in diesem [Leitfaden](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). - -Wenn Sie den folgenden Befehl ausführen, erhalten Sie weitere Details zu den RAID-Arrays: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 - -/dev/md3: - Version : 1.2 - Creation Time : Fri Aug 1 14:51:13 2025 - Raid Level : raid1 - Array Size : 497875968 (474.81 GiB 509.82 GB) - Used Dev Size : 497875968 (474.81 GiB 509.82 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent - - Intent Bitmap : Internal - - Update Time : Fri Aug 1 15:56:17 2025 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 - -Consistency Policy : bitmap - - Name : md3 - UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff - Events : 215 - - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 259 4 1 active sync /dev/nvme1n1p3 -``` - -Wir können nun mit dem Festplattentausch fortfahren. - - - -### Neuaufbauen des RAIDs - -> [!primary] -> Dieser Prozess kann je nach installiertem Betriebssystem auf Ihrem Server variieren. Wir empfehlen Ihnen, die offizielle Dokumentation Ihres Betriebssystems zu konsultieren, um auf die richtigen Befehle zugreifen zu können. -> - -> [!warning] -> -> Bei den meisten Servern mit Software-RAID ist es nach einem Festplattentausch möglich, dass der Server im normalen Modus (auf der gesunden Festplatte) startet und das Neuaufbauen des RAIDs im normalen Modus durchgeführt werden kann. Wenn der Server nach einem Festplattentausch nicht im normalen Modus starten kann, wird er im Rescue-Modus neu gestartet, um das RAID-Neuaufbauen fortzusetzen. -> -> Wenn Ihr Server nach dem Festplattentausch im normalen Modus starten kann, führen Sie einfach die Schritte aus [diesem Abschnitt](#rebuilding-the-raid-in-normal-mode) aus. - - - -#### Neuaufbauen des RAIDs im Rescue-Modus - -Nachdem die Festplatte ersetzt wurde, ist der nächste Schritt, die Partitionstabelle von der gesunden Festplatte (in diesem Beispiel `nvme1n1`) auf die neue (`nvme0n1`) zu kopieren. - -**Für GPT-Partitionen** - -Der Befehl sollte in diesem Format lauten: `sgdisk -R /dev/new disk /dev/healthy disk` - -In unserem Beispiel: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 -``` - -Führen Sie `lsblk` aus, um sicherzustellen, dass die Partitionstabellen ordnungsgemäß kopiert wurden: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -├─nvme0n1p1 259:10 0 511M 0 part -├─nvme0n1p2 259:11 0 1G 0 part -├─nvme0n1p3 259:12 0 474.9G 0 part -└─nvme0n1p4 259:13 0 512M 0 part -``` - -Sobald dies erledigt ist, ist der nächste Schritt, die GUID der neuen Festplatte zu randomisieren, um Konflikte mit anderen Festplatten zu vermeiden: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 -``` - -Wenn Sie die folgende Meldung erhalten: - -```console -Warning: The kernel is still using the old partition table. -The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) -The operation has completed successfully. -``` - -Führen Sie einfach den Befehl `partprobe` aus. - -Wir können nun das RAID-Array neu aufbauen. Der folgende Codeausschnitt zeigt, wie die neuen Partitionen (nvme0n1p2 und nvme0n1p3) in das RAID-Array zurückgefügt werden können. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 -# mdadm: added /dev/nvme0n1p2 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 -``` - -# mdadm: /dev/nvme0n1p3 wurde wieder hinzugefügt -``` - -Um den Rebuild-Prozess zu prüfen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - [>....................] recovery = 0,1% (801920/497875968) finish=41,3min speed=200480K/sec - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] -``` - -Sobald der RAID-Rebuild abgeschlossen ist, führen Sie den folgenden Befehl aus, um sicherzustellen, dass die Partitionen ordnungsgemäß dem RAID hinzugefügt wurden: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 -├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 -nvme0n1 -├─nvme0n1p1 -├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme0n1p4 -``` - -Basierend auf den oben genannten Ergebnissen wurden die Partitionen auf der neuen Festplatte korrekt dem RAID hinzugefügt. Allerdings wurden die EFI-Systempartition und die SWAP-Partition (in einigen Fällen) nicht dupliziert, was normal ist, da sie nicht in das RAID einbezogen werden. - -> [!warning] -> Die oben genannten Beispiele illustrieren lediglich die notwendigen Schritte anhand einer Standardserverkonfiguration. Die Informationen in der Ausgabetabelle hängen von der Hardware Ihres Servers und seinem Partitionsschema ab. Bei Unsicherheiten konsultieren Sie bitte die Dokumentation Ihres Betriebssystems. -> -> Wenn Sie professionelle Unterstützung bei der Serververwaltung benötigen, beachten Sie bitte die Details im Abschnitt [Weiterführende Informationen](#go-further) dieses Leitfadens. -> - - - -#### Wiederherstellen der EFI-Systempartition - -Um die EFI-Systempartition zu wiederherstellen, müssen wir **nvme0n1p1** formatieren und anschließend den Inhalt der gesunden Partition (in unserem Beispiel: nvme1n1p1) darauf kopieren. - -Wir gehen davon aus, dass beide Partitionen synchronisiert wurden und aktuelle Dateien enthalten. - -> [!warning] -> Falls es eine große Systemaktualisierung gab, z. B. Kernel oder GRUB, und beide Partitionen nicht synchronisiert wurden, beachten Sie bitte nach Abschluss der Erstellung der neuen EFI-Systempartition diesen [Abschnitt](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub). -> - -Zunächst formatieren wir die Partition: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 -``` - -Anschließend versehen wir die Partition mit dem Label `EFI_SYSPART` (dieser Name ist spezifisch für OVHcloud): - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Nun kopieren wir den Inhalt von nvme1n1p1 auf nvme0n1p1. Zunächst erstellen wir zwei Ordner, die wir im Beispiel „old“ und „new“ nennen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new -``` - -Anschließend mounten wir **nvme1n1p1** im Ordner „old“ und **nvme0n1p1** im Ordner „new“, um den Unterschied zu verdeutlichen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new -``` - -Nun kopieren wir die Dateien vom Ordner „old“ in den Ordner „new“: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ -sending incremental file list -EFI/ -EFI/debian/ -EFI/debian/BOOTX64.CSV -EFI/debian/fbx64.efi -EFI/debian/grub.cfg -EFI/debian/grubx64.efi -EFI/debian/mmx64.efi -EFI/debian/shimx64.efi - -sent 6.099.848 bytes received 165 bytes 12.200.026,00 bytes/sec -total size is 6.097.843 speedup is 1,00 -``` - -Sobald dies abgeschlossen ist, trennen wir beide Partitionen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 -``` - -Nun mounten wir die Partition, die die Wurzel unseres Betriebssystems enthält, auf `/mnt`. In unserem Beispiel ist dies die Partition **md3**. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt -``` - -Wir mounten die folgenden Ordner, um sicherzustellen, dass alle Manipulationen im `chroot`-Umgebung ordnungsgemäß funktionieren: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -mount --types proc /proc /mnt/proc -mount --rbind /sys /mnt/sys -mount --make-rslave /mnt/sys -mount --rbind /dev /mnt/dev -mount --make-rslave /mnt/dev -mount --bind /run /mnt/run -mount --make-slave /mnt/run -``` - -Nun verwenden wir den Befehl `chroot`, um auf den Mount-Punkt zuzugreifen und sicherzustellen, dass die neue EFI-Systempartition ordnungsgemäß erstellt wurde und das System beide ESPs erkennt: - -```sh -root@rescue12-customer-eu:/# chroot /mnt -``` - -Um die ESP-Partitionen anzuzeigen, führen wir den Befehl `blkid -t LABEL=EFI_SYSPART` aus: - -```sh -root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Die oben genannten Ergebnisse zeigen, dass die neue EFI-Partition ordnungsgemäß erstellt wurde und das Label korrekt angewendet wurde. - - - -#### RAID neu aufbauen, wenn EFI-Partitionen nach größeren Systemaktualisierungen (GRUB) nicht synchronisiert sind - -/// details | Diesen Abschnitt ausklappen - -> [!warning] -> Bitte folgen Sie nur den Schritten in diesem Abschnitt, wenn sie auf Ihren Fall zutreffen. -> - -Wenn die EFI-Systempartitionen nach größeren Systemaktualisierungen, die GRUB modifizieren oder beeinflussen, nicht synchronisiert sind und die primäre Festplatte, auf der die Partition montiert ist, ersetzt wurde, kann das Starten von einer sekundären Festplatte mit einer veralteten ESP nicht funktionieren. - -In diesem Fall müssen Sie neben dem Neuaufbauen des RAIDs und dem Wiederherstellen der EFI-Systempartition im Rescue-Modus auch GRUB darauf neu installieren. - -Sobald wir die EFI-Partition wiederhergestellt und sichergestellt haben, dass das System beide Partitionen erkennt (vorige Schritte in `chroot`), erstellen wir den Ordner `/boot/efi`, um die neue EFI-Systempartition **nvme0n1p1** zu mounten: - -```sh -root@rescue12-customer-eu:/# mount /boot -root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi -``` - -Anschließend installieren wir den GRUB-Bootloader erneut: - -```sh -root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 -``` - -Sobald dies abgeschlossen ist, führen Sie den folgenden Befehl aus: - -```sh -root@rescue12-customer-eu:/# update-grub -``` -/// - - - -#### Label zur SWAP-Partition hinzufügen (falls zutreffend) - -Nachdem wir die EFI-Partition abgeschlossen haben, wechseln wir zur SWAP-Partition. - -Wir verlassen die `chroot`-Umgebung mit `exit`, um unsere [SWAP]-Partition **nvme0n1p4** zu erstellen und das Label `swap-nvme0n1p4` hinzuzufügen: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd -``` - -Wir prüfen, ob das Label ordnungsgemäß angewendet wurde: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 - -├─nvme1n1p1 -│ vfat FAT16 EFI_SYSPART -│ BA77-E844 504,9M 1% /root/old -├─nvme1n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme1n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441,2G 0% /mnt -└─nvme1n1p4 - swap 1 swap-nvme1n1p4 - d6af33cf-fc15-4060-a43c-cb3b5537f58a -nvme0n1 - -├─nvme0n1p1 -│ vfat FAT16 EFI_SYSPART -│ 477D-6658 -├─nvme0n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme0n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441,2G 0% /mnt -└─nvme0n1p4 - swap 1 swap-nvme0n1p4 - b3c9e03a-52f5-4683-81b6-cc10091 - -# mdadm: /dev/nvme0n1p3 erneut hinzugefügt -``` - -Verwenden Sie den folgenden Befehl, um den RAID-Neuaufbau zu verfolgen: `cat /proc/mdstat`. - -**Erstellen der EFI-Systempartition auf der Festplatte** - -Zunächst installieren wir die erforderlichen Tools: - -**Debian und Ubuntu** - -```sh -[user@server_ip ~]# sudo apt install dosfstools -``` - -**CentOS** - -```sh -[user@server_ip ~]# sudo yum install dosfstools -``` - -Als nächstes formatieren wir die Partition. In unserem Beispiel `nvme0n1p1`: - -```sh -[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 -``` - -Als nächstes versehen wir die Partition mit dem Label `EFI_SYSPART` (dieser Name ist spezifisch für OVHcloud) - -```sh -[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Sobald dies abgeschlossen ist, können Sie beide Partitionen mithilfe des von uns bereitgestellten Skripts [hier](#script) synchronisieren. - -Wir prüfen, ob die neue EFI-Systempartition ordnungsgemäß erstellt wurde und vom System erkannt wird: - -```sh -[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Zuletzt aktivieren wir die [SWAP]-Partition (sofern zutreffend): - - -- Wir erstellen und fügen das Label hinzu: - -```sh -[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -``` - -- Wir rufen die UUIDs beider Swap-Partitionen ab: - -```sh -[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 -/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 -/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" -``` - -- Wir ersetzen die alte UUID der Swap-Partition (**nvme0n1p4)** durch die neue in `/etc/fstab`: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -``` - -Beispiel: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 -UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 -LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 -UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 -UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 -``` - -Basierend auf den obigen Ergebnissen ist die alte UUID `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` und sollte durch die neue `b3c9e03a-52f5-4683-81b6-cc10091fcd15` ersetzt werden. - -Stellen Sie sicher, dass Sie die richtige UUID ersetzen. - -Als nächstes führen wir den folgenden Befehl aus, um die Swap-Partition zu aktivieren: - -```sh -[user@server_ip ~]# sudo swapon -av -swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme0n1p4 -swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme1n1p4 -``` - -Als nächstes laden wir das System neu: - -```sh -[user@server_ip ~]# sudo systemctl daemon-reload -``` - -Wir haben nun erfolgreich den RAID-Neuaufbau abgeschlossen. - -## Weiterführende Informationen - -[Hot Swap - Software-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) - -[OVHcloud API und Speicher](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) - -[Verwalten von Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) - -[Hot Swap - Hardware-RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) - -Für spezialisierte Dienstleistungen (SEO, Entwicklung usw.) wenden Sie sich an [OVHcloud Partner](/links/partner). - -Wenn Sie bei der Nutzung und Konfiguration Ihrer OVHcloud-Lösungen Unterstützung benötigen, wenden Sie sich bitte an unsere [Support-Angebote](/links/support). - -Wenn Sie Schulungen oder technische Unterstützung benötigen, um unsere Lösungen umzusetzen, wenden Sie sich an Ihren Vertriebsmitarbeiter oder klicken Sie auf [diesen Link](/links/professional-services), um ein Angebot anzufordern und unsere Experten für Professional Services um Unterstützung bei Ihrem spezifischen Anwendungsfall zu bitten. - -Treten Sie unserer [User Community](/links/community) bei. \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md deleted file mode 100644 index 3ff2210a88b..00000000000 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.es-es.md +++ /dev/null @@ -1,908 +0,0 @@ ---- -title: "Gestión y reconstrucción de un RAID software en servidores que utilizan el modo de arranque UEFI" -excerpt: Aprenda a gestionar y reconstruir un RAID software tras un reemplazo de disco en un servidor que utiliza el modo de arranque UEFI -updated: 2025-12-11 ---- - -## Objetivo - -Un Redundant Array of Independent Disks (RAID) es una tecnología que atenúa la pérdida de datos en un servidor al replicar los datos en dos discos o más. - -El nivel RAID predeterminado para las instalaciones de servidores de OVHcloud es el RAID 1, que duplica el espacio ocupado por sus datos, reduciendo así el espacio de disco utilizable a la mitad. - -**Este tutorial explica cómo gestionar y reconstruir un RAID software tras un reemplazo de disco en su servidor en modo EFI** - -Antes de comenzar, tenga en cuenta que este tutorial se centra en los servidores dedicados que utilizan el modo UEFI como modo de arranque. Este es el caso de las placas base modernas. Si su servidor utiliza el modo de arranque legacy (BIOS), consulte este tutorial: [Gestión y reconstrucción de un RAID software en servidores en modo de arranque legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). - -Para verificar si un servidor funciona en modo BIOS legacy o en modo UEFI, ejecute el siguiente comando: - -```sh -[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS -``` - -Para obtener más información sobre UEFI, consulte el siguiente artículo: [https://uefi.org/about](https://uefi.org/about). - -## Requisitos - -- Un [servidor dedicado](/links/bare-metal/bare-metal) con una configuración de RAID software -- Acceso administrativo (sudo) al servidor a través de SSH -- Comprensión del RAID, las particiones y GRUB - -Durante este tutorial, utilizamos los términos **disco principal** y **disco secundario**. En este contexto: - -- El disco principal es el disco cuya ESP (partición del sistema EFI) está montada por Linux -- Los discos secundarios son todos los demás discos del RAID - -## Instrucciones - -Cuando adquiere un nuevo servidor, puede sentir la necesidad de realizar una serie de pruebas y acciones. Una de estas pruebas podría ser simular una falla de disco para comprender el proceso de reconstrucción del RAID y prepararse en caso de problemas. - -### Vista previa del contenido - -- [Información básica](#basicinformation) -- [Comprensión de la partición del sistema EFI (ESP)](#efisystemparition) -- [Simulación de una falla de disco](#diskfailure) - - [Eliminación del disco defectuoso](#diskremove) -- [Reconstrucción del RAID](#raidrebuild) - - [Reconstrucción del RAID después del reemplazo del disco principal (modo de rescate)](#rescuemode) - - [Recreación de la partición del sistema EFI](#recreateesp) - - [Reconstrucción del RAID cuando las particiones EFI no están sincronizadas después de actualizaciones importantes del sistema (ej. GRUB)](efiraodgrub) - - [Añadido de la etiqueta a la partición SWAP (si aplica)](#swap-partition) - - [Reconstrucción del RAID en modo normal](#normalmode) - - - -### Información básica - -En una sesión de línea de comandos, escriba el siguiente comando para determinar el estado actual del RAID : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 2/4 pages [8KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Este comando nos muestra que actualmente tenemos dos volúmenes RAID software configurados, **md2** y **md3**, con **md3** siendo el más grande de los dos. **md3** se compone de dos particiones, llamadas **nvme1n1p3** y **nvme0n1p3**. - -El [UU] significa que todos los discos funcionan normalmente. Un `_` indicaría un disco defectuoso. - -Si tiene un servidor con discos SATA, obtendrá los siguientes resultados : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 sda3[0] sdb3[1] - 3904786432 blocks super 1.2 [2/2] [UU] - bitmap: 2/30 pages [8KB], 65536KB chunk - -md2 : active raid1 sda2[0] sdb2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Aunque este comando devuelve nuestros volúmenes RAID, no nos indica el tamaño de las particiones en sí. Podemos encontrar esta información con el siguiente comando : - -```sh -[user@server_ip ~]# sudo fdisk -l - -Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 - -Device Start End Sectors Size Type -/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files - - -Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 - -Device Start End Sectors Size Type -/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file -/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file - - -Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes - - -Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -``` - -El comando `fdisk -l` también permite identificar el tipo de sus particiones. Esta es una información importante durante la reconstrucción de su RAID en caso de falla de disco. - -Para las particiones **GPT**, la línea 6 mostrará : `Disklabel type: gpt`. - -Siempre basándonos en los resultados de `fdisk -l`, podemos ver que `/dev/md2` se compone de 1022 MiB y `/dev/md3` contiene 474,81 GiB. Si ejecutamos el comando `mount`, también podemos encontrar la disposición de los discos. - -Como alternativa, el comando `lsblk` ofrece una vista diferente de las particiones : - -```sh -[user@server_ip ~]# lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:7 0 511M 0 part -├─nvme1n1p2 259:8 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme1n1p3 259:9 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -└─nvme1n1p4 259:10 0 512M 0 part [SWAP] -nvme0n1 259:1 0 476.9G 0 disk -├─nvme0n1p1 259:2 0 511M 0 part /boot/efi -├─nvme0n1p2 259:3 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme0n1p3 259:4 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -├─nvme0n1p4 259:5 0 512M 0 part [SWAP] -└─nvme0n1p5 259:6 0 2M 0 part -``` - -Además, si ejecutamos `lsblk -f`, obtenemos más información sobre estas particiones, tales como el LABEL y el UUID : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] -nvme0n1 -├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi -├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] -└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 -``` - -Tome nota de los dispositivos, las particiones y sus puntos de montaje; esto es importante, especialmente después del reemplazo de un disco. - -A partir de los comandos y resultados anteriores, tenemos : - -- Dos matrices RAID : `/dev/md2` y `/dev/md3`. -- Cuatro particiones que forman parte del RAID : **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** con los puntos de montaje `/boot` y `/`. -- Dos particiones no incluidas en el RAID, con los puntos de montaje : `/boot/efi` y [SWAP]. -- Una partición que no tiene punto de montaje : **nvme1n1p1** - -La partición `nvme0n1p5` es una partición de configuración, es decir, un volumen de solo lectura conectado al servidor que le proporciona los datos de configuración inicial. - - - -### Comprender la partición del sistema EFI (ESP) - -***¿Qué es una partición del sistema EFI ?*** - -Una partición del sistema EFI es una partición en la que el servidor inicia. Contiene los archivos de inicio, así como los controladores de inicio o las imágenes del kernel de un sistema operativo instalado. También puede contener programas útiles diseñados para ejecutarse antes de que el sistema operativo inicie, así como archivos de datos tales como registros de errores. - -***¿La partición del sistema EFI está incluida en el RAID ?*** - -No, a partir de agosto de 2025, cuando se realiza una instalación del sistema operativo por parte de OVHcloud, la partición ESP no está incluida en el RAID. Cuando utiliza nuestros modelos de sistema operativo para instalar su servidor con un RAID software, se crean varias particiones del sistema EFI: una por disco. Sin embargo, solo se monta una partición EFI a la vez. Todas las ESP creadas contienen los mismos archivos. Todos los ESP creados en el momento de la instalación contienen los mismos archivos. - -La partición del sistema EFI se monta en `/boot/efi` y el disco en el que se monta se selecciona por Linux al iniciar. - -Ejemplo : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 85 - -Le recomendamos sincronizar sus ESP regularmente o después de cada actualización importante del sistema. Por defecto, todas las particiones del sistema EFI contienen los mismos archivos después de la instalación. Sin embargo, si se implica una actualización importante del sistema, la sincronización de los ESP es esencial para mantener el contenido actualizado. - - - -#### Script - -Aquí tiene un script que puede utilizar para sincronizarlos manualmente. También puede ejecutar un script automatizado para sincronizar las particiones diariamente o cada vez que se inicie el servicio. - -Antes de ejecutar el script, asegúrese de que `rsync` esté instalado en su sistema : - -**Debian/Ubuntu** - -```sh -sudo apt install rsync -``` - -**CentOS, Red Hat y Fedora** - -```sh -sudo yum install rsync -``` - -Para ejecutar un script en Linux, necesita un archivo ejecutable : - -- Empiece creando un archivo .sh en el directorio que elija, reemplazando `nombre-del-script` por el nombre que elija. - -```sh -sudo touch nombre-del-script.sh -``` - -- Abra el archivo con un editor de texto y agregue las siguientes líneas : - -```sh -sudo nano nombre-del-script.sh -``` - -```sh -#!/bin/bash - -set -euo pipefail - -MOUNTPOINT="/var/lib/grub/esp" -MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) - -echo "${MAIN_PARTITION} es la partición principal" - -mkdir -p "${MOUNTPOINT}" - -while read -r partition; do - if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then - continue - fi - echo "Trabajo en ${partition}" - mount "${partition}" "${MOUNTPOINT}" - rsync -ax "/boot/efi/" "${MOUNTPOINT}/" - umount "${MOUNTPOINT}" -done < <(blkid -o device -t LABEL=EFI_SYSPART) -``` - -Guarde y cierre el archivo. - -- Haga que el script sea ejecutable - -```sh -sudo chmod +x nombre-del-script.sh -``` - -- Ejecute el script - -```sh -sudo ./nombre-del-script.sh -``` - -- Si no está en el directorio - -```sh -./ruta/hacia/el/directorio/nombre-del-script.sh -``` - -Cuando se ejecuta el script, el contenido de la partición EFI montada se sincronizará con las demás. Para acceder al contenido, puede montar una de estas particiones EFI no montadas en el punto de montaje: `/var/lib/grub/esp`. - - - -### Simulación de una falla de disco - -Ahora que tenemos toda la información necesaria, podemos simular una falla de disco y proceder a los tests. En este primer ejemplo, provocaremos una falla del disco principal `nvme0n1`. - -El método preferido para hacerlo es a través del modo rescue de OVHcloud. - -Reinicie primero el servidor en modo rescue y conéctese con las credenciales proporcionadas. - -Para retirar un disco del RAID, el primer paso es marcarlo como **Failed** y retirar las particiones de sus matrices RAID respectivas. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -A partir del resultado anterior, nvme0n1 contiene dos particiones en RAID que son **nvme0n1p2** y **nvme0n1p3**. - - - -#### Retiro del disco defectuoso - -En primer lugar, marcamos las particiones **nvme0n1p2** y **nvme0n1p3** como defectuosas. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 -# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 -# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 -``` - -Cuando ejecutamos el comando `cat /proc/mdstat`, obtenemos : - -```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Como podemos ver arriba, el [F] al lado de las particiones indica que el disco está defectuoso o fallido. - -A continuación, retiramos estas particiones de las matrices RAID para eliminar completamente el disco del RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 -# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 -# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 -``` - -El estado de nuestro RAID debería parecerse ahora a esto : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -De acuerdo con los resultados anteriores, podemos ver que ahora solo hay dos particiones en las matrices RAID. Hemos logrado degradar el disco **nvme0n1**. - -Para asegurarnos de obtener un disco similar a un disco vacío, utilizamos el siguiente comando en cada partición, y luego en el disco mismo : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -shred -s10M -n1 /dev/nvme0n1p1 -shred -s10M -n1 /dev/nvme0n1p2 -shred -s10M -n1 /dev/nvme0n1p3 -shred -s10M -n1 /dev/nvme0n1p4 -shred -s10M -n1 /dev/nvme0n1p5 -shred -s10M -n1 /dev/nvme0n1 -``` - -El disco ahora aparece como un disco nuevo y vacío : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -``` - -Si ejecutamos el siguiente comando, verificamos que nuestro disco ha sido correctamente "borrado" : - -```sh -parted /dev/nvme0n1 -GNU Parted 3.5 -Using /dev/nvme0n1 -Welcome to GNU Parted! Type 'help' to view a list of commands. -(parted) p -Error: /dev/nvme0n1: unrecognised disk label -Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) -Disk /dev/nvme0n1: 512GB -Sector size (logical/physical): 512B/512B -Partition Table: unknown -Disk Flags: -``` - -Para obtener más información sobre la preparación y la solicitud de reemplazo de un disco, consulte este [guía](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). - -Si ejecuta el siguiente comando, puede obtener más detalles sobre las matrices RAID : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 - -/dev/md3: - Version : 1.2 - Creation Time : Fri Aug 1 14:51:13 2025 - Raid Level : raid1 - Array Size : 497875968 (474.81 GiB 509.82 GB) - Used Dev Size : 497875968 (474.81 GiB 509.82 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent - - Intent Bitmap : Internal - - Update Time : Fri Aug 1 15:56:17 2025 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 - -Consistency Policy : bitmap - - Name : md3 - UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff - Events : 215 - - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 259 4 1 active sync /dev/nvme1n1p3 -``` - -Ahora podemos proceder al reemplazo del disco. - - - -### Reconstrucción del RAID - -> [!primary] -> Este proceso puede variar según el sistema operativo instalado en su servidor. Le recomendamos consultar la documentación oficial de su sistema operativo para obtener los comandos adecuados. -> - -> [!warning] -> -> En la mayoría de los servidores con RAID software, después de un reemplazo de disco, el servidor puede arrancar en modo normal (sobre el disco sano) y la reconstrucción puede realizarse en modo normal. Sin embargo, si el servidor no puede arrancar en modo normal después del reemplazo del disco, reiniciará en modo rescue para proceder a la reconstrucción del RAID. -> -> Si su servidor puede arrancar en modo normal después del reemplazo del disco, simplemente siga los pasos de [esta sección](#rebuilding-the-raid-in-normal-mode). - - - -#### Reconstrucción del RAID en modo rescue - -Una vez reemplazado el disco, el siguiente paso consiste en copiar la tabla de particiones del disco sano (en este ejemplo, nvme1n1) en el nuevo (nvme0n1). - -**Para particiones GPT** - -El comando debe tener este formato: `sgdisk -R /dev/nuevo disco /dev/disco sano` - -En nuestro ejemplo : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 -``` - -Ejecute `lsblk` para asegurarse de que las tablas de particiones se hayan copiado correctamente : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -├─nvme0n1p1 259:10 0 511M 0 part -├─nvme0n1p2 259:11 0 1G 0 part -├─nvme0n1p3 259:12 0 474.9G 0 part -└─nvme0n1p4 259:13 0 512M 0 part -``` - -Una vez hecho esto, el siguiente paso consiste en asignar un GUID aleatorio al nuevo disco para evitar conflictos de GUID con otros discos : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 -``` - -Si recibe el siguiente mensaje : - -```console -Warning: The kernel is still using the old partition table. -The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) -The operation has completed successfully. -``` - -Simplemente ejecute el comando `partprobe`. - -Ahora podemos reconstruir la matriz RAID. El siguiente fragmento de código muestra cómo agregar nuevamente las nuevas particiones (nvme0n1p2 y nvme0n1p3) a la matriz RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 -# mdadm: added /dev/nvme0n1p2 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 -``` - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Para verificar el proceso de reconstrucción: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] -``` - -Una vez que la reconstrucción del RAID esté terminada, ejecute el siguiente comando para asegurarse de que las particiones se hayan agregado correctamente al RAID: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 -├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 -nvme0n1 -├─nvme0n1p1 -├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme0n1p4 -``` - -Según los resultados anteriores, las particiones del nuevo disco se han agregado correctamente al RAID. Sin embargo, la partición del sistema EFI y la partición SWAP (en algunos casos) no se han duplicado, lo cual es normal ya que no forman parte del RAID. - -> [!warning] -> Los ejemplos anteriores ilustran simplemente los pasos necesarios basados en una configuración de servidor predeterminada. Los resultados de cada comando dependen del tipo de hardware instalado en su servidor y de la estructura de sus particiones. En caso de duda, consulte la documentación de su sistema operativo. -> -> Si necesita asistencia profesional para la administración de su servidor, consulte los detalles de la sección [Más información](#go-further) de esta guía. -> - - - -#### Recreación de la partición del sistema EFI - -Para recrear la partición del sistema EFI, debemos formatear **nvme0n1p1** y replicar el contenido de la partición del sistema EFI sana (en nuestro ejemplo: nvme1n1p1) en esta última. - -Aquí, asumimos que ambas particiones se han sincronizado y contienen archivos actualizados o simplemente no han sufrido actualizaciones del sistema que afecten al *bootloader*. - -> [!warning] -> Si se ha realizado una actualización importante del sistema, como una actualización del kernel o de GRUB, y las dos particiones no se han sincronizado, consulte esta [sección](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) una vez que haya terminado de crear la nueva partición del sistema EFI. -> - -Primero, formateamos la partición: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 -``` - -A continuación, asignamos la etiqueta `EFI_SYSPART` a la partición. (este nombre es específico de OVHcloud): - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Luego, duplicamos el contenido de nvme1n1p1 en nvme0n1p1. Comenzamos creando dos directorios, que llamamos « old » y « new » en nuestro ejemplo: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new -``` - -A continuación, montamos **nvme1n1p1** en el directorio « old » y **nvme0n1p1** en el directorio « new » para diferenciarlos: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new -``` - -Luego, copiamos los archivos del directorio 'old' a 'new': - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ -sending incremental file list -EFI/ -EFI/debian/ -EFI/debian/BOOTX64.CSV -EFI/debian/fbx64.efi -EFI/debian/grub.cfg -EFI/debian/grubx64.efi -EFI/debian/mmx64.efi -EFI/debian/shimx64.efi - -sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec -total size is 6,097,843 speedup is 1.00 -``` - -Una vez hecho esto, desmontamos ambas particiones: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 -``` - -A continuación, montamos la partición que contiene la raíz de nuestro sistema operativo en `/mnt`. En nuestro ejemplo, esta partición es **md3**: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt -``` - -Montamos los siguientes directorios para asegurarnos de que cualquier manipulación que realicemos en el entorno `chroot` funcione correctamente: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -mount --types proc /proc /mnt/proc -mount --rbind /sys /mnt/sys -mount --make-rslave /mnt/sys -mount --rbind /dev /mnt/dev -mount --make-rslave /mnt/dev -mount --bind /run /mnt/run -mount --make-slave /mnt/run -``` - -Luego, utilizamos el comando `chroot` para acceder al punto de montaje y asegurarnos de que la nueva partición del sistema EFI se ha creado correctamente y que el sistema reconoce las dos ESP: - -```sh -root@rescue12-customer-eu:/# chroot /mnt -``` - -Para mostrar las particiones ESP, ejecutamos el comando `blkid -t LABEL=EFI_SYSPART`: - -```sh -root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Los resultados anteriores muestran que la nueva partición EFI se ha creado correctamente y que la etiqueta se ha aplicado correctamente. - - - -#### Reconstrucción del RAID cuando las particiones EFI no están sincronizadas después de actualizaciones importantes del sistema (GRUB) - -/// details | Despliegue esta sección - -> [!warning] -> Siga los pasos de esta sección solo si se aplica a su caso. -> - -Cuando las particiones del sistema EFI no están sincronizadas después de actualizaciones importantes del sistema que modifican/afectan a GRUB, y se reemplaza el disco principal en el que se monta la partición, el arranque desde un disco secundario que contiene una ESP obsoleta puede no funcionar. - -En este caso, además de reconstruir el RAID y recrear la partición del sistema EFI en modo rescue, también debe reinstalar GRUB en esta última. - -Una vez que hayamos recreado la partición EFI y nos aseguremos de que el sistema reconoce las dos particiones (pasos anteriores en `chroot`), creamos el directorio `/boot/efi` para montar la nueva partición del sistema EFI **nvme0n1p1**: - -```sh -root@rescue12-customer-eu:/# mount /boot -root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi -``` - -A continuación, reinstalamos el cargador de arranque GRUB (*bootloader*): - -```sh -root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 -``` - -Una vez hecho esto, ejecute el siguiente comando: - -```sh -root@rescue12-customer-eu:/# update-grub -``` -/// - - - -#### Añadimos la etiqueta a la partición SWAP (si aplica) - -Una vez que hayamos terminado con la partición EFI, pasamos a la partición SWAP. - -Salimos del entorno `chroot` con `exit` para recrear nuestra partición [SWAP] **nvme0n1p4** y añadir la etiqueta `swap-nvme0n1p4`: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd -``` - -Verificamos que la etiqueta se haya aplicado correctamente: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 - -├─nvme1n1p1 -│ vfat FAT16 EFI_SYSPART -│ BA77-E844 504.9M 1% /root/old -├─nvme1n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme1n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme1n1p4 - swap 1 swap-nvme1n1p4 - d6af33cf-fc15-4060-a43c-cb3b5537f58a -nvme0n1 - -├─nvme0n1p1 -│ vfat FAT16 EFI_SYSPART -│ 477D-6658 -├─nvme0n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme0n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme0n1p4 - swap 1 swap-nvme0n1p4 - b3c9e03a-52f5-4683-81b6-cc10091fcd15 -``` - -Accedemos nuevamente al entorno `chroot`: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt -``` - -Recuperamos el UUID de ambas particiones swap: - -```sh - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Utilice el siguiente comando para seguir la reconstrucción del RAID: `cat /proc/mdstat`. - -**Recreación de la partición EFI System en el disco** - -En primer lugar, instalamos las herramientas necesarias: - -**Debian y Ubuntu** - -```sh -[user@server_ip ~]# sudo apt install dosfstools -``` - -**CentOS** - -```sh -[user@server_ip ~]# sudo yum install dosfstools -``` - -A continuación, formateamos la partición. En nuestro ejemplo `nvme0n1p1`: - -```sh -[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 -``` - -A continuación, asignamos la etiqueta `EFI_SYSPART` a la partición. (este nombre es específico de OVHcloud): - -```sh -[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Una vez hecho esto, puede sincronizar las dos particiones utilizando el script que hemos proporcionado [aquí](#script). - -Comprobamos que la nueva partición EFI System se ha creado correctamente y que el sistema la reconoce: - -```sh -[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Finalmente, activamos la partición [SWAP] (si aplica): - -- Creamos y añadimos la etiqueta: - -```sh -[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -``` - -- Recuperamos los UUID de las dos particiones swap: - -```sh -[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 -/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 -/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" -``` - -- Reemplazamos el antiguo UUID de la partición swap (**nvme0n1p4)** por el nuevo en `/etc/fstab`: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -``` - -Ejemplo: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 -UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 -LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 -UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 -UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 -``` - -Según los resultados anteriores, el antiguo UUID es `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` y debe ser reemplazado por el nuevo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. - -Asegúrese de reemplazar el UUID correcto. - -A continuación, ejecutamos el siguiente comando para activar la partición swap: - -```sh -[user@server_ip ~]# sudo swapon -av -swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme0n1p4 -swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme1n1p4 -``` - -A continuación, recargamos el sistema: - -```sh -[user@server_ip ~]# sudo systemctl daemon-reload -``` - -Hemos terminado con éxito la reconstrucción del RAID. - -## Más información - -[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) - -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) - -[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) - -[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) - -Para servicios especializados (SEO, desarrollo, etc.), contacte con [los socios de OVHcloud](/links/partner). - -Si necesita asistencia para utilizar y configurar sus soluciones OVHcloud, consulte nuestras [ofertas de soporte](/links/support). - -Si necesita formación o asistencia técnica para implementar nuestras soluciones, contacte con su representante comercial o haga clic en [este enlace](/links/professional-services) para obtener un presupuesto y solicitar que los expertos del equipo de Professional Services intervengan en su caso de uso específico. - -Únase a nuestra [comunidad de usuarios](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md deleted file mode 100644 index 28d39c3c16d..00000000000 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.it-it.md +++ /dev/null @@ -1,896 +0,0 @@ ---- -title: "Gestione e ricostruzione di un RAID software sui server in modalità di avvio UEFI" -excerpt: Scopri come gestire e ricostruire un RAID software dopo il ripristino di un disco su un server in modalità di avvio UEFI -updated: 2025-12-11 ---- - -## Obiettivo - -Un Redundant Array of Independent Disks (RAID) è una tecnologia che riduce la perdita di dati su un server replicando i dati su due dischi o più. - -Il livello RAID predefinito per le installazioni dei server OVHcloud è il RAID 1, che raddoppia lo spazio occupato dai vostri dati, riducendo quindi la capacità di archiviazione utilizzabile a metà. - -**Questa guida spiega come gestire e ricostruire un RAID software dopo il ripristino di un disco sul vostro server in modalità EFI** - -Prima di iniziare, notate che questa guida si concentra sui Server dedicati che utilizzano la modalità UEFI come modalità di avvio. Questo è il caso delle schede madri moderne. Se il vostro server utilizza la modalità di avvio legacy (BIOS), consultate questa guida: [Gestione e ricostruzione di un RAID software su server in modalità di avvio legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). - -Per verificare se un server funziona in modalità BIOS legacy o in modalità UEFI, eseguite il comando seguente: - -```sh -[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS -``` - -Per ulteriori informazioni sull'UEFI, consultate l'articolo seguente: [https://uefi.org/about](https://uefi.org/about). - -## Prerequisiti - -- Un [server dedicato](/links/bare-metal/bare-metal) con una configurazione RAID software -- Un accesso amministrativo (sudo) al server tramite SSH -- Una comprensione del RAID, delle partizioni e di GRUB - -Durante questa guida utilizzeremo i termini **disco principale** e **disco secondario**. In questo contesto: - -- Il disco principale è il disco il cui ESP (EFI System Partition) è montato da Linux -- I dischi secondari sono tutti gli altri dischi del RAID - -## Procedura - -Quando acquisti un nuovo server, potresti sentire il bisogno di effettuare una serie di test e azioni. Un tale test potrebbe consistere nel simulare un guasto del disco per comprendere il processo di ricostruzione del RAID e prepararti in caso di problemi. - -### Panoramica del contenuto - -- [Informazioni di base](#basicinformation) -- [Comprendere la partizione del sistema EFI (ESP)](#efisystemparition) -- [Simulazione di un guasto del disco](#diskfailure) - - [Rimozione del disco guasto](#diskremove) -- [Ricostruzione del RAID](#raidrebuild) - - [Ricostruzione del RAID dopo la sostituzione del disco principale (modalità di salvataggio)](#rescuemode) - - [Ricreazione della partizione del sistema EFI](#recreateesp) - - [Ricostruzione del RAID quando le partizioni EFI non sono sincronizzate dopo aggiornamenti importanti del sistema (es. GRUB)](efiraodgrub) - - [Aggiunta dell'etichetta alla partizione SWAP (se applicabile)](#swap-partition) - - [Ricostruzione del RAID in modalità normale](#normalmode) - - - -### Informazioni di base - -In una sessione della riga di comando, digita il comando seguente per determinare lo stato corrente del RAID : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 2/4 pages [8KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Questo comando ci mostra che attualmente abbiamo due volumi RAID software configurati, **md2** e **md3**, con **md3** che è il più grande dei due. **md3** è composto da due partizioni, chiamate **nvme1n1p3** e **nvme0n1p3**. - -Il [UU] significa che tutti i dischi funzionano normalmente. Un `_` indicherebbe un disco guasto. - -Se hai un server con dischi SATA, otterrai i seguenti risultati : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 sda3[0] sdb3[1] - 3904786432 blocks super 1.2 [2/2] [UU] - bitmap: 2/30 pages [8KB], 65536KB chunk - -md2 : active raid1 sda2[0] sdb2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Sebbene questo comando restituisca i nostri volumi RAID, non ci indica la dimensione delle partizioni stesse. Possiamo trovare queste informazioni con il comando seguente : - -```sh -[user@server_ip ~]# sudo fdisk -l - -Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 - -Device Start End Sectors Size Type -/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files - - -Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 - -Device Start End Sectors Size Type -/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file -/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file - - -Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes - - -Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -``` - -Il comando `fdisk -l` consente anche di identificare il tipo delle tue partizioni. Questa informazione è importante durante la ricostruzione del tuo RAID in caso di guasto del disco. - -Per le partizioni **GPT**, la riga 6 mostrerà: `Disklabel type: gpt`. - -Ancora basandosi sui risultati di `fdisk -l`, possiamo vedere che `/dev/md2` è composto da 1022 MiB e `/dev/md3` contiene 474,81 GiB. Se eseguiamo il comando `mount`, possiamo anche trovare la disposizione dei dischi. - -In alternativa, il comando `lsblk` offre una visione diversa delle partizioni : - -```sh -[user@server_ip ~]# lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:7 0 511M 0 part -├─nvme1n1p2 259:8 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme1n1p3 259:9 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -└─nvme1n1p4 259:10 0 512M 0 part [SWAP] -nvme0n1 259:1 0 476.9G 0 disk -├─nvme0n1p1 259:2 0 511M 0 part /boot/efi -├─nvme0n1p2 259:3 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme0n1p3 259:4 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -├─nvme0n1p4 259:5 0 512M 0 part [SWAP] -└─nvme0n1p5 259:6 0 2M 0 part -``` - -Inoltre, se eseguiamo `lsblk -f`, otteniamo ulteriori informazioni su queste partizioni, come l'etichetta (LABEL) e l'UUID : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] -nvme0n1 -├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi -├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] -└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 -``` - -Prendi nota dei dispositivi, delle partizioni e dei loro punti di montaggio; è importante, soprattutto dopo la sostituzione di un disco. - -Dalle comandi e risultati sopra, abbiamo : - -- Due matrici RAID : `/dev/md2` e `/dev/md3`. -- Quattro partizioni che fanno parte del RAID : **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** con i punti di montaggio `/boot` e `/`. -- Due partizioni non incluse nel RAID, con i punti di montaggio : `/boot/efi` e [SWAP]. -- Una partizione che non possiede un punto di montaggio : **nvme1n1p1** - -La partizione `nvme0n1p5` è una partizione di configurazione, cioè un volume in sola lettura connesso al server che gli fornisce i dati di configurazione iniziale. - - - -### Comprendere la partizione del sistema EFI (ESP) - -***Cos'è una partizione del sistema EFI ?*** - -Una partizione del sistema EFI è una partizione su cui il server si avvia. Contiene i file di avvio, ma anche i gestori di avvio o le immagini del kernel di un sistema operativo installato. Può anche contenere programmi utili progettati per essere eseguiti prima che il sistema operativo si avvii, così come file di dati come registri degli errori. - -***La partizione del sistema EFI è inclusa nel RAID ?*** - -No, a partire da agosto 2025, quando un'installazione del sistema operativo viene effettuata da OVHcloud, la partizione ESP non è inclusa nel RAID. Quando si utilizzano i nostri modelli OS per installare il server con un RAID software, vengono create più partizioni del sistema EFI: una per disco. Tuttavia, solo una partizione EFI è montata alla volta. Tutte le ESP create contengono gli stessi file. Tutte le ESP create al momento dell'installazione contengono gli stessi file. - -La partizione del sistema EFI è montata a `/boot/efi` e il disco su cui è montata viene selezionato da Linux all'avvio. - -Esempio : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f90 - -Ti consigliamo di sincronizzare regolarmente i tuoi ESP o dopo ogni aggiornamento importante del sistema. Per default, tutte le partizioni EFI del sistema contengono gli stessi file dopo l'installazione. Tuttavia, se è coinvolto un aggiornamento importante del sistema, la sincronizzazione degli ESP è essenziale per mantenere aggiornato il contenuto. - - - -#### Script - -Ecco uno script che puoi utilizzare per sincronizzarli manualmente. Puoi anche eseguire uno script automatizzato per sincronizzare le partizioni quotidianamente o ogni volta che il servizio parte. - -Prima di eseguire lo script, assicurati che `rsync` sia installato sul tuo sistema : - -**Debian/Ubuntu** - -```sh -sudo apt install rsync -``` - -**CentOS, Red Hat e Fedora** - -```sh -sudo yum install rsync -``` - -Per eseguire uno script su Linux, hai bisogno di un file eseguibile : - -- Inizia creando un file .sh nella directory di tuo interesse, sostituendo `nome-script` con il nome che preferisci. - -```sh -sudo touch nome-script.sh -``` - -- Apri il file con un editor di testo e aggiungi le seguenti righe : - -```sh -sudo nano nome-script.sh -``` - -```sh -#!/bin/bash - -set -euo pipefail - -MOUNTPOINT="/var/lib/grub/esp" -MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) - -echo "${MAIN_PARTITION} è la partizione principale" - -mkdir -p "${MOUNTPOINT}" - -while read -r partition; do - if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then - continue - fi - echo "Lavoro su ${partition}" - mount "${partition}" "${MOUNTPOINT}" - rsync -ax "/boot/efi/" "${MOUNTPOINT}/" - umount "${MOUNTPOINT}" -done < <(blkid -o device -t LABEL=EFI_SYSPART) -``` - -Salva e chiudi il file. - -- Rendi lo script eseguibile - -```sh -sudo chmod +x nome-script.sh -``` - -- Esegui lo script - -```sh -sudo ./nome-script.sh -``` - -- Se non sei nella directory - -```sh -./percorso/verso/la/cartella/nome-script.sh -``` - -Quando lo script viene eseguito, il contenuto della partizione EFI montata verrà sincronizzato con le altre. Per accedere al contenuto, puoi montare una di queste partizioni EFI non montate sul punto di montaggio : `/var/lib/grub/esp`. - - - -### Simulazione di un guasto del disco - -Ora che abbiamo tutte le informazioni necessarie, possiamo simulare un guasto del disco e procedere ai test. In questo primo esempio, provocheremo un guasto del disco principale `nvme0n1`. - -Il metodo preferito per farlo è attraverso la modalità rescue di OVHcloud. - -Riavvia prima il server in modalità rescue e collegati con le credenziali fornite. - -Per rimuovere un disco dal RAID, il primo passo è contrassegnarlo come **Failed** e rimuovere le partizioni dai rispettivi array RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Dai risultati sopra, nvme0n1 contiene due partizioni in RAID che sono **nvme0n1p2** e **nvme0n1p3**. - - - -#### Rimozione del disco guasto - -In primo luogo, contrassegniamo le partizioni **nvme0n1p2** e **nvme0n1p3** come guaste. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 -# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 -# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 -``` - -Quando eseguiamo il comando `cat /proc/mdstat`, otteniamo : - -```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Come possiamo vedere sopra, il [F] accanto alle partizioni indica che il disco è guasto o in panne. - -Successivamente, rimuoviamo queste partizioni dagli array RAID per eliminarle completamente dal RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 -# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 -# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 -``` - -Lo stato del nostro RAID dovrebbe ora assomigliare a questo : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Dai risultati sopra, possiamo vedere che ora ci sono solo due partizioni negli array RAID. Abbiamo riuscito a degradare il disco **nvme0n1**. - -Per assicurarci di ottenere un disco simile a un disco vuoto, utilizziamo il comando seguente su ogni partizione, quindi sul disco stesso : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -shred -s10M -n1 /dev/nvme0n1p1 -shred -s10M -n1 /dev/nvme0n1p2 -shred -s10M -n1 /dev/nvme0n1p3 -shred -s10M -n1 /dev/nvme0n1p4 -shred -s10M -n1 /dev/nvme0n1p5 -shred -s10M -n1 /dev/nvme0n1 -``` - -Il disco appare ora come un disco nuovo e vuoto : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -``` - -Se eseguiamo il comando seguente, constatiamo che il nostro disco è stato correttamente "cancellato" : - -```sh -parted /dev/nvme0n1 -GNU Parted 3.5 -Using /dev/nvme0n1 -Welcome to GNU Parted! Type 'help' to view a list of commands. -(parted) p -Error: /dev/nvme0n1: unrecognised disk label -Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) -Disk /dev/nvme0n1: 512GB -Sector size (logical/physical): 512B/512B -Partition Table: unknown -Disk Flags: -``` - -Per ulteriori informazioni sulla preparazione e la richiesta di sostituzione di un disco, consulta questo [guida](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). - -Se esegui il comando seguente, puoi ottenere ulteriori dettagli sugli array RAID : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 - -/dev/md3: - Version : 1.2 - Creation Time : Fri Aug 1 14:51:13 2025 - Raid Level : raid1 - Array Size : 497875968 (474.81 GiB 509.82 GB) - Used Dev Size : 497875968 (474.81 GiB 509.82 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent - - Intent Bitmap : Internal - - Update Time : Fri Aug 1 15:56:17 2025 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 - -Consistency Policy : bitmap - - Name : md3 - UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff - Events : 215 - - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 259 4 1 active sync /dev/nvme1n1p3 -``` - -Possiamo ora procedere alla sostituzione del disco. - - - -### Ricostruzione del RAID - -> [!primary] -> Questo processo può variare a seconda del sistema operativo installato sul tuo server. Ti consigliamo di consultare la documentazione ufficiale del tuo sistema operativo per ottenere i comandi appropriati. -> - -> [!warning] -> -> Su la maggior parte dei server in RAID software, dopo la sostituzione di un disco, il server è in grado di avviarsi in modalità normale (sul disco sano) e la ricostruzione può essere effettuata in modalità normale. Tuttavia, se il server non riesce ad avviarsi in modalità normale dopo la sostituzione del disco, si riavvierà in modalità rescue per procedere alla ricostruzione del RAID. -> -> Se il tuo server è in grado di avviarsi in modalità normale dopo la sostituzione del disco, segui semplicemente le fasi di [questa sezione](#rebuilding-the-raid-in-normal-mode). - - - -#### Ricostruzione del RAID in modalità rescue - -Una volta sostituito il disco, il passo successivo consiste nel copiare la tabella delle partizioni del disco sano (in questo esempio, nvme1n1) sul nuovo (nvme0n1). - -**Per le partizioni GPT** - -Il comando deve essere in questo formato : `sgdisk -R /dev/nuovo disco /dev/disco sano` - -Nel nostro esempio : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 -``` - -Esegui `lsblk` per assicurarti che le tabelle delle partizioni siano state correttamente copiate : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -├─nvme0n1p1 259:10 0 511M 0 part -├─nvme0n1p2 259:11 0 1G 0 part -├─nvme0n1p3 259:12 0 474.9G 0 part -└─nvme0n1p4 259:13 0 512M 0 part -``` - -Una volta fatto questo, il passo successivo consiste nell'assegnare un GUID casuale al nuovo disco per evitare conflitti di GUID con altri dischi : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 -``` - -Se ricevi il seguente messaggio : - -```console -Warning: The kernel is still using the old partition table. -The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) -The operation has completed successfully. -``` - -Esegui semplicemente il comando `partprobe`. - -Possiamo ora ricostruire l'array RAID. L'estratto di codice seguente mostra come aggiungere nuovamente le nuove partizioni (nvme0n1p2 e nvme0n1p3) all'array RAID. - -```sh -root - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Per verificare il processo di ricostruzione: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] -``` - -Una volta completata la ricostruzione del RAID, esegui il comando seguente per verificare che le partizioni siano state correttamente aggiunte al RAID: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 -├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 -nvme0n1 -├─nvme0n1p1 -├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme0n1p4 -``` - -In base ai risultati sopra riportati, le partizioni del nuovo disco sono state correttamente aggiunte al RAID. Tuttavia, la partizione EFI System e la partizione SWAP (in alcuni casi) non sono state duplicate, il che è normale poiché non fanno parte del RAID. - -> [!warning] -> Gli esempi sopra illustrano semplicemente le fasi necessarie in base a una configurazione di server predefinita. I risultati di ogni comando dipendono dal tipo di hardware installato sul tuo server e dalla struttura delle sue partizioni. In caso di dubbi, consulta la documentazione del tuo sistema operativo. -> -> Se hai bisogno di un supporto professionale per l'amministrazione del tuo server, consulta i dettagli della sezione [Per saperne di più](#go-further) di questa guida. -> - - - -#### Ricostruzione della partizione EFI System - -Per ricostruire la partizione EFI System, dobbiamo formattare **nvme0n1p1** e replicare il contenuto della partizione EFI System sana (nel nostro esempio: nvme1n1p1) su questa. - -In questo caso, assumiamo che le due partizioni siano state sincronizzate e contengano file aggiornati o non abbiano subito aggiornamenti del sistema che influenzano il *bootloader*. - -> [!warning] -> Se è avvenuto un aggiornamento importante del sistema, ad esempio un aggiornamento del kernel o di GRUB, e le due partizioni non sono state sincronizzate, consulta questa [sezione](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) una volta completata la creazione della nuova partizione EFI System. -> - -In primo luogo, formattiamo la partizione: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 -``` - -Successivamente, assegniamo l'etichetta `EFI_SYSPART` alla partizione. (questo nome è specifico di OVHcloud): - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Successivamente, duplichiamo il contenuto di nvme1n1p1 in nvme0n1p1. Creiamo prima due cartelle, che chiamiamo « old » e « new » nel nostro esempio: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new -``` - -Successivamente, montiamo **nvme1n1p1** nella cartella « old » e **nvme0n1p1** nella cartella « new » per distinguerle: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new -``` - -Successivamente, copiamo i file della cartella 'old' in 'new': - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ -sending incremental file list -EFI/ -EFI/debian/ -EFI/debian/BOOTX64.CSV -EFI/debian/fbx64.efi -EFI/debian/grub.cfg -EFI/debian/grubx64.efi -EFI/debian/mmx64.efi -EFI/debian/shimx64.efi - -sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec -total size is 6,097,843 speedup is 1.00 -``` - -Una volta completato, smontiamo le due partizioni: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 -``` - -Successivamente, montiamo la partizione che contiene la radice del nostro sistema operativo su `/mnt`. Nell'esempio, questa partizione è **md3**: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt -``` - -Montiamo i seguenti directory per assicurarci che qualsiasi operazione che eseguiamo nell'ambiente `chroot` funzioni correttamente: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -mount --types proc /proc /mnt/proc -mount --rbind /sys /mnt/sys -mount --make-rslave /mnt/sys -mount --rbind /dev /mnt/dev -mount --make-rslave /mnt/dev -mount --bind /run /mnt/run -mount --make-slave /mnt/run -``` - -Successivamente, utilizziamo il comando `chroot` per accedere al punto di montaggio e verificare che la nuova partizione EFI System sia stata correttamente creata e che il sistema riconosca entrambe le ESP: - -```sh -root@rescue12-customer-eu:/# chroot /mnt -``` - -Per visualizzare le partizioni ESP, eseguiamo il comando `blkid -t LABEL=EFI_SYSPART`: - -```sh -root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -I risultati sopra mostrano che la nuova partizione EFI è stata creata correttamente e che l'etichetta è stata applicata correttamente. - - - -#### Ricostruzione del RAID quando le partizioni EFI non sono sincronizzate dopo aggiornamenti importanti del sistema (GRUB) - -/// details | Espandi questa sezione - -> [!warning] -> Segui le fasi di questa sezione solo se si applica al tuo caso. -> - -Quando le partizioni del sistema EFI non sono sincronizzate dopo aggiornamenti importanti del sistema che modificano/colpiscono il GRUB, e il disco principale su cui è montata la partizione viene sostituito, l'avvio da un disco secondario che contiene un'ESP obsoleta potrebbe non funzionare. - -In questo caso, oltre a ricostruire il RAID e a ricreare la partizione del sistema EFI in modalità rescue, devi anche reinstallare il GRUB su quest'ultima. - -Una volta che abbiamo ricreato la partizione EFI e ci siamo assicurati che il sistema riconosca entrambe le partizioni (fasi precedenti in `chroot`), creiamo la directory `/boot/efi` per montare la nuova partizione del sistema EFI **nvme0n1p1**: - -```sh -root@rescue12-customer-eu:/# mount /boot -root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi -``` - -Successivamente, reinstalliamo il caricatore di avvio GRUB (*bootloader*): - -```sh -root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 -``` - -Una volta fatto, esegui il comando seguente: - -```sh -root@rescue12-customer-eu:/# update-grub -``` -/// - - - -#### Aggiunta dell'etichetta alla partizione SWAP (se applicabile) - -Una volta completata la partizione EFI, passiamo alla partizione SWAP. - -Usciamo dall'ambiente `chroot` con `exit` per ricreare la nostra partizione [SWAP] **nvme0n1p4** e aggiungere l'etichetta `swap-nvme0n1p4`: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd -``` - -Verifichiamo che l'etichetta sia stata correttamente applicata: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 - -├─nvme1n1p1 -│ vfat FAT16 EFI_SYSPART -│ BA77-E844 504.9M 1% /root/old -├─nvme1n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme1n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme1n1p4 - swap 1 swap-nvme1n1p4 - d6af33cf-fc15-4060-a43c-cb3b5537f58a -nvme0n1 - -├─nvme0n1p1 -│ vfat FAT16 EFI_SYSPART -│ 477D-6658 -├─nvme0n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme0n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme0n1p4 - swap 1 swap-nvme0n1p4 - b3c9e03a-52f5-4683-81b6-cc10091fcd15 -``` - -Accediamo nuovamente all'ambiente `chroot` - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Utilizza il comando seguente per monitorare la ricostruzione del RAID: `cat /proc/mdstat`. - -**Ricreazione della partizione EFI System sul disco** - -Per prima cosa installiamo gli strumenti necessari: - -**Debian e Ubuntu** - -```sh -[user@server_ip ~]# sudo apt install dosfstools -``` - -**CentOS** - -```sh -[user@server_ip ~]# sudo yum install dosfstools -``` - -Successivamente formattiamo la partizione. Nel nostro esempio `nvme0n1p1`: - -```sh -[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 -``` - -Successivamente assegniamo l'etichetta `EFI_SYSPART` alla partizione. (questo nome è specifico per OVHcloud): - -```sh -[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Una volta completato, puoi sincronizzare le due partizioni utilizzando lo script che abbiamo fornito [qui](#script). - -Verifichiamo che la nuova partizione EFI System sia stata creata correttamente e che il sistema la riconosca: - -```sh -[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Infine, attiviamo la partizione [SWAP] (se applicabile): - -- Creiamo e aggiungiamo l'etichetta: - -```sh -[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -``` - -- Recuperiamo gli UUID delle due partizioni di swap: - -```sh -[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 -/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 -/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" -``` - -- Sostituiamo l'UUID vecchio della partizione swap (**nvme0n1p4)** con il nuovo in `/etc/fstab`: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -``` - -Esempio: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 -UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 -LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 -UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 -UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 -``` - -Secondo i risultati sopra, l'UUID vecchio è `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` e deve essere sostituito con il nuovo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. - -Assicurati di sostituire l'UUID corretto. - -Successivamente, eseguiamo il comando seguente per attivare la partizione di swap: - -```sh -[user@server_ip ~]# sudo swapon -av -swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme0n1p4 -swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme1n1p4 -``` - -Successivamente, ricarichiamo il sistema: - -```sh -[user@server_ip ~]# sudo systemctl daemon-reload -``` - -Abbiamo completato con successo la ricostruzione del RAID. - -## Per saperne di più - -[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) - -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) - -[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) - -[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) - -Per servizi specializzati (SEO, sviluppo, ecc.), contatta [i partner OVHcloud](/links/partner). - -Se hai bisogno di un supporto per utilizzare e configurare le tue soluzioni OVHcloud, consulta le [nostre offerte di supporto](/links/support). - -Se hai bisogno di formazione o di un supporto tecnico per implementare le nostre soluzioni, contatta il tuo rappresentante commerciale o clicca su [questo link](/links/professional-services) per richiedere un preventivo e chiedere ai nostri esperti del team Professional Services di intervenire sul tuo caso d'uso specifico. - -Contatta la nostra [Community di utenti](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md deleted file mode 100644 index 57a245df19f..00000000000 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pl-pl.md +++ /dev/null @@ -1,841 +0,0 @@ ---- -title: Zarządzanie i odbudowa oprogramowania RAID na serwerach w trybie uruchamiania UEFI -excerpt: Dowiedz się, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie uruchamiania UEFI -updated: 2025-12-11 ---- - -## Wprowadzenie - -Redundant Array of Independent Disks (RAID) to technologia, która zmniejsza utratę danych na serwerze, replikując dane na dwóch lub więcej dyskach. - -Domyślny poziom RAID dla instalacji serwerów OVHcloud to RAID 1, który podwaja zajęte przez dane miejsce, skutecznie zmniejszając dostępne miejsce na dysku o połowę. - -**Ten przewodnik wyjaśnia, jak zarządzać i odbudować oprogramowanie RAID po wymianie dysku na serwerze w trybie uruchamiania UEFI** - -Zanim zaczniemy, zwróć uwagę, że ten przewodnik skupia się na Serwerach dedykowanych, które używają UEFI jako trybu uruchamiania. Jest to typowe dla nowoczesnych płyt głównych. Jeśli Twój serwer używa trybu uruchamiania zgodnego (BIOS), odwiedź ten przewodnik: [Zarządzanie i odbudowa oprogramowania RAID na serwerach w trybie uruchamiania zgodnym (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). - -Aby sprawdzić, czy serwer działa w trybie zgodnym BIOS czy trybie uruchamiania UEFI, uruchom następującą komendę: - -```sh -[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS -``` - -Aby uzyskać więcej informacji na temat UEFI, zapoznaj się z poniższym [artykułem](https://uefi.org/about). - -## Wymagania początkowe - -- Serwer [dedykowany](/links/bare-metal/bare-metal) z konfiguracją oprogramowania RAID -- Dostęp administracyjny (sudo) do serwera przez SSH -- Zrozumienie RAID, partycji i GRUB - -W trakcie tego przewodnika używamy pojęć **główny dysk** i **dyski pomocnicze**. W tym kontekście: - -- Główny dysk to dysk, którego ESP (EFI System Partition) jest montowany przez system Linux -- Dyski pomocnicze to wszystkie inne dyski w RAID - -## Instrukcje - -Kiedy zakupisz nowy serwer, możesz poczuć potrzebę wykonania serii testów i działań. Jednym z takich testów może być symulacja awarii dysku, aby zrozumieć proces odbudowy RAID i przygotować się na wypadek, gdyby to się kiedykolwiek zdarzyło. - -### Omówienie treści - -- [Podstawowe informacje](#basicinformation) -- [Zrozumienie partycji systemu EFI (ESP)](#efisystemparition) -- [Symulowanie awarii dysku](#diskfailure) - - [Usunięcie awaryjnego dysku](#diskremove) -- [Odbudowanie RAID](#raidrebuild) - - [Odbudowanie RAID po wymianie głównego dysku (tryb ratunkowy)](#rescuemode) - - [Ponowne utworzenie partycji systemu EFI](#recreateesp) - - [Odbudowanie RAID, gdy partycje EFI nie są zsynchronizowane po dużych aktualizacjach systemu (np. GRUB)](efiraodgrub) - - [Dodanie etykiety do partycji SWAP (jeśli dotyczy)](#swap-partition) - - [Odbudowanie RAID w trybie normalnym](#normalmode) - - - -### Podstawowe informacje - -W sesji linii poleceń wpisz następujące polecenie, aby określić bieżący stan RAID: - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 2/4 pages [8KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -To polecenie pokazuje nam, że obecnie mamy skonfigurowane dwa urządzenia RAID oprogramowania, **md2** i **md3**, z **md3** będącym większym z nich. **md3** składa się z dwóch partycji o nazwach **nvme1n1p3** i **nvme0n1p3**. - -[UU] oznacza, że wszystkie dyski działają normalnie. `_` wskazywałby na awaryjny dysk. - -Jeśli masz serwer z dyskami SATA, otrzymasz następujące wyniki: - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 sda3[0] sdb3[1] - 3904786432 blocks super 1.2 [2/2] [UU] - bitmap: 2/30 pages [8KB], 65536KB chunk - -md2 : active raid1 sda2[0] sdb2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Choć to polecenie zwraca nasze objęte RAID woluminy, nie mówi nam o rozmiarze partycji samych w sobie. Możemy znaleźć tę informację za pomocą poniższego polecenia: - -```sh -[user@server_ip ~]# sudo fdisk -l - -Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 - -Device Start End Sectors Size Type -/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files - - -Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 - -Device Start End Sectors Size Type -/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file -/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file - - -Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes - - -Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -``` - -Polecenie `fdisk -l` umożliwia również identyfikację typu partycji. Jest to ważna informacja przy odbudowie RAID w przypadku awarii dysku. - -Dla partycji **GPT**, linia 6 będzie wyświetlać: `Disklabel type: gpt`. - -Zgodnie z wynikami `fdisk -l`, możemy stwierdzić, że `/dev/md2` składa się z 1022 MiB, a `/dev/md3` zawiera 474,81 GiB. Jeśli uruchomimy polecenie `mount`, możemy również ustalić układ dysku. - -Alternatywnie, polecenie `lsblk` oferuje inny widok partycji: - -```sh -[user@server_ip ~]# lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:7 0 511M 0 part -├─nvme1n1p2 259:8 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme1n1p3 259:9 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -└─nvme1n1p4 259:10 0 512M 0 part [SWAP] -nvme0n1 259:1 0 476.9G 0 disk -├─nvme0n1p1 259:2 0 511M 0 part /boot/efi -├─nvme0n1p2 259:3 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme0n1p3 259:4 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -├─nvme0n1p4 259:5 0 512M 0 part [SWAP] -└─nvme0n1p5 259:6 0 2M 0 part -``` - -Ponadto, jeśli uruchomimy `lsblk -f`, otrzymamy więcej informacji o tych partycjach, takich jak Eтикетка i UUID: - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] -nvme0n1 -├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi -├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] -└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 -``` - -Zwróć uwagę na urządzenia, partycje i ich punkty montowania; to jest ważne, zwłaszcza po wymianie dysku. - -Z powyższych poleceń i wyników mamy: - -- Dwa tablice RAID: `/dev/md2` i `/dev/md3`. -- Cztery partycje, które są częścią RAID: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** z punktami montowania `/boot` i `/`. -- Dwie partycje, które nie są częścią RAID, z punktami montowania: `/boot/efi` i [SWAP]. -- Jedna partycja, która nie ma punktu montowania: **nvme1n1p1** - -Partycja **nvme0n1p5** to partycja konfiguracyjna, czyli tylko do odczytu, połączona z serwerem, która dostarcza mu początkowe dane konfiguracyjne. - - - -### Zrozumienie partycji systemu EFI (ESP) - -***Co to jest partycja systemu EFI?*** - -Partycja systemu EFI to partycja, która może zawierać programy uruchamiające system operacyjny, zarządzacze uruchamiania, obrazy jądra lub inne programy systemowe. Może również zawierać programy narzędziowe systemowe zaprojektowane do uruchomienia przed uruchomieniem systemu operacyjnego, a także pliki danych, takie jak dzienniki błędów. - -***Czy partycja systemu EFI jest lustrzana w RAID?*** - -Nie, jak na sierpień 2025, gdy instalacja systemu operacyjnego jest wykonywana przez OVHcloud, ESP nie jest włączona do RAID. Gdy używasz naszych szablonów systemów operacyjnych do instalacji serwera z oprogramowaniem RAID, tworzone są kilka partycji systemu EFI: jedna na dysku. Jednak tylko jedna partycja EFI jest montowana jednocześnie. Wszystkie ESP utworzone w czasie instalacji zawierają te same pliki. - -Partycja systemu EFI jest montowana w `/boot/efi` i dysk, na którym jest montowana, jest wybierany przez Linux w czasie uruchamiania. - -Przykład: - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE - -while read -r partition; do - if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then - continue - fi - echo "Working on ${partition}" - mount "${partition}" "${MOUNTPOINT}" - rsync -ax "/boot/efi/" "${MOUNTPOINT}/" - umount "${MOUNTPOINT}" -done < <(blkid -o device -t LABEL=EFI_SYSPART) -``` - -Zapisz i zamknij plik. - -- Ustaw skrypt jako wykonywalny - -```sh -sudo chmod +x script-name.sh -``` - -- Uruchom skrypt - -```sh -sudo ./script-name.sh -``` - -- Jeśli nie jesteś w odpowiednim folderze - -```sh -./path/to/folder/script-name.sh -``` - -Po wykonaniu skryptu zawartość zainstalowanej partycji EFI zostanie zsynchronizowana z pozostałymi. Aby uzyskać dostęp do zawartości, możesz zainstalować dowolną z tych niezainstalowanych partycji EFI na punkcie montażu: `/var/lib/grub/esp`. - - - -### Symulowanie awarii dysku - -Teraz, gdy mamy wszystkie niezbędne informacje, możemy zasymulować awarię dysku i przystąpić do testów. W tym pierwszym przykładzie zasymulujemy awarię głównego dysku `nvme0n1`. - -Preferowanym sposobem jest użycie środowiska trybu ratunkowego OVHcloud. - -Najpierw uruchom serwer w trybie ratunkowym i zaloguj się przy użyciu dostarczonych poświadczeń. - -Aby usunąć dysk z tablicy RAID, pierwszym krokiem jest oznaczenie go jako **Nieprawidłowy** i usunięcie partycji z odpowiednich tablic RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Z powyższego wyniku wynika, że dysk `nvme0n1` składa się z dwóch partycji w RAID, które to są **nvme0n1p2** i **nvme0n1p3**. - - - -#### Usunięcie zepsutego dysku - -Najpierw oznacz partycje **nvme0n1p2** i **nvme0n1p3** jako zepsute. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 -# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 -# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 -``` - -Po uruchomieniu polecenia `cat /proc/mdstat`, otrzymujemy następujące dane wyjściowe: - -```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Jak widać powyżej, [F] obok partycji wskazuje, że dysk uległ awarii lub jest uszkodzony. - -Następnie usuwamy te partycje z tablic RAID, aby całkowicie usunąć dysk z RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 -# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 -# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 -``` - -Status naszego RAID powinien teraz wyglądać tak: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Z powyższych wyników widać, że teraz tylko dwie partycje pojawiają się w tablicach RAID. Pomyślnie zakończyliśmy symulację awarii dysku **nvme0n1**. - -Aby upewnić się, że otrzymamy dysk podobny do pustego, używamy poniższego polecenia na każdej partycji, a następnie na samym dysku: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -shred -s10M -n1 /dev/nvme0n1p1 -shred -s10M -n1 /dev/nvme0n1p2 -shred -s10M -n1 /dev/nvme0n1p3 -shred -s10M -n1 /dev/nvme0n1p4 -shred -s10M -n1 /dev/nvme0n1p5 -shred -s10M -n1 /dev/nvme0n1 -``` - -Dysk teraz wygląda jak nowy, pusty dysk: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -``` - -Jeśli uruchomimy poniższe polecenie, zobaczymy, że nasz dysk został pomyślnie "wyczyszczony": - -```sh -parted /dev/nvme0n1 -GNU Parted 3.5 -Using /dev/nvme0n1 -Welcome to GNU Parted! Type 'help' to view a list of commands. -(parted) p -Error: /dev/nvme0n1: unrecognised disk label -Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) -Disk /dev/nvme0n1: 512GB -Sector size (logical/physical): 512B/512B -Partition Table: unknown -Disk Flags: -``` - -Aby uzyskać więcej informacji na temat przygotowania i złożenia wniosku o wymianę dysku, zapoznaj się z tym [przewodnikiem](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). - -Jeśli uruchomisz poniższe polecenie, możesz uzyskać więcej szczegółów na temat tablic RAID: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 - -/dev/md3: - Version : 1.2 - Creation Time : Fri Aug 1 14:51:13 2025 - Raid Level : raid1 - Array Size : 497875968 (474.81 GiB 509.82 GB) - Used Dev Size : 497875968 (474.81 GiB 509.82 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent - - Intent Bitmap : Internal - - Update Time : Fri Aug 1 15:56:17 2025 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 - -Consistency Policy : bitmap - - Name : md3 - UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff - Events : 215 - - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 259 4 1 active sync /dev/nvme1n1p3 -``` - -Teraz możemy przystąpić do wymiany dysku. - - - -### Odbudowanie RAID - -> [!primary] -> Ten proces może się różnić w zależności od systemu operacyjnego zainstalowanego na Twoim serwerze. Zalecamy, abyś zapoznał się z oficjalną dokumentacją swojego systemu operacyjnego, aby uzyskać dostęp do odpowiednich poleceń. -> - -> [!warning] -> -> Dla większości serwerów w oprogramowaniu RAID po wymianie dysku serwer jest w stanie uruchomić się w normalnym trybie (na zdrowym dysku) i odbudować RAID w normalnym trybie. Jednak, jeśli serwer nie będzie mógł uruchomić się w normalnym trybie po wymianie dysku, zostanie uruchomiony w trybie ratunkowym, aby kontynuować odbudowę RAID. -> -> Jeśli Twój serwer będzie mógł uruchomić się w normalnym trybie po wymianie dysku, po prostu wykonaj kroki z [tej sekcji](#rebuilding-the-raid-in-normal-mode). - - - -#### Odbudowanie RAID w trybie ratunkowym - -Po wymianie dysku następnym krokiem jest skopiowanie tabeli partycji z zdrowego dysku (w tym przykładzie `nvme1n1`) na nowy (dysk `nvme0n1`). - -**Dla partycji GPT** - -Polecenie powinno mieć następującą postać: `sgdisk -R /dev/new disk /dev/healthy disk` - -W naszym przykładzie: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 -``` - -Uruchom `lsblk`, aby upewnić się, że tabele partycji zostały poprawnie skopiowane: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -├─nvme0n1p1 259:10 0 511M 0 part -├─nvme0n1p2 259:11 0 1G 0 part -├─nvme0n1p3 259:12 0 474.9G 0 part -└─nvme0n1p4 259:13 0 512M 0 part -``` - -Po wykonaniu tego kroku następnym krokiem jest losowe ustawienie GUID nowego dysku, aby uniknąć konfliktów GUID z innymi dyskami: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 -``` - -Jeśli otrzymasz poniższy komunikat: - -```console -Warning: The kernel is still using the old partition table. -The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) -The operation has completed successfully. -``` - -Po prostu uruchom polecenie `partprobe`. - -Teraz możemy odbudować tablicę RAID. Poniższy fragment kodu pokazuje, jak dodać nowe partycje (nvme0n1p2 i nvme0n1p3) z powrotem do tablicy RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 -# mdadm: added /dev/nvme0n1p2 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md3 /dev/nvme0n1p3 -``` - -# mdadm: ponownie dodano /dev/nvme0n1p3 -``` - -Aby sprawdzić proces odbudowy: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] -``` - -Po zakończeniu odbudowy RAID uruchom poniższe polecenie, aby upewnić się, że partycje zostały poprawnie dodane do RAID: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 -├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 -nvme0n1 -├─nvme0n1p1 -├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme0n1p4 -``` - -Na podstawie powyższych wyników partycje na nowym dysku zostały poprawnie dodane do RAID. Jednak partycja systemowa EFI i partycja SWAP (w niektórych przypadkach) nie zostały zduplikowane, co jest normalne, ponieważ nie są one uwzględniane w RAID. - -> [!warning] -> Powyższe przykłady ilustrują tylko niezbędne kroki na podstawie domyślnej konfiguracji serwera. Informacje w tabeli wyników zależą od sprzętu serwera i jego schematu partycji. W przypadku wątpliwości skonsultuj dokumentację swojego systemu operacyjnego. -> -> Jeśli potrzebujesz profesjonalnej pomocy z administracją serwerem, zapoznaj się z sekcją [Sprawdź również](#go-further) tego przewodnika. -> - - - -#### Odbudowanie partycji systemowej EFI - -Aby odbudować partycję systemową EFI, należy sformatować **nvme0n1p1** i następnie zrekopilować zawartość zdrowej partycji (w naszym przykładzie: nvme1n1p1) na nią. - -Zakładamy, że obie partycje zostały zsynchronizowane i zawierają aktualne pliki. - -> [!warning] -> Jeśli miało miejsce znaczące uaktualnienie systemu, takie jak jądro lub GRUB, i partycje nie zostały zsynchronizowane, skorzystaj z tej [sekcji](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub), gdy skończysz tworzyć nową partycję systemową EFI. -> - -Najpierw formatujemy partycję: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 -``` - -Następnie nadajemy partycji etykietę `EFI_SYSPART` (ta nazwa jest specyficzna dla OVHcloud): - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Następnie kopiujemy zawartość nvme1n1p1 do nvme0n1p1. Najpierw tworzymy dwa katalogi, które nazwiemy "old" i "new" w naszym przykładzie: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new -``` - -Następnie montujemy **nvme1n1p1** w katalogu "old" i **nvme0n1p1** w katalogu "new", aby odróżnić je od siebie: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new -``` - -Następnie kopiujemy pliki z katalogu "old" do "new": - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ -sending incremental file list -EFI/ -EFI/debian/ -EFI/debian/BOOTX64.CSV -EFI/debian/fbx64.efi -EFI/debian/grub.cfg -EFI/debian/grubx64.efi -EFI/debian/mmx64.efi -EFI/debian/shimx64.efi - -sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec -total size is 6,097,843 speedup is 1.00 -``` - -Po wykonaniu tej czynności odmontowujemy obie partycje: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 -``` - -Następnie montujemy partycję zawierającą korzeń naszego systemu operacyjnego na `/mnt`. W naszym przykładzie jest to partycja **md3**. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt -``` - -Montujemy następujące katalogi, aby upewnić się, że wszystkie operacje w środowisku `chroot` przebiegną poprawnie: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -mount --types proc /proc /mnt/proc -mount --rbind /sys /mnt/sys -mount --make-rslave /mnt/sys -mount --rbind /dev /mnt/dev -mount --make-rslave /mnt/dev -mount --bind /run /mnt/run -mount --make-slave /mnt/run -``` - -Następnie korzystamy z polecenia `chroot`, aby uzyskać dostęp do punktu montażowego i upewnić się, że nowa partycja systemowa EFI została poprawnie utworzona i system rozpoznaje obie partycje ESP: - -```sh -root@rescue12-customer-eu:/# chroot /mnt -``` - -Aby wyświetlić partycje ESP, uruchamiamy polecenie `blkid -t LABEL=EFI_SYSPART`: - -```sh -root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Wyniki powyższe pokazują, że nowa partycja EFI została poprawnie utworzona i etykieta została poprawnie zastosowana. - - - -#### Odbudowa RAID, gdy partycje EFI nie są zsynchronizowane po znaczących uaktualnieniach systemu (GRUB) - -/// details | Rozwiń tę sekcję - -> [!warning] -> Postępuj zgodnie z krokami w tej sekcji tylko wtedy, gdy dotyczy to Twojego przypadku. -> - -Gdy partycje systemowe EFI nie są zsynchronizowane po znaczących uaktualnieniach systemu, które modyfikują/lub wpływają na GRUB, a podstawowy dysk, na którym jest zamontowana partycja, zostaje wymieniony, uruchomienie z dysku pomocniczego zawierającego przestarzałą partycję ESP może się nie powieść. - -W takim przypadku, oprócz odbudowy RAID i ponownego utworzenia partycji systemowej EFI w trybie ratunkowym, należy również ponownie zainstalować GRUB na niej. - -Po utworzeniu partycji EFI i upewnieniu się, że system rozpoznaje obie partycje (poprzednie kroki w `chroot`), tworzymy katalog `/boot/efi`, aby zamontować nową partycję systemową EFI **nvme0n1p1**: - -```sh -root@rescue12-customer-eu:/# mount /boot -root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi -``` - -Następnie ponownie instalujemy bootloader GRUB: - -```sh -root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 -``` - -Po wykonaniu tej czynności uruchamiamy poniższe polecenie: - -```sh -root@rescue12-customer-eu:/# update-grub -``` -/// - - - -#### Dodanie etykiety do partycji SWAP (jeśli dotyczy) - -Po zakończeniu pracy z partycją EFI przechodzimy do partycji SWAP. - -Wyjdź z środowiska `chroot` za pomocą `exit`, aby ponownie utworzyć naszą [SWAP] partycję **nvme0n1p4** i dodać etykietę `swap-nvme0n1p4`: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd -``` - -Sprawdzamy, czy etykieta została poprawnie zastosowana: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 - -├─nvme1n1p1 -│ vfat FAT16 EFI_SYSPART -│ BA77-E844 504.9M 1% /root/old -├─nvme1n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme1n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme1n1p4 - swap 1 swap-nvme1n1p4 - d6af33cf-fc15-4060-a43c-cb3b5537f58a -nvme0n1 - -├─nvme0n1p1 -│ vfat FAT16 EFI_SYSPART -│ 477D-6658 -├─nvme0n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme0n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685 - -# mdadm: ponownie dodano /dev/nvme0n1p3 -``` - -Użyj poniższego polecenia, aby śledzić odbudowę RAID: `cat /proc/mdstat`. - -**Odbudowanie partycji systemu EFI na dysku** - -Najpierw instalujemy niezbędne narzędzia: - -**Debian i Ubuntu** - -```sh -[user@server_ip ~]# sudo apt install dosfstools -``` - -**CentOS** - -```sh -[user@server_ip ~]# sudo yum install dosfstools -``` - -Następnie formatujemy partycję. W naszym przykładzie `nvme0n1p1`: - -```sh -[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 -``` - -Następnie nadajemy partycji etykietę `EFI_SYSPART` (ta nazwa jest specyficzna dla OVHcloud) - -```sh -[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Po wykonaniu tej czynności możesz zsynchronizować obie partycje za pomocą skryptu, który udostępniliśmy [tutaj](#script). - -Sprawdzamy, czy nowa partycja systemu EFI została poprawnie utworzona i system ją rozpoznaje: - -```sh -[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Na koniec aktywujemy partycję [SWAP] (jeśli dotyczy): - - -- Tworzymy i dodajemy etykietę: - -```sh -[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -``` - -- Pobieramy UUID obu partycji swap: - -```sh -[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 -/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 -/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" -``` - -- Zastępujemy stary UUID partycji swap (**nvme0n1p4)** nowym w pliku `/etc/fstab`: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -``` - -Przykład: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 -UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 -LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 -UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 -UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 -``` - -Na podstawie powyższych wyników, stary UUID to `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` i powinien zostać zastąpiony nowym `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. - -Upewnij się, że zastępujesz poprawny UUID. - -Następnie uruchamiamy poniższe polecenie, aby aktywować partycję swap: - -```sh -[user@server_ip ~]# sudo swapon -av -swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme0n1p4 -swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme1n1p4 -``` - -Następnie ponownie ładowujemy system: - -```sh -[user@server_ip ~]# sudo systemctl daemon-reload -``` - -Pomyślnie ukończono odbudowę RAID. - -## Sprawdź również - -[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) - -[OVHcloud API i Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) - -[Zarządzanie hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) - -[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) - -Dla usług specjalistycznych (SEO, programowanie itp.), skontaktuj się z [partnerami OVHcloud](/links/partner). - -Jeśli potrzebujesz pomocy w użyciu i konfiguracji rozwiązań OVHcloud, zapoznaj się z naszymi [ofertami wsparcia](/links/support). - -Jeśli potrzebujesz szkoleń lub pomocy technicznej w wdrożeniu naszych rozwiązań, skontaktuj się ze swoim przedstawicielem handlowym lub kliknij [ten link](/links/professional-services), aby uzyskać wycenę i zapytać ekspertów z Professional Services o pomoc w konkretnym przypadku użycia projektu. - -Dołącz do [grona naszych użytkowników](/links/community). \ No newline at end of file diff --git a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md b/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md deleted file mode 100644 index d3053240d60..00000000000 --- a/pages/bare_metal_cloud/dedicated_servers/raid_soft_uefi/guide.pt-pt.md +++ /dev/null @@ -1,905 +0,0 @@ ---- -title: "Gestão e reconstrução de um RAID software nos servidores que utilizam o modo de arranque UEFI" -excerpt: Descubra como gerir e reconstruir um RAID software após a substituição de disco num servidor que utiliza o modo de arranque UEFI -updated: 2025-12-11 ---- - -## Objetivo - -Um Redundant Array of Independent Disks (RAID) é uma tecnologia que atenua a perda de dados num servidor ao replicar os dados em dois discos ou mais. - -O nível RAID predefinido para as instalações de servidores OVHcloud é o RAID 1, que duplica o espaço ocupado pelos seus dados, reduzindo assim o espaço de disco utilizável para metade. - -**Este guia explica como gerir e reconstruir um RAID software após a substituição de disco no seu servidor em modo EFI** - -Antes de começar, note que este guia foca-se nos servidores dedicados que utilizam o modo UEFI como modo de arranque. Este é o caso das placas-mãe modernas. Se o seu servidor utiliza o modo de arranque legacy (BIOS), consulte este guia: [Gestão e reconstrução de um RAID software em servidores no modo de arranque legacy (BIOS)](/pages/bare_metal_cloud/dedicated_servers/raid_soft_bios). - -Para verificar se um servidor está a funcionar no modo BIOS legacy ou no modo UEFI, execute o seguinte comando: - -```sh -[user@server_ip ~]# [ -d /sys/firmware/efi ] && echo UEFI || echo BIOS -``` - -Para mais informações sobre a UEFI, consulte o seguinte artigo: [https://uefi.org/about](https://uefi.org/about). - -## Requisitos - -- Um [servidor dedicado](/links/bare-metal/bare-metal) com uma configuração RAID software -- Acesso administrativo (sudo) ao servidor através de SSH -- Compreensão do RAID, partições e GRUB - -Ao longo deste guia, utilizamos os termos **disco principal** e **disco secundário**. Neste contexto: - -- O disco principal é o disco cuja ESP (EFI System Partition) está montada pelo Linux -- Os discos secundários são todos os outros discos do RAID - -## Instruções - -Quando compra um novo servidor, pode sentir a necessidade de realizar uma série de testes e ações. Um destes testes pode ser simular uma falha de disco para compreender o processo de reconstrução do RAID e preparar-se em caso de problema. - -### Visão geral do conteúdo - -- [Informações básicas](#basicinformation) -- [Compreensão da partição do sistema EFI (ESP)](#efisystemparition) -- [Simulação de uma falha de disco](#diskfailure) - - [Remoção do disco defeituoso](#diskremove) -- [Reconstrução do RAID](#raidrebuild) - - [Reconstrução do RAID após a substituição do disco principal (modo de recuperação)](#rescuemode) - - [Recriação da partição do sistema EFI](#recreateesp) - - [Reconstrução do RAID quando as partições EFI não estão sincronizadas após atualizações maiores do sistema (ex. GRUB)](efiraodgrub) - - [Adição da etiqueta à partição SWAP (se aplicável)](#swap-partition) - - [Reconstrução do RAID em modo normal](#normalmode) - - - -### Informações básicas - -Numa sessão de linha de comandos, introduza o seguinte comando para determinar o estado atual do RAID : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 nvme1n1p3[1] nvme0n1p3[0] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 2/4 pages [8KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] nvme0n1p2[0] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Este comando mostra-nos que temos atualmente dois volumes RAID software configurados, **md2** e **md3**, com **md3** sendo o maior dos dois. **md3** é composto por duas partições, chamadas **nvme1n1p3** e **nvme0n1p3**. - -O [UU] significa que todos os discos estão a funcionar normalmente. Um `_` indicaria um disco defeituoso. - -Se tiver um servidor com discos SATA, obterá os seguintes resultados : - -```sh -[user@server_ip ~]# cat /proc/mdstat -Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] -md3 : active raid1 sda3[0] sdb3[1] - 3904786432 blocks super 1.2 [2/2] [UU] - bitmap: 2/30 pages [8KB], 65536KB chunk - -md2 : active raid1 sda2[0] sdb2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -Embora este comando devolva os nossos volumes RAID, não nos indica o tamanho das próprias partições. Podemos encontrar esta informação com o seguinte comando : - -```sh -[user@server_ip ~]# sudo fdisk -l - -Disk /dev/nvme1n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: A11EDAA3-A984-424B-A6FE-386550A92435 - -Device Start End Sectors Size Type -/dev/nvme1n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme1n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme1n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme1n1p4 999161856 1000210431 1048576 512M Linux files - - -Disk /dev/nvme0n1: 476.94 GiB, 512110190592 bytes, 1000215216 sectors -Disk model: WDC CL SN720 SDAQNTW-512G-2000 -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: gpt -Disk identifier: F03AC3C3-D7B7-43F9-88DB-9F12D7281D94 - -Device Start End Sectors Size Type -/dev/nvme0n1p1 2048 1048575 1046528 511M EFI System -/dev/nvme0n1p2 1048576 3145727 2097152 1G Linux RAID -/dev/nvme0n1p3 3145728 999161855 996016128 474.9G Linux RAID -/dev/nvme0n1p4 999161856 1000210431 1048576 512M Linux file -/dev/nvme0n1p5 1000211120 1000215182 4063 2M Linux file - - -Disk /dev/md2: 1022 MiB, 1071644672 bytes, 2093056 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes - - -Disk /dev/md3: 474.81 GiB, 509824991232 bytes, 995751936 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -``` - -O comando `fdisk -l` permite também identificar o tipo das suas partições. Esta é uma informação importante durante a reconstrução do seu RAID em caso de falha de disco. - -Para as partições **GPT**, a linha 6 mostrará: `Disklabel type: gpt`. - -Também com base nos resultados de `fdisk -l`, podemos ver que `/dev/md2` é composto por 1022 MiB e `/dev/md3` contém 474,81 GiB. Se executarmos o comando `mount`, também podemos encontrar a disposição dos discos. - -Como alternativa, o comando `lsblk` oferece uma visão diferente das partições : - -```sh -[user@server_ip ~]# lsblk -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:7 0 511M 0 part -├─nvme1n1p2 259:8 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme1n1p3 259:9 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -└─nvme1n1p4 259:10 0 512M 0 part [SWAP] -nvme0n1 259:1 0 476.9G 0 disk -├─nvme0n1p1 259:2 0 511M 0 part /boot/efi -├─nvme0n1p2 259:3 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 /boot -├─nvme0n1p3 259:4 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 / -├─nvme0n1p4 259:5 0 512M 0 part [SWAP] -└─nvme0n1p5 259:6 0 2M 0 part -``` - -Além disso, se executarmos `lsblk -f`, obtemos mais informações sobre estas partições, tais como o LABEL e o UUID : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme1n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -└─nvme1n1p4 swap 1 swap-nvme1n1p4 483b9b41-ada3-4143-8cac-5bff7afb73c7 [SWAP] -nvme0n1 -├─nvme0n1p1 vfat FAT16 EFI_SYSPART B486-9781 504.9M 1% /boot/efi -├─nvme0n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b5-4048-8c39-525194e441aa 851.8M 7% /boot -├─nvme0n1p3 linux_raid_member 1.2 md3 ce0c7fac-0032-054c-eef7-7463b2245519 -│ └─md3 ext4 1.0 root 6fea39e9-6297-4ea3-82f1-bf1a3e88106a 441.3G 0% / -├─nvme0n1p4 swap 1 swap-nvme0n1p4 51e7172b-adb0-4729-b0f8-613e5dede38b [SWAP] -└─nvme0n1p5 iso9660 Joliet Extension config-2 2025-08-05-14-55-41-00 -``` - -Note os dispositivos, as partições e os seus pontos de montagem; isto é importante, especialmente após a substituição de um disco. - -A partir dos comandos e resultados acima, temos : - -- Duas matrizes RAID: `/dev/md2` e `/dev/md3`. -- Quatro partições que fazem parte do RAID: **nvme0n1p2**, **nvme0n1p3**, **nvme1n1p2**, **nvme0n1p3** com os pontos de montagem `/boot` e `/`. -- Duas partições não incluídas no RAID, com os pontos de montagem: `/boot/efi` e [SWAP]. -- Uma partição que não possui ponto de montagem: **nvme1n1p1** - -A partição `nvme0n1p5` é uma partição de configuração, ou seja, um volume somente leitura ligado ao servidor que lhe fornece os dados de configuração inicial. - - - -### Compreender a partição do sistema EFI (ESP) - -***O que é uma partição do sistema EFI ?*** - -Uma partição do sistema EFI é uma partição na qual o servidor inicia. Contém os ficheiros de arranque, bem como os gestores de arranque ou as imagens do núcleo de um sistema operativo instalado. Pode também conter programas utilitários concebidos para serem executados antes que o sistema operativo inicie, bem como ficheiros de dados tais como registos de erros. - -***A partição do sistema EFI está incluída no RAID ?*** - -Não, a partir de agosto de 2025, quando uma instalação do sistema operativo é efetuada pela OVHcloud, a partição ESP não está incluída no RAID. Quando utiliza os nossos modelos de SO para instalar o seu servidor com um RAID software, várias partições do sistema EFI são criadas: uma por disco. No entanto, apenas uma partição EFI é montada de cada vez. Todas as ESP criadas contêm os mesmos ficheiros. Todas as ESP criadas no momento da instalação contêm os mesmos ficheiros. - -A partição do sistema EFI é montada em `/boot/efi` e o disco no qual está montada é selecionado pelo Linux no arranque. - -Exemplo : - -```sh -[user@server_ip ~]# sudo lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINT -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART B493-9DFA -├─nvme1n1p2 linux_raid_member 1.2 md2 baae988b-bef3-fc07-615f-6f9043cfd5ea -│ └─md2 ext4 1.0 boot 96850c4e-e2b - -Recomendamos que sincronize os seus ESP regularmente ou após cada atualização importante do sistema. Por defeito, todas as partições do sistema EFI contêm os mesmos ficheiros após a instalação. No entanto, se uma atualização importante do sistema estiver envolvida, a sincronização dos ESP é essencial para manter o conteúdo atualizado. - - - -#### Script - -Aqui está um script que pode utilizar para os sincronizar manualmente. Também pode executar um script automatizado para sincronizar as partições diariamente ou sempre que o serviço iniciar. - -Antes de executar o script, certifique-se de que o `rsync` está instalado no seu sistema : - -**Debian/Ubuntu** - -```sh -sudo apt install rsync -``` - -**CentOS, Red Hat e Fedora** - -```sh -sudo yum install rsync -``` - -Para executar um script em Linux, necessita de um ficheiro executável : - -- Comece por criar um ficheiro .sh no diretório da sua escolha, substituindo `nome-do-script` pelo nome da sua escolha. - -```sh -sudo touch nome-do-script.sh -``` - -- Abra o ficheiro com um editor de texto e adicione as seguintes linhas : - -```sh -sudo nano nome-do-script.sh -``` - -```sh -#!/bin/bash - -set -euo pipefail - -MOUNTPOINT="/var/lib/grub/esp" -MAIN_PARTITION=$(findmnt -n -o SOURCE /boot/efi) - -echo "${MAIN_PARTITION} é a partição principal" - -mkdir -p "${MOUNTPOINT}" - -while read -r partition; do - if [[ "${partition}" == "${MAIN_PARTITION}" ]]; then - continue - fi - echo "Trabalhando em ${partition}" - mount "${partition}" "${MOUNTPOINT}" - rsync -ax "/boot/efi/" "${MOUNTPOINT}/" - umount "${MOUNTPOINT}" -done < <(blkid -o device -t LABEL=EFI_SYSPART) -``` - -Guarde e feche o ficheiro. - -- Torne o script executável - -```sh -sudo chmod +x nome-do-script.sh -``` - -- Execute o script - -```sh -sudo ./nome-do-script.sh -``` - -- Se não estiver no diretório - -```sh -./caminho/para/o/diretório/nome-do-script.sh -``` - -Quando o script é executado, o conteúdo da partição EFI montada será sincronizado com as outras. Para aceder ao conteúdo, pode montar uma destas partições EFI não montadas no ponto de montagem: `/var/lib/grub/esp`. - - - -### Simulação de uma falha de disco - -Agora que temos todas as informações necessárias, podemos simular uma falha de disco e proceder aos testes. Neste primeiro exemplo, vamos provocar uma falha no disco principal `nvme0n1`. - -O método preferido para o fazer é através do modo rescue da OVHcloud. - -Reinicie primeiro o servidor em modo rescue e ligue-se com as credenciais fornecidas. - -Para retirar um disco do RAID, o primeiro passo é marcá-lo como **Failed** e retirar as partições dos seus arrays RAID respetivos. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/2] [UU] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] - -unused devices: -``` - -A partir do resultado acima, nvme0n1 contém duas partições em RAID que são **nvme0n1p2** e **nvme0n1p3**. - - - -#### Remoção do disco defeituoso - -Primeiro, marcamos as partições **nvme0n1p2** e **nvme0n1p3** como defeituosas. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --fail /dev/nvme0n1p2 -# mdadm: set /dev/nvme0n1p2 faulty in /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --fail /dev/nvme0n1p3 -# mdadm: set /dev/nvme0n1p3 faulty in /dev/md3 -``` - -Quando executamos o comando `cat /proc/mdstat`, obtemos : - -```sh -root@rescue12-customer-ca (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[0](F) nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2](F) nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -Como podemos ver acima, o [F] ao lado das partições indica que o disco está defeituoso ou em falha. - -Em seguida, retiramos estas partições dos arrays RAID para eliminar completamente o disco do RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md2 --remove /dev/nvme0n1p2 -# mdadm: hot removed /dev/nvme0n1p2 from /dev/md2 -``` - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --manage /dev/md3 --remove /dev/nvme0n1p3 -# mdadm: hot removed /dev/nvme0n1p3 from /dev/md3 -``` - -O estado do nosso RAID deverá agora assemelhar-se a isto : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme1n1p2[1] - 1046528 blocks super 1.2 [2/1] [_U] - -unused devices: -``` - -De acordo com os resultados acima, podemos ver que agora existem apenas duas partições nos arrays RAID. Conseguimos degradar com sucesso o disco **nvme0n1**. - -Para nos certificarmos de obter um disco semelhante a um disco vazio, utilizamos o seguinte comando em cada partição, seguido do próprio disco : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -shred -s10M -n1 /dev/nvme0n1p1 -shred -s10M -n1 /dev/nvme0n1p2 -shred -s10M -n1 /dev/nvme0n1p3 -shred -s10M -n1 /dev/nvme0n1p4 -shred -s10M -n1 /dev/nvme0n1p5 -shred -s10M -n1 /dev/nvme0n1 -``` - -O disco aparece agora como um disco novo e vazio : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -``` - -Se executarmos o seguinte comando, verificamos que o nosso disco foi corretamente "apagado" : - -```sh -parted /dev/nvme0n1 -GNU Parted 3.5 -Using /dev/nvme0n1 -Welcome to GNU Parted! Type 'help' to view a list of commands. -(parted) p -Error: /dev/nvme0n1: unrecognised disk label -Model: WDC CL SN720 SDAQNTW-512G-2000 (nvme) -Disk /dev/nvme0n1: 512GB -Sector size (logical/physical): 512B/512B -Partition Table: unknown -Disk Flags: -``` - -Para mais informações sobre a preparação e a solicitação de substituição de um disco, consulte este [guia](/pages/bare_metal_cloud/dedicated_servers/disk_replacement). - -Se executar o seguinte comando, pode obter mais detalhes sobre os arrays RAID : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --detail /dev/md3 - -/dev/md3: - Version : 1.2 - Creation Time : Fri Aug 1 14:51:13 2025 - Raid Level : raid1 - Array Size : 497875968 (474.81 GiB 509.82 GB) - Used Dev Size : 497875968 (474.81 GiB 509.82 GB) - Raid Devices : 2 - Total Devices : 1 - Persistence : Superblock is persistent - - Intent Bitmap : Internal - - Update Time : Fri Aug 1 15:56:17 2025 - State : clean, degraded - Active Devices : 1 - Working Devices : 1 - Failed Devices : 0 - Spare Devices : 0 - -Consistency Policy : bitmap - - Name : md3 - UUID : b383c3d5:7fb1bb5e:6b7c4d96:6ea817ff - Events : 215 - - Number Major Minor RaidDevice State - - 0 0 0 removed - 1 259 4 1 active sync /dev/nvme1n1p3 -``` - -Agora podemos proceder à substituição do disco. - - - -### Reconstrução do RAID - -> [!primary] -> Este processo pode variar consoante o sistema operativo instalado no seu servidor. Recomendamos que consulte a documentação oficial do seu sistema operativo para obter os comandos adequados. -> - -> [!warning] -> -> Na maioria dos servidores com RAID software, após a substituição de um disco, o servidor é capaz de arrancar em modo normal (no disco saudável) e a reconstrução pode ser efetuada em modo normal. No entanto, se o servidor não conseguir arrancar em modo normal após a substituição do disco, reiniciará em modo rescue para proceder à reconstrução do RAID. -> -> Se o seu servidor for capaz de arrancar em modo normal após a substituição do disco, siga apenas as etapas da [secção seguinte](#rebuilding-the-raid-in-normal-mode). - - - -#### Reconstrução do RAID em modo rescue - -Uma vez o disco substituído, o próximo passo consiste em copiar a tabela de partições do disco saudável (neste exemplo, nvme1n1) para o novo (nvme0n1). - -**Para as partições GPT** - -O comando deve estar neste formato : `sgdisk -R /dev/novo disco /dev/disco saudável` - -No nosso exemplo : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -R /dev/nvme0n1 /dev/nvme1n1 -``` - -Execute `lsblk` para se certificar de que as tabelas de partições foram corretamente copiadas : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk - -NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS -nvme1n1 259:0 0 476.9G 0 disk -├─nvme1n1p1 259:1 0 511M 0 part -├─nvme1n1p2 259:2 0 1G 0 part -│ └─md2 9:2 0 1022M 0 raid1 -├─nvme1n1p3 259:3 0 474.9G 0 part -│ └─md3 9:3 0 474.8G 0 raid1 -└─nvme1n1p4 259:4 0 512M 0 part -nvme0n1 259:5 0 476.9G 0 disk -├─nvme0n1p1 259:10 0 511M 0 part -├─nvme0n1p2 259:11 0 1G 0 part -├─nvme0n1p3 259:12 0 474.9G 0 part -└─nvme0n1p4 259:13 0 512M 0 part -``` - -Uma vez feito isto, o próximo passo consiste em atribuir um GUID aleatório ao novo disco para evitar conflitos de GUID com outros discos : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # sgdisk -G /dev/nvme0n1 -``` - -Se receber a seguinte mensagem : - -```console -Warning: The kernel is still using the old partition table. -The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) -The operation has completed successfully. -``` - -Execute simplesmente o comando `partprobe`. - -Agora podemos reconstruir a matriz RAID. O seguinte trecho de código mostra como adicionar novamente as novas partições (nvme0n1p2 e nvme0n1p3) à matriz RAID. - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mdadm --add /dev/md2 /dev/nvme0n1p2 -# mdadm: added /dev/nvme0n1p2 -root@res - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Para verificar o processo de reconstrução : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # cat /proc/mdstat -Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] -md3 : active raid1 nvme0n1p3[2] nvme1n1p3[1] - 497875968 blocks super 1.2 [2/1] [_U] - [>....................] recovery = 0.1% (801920/497875968) finish=41.3min speed=200480K/sec - bitmap: 0/4 pages [0KB], 65536KB chunk - -md2 : active raid1 nvme0n1p2[2] nvme1n1p2[1] - 1046528 blocks super 1.2 [2/2] [UU] -``` - -Uma vez a reconstrução do RAID terminada, execute o seguinte comando para se assegurar que as partições foram corretamente adicionadas ao RAID : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 -├─nvme1n1p1 vfat FAT16 EFI_SYSPART 4629-D183 -├─nvme1n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme1n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme1n1p4 swap 1 swap-nvme1n1p4 9bf292e8-0145-4d2f-b891-4cef93c0d209 -nvme0n1 -├─nvme0n1p1 -├─nvme0n1p2 linux_raid_member 1.2 md2 83719c5c-2a27-2a56-5268-7d49d8a1d84f -│ └─md2 ext4 1.0 boot 4de80ae0-dd90-4256-9135-1735e7be4b4d -├─nvme0n1p3 linux_raid_member 1.2 md3 b383c3d5-7fb1-bb5e-6b7c-4d966ea817ff -│ └─md3 ext4 1.0 root 9bf386b6-9523-46bf-b8e5-4b8cc7c5786f -└─nvme0n1p4 -``` - -De acordo com os resultados acima, as partições do novo disco foram corretamente adicionadas ao RAID. No entanto, a partição EFI System e a partição SWAP (em alguns casos) não foram duplicadas, o que é normal, pois não fazem parte do RAID. - -> [!warning] -> Os exemplos acima ilustram apenas as etapas necessárias com base numa configuração de servidor predefinida. Os resultados de cada comando dependem do tipo de hardware instalado no seu servidor e da estrutura das suas partições. Em caso de dúvida, consulte a documentação do seu sistema operativo. -> -> Se precisar de assistência profissional para a administração do seu servidor, consulte os detalhes da secção [Quer saber mais?](#go-further) deste guia. -> - - - -#### Recriação da partição EFI System - -Para recolocar a partição EFI System, temos de formatar **nvme0n1p1** e replicar o conteúdo da partição EFI System saudável (no nosso exemplo: nvme1n1p1) para esta. - -Aqui, assumimos que as duas partições foram sincronizadas e contêm ficheiros actualizados ou simplesmente não sofreram actualizações do sistema com impacto no *bootloader*. - -> [!warning] -> Se uma actualização importante do sistema, tal como uma actualização do núcleo ou do GRUB, ocorreu e as duas partições não foram sincronizadas, consulte esta [secção](#rebuilding-raid-when-efi-partitions-are-not-synchronized-after-major-system-updates-eg-grub) uma vez que tenha terminado a criação da nova partição EFI System. -> - -Em primeiro lugar, formatamos a partição : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkfs.vfat /dev/nvme0n1p1 -``` - -Em seguida, atribuímos a etiqueta `EFI_SYSPART` à partição. (este nome é específico da OVHcloud) : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Em seguida, replicamos o conteúdo de nvme1n1p1 para nvme0n1p1. Começamos por criar dois diretórios, que chamamos « old » e « new » no nosso exemplo : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkdir old new -``` - -Em seguida, montamos **nvme1n1p1** no diretório « old » e **nvme0n1p1** no diretório « new » para diferenciá-los : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme1n1p1 old -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/nvme0n1p1 new -``` - -Em seguida, copiamos os ficheiros do diretório 'old' para 'new' : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # rsync -axv old/ new/ -sending incremental file list -EFI/ -EFI/debian/ -EFI/debian/BOOTX64.CSV -EFI/debian/fbx64.efi -EFI/debian/grub.cfg -EFI/debian/grubx64.efi -EFI/debian/mmx64.efi -EFI/debian/shimx64.efi - -sent 6,099,848 bytes received 165 bytes 12,200,026.00 bytes/sec -total size is 6,097,843 speedup is 1.00 -``` - -Uma vez feito isto, desmontamos as duas partições : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme0n1p1 -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # umount /dev/nvme1n1p1 -``` - -Em seguida, montamos a partição contendo a raiz do nosso sistema operativo em `/mnt`. No nosso exemplo, esta partição é **md3**: - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mount /dev/md3 /mnt -``` - -Montamos os seguintes diretórios para nos assegurarmos que qualquer manipulação que realizamos no ambiente `chroot` funciona corretamente : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # -mount --types proc /proc /mnt/proc -mount --rbind /sys /mnt/sys -mount --make-rslave /mnt/sys -mount --rbind /dev /mnt/dev -mount --make-rslave /mnt/dev -mount --bind /run /mnt/run -mount --make-slave /mnt/run -``` - -Em seguida, utilizamos o comando `chroot` para aceder ao ponto de montagem e assegurar-nos que a nova partição do sistema EFI foi corretamente criada e que o sistema reconhece as duas ESP : - -```sh -root@rescue12-customer-eu:/# chroot /mnt -``` - -Para mostrar as partições ESP, executamos o comando `blkid -t LABEL=EFI_SYSPART` : - -```sh -root@rescue12-customer-eu:/# blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Os resultados acima mostram que a nova partição EFI foi criada corretamente e que a etiqueta foi aplicada corretamente. - - - -#### Reconstrução do RAID quando as partições EFI não estão sincronizadas após actualizações importantes do sistema (GRUB) - -/// details | Desenvolva esta secção - -> [!warning] -> Siga as etapas desta secção apenas se se aplicarem ao seu caso. -> - -Quando as partições do sistema EFI não estão sincronizadas após actualizações importantes do sistema que modificam/afetam o GRUB, e o disco principal no qual a partição está montada é substituído, o arranque a partir de um disco secundário contendo uma ESP obsoleta pode não funcionar. - -Neste caso, para além de reconstruir o RAID e recolocar a partição do sistema EFI no modo rescue, também deve reinstalar o GRUB nela. - -Uma vez que tenhamos recolocado a partição EFI e nos certificamos que o sistema reconhece as duas partições (etapas anteriores no `chroot`), criamos a pasta `/boot/efi` para montar a nova partição do sistema EFI **nvme0n1p1** : - -```sh -root@rescue12-customer-eu:/# mount /boot -root@rescue12-customer-eu:/# mount /dev/nvme0n1p1 /boot/efi -``` - -Em seguida, reinstalamos o carregador de arranque GRUB (*bootloader*) : - -```sh -root@rescue12-customer-eu:/# grub-install --efi-directory=/boot/efi /dev/nvme0n1p1 -``` - -Uma vez feito isto, execute o seguinte comando : - -```sh -root@rescue12-customer-eu:/# update-grub -``` -/// - - - -#### Adição da etiqueta à partição SWAP (se aplicável) - -Uma vez que tenhamos terminado com a partição EFI, passamos à partição SWAP. - -Sair do ambiente `chroot` com `exit` para recolocar a nossa partição [SWAP] **nvme0n1p4** e adicionar a etiqueta `swap-nvme0n1p4` : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -Setting up swapspace version 1, size = 512 MiB (536866816 bytes) -LABEL=swap-nvme0n1p4, UUID=b3c9e03a-52f5-4683-81b6-cc10091fcd -``` - -Verificamos que a etiqueta foi corretamente aplicada : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # lsblk -f -NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS -nvme1n1 - -├─nvme1n1p1 -│ vfat FAT16 EFI_SYSPART -│ BA77-E844 504.9M 1% /root/old -├─nvme1n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme1n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme1n1p4 - swap 1 swap-nvme1n1p4 - d6af33cf-fc15-4060-a43c-cb3b5537f58a -nvme0n1 - -├─nvme0n1p1 -│ vfat FAT16 EFI_SYSPART -│ 477D-6658 -├─nvme0n1p2 -│ linux_ 1.2 md2 53409058-480a-bc65-4e1d-6acc848fe233 -│ └─md2 -│ ext4 1.0 boot f925a033-0087-40ec-817e-44efab0351ac -├─nvme0n1p3 -│ linux_ 1.2 md3 a3b8816c-a5c3-7f01-ee17-e1aa9685c35c -│ └─md3 -│ ext4 1.0 root 6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 441.2G 0% /mnt -└─nvme0n1p4 - swap 1 swap-nvme0n1p4 - b3c9e03a-52f5-4683-81b6-cc10091fcd15 -``` - -Acedemos novamente ao ambiente `chroot` : - -```sh -root@rescue12-customer-eu (nsxxxxx.ip-xx-xx-xx.eu) ~ # chroot /mnt -``` - -Recuperamos o UUID das - -# mdadm: re-added /dev/nvme0n1p3 -``` - -Utilize o seguinte comando para seguir a reconstrução do RAID: `cat /proc/mdstat`. - -**Recriação da partição do sistema EFI no disco** - -Primeiro, instalamos as ferramentas necessárias: - -**Debian e Ubuntu** - -```sh -[user@server_ip ~]# sudo apt install dosfstools -``` - -**CentOS** - -```sh -[user@server_ip ~]# sudo yum install dosfstools -``` - -Em seguida, formatamos a partição. No nosso exemplo `nvme0n1p1`: - -```sh -[user@server_ip ~]# sudo mkfs.vfat /dev/nvme0n1p1 -``` - -Em seguida, atribuímos a etiqueta `EFI_SYSPART` à partição. (este nome é específico da OVHcloud): - -```sh -[user@server_ip ~]# sudo fatlabel /dev/nvme0n1p1 EFI_SYSPART -``` - -Depois disso, pode sincronizar as duas partições com o script que fornecemos [aqui](#script). - -Verificamos que a nova partição do sistema EFI foi corretamente criada e que o sistema a reconhece: - -```sh -[user@server_ip ~]# sudo blkid -t LABEL=EFI_SYSPART -/dev/nvme1n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="4629-D183" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="889f241b-49c3-4031-b5c9-60df0746f98f" -/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="EFI_SYSPART" LABEL="EFI_SYSPART" UUID="521F-300B" BLOCK_SIZE="512" TYPE="vfat" PARTLABEL="primary" PARTUUID="02bf2b2d-7ada-4461-ba50-07683519f65d" -``` - -Por fim, ativamos a partição [SWAP] (se aplicável): - -- Criamos e adicionamos a etiqueta: - -```sh -[user@server_ip ~]# sudo mkswap /dev/nvme0n1p4 -L swap-nvme0n1p4 -``` - -- Recuperamos os UUID das duas partições swap: - -```sh -[user@server_ip ~]# sudo blkid -s /dev/nvme0n1p4 -/dev/nvme0n1p4: UUID="b3c9e03a-52f5-4683-81b6-cc10091fcd15" -[user@server_ip ~]# sudo blkid -s /dev/nvme1n1p4 -/dev/nvme1n1p4: UUID="d6af33cf-fc15-4060-a43c-cb3b5537f58a" -``` - -- Substituímos o antigo UUID da partição swap (**nvme0n1p4)** pelo novo em `/etc/fstab`: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -``` - -Exemplo: - -```sh -[user@server_ip ~]# sudo nano /etc/fstab -UUID=6abfaa3b-e630-457a-bbe0-e00e5b4b59e5 / ext4 defaults 0 1 -UUID=f925a033-0087-40ec-817e-44efab0351ac /boot ext4 defaults 0 0 -LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1 -UUID=b7b5dd38-9b51-4282-8f2d-26c65e8d58ec swap swap defaults 0 0 -UUID=d6af33cf-fc15-4060-a43c-cb3b5537f58a swap swap defaults 0 0 -``` - -De acordo com os resultados acima, o antigo UUID é `b7b5dd38-9b51-4282-8f2d-26c65e8d58ec` e deve ser substituído pelo novo `b3c9e03a-52f5-4683-81b6-cc10091fcd15`. - -Certifique-se de substituir o UUID correto. - -Em seguida, executamos o seguinte comando para ativar a partição swap: - -```sh -[user@server_ip ~]# sudo swapon -av -swapon: /dev/nvme0n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme0n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme0n1p4 -swapon: /dev/nvme1n1p4: found signature [pagesize=4096, signature=swap] -swapon: /dev/nvme1n1p4: pagesize=4096, swapsize=536870912, devsize=536870912 -swapon /dev/nvme1n1p4 -``` - -Em seguida, recarregamos o sistema: - -```sh -[user@server_ip ~]# sudo systemctl daemon-reload -``` - -Agora terminámos com sucesso a reconstrução do RAID. - -## Quer saber mais? - -[Hot Swap - Software RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_soft) - -[OVHcloud API and Storage](/pages/bare_metal_cloud/dedicated_servers/partitioning_ovh) - -[Managing hardware RAID](/pages/bare_metal_cloud/dedicated_servers/raid_hard) - -[Hot Swap - Hardware RAID](/pages/bare_metal_cloud/dedicated_servers/hotswap_raid_hard) - -Para serviços especializados (SEO, desenvolvimento, etc.), contacte [os parceiros da OVHcloud](/links/partner). - -Se precisar de assistência para utilizar e configurar as suas soluções OVHcloud, consulte as [nossas ofertas de suporte](/links/support). - -Se precisar de formação ou de assistência técnica para implementar as nossas soluções, contacte o seu representante comercial ou clique [neste link](/links/professional-services) para obter um orçamento e solicitar que a equipa de Professional Services intervenha no seu caso de utilização específico. - -Fale com a nossa [comunidade de utilizadores](/links/community). \ No newline at end of file