物理机器Raid-5添加磁盘方式扩容

2019年02月13日原创 Linux基础

浏览 619949 评论 4

` 背景：有一个MySQL从库，磁盘空间不够了，目前是4块盘做的raid5,包括系统在内，数据盘只有745G，数据就有621G.磁盘使用率达到88%。普通的做法是下线停机，然后再扩容重做系统，分区格式化，导数据，再提供服务。按这个数据量级会有很长时间无法提供服务。为了尽量减少停机时间，也减少恢复数据的麻烦，是否可以尝试在线扩容的方式？避免长时间不可用` [root@localhost ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 79G 25G 51G 33% / tmpfs 7.8G 0 7.8G 0% /dev/shm /dev/sda1 194M 28M 157M 15% /boot /dev/sda3 745G 621G 86G 88% /data ####准备工作 - 确认需要扩容的分区是最后一个，否则是没法扩容的。（LVM管理的分区是不受这个限制）。 - 先发工单给机房，在线添加一块同规格的磁盘（直接插入就行，当然需要有多的磁盘槽位）。 ####将磁盘加入阵列磁盘插入后，通过如下命令可以看到新加入的磁盘 [root@localhost ~]$ MegaCli64 -pdlist -aall|egrep 'Enclosure Device ID|Slot Number|Raw Size' Enclosure Device ID: 32 Slot Number: 0 Raw Size: 279.396 GB [0x22ecb25c Sectors] Enclosure Device ID: 32 Slot Number: 1 Raw Size: 279.396 GB [0x22ecb25c Sectors] Enclosure Device ID: 32 Slot Number: 2 Raw Size: 279.396 GB [0x22ecb25c Sectors] Enclosure Device ID: 32 Slot Number: 3 Raw Size: 279.396 GB [0x45dd2fb0 Sectors] Enclosure Device ID: 32 Slot Number: 4 Raw Size: 279.396 GB [0x22ecb25c Sectors] 最后一块，slot Number为4的就是新增加的。这时查看raid时，还是只有4块。通过如下命令可以看: [root@localhost ~]$ MegaCli64 -ldinfo -lall -aall|egrep 'Number Of Drives|Span Depth' Number Of Drives : 4 Span Depth : 1 Number Of Drives为4表示这个raid为4块盘组成。下面开始将slot为4的磁盘加入到这个raid中，通过如下命令执行即可，剩下的就是等待。（等待过程中不影响正常提供服务） MegaCli64 -LDRecon -Start -r5 -Add -PhysDrv[32:4] -L0 -a0 `这里的[32:3]，表示这里的是磁盘的device id和slot号。通过：MegaCli64 -ldinfo -lall -aall 可以看 L是阵列号，可以看Adapter 后面的数字 a是Virtual Drive，如果有2个raid，还会有Virtual Drive为1的raid 。可以通过：MegaCli64 -ldinfo -lall -aall命令看 ` 命令执行后，就可以通过如下命令看添加进度，先是reconstruction,再是init. 这两个过程会比较久，可能要几天，但这几天是不影响正常使用的。 MegaCli64 -ldinfo -lall -aall Reconstruction : Completed 8%, Taken 184 min. 在init过程中，发现磁盘数据已经变成5块了，size也变大了 Virtual Drive: 0 (Target Id: 0) Name :Virtual Disk 0 RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3 Size : 1.088 TB State : Optimal Strip Size : 64 KB Number Of Drives : 5 Span Depth : 1 Default Cache Policy: WriteBack, ReadAheadNone, Direct, Write Cache OK if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, Write Cache OK if Bad BBU Access Policy : Read/Write Disk Cache Policy : Disk's Default Ongoing Progresses: Background Initialization: Completed 34%, Taken 927 min. Encryption Type : None 添加完成的状态是如下这样的： [root@localhost ~]$ MegaCli64 -ldinfo -lall -aall Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name :Virtual Disk 0 RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3 Size : 1.088 TB State : Optimal Strip Size : 64 KB Number Of Drives : 5 Span Depth : 1 Default Cache Policy: WriteBack, ReadAheadNone, Direct, Write Cache OK if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, Write Cache OK if Bad BBU Access Policy : Read/Write Disk Cache Policy : Disk's Default Encryption Type : None ####开始扩容分区这时磁盘的块设备是还没有变大的，fdisk /dev/sda可以看。此时需要重启下操作系统，让sda识别刚才加进去的磁盘空间。从800多G提升到1.088T。下线流量，停掉mysql，然后reboot 重启后fdisk /dev/sda可以看到扩容后磁盘大小了，1197.8 GB： [root@localhost ~]$ fdisk /dev/sda WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): p Disk /dev/sda: 1197.8 GB, 1197759004672 bytes 255 heads, 63 sectors/track, 145619 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000080 Device Boot Start End Blocks Id System /dev/sda1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 26 10469 83886080 83 Linux /dev/sda3 10469 109215 793172992 83 Linux 开始扩容分区：先删除3号分区，也就是需要扩容的分区。再重新添加分区，保存。 Command (m for help): d Partition number (1-4): 3 Command (m for help): p Disk /dev/sda: 1197.8 GB, 1197759004672 bytes 255 heads, 63 sectors/track, 145619 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000080 Device Boot Start End Blocks Id System /dev/sda1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 26 10469 83886080 83 Linux Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 3 First cylinder (10469-145619, default 10469): Using default value 10469 Last cylinder, +cylinders or +size{K,M,G} (10469-145619, default 145619): Using default value 145619 Command (m for help): p Disk /dev/sda: 1197.8 GB, 1197759004672 bytes 255 heads, 63 sectors/track, 145619 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000080 Device Boot Start End Blocks Id System /dev/sda1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 26 10469 83886080 83 Linux /dev/sda3 10469 145619 1085592713+ 83 Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) Syncing disks. 分区已经重新划分，注意分区的START，可保持和原来的分区一致。再进行如下两步操作： [root@localhost ~]$ partprobe /dev/sda3 [root@localhost ~]$ resize2fs -p /dev/sda3 resize2fs 1.41.12 (17-May-2010) The filesystem is already 198293248 blocks long. Nothing to do! resize2fs 提示Nothing to do ，表示/dev/sda3还没有识别到刚才扩容的分区信息，需要再重启一次系统。重启后，再执行resize2fs -p /dev/sda3就OK了。具体如下： [root@localhost ~]$ resize2fs -p /dev/sda3 resize2fs 1.41.12 (17-May-2010) Filesystem at /dev/sda3 is mounted on /data; on-line resizing required old desc_blocks = 48, new_desc_blocks = 65 Performing an on-line resize of /dev/sda3 to 271398178 (4k) blocks. The filesystem on /dev/sda3 is now 271398178 blocks long. [root@localhost ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 79G 25G 51G 33% / tmpfs 7.8G 0 7.8G 0% /dev/shm /dev/sda1 194M 28M 157M 15% /boot /dev/sda3 1020G 621G 347G 65% /data 到这里，整个分区扩容就搞定了。整个过程，只有在两次重启时不能提供服务，其它时间都可正常提供服务。 ####总结： - 这种方式扩容是比较方便的，操作比较简单。由于是划分了raid，增加磁盘时会重构raid配置，同步数据，时间较长。 - 目前各云厂商都提供了在线扩容磁盘的，做到最好的是AWS可真正做到在线扩容任意分区。（像阿里还需要重启操作系统来识别扩容的磁盘）

文章最后更新时间： 2019年02月13日 13:25:37