Optimize and save on provisioned storage space in virtualized environment

Before i often over-provisioned the needed storage for VM.

For example for AWS and other cloud services it’s common place to be charged per gigabytes of used EBS and similar storages (for Mongolian iTools, Mobicom and others usually you are charged by subscription plans so it’s unavoidable to pay for not used disk space). Generally it’s not recommended to provision storage space more than necessary. Surely it’s OK to have a reasonable free space for your VMs, but not too much (for example it’s still very common to see created VM which utilize only 3-5% of huge vmdk/AWS EBS and so on)

The same “generosity” is true for company data centers with expensive SAN storages (as for physical servers and as well for virtual machines on VMware/Hyper-V). For Windows VM Servers it’s very transparent and easy to expand at first SAN LUN, later re-scanning and expanding vmware datastore, later manually changing size of vmdk, and in the last C/D disks re-scan and expanding disk from Disk Manager online without rebooting of guest OS. Mainly above mistake is often seen for Linux/UNIX VM or physical servers. And in my opinion the core reason is the lack of understanding LVM. Too many VM are deployed without any consideration about future maintenance and scaling. In other words without planning and storage design. For windows administrators LVM could be considered as the same conception as Dynamic Disks, without conversion of ordinary disk into Dynamic Disks you cannot create windows software RAIDs – the same purpose is for LVM, but in UNIX and linux.

Let’s consider that we have Vmware servers with linux OS as VM.

I suggest at first as usually to create VM template with minimal disk sizes:

    1. create new VM:
      • enable hot vRAM/vCPU (to be able without reboot to add linux system necessary RAM/cpu resources) and install vmware tools or equivalent software for other hypervisors to be able use this feature.
      • add three vmdk (as SCSI disks with SCSI ID 0:0, 0:1, 0:2 – these IDs are very useful to identify later which vmdk disk to expand in size)
      • even if you create vmdk as named disks for VM template by vmkfstools (hoping later to use these names to identify disks), later during deployment from this vmware template VM disk anyway will be automatically renamed into faceless vmdk disks so that you cannot differ them.
      • for this reason i recommend to create new VM disks without missing SCSI IDs (usually it happens if you created and deleted disk and you get omitted SCSI ID, for example 0:0; 0:3; 0:4 –> vmdk with 0:1 and 0:2 were deleted and later when you add vmdk it will get 0:1 ID which can mess up everything) – at least for VM template
      • and so on
    2. install linux, for example RHEL7 :
      • install everything on first vmdk – for example base.vmdk (you should do everything in manual partitioning mode, i use named vmdk to simplify explanations, in real life you should use SCSI IDs as i mentioned above) – deselect during installation second and third disks (or even better to create them after installation)
      • mandatory to use LVM for all partitions/logical volumes, except /boot
      • create as many as possible logical volumes (at least for swap, /opt, /tmp, /usr, /home, /var and others). it’s ok to place all of them in one LVM Volume Group – for example into default “rhel” — if you don’t create LVM logical volumes then later you cannot expand them online without rebooting, without interruption of the LOB/Line Of Business applications.
      • this is only VM template – skeleton VM, so you are free to create all logical volumes with reasonable minimal sizes (for example: /boot is 500M; 2gb for /usr, and only 0.5-1gb for all others on first 10Gb vmdk, leftover is for root directory [check that this size is not too small for root directory)
      • install rhel; convert VM into Vmware template (deployment will rename disks, as well assign new MAC addresses for all vNICs and so on)
    3. now we have linux template with minimal sizes of disks – so it cannot be directly used for anything serious, except VM deployment.
    4. ok, let’s deploy from this template new VM
    5. start new VM
    6. now it’s time to use second and third vmdk, let’s name them as extends.vmdk (to expand our first vmdk) and apps.vmdk (to create and expand folders for our LOB)  (in fact the name of vmdk will be  [vmname]_1(2).vmdk or something like this)
    7. if you read carefully my post you can tell me that extends.vmdk could be deleted and instead why not to re-use our first vmdk resizing it. It’s possible, but there are too many linux distributives which have problems with partprobe command on disk with root directory – so just to avoid these problems we suggest dedicated vmdk for all extends
    8. now let’s suggest that we have task to re-configure created from above template new rhel VM server with a such parameters:
      • /var is 30G – for example web server
      • swap should be at least 4G
      • all data should be stored in separate folder /data – 100G
      • linux server should not be rebooted !!! (already used somehow)
    9. let’s at first expand logical volume for /var:
      • current logical volume for /var is created by us on the base.vmdk as only 500M- so we increase whole extends.vmdk (scsi id 0:1) size up to 29.5g
      • we don’t have right to reboot the server to recognize new size of the vmdk, so need to run command
        echo 1 > /sys/class/scsi_device/[device-id]/device/rescan

        (where [device-id] is 2:0:1:0 or something like this – easier just go to this folder by cd command and choose proper one – middle two numbers are our SCSI ID mentioned above – again 0:0 is base.vmdk, 0:1 is extends.vmdk)

      • if the above command worked then we see changed size of the /dev/sdb by command fdisk -l /dev/sdb (second vmdk)
      • now we should create new partition on /dev/sdb (i prefer by cfdisk, better to create not primary disk if you are planning in the future multiple time to expand, there is limt for primary partitions). let’s consider that your new partition with size 29.5G is named /dev/sdb5 (don’t forget to save partition :))
      • we created new partition, but linux doesn’t know about it without reboot. So we run partprobe -s
      • so everything is ready for LVM commands:
        pvcreate /dev/sdb5
        vgscan  (to check that vg name is "rhel")
        vgextend rhel /dev/sdb5 (extending "rhel" vg using whole space of new partition)
        vgdisplay | grep Free (to check that vg is extended and it has now free space)
        lvextend -r /dev/rhel/var /dev/sdb5 (instead of "-r" you can run extra commands 
        "resize2fs /dev/rhel/var" or "xfs_growfs /dev/rhel/var")
        df -h (to check that size of the /var is changed)
    10. now we need to increase swap from 500M to 4g:
      • again change size of the extends.vmdk by 3.5G (delta between 0.5g and 4g)
      • again force the VM to recognize that extends.vmdk size is changed by issuing command echo 1 > rescan in appropriate folder (see above)
      • again create 3.5g partition /dev/sdb6
      • again force to recognize by VM this new partition by issuing command partprobe -s
      • check by “lvscan” that current swap LV size is 500M so far, and check what is the name of LV/logical volume for swap
      • swapoff /dev/rhel/swap
      • check thru “top” that swap is disabled
      • pvcreate /dev/sdb6
      • vgextend rhel /dev/sdb6
      • lvextend  /dev/rhel/swap /dev/sdb6 (no need to use “-r”  – swap doesn’t use xfs or ext4 fs)
      • mkswap /dev/rhel/swap
      • swapon /dev/rhel/swap
      • use “top” command to check that swap size is changed from 0.5g to 4g
    11. now we need to create elastic directory for DATA with initial size 100G (data disks always tend to expand and expand, so LVM elastic disks which we discuss are exactly what is needed. It’s always better to have dedicated VMDK for a such purpose or for LOB apps – that’s why it’s named as apps.vmdk):
      • almost the same as above, except that we will create new VG and LV from a scratch on third apps.vmdk:
      • increase the size of apps.vmdk up to 100gb from Vmware
      • again force the VM to recognize that apps.vmdk size is changed by issuing command echo 1 > rescan in appropriate folder (see above, 0:2 SCSI ID third)
      • again create 100G partition /dev/sdc1 by cfdisk /dev/sdc
      • again force to recognize by VM this new partition by issuing command “partprobe -s
      • pvcreate /dev/sdc1
      • vgcreate extra /dev/sdc1
      • lvcreate -l 100%FREE -n apps extra (use all free space of “extra” VG to create LV with name “apps”)
      • mkfs.ext4 /dev/extra/apps
      • lvscan to check that newly created LV has size of 100G
      • now if it’s necessary – create /etc/fstab record to mount this 100G disk to the mount point /data: “/dev/extra/apps /data ext4 defaults 0 0
    12. ps:
      • extended LVM disks act as JBOD, no redundancy and stripping, not so reliable, but usually vmware datastores are based on SAN LUNs (which are already protected on physical level by RAIDs). Anyway if you need you can use mirrored or stripped  LVM disks. When number of expands will exceed several dozens, i guess it’s better to create one big disk and mirror multiple expanded disk to it and migrate to it, after this you can remove old one (but it should be done very carefully).
      • for virtualized environment i would also recommend to place all three vmdk to the same folder in the same vmware datastore – in this case there is less risk to remove accidently extends.vmdk (if you separate vmdk files of one LVM LogicalVolume the risk is much higher to damage a such LV).
      • it’s possible to not use partitions – for example directlry pvcreate /dev/sdc (not creating partition and not using “partprobe -s”). But in this case you should be very careful to not create partition on a such vmdk (for fdisk it will be like empty, not used) !!! And other consideration – a such vmdk can be used only by one VG.
      • shrinking thru LVM is also possible, but in the majority of case you need to run efsck and for this you need to umount resource – in other words LOB app should be stopped for maintenance 🙁