Monday, August 5, 2013

Storage Distributed Resource Scheduler (SDRS) algorithms and metrics

Storage Distributed Resource Scheduler (SDRS) provides initial placement of virtual machine disks and load balancing recommendations based on datastore latency and capacity.

Initial Placement

The initial placement of a virtual machine disk is computed based on space utilization and I/O load. When a datastore cluster is selected for one of the following scenarios, an initial placement is triggered:

  1. A virtual machine is created

  2. A virtual machine is cloned

  3. A virtual machine is migrated to a new datastore cluster

  4. A virtual machine is assigned a new disk


The intent of initial placement of a virtual machine disk is to reduce administrative complexity and ensure datastore performance. SDRS requires the administrator to select an appropriate datastore cluster. SDRS removes the administrator’s burden of manually calculating datastore  I/O and storage capacity. Based on the datastore cluster selected, SDRS will choose the most appropriate location to prevent cluster imbalance.

Profile Driven Storage

An administrator’s datastore cluster choice is further eased when SDRS is coupled with profile driven storage. The VM Storage Profiles feature found in vCenter, new to vSphere 5, eases the administrative burden of choosing an appropriate datastore cluster. When establishing a location for a new virtual machine, the administrator may reduce the list of available datastore cluster choices by selecting a pre-defined profile. A common example of storage tiering (profiles) is demonstrated below:

  1. Platinum – RAID 1, Enterprise Flash Drives

  2. Gold – RAID 10, 15K FC

  3. Silver – RAID 5, 10K FC

  4. Bronze – RAID 5, 7K SATA


Before going any further, I would like to point out that although you may easily select a datastore cluster based on a storage tier and SDRS may move the location of a virtual machine’s disk, it is still in the very best interest of your organization to keep record of static virtual machine settings. Documentation and change control is the key to establishing, maintaining, and supporting a healthy IT environment. /end rant

Datastore clusters

Datastore clusters were introduced in vSphere 5. A datastore cluster represents an aggregate of datastores. SDRS is automatically enabled upon the creation of a datastore cluster.

When designing a datastore cluster, you should ideally group disks with similar characteristics and take advantage of vSphere Storage API Storage Awareness (VASA) features if available for your storage arrays. Although it is technically feasible to have datastores with different storage characteristics as members of a datastore cluster, it is not in the best interest of your datastore cluster or virtual machine’s performance. Note that VMFS and NFS datastores cannot be part of the same datastore cluster.

Datastore cluster SDRS automation modes

A datastore cluster may be configured to operate in one of two modes:

  1. No Automation (Manual Mode) – Initial placement and migration recommendations are provided but it is the administrator’s responsibility to review and take action for each recommendation. This is the default mode after creating a datastore cluster.

  2. Fully Automated – Migration recommendations are executed automatically. Initial placement still requires administrator approval.




Virtual machine SDRS modes and operations

When the automation level is adjusted for individual virtual machines, the following options apply:


  1. Default (Manual) - Initial placement and migration recommendations are made but are not executed without administrator approval.

  2. Fully Automated – Initial placement and migrations occur automatically.

  3. Disabled – SDRS initial placement and migration recommendations are disabled. The resources in use by the virtual machine are still considered in the overall assessment of a datastore cluster. When SDRS is disabled, all settings relative to automation level, rules, thresholds and aggressiveness are saved until SDRS is reactivated.



Storage DRS Thresholds 




  1. Utilized Space – An adjustable value that is initially set at 80%. When this threshold is exceeded, SDRS will make recommendations.

  2. I/O Latency – Default value is 15 milliseconds. This value should be adjusted to reflect the type of disks used by the array that supports your datastores. When the 90th percentile I/O Latency is exceeded for the day, SDRS will make recommendations. When considering adjusting this setting, consult your storage vendor for best practices.


Advanced Options





  1. Evaluate I/O load every - This value will adjust the default interval that SDRS is invoked and may be adjusted from 60 minutes to 30 days. By default, SDRS load balancing algorithms are invoked at 8 hour intervals.

  2. No recommendations until utilization difference between source and destination is – This setting ensures that there is minimum amount of capacity difference between the source and target datastore. As an example, consider if the datastore cluster utilization threshold is set at 80% and datastore ‘A’ exceeded that value at 81%. It will consider migrating to datastore ‘B’ if the utilization capacity percentage difference is greater than the utilization difference setting. Further to this point, if the utilization difference value is set to 5 (the default), and datastore ‘B’ utilization is 77%, the difference is not great enough to trigger the migration. However, if datastore ‘C’ utilization capacity is currently at 76%, datastore ‘A’ would consider datastore ‘C’ as a viable migration target.

  3. I/O imbalance threshold - This setting is adjustable between conservative and aggressive. A conservative setting will only generate recommendations that would greatly impact the datastore cluster balance. An aggressive setting will generate recommendations for even the smallest benefit.





I/O Metric Inclusion

If this option is disabled, I/O metrics will not be considered for any SDRS recommendations. All calculations will be based on space utilization.


SDRS Rules

SDRS must consider anti-affinity rules in its migration recommendations. These rules are also in effect during initial placement. It will also make recommendations to correct any violated rules. You can setup anti-affinity rules to prevent two virtual disks from residing on the same datastore. SDRS Anti-affinity rules only apply to the cluster they are assigned. If a datastore is moved from the cluster, the rule does not apply.

SDRS supports the following rule types:

  1. Inter-VM Anti-Affinity Rules – This prevents virtual machines from residing on the same datastore.

  2. Intra-VM Anti-Affinity Rules – This prevents virtual disks from residing on the same datastore. For example, if a virtual machine has two disks, you may want them to run on different datastores.

  3. VMDK Affinity - By default, a virtual machine’s virtual disks are all contained within the same datastore. This may be overridden by adjusting the Keep VMDKs together option within the datastore cluster virtual machine settings dialog.


Load balancing assessments


The basis of SDRS recommendations is the consideration of both space utilization and I/O load analysis.

SDRS collects space utilization statistics for datastores within a datastore cluster at an interval of every two hours and compares this with the space utilization threshold. This assessment is repeated for all datastores within a datastore cluster prior to making a recommendation. When making placement recommendations based on space utilization, SDRS will recommend virtual machines that are powered off over those that are powered on.

SDRS analyzes historical statistics for the previous 24 hours of I/O load at 8 hour intervals. Although not recommended, you may adjust the interval that SDRS is invoked. The advanced setting Evaulate I/O load every 8 hours is the default value. Historical performance statistics along with an assessment of workload capabilities of each datastore are effectively baselined by an algorithm that represents the normalized load (a standard deviation) for each datastore. This value is ultimately compared to the I/O latency threshold defined for SDRS. If this threshold is exceeded by the normalized load, a cost benefit analysis is conducted prior to making any recommendations.

Any recommendations that are not acted upon expire at the next scheduled assessment.

SDRS issues migration recommendations for the follow events:

  1. Space utilization thresholds have been exceeded on a datastore

  2. I/O response time thresholds have been exceeded on a datastore

  3. A significant imbalance of capacity among datastores

  4. A significant imbalance of I/O among datastores


An SDRS assessment is triggered for the following events:

  1. When SDRS is manually executed

  2. During initial placement events

  3. When a datastore is added to a datastore cluster

  4. When a datastore is changed to maintenance mode

  5. When then SDRS configuration is updated

  6. When a threshold is exceeded

  7. At the defined interval (Default is 8 hours)




Storage I/O Control (SIOC) and SDRS

Both SIOC and SDRS have latency thresholds. The SDRS latency threshold should be set lower than the SIOC latency threshold. SIOC is for throttling I/O during times of contention whereas SDRS role is for avoiding contention. It would be better to rebalance the cluster if there are resources instead of throttling the workload.

It should be noted that the calculation for measuring latency is different for SIOC than SDRS. The SIOC latency threshold only considers device latency whereas SDRS considers device latency and queue latency.


Maintenance Mode

When a datastore is directed to be placed in maintenance mode, SDRS will only generate recommendations for registered virtual machines residing within the datastore targeted for maintenance mode. It is the administrators responsibility to determine if there are orphaned, unregistered, or other files residing on the datastore intended for maintenance mode and to manually take the appropriate action to preserve that data. Furthermore, if anti-affinity rules exist, it may prevent a datastore from entering maintenance mode. You may disabled Storage DRS rules for maintenance mode by setting theIgnore Affinity Rules for Maintenance option.
Scheduling SDRS

vCenter contains a feature for scheduling SDRS activity. For example, you may only want SDRS migrations to occur during off-peak hours and recommendations to occur during on-peak hours. Another example would be the need to disable migrations during backups. You can schedule the adjustment of many SDRS settings:



  1. Automation level

  2. Inclusion of I/O metrics for Storage DRS Recommendation

  3. Utilized space threshold

  4. I/O latency threshold

  5. I/O imbalance threshold



 

No comments:

Post a Comment