Fortune Telling Collection - Comprehensive fortune-telling - How to Test Hard Disk with windows Performance Counter

How to Test Hard Disk with windows Performance Counter

Windows Performance Counter-Disk Performance Analysis Disk

Physical disk:

Single IO size

Average value. Disk bytes/read

Average value. Disk Bytes/Write

IO response time

Average value. Disk sec/read

Average value. Disk sec/write

IOPS

Number of disk reads per second

Disk writes per second

Disk transfers per second

IO throughput

Disk bytes per second

Bytes read from disk per second

Bytes written to disk per second

Disk has two important parameters: seek time and rotation delay.

The normal I/O number is: ① 1000/ (seek time+rotation delay) * 0.75, which is normal within this range. When I/O count reaches more than 85%, I/O bottleneck is basically considered. Theoretically, the random read count of the disk is 125, and the sequential read count is 225. Data files are read and written randomly, and log files are read and written sequentially. Therefore, it is recommended that the data file be stored on RAID5, and the log file be stored on RAID 10 or RAID 1.

Attached:

15000 rpm: 150 random IOPS.

10000 rpm: 1 10 random IOPS.

5400 rpm: 50 random IOPS

Let's assume some values of physical disk performance objects observed in RAID5 with 4 hard disks:

Average value. Disk queue length 12 queue length

Average value. DiskSec/Read .035 Time taken to read data (ms)

Average value. DiskSec/Write .045 Time taken to write data in milliseconds.

320 data readings per second

Disk writes/sec 100 Write data/sec

Average value. DiskQueue length, 12/4=3, and the average queue per disk is recommended not to exceed 2.

Average value. DiskSec/Read should generally not exceed11~15 ms.

Average value. DiskSec/Write is usually recommended to be less than12ms.

As can be seen from the above results, the I/O capability of the disk itself meets our requirements, because a large number of requests lead to queue waiting, which is probably caused by a large number of table scans caused by your SQL statements. After optimization, if you still can't meet the requirements, the following formula can help you calculate how many hard disks you can use to meet such concurrent requirements:

Raid 0-I/O per disk = (read+write)/number of disks

Raid 1-I/O of each disk = [read +(2 * write)]/2.

Raid 5-I/o per disk = [number of reads +(4 * number of writes)]/number of disks.

Raid 10-I/O per disk = [number of reads +(2 * number of writes)]/number of disks.

Our result is: (320+400)/4= 180. At this time, you can get the normal I/O value of the disk according to Formula ①. Suppose the normal I/O number is 125. To achieve this result: 720/ 125=5.76. In other words, six disks are needed to meet this requirement.

However, the above disk reads per second and disk writes per second are difficult to estimate correctly. So we can only estimate an average value when the system is busy as the basis of the calculation formula. The other is that it is difficult for you to get the values of seek time and RotationLatency parameters from customers, and you can only calculate them with the theoretical value 125.

order

As a database administrator, paying attention to the performance of the system is one of the most important tasks in daily life, and paying attention to all aspects of performance is only IO performance, but it is the most headache. In the face of all kinds of unfamiliar parameters, dazzling new terms, coupled with the flicker of storage manufacturers, we always feel confused. This series of articles attempts to comprehensively summarize various concepts related to disk storage from the basic concepts, so that everyone can have a more comprehensive understanding of the basic concepts related to IO performance, monitoring and adjustment of IO performance.

In this part, we first give up all kinds of storage systems with complex structures and directly study the performance of a single disk, so as to understand the various indicators to measure the performance of IO systems and their relationships.

Several basic concepts

Before studying disk performance, we must first understand the structure and working principle of the disk. But I won't repeat the explanation here. For the structure and working principle of hard disk, you can refer to the related entries in Wikipedia-hard disk (English) and hard disk (Chinese).

IO (read/write IO operation

Disk is used to access data for us, so when it comes to IO operation, there will be two corresponding operations, namely, write IO operation when storing data and read IO operation when fetching data.

Single IO operation

When the controller controlling the disk receives the IO operation instruction from the operating system, the controller will send the data reading instruction to the disk and the address of the data block to the disk, and then the disk will send the read data to the controller, and the controller will return it to the operating system to complete an IO write operation. Similarly, IO write operations are similar. The controller receives the written IO operation instruction and the data to be written, and sends it to the disk. After the data is written, the disk sends the operation result back to the controller, and then the controller returns to the operating system to complete an IO write operation. A single IO operation refers to the completion of an operation to write or read an IO.

Random access and sequential access.

Random access means that the sector address given by this IO is quite different from the sector address given by the last IO, so the head needs to make a big move between two IO operations before it can start reading/writing data again. On the contrary, if the sector address given by the secondary IO is the same as or close to the sector address at the end of the last IO, the magnetic head can quickly start this IO operation. This multiple IO operation is called continuous access. Therefore, although two adjacent IO operations are issued at the same time, if the sector addresses they request are very different, they can only be called random access, not continuous access.

Sequential IO mode/concurrent IO mode.

A disk controller can issue a series of IO commands to a disk group at a time. If a disk group can only execute one IO command at a time, it is called sequential IO. When a disk group can execute multiple IO commands at the same time, it is called concurrent IO. Concurrent IO can only occur on a disk group composed of multiple disks, and a single disk can only process one IO command at a time.

The size of a single IO (IO block size).

Anyone familiar with databases knows that database storage has a basic block size. The default block size of both SQL Server and Oracle is 8KB, that is, it is 8k every time the database is read or written. So, what happens when the fixed 8k size single read and write of a database application reaches the level of writing to disk? That is, what is the operation data size of a single IO operation to read and write a disk? Is it also a fixed value? The answer is uncertain. First of all, the operating system introduces file system cache to improve the performance of IO. The system will put multiple requests from IO into cache according to the requested data, and then submit them to disk at one time. That is to say, the reading operations of multiple 8K data blocks sent by the database may be processed in one disk reading IO. Some storage systems also have caches. After receiving IO requests from operating systems, IO requests from multiple operating systems will be merged into one for processing. No matter the cache at the operating system level or the cache at the disk controller level, there is only one purpose to improve the efficiency of data reading and writing. Therefore, the size of each individual IO operation is different, which mainly depends on the system's judgment of data reading and writing efficiency.

When the scale of an IO operation is relatively small, we become a small IO operation, such as 1K, 4K, 8K; When the data amount of an IO operation is compared, it is called a big IO operation, such as 32K, 64K or even more.

When we talk about block size, we usually come into contact with many similar concepts. For example, the management unit with the smallest data in the database mentioned above, Oralce calls it a block with a size of 8K, and SQL Server calls it a page with a general size of 8k. In the file system, we can also encounter the block of the file system, which is 4K in many Linux systems (you can see it through /usr/bin/time -v). Its function is actually the same as the block/page in the database, in order to facilitate data management. But when it comes to the size of a single IO, it is not directly related to the size of these blocks. In English, the size of a single IO is usually called IO Chunk Size, not IO Block Size.

IOPS (IO per second)

IOPS, that is, the number of IO operations performed by the IO system per second, is an important parameter to measure the IO capability of the system. For an IO system composed of a single disk, it is not difficult to calculate its IOPS. As long as we know the time required for the system to complete an IO, we can calculate the system IOPS.

Now let's calculate the IOPS of the disk, assuming that the disk speed is 15K RPM, the average seek time is 5ms, and the maximum transfer rate is 40MB/s (here, the reading and writing speeds are regarded as the same, but the actual difference is quite large).

For the disk, a complete IO operation is carried out as follows: when the controller sends an IO operation command to the disk, the ActuatorArm of the disk leaves the LandingZone (located in the inner ring data-free zone) with the read-write head and moves directly above the track where the initial data block to be operated is located. This process is called searching, and the corresponding consumed time is called addressing time. However, when the corresponding track is found, the data cannot be read immediately. At this time, the magnetic head can't start reading data until the disk rotates to the sector where the initial data block is located, and the time consumed in the process of waiting for the disk to rotate to the operable sector is called rotation delay. Then, as the disk rotates, the head keeps reading/writing the corresponding data block until all the data required for this IO operation is completed. This process is called DataTransfer, and the corresponding time is called TransferTime. After completing these three steps, an IO operation is completed.

When we look at the leaflets of hard disk manufacturers, we often see three parameters, namely average addressing time, disk speed and maximum transmission speed, which can provide us with time to calculate the above three steps.

The first addressing time, considering that the data to be read and written may be in any track of the disk, either in the innermost ring of the disk (the shortest addressing time) or in the outermost ring of the disk (the longest addressing time), we only consider the average addressing time in the calculation, that is, the average addressing time indicated in the disk parameters, and here we adopt 5ms of the current largest 10krmp hard disk.

As with addressing, the second rotation delay may be just above the sector to be read and written after the head is positioned on the track. At this time, data can be read and written immediately without extra delay, but in the worst case, the head can only read data after the disk rotates completely, so we also consider the average rotation delay here, which is (60s/ 15k) * (65448) for the disk with 10krpm.

The third transmission time, disk parameters provide our maximum transmission speed. Of course, it is difficult to achieve this speed, but this speed is the speed of reading and writing disks. Therefore, as long as the size of a single IO is given, we can know how long it takes for the disk to transfer data, and this time is IOChunk Size/Max Transfer Rate.

Now we can get such a formula to calculate the single IO time:

IO time = seek time+60s/rpm /2+IO block size/transmission rate.

We can calculate IOPS like this.

IOPS = 1/IO time = 1/ (seek time+60s/rpm /2+IOChunk size/transmission rate)

For different IO sizes, we can get the following series of data.

4K (1/7. 1 ms = 140 IOPS)

5ms+(60s/15000rpm/2)+4k/40mb = 5+2+0.1= 7.1

8k (1/7.2ms = 139 IOPS)

5ms+(60s/15000rpm /2)+8K/40MB = 5+2+0.2 = 7.2.

16K (1/7.4 ms = 135 IOPS)

5ms+(60s/15000rpm/2)+16k/40mb = 5+2+0.4 = 7.4.

32k (1/7.8ms = 128 IOPS)

5ms+(60s/15000rpm /2)+32K/40MB = 5+2+0.8 = 7.8.

64k (1/8.6ms = 1 16 IOPS)

5ms+(60s/15000rpm/2)+64k/40mb = 5+2+1.6 = 8.6.

As can be seen from the above data, the smaller a single IO, the less time it takes for a single IO, and the larger the corresponding IOPS.

All our above data are obtained under an ideal assumption. The ideal situation here is that the disk takes average addressing time and average rotation delay. This assumption is actually more in line with our actual random reading and writing. In random reading and writing, the addressing time and rotation delay of each IO operation can not be ignored, and the existence of these two times also limits the size of IOPS. Now we consider a relatively extreme sequential read and write operation. For example, when reading a large file whose storage is continuously distributed on the disk, because the storage distribution of the file is continuous, the magnetic head does not need to re-address or rotate delay after completing an IO reading operation. In this case, we can get a large IOPS value, as shown below.

4K (1/0. 1 ms = 10000 IOPS)

0 ms +0 ms+4K/40MB = 0. 1

8k (1/0.2ms = 5000 IOPS)

0 ms +0 ms+8K/40MB = 0.2.

16k (1/0.4ms = 2500 IOPS)

0 ms +0 ms+16K/40MB = 0.4

32k (1/0.8ms = 1250 IOPS)

0 ms +0 ms+32K/40MB = 0.8

64k (11.6 ms = 625 IOPS)

0ms+0ms+64K/40MB = 1.6

Compared with the first set of data, the gap is very large. Therefore, when we use IOPS to measure the capacity of an IO system, we must be clear about the IOPS under what circumstances, that is, the reading and writing mode and the size of a single IO. Of course, in practice, especially in OLTP system, the reading and writing of random small IO is the most convincing.

Transmission rate)/throughput.

The transmission speed we are going to talk about now (another popular saying is throughput) is not the maximum transmission speed or ideal transmission speed marked on the disk, but the amount of data that the disk flows through the disk system bus in actual use. With the IOPS data, we can easily calculate the corresponding transmission speed.

Transfer rate = IOPS * IO block size

Still putting the first set of IOPS data on it, we can get the corresponding transmission speed as follows.

4K: 140 * 4K = 560k/40M = 1.36%

8K: 139 * 8K = 1 1 12K/40M = 2.7 1%

16K: 135 * 16K = 2 160k/40M = 5.27%

32K: 1 16 * 32K = 37 12K/40M = 9.06%

It can be seen that the actual transmission speed is very small and the bus utilization rate is also very small.

It must be clear here that although we use IOPS to calculate the transmission speed, there is actually no direct relationship between the transmission speed and IOPS, and their common determinants are the access mode to the disk system and the size of a single IO without cache. We can use IOPS to measure the performance of a disk system when accessing disks randomly, and the transmission speed will not be too high at this time; However, when the disk is accessed continuously, the IOPS at this time has no reference value. At this point, limiting the actual transfer speed is the maximum transfer speed of the disk. Therefore, in practical application, only IOPS will be used to measure the random read-write performance of small IO, and when measuring the continuous read-write performance of large IO, the transmission speed should be used instead of IOPS.

IO response time (io response time)

Finally, let's pay attention to IO response time, which can directly describe IO performance. IO response time is also called IO threat. IO response time refers to the time from sending a read or write IO command from the operating system kernel to receiving an IO response from the operating system kernel. Be careful not to confuse it with a single IO time. A single IO time only refers to the time for processing IO operations inside the disk, and the IO response time also includes the waiting time for IO operations in the IO waiting queue.

There is a queuing model M/M/ 1, which is derived from Little'sLaw and used to calculate the time consumed by IO operations in the waiting queue. Due to the complexity of the queuing model algorithm, it has not been understood at present (if anyone is proficient in M/M/ 1 model, guidance is welcome). The following is the final result list, or the above calculation.

8K IO block size (135 IOPS, 7.2ms)

135 = >240.0 ms

105 = > 29.5ms.

75 = >15.7ms.

45 = > 10.6 ms

64K IO block size (1 16 IOPS, 8.6 ms)

135 = > no response. ...

105 = >88.6 ms

75 = >24.6 ms

45 = > 14.6 ms

As can be seen from the above data, the response time of IO will increase nonlinearly as the actual IOPS of the system approaches the theoretical maximum, and will become longer as it approaches the maximum, and it will be much more than expected. Generally speaking, there is a guiding value of 70% in practical application, that is, when the queue size is less than 70% of the maximum IOPS, the response time of IO increases very little, which is relatively acceptable. Once it exceeds 70%, the response time will increase sharply, so when the IO pressure of a system exceeds 70% of the maximum tolerance pressure, it is necessary to consider adjustment or upgrade.

In addition, it needs to be added that 70% of the guidance value is also applicable to CPU response time, which has also been proved in practice. Once the CPU exceeds 70%, the system will become unbearably slow. Very interesting stuff.

From the calculation in the last article, we can see that the IOPS of a disk with the rotation speed of 15k is only about 140 in the case of random access, but in practical application, we can see many storage systems marked with 5000IOPS or even higher. How did the storage system with such a large IOPS come from? This is due to the use of various storage technologies, among which cache and RAID are the most widely used. This paper will discuss the methods to improve the storage IO performance through caching and RAID.

Cache (cache)

At present, among all kinds of storage products, memory >: flash > disk > tape has disappeared, but the faster the speed, the higher the price. Although the development momentum of flash memory is good, it can't be popularized at present because of the price problem, so it is still the era when disk is the overlord. Compared with the speed of CPU and memory, the speed of disk is undoubtedly the biggest bottleneck in computer system. Therefore, when you need to use a disk and want to improve the performance, people come up with a compromise method, embedding a high-speed memory in the disk to save frequently accessed data, thus improving the reading and writing efficiency. This embedded memory is called cache.

Speaking of caching, this kind of application is everywhere now. From the upper application to the operating system layer, to the disk controller, to the CPU, there is a cache in a single disk. All these caches exist for the same purpose, that is, to improve the efficiency of system execution. Of course, we only care about the cache related to IO performance here. Several caches directly related to IO performance are file system cache, disk controller cache and disk cache, but the file system cache will not be considered when calculating the performance of a disk system, so we focus on disk controller cache and disk cache.

Whether it is controller cache or disk cache, their functions are mainly divided into three parts: data cache, read-ahead and write-back.

Cached data

First of all, the data read by the system will be cached in the cache, so the next time you need to read the same data again, you don't have to access the disk, you just need to fetch the data from the cache. Of course, used data cannot be kept in the cache forever. The cached data is generally managed by LRU algorithm, with the purpose of clearing the data that has not been used for a long time from the cache, while those frequently accessed data can be kept in the cache until the cache is emptied.

Read ahead

Pre-reading refers to reading data from the disk into the cache in advance when the system has no IO request, and then when the system makes an IO request, it will check whether there is any data to be read in the cache, and if there is (that is, hit), it will directly return the result. At this time, the disk does not need to address, wait for rotation, read data and other operations, which can save a lot of time. If there is no hit, issue a real command to read the disk to get the required data.

The hit rate of cache has a great relationship with the size of cache. Theoretically, the larger the cache, the more data can be cached, so the higher the hit rate. Of course, the cache should not be too large. After all, the cost is there. If a storage system with a large capacity is equipped with a small read cache, the problem will be more serious at this time, because the amount of data in the small cache is very small, which is very low compared with the whole storage system, so the hit rate is naturally low when reading randomly (in most cases of database systems). Such a cache will not only not improve efficiency (because most IO readers have to read the disk), but will waste time by matching the cache every time.

When reading IO, the ratio of the number of data read in the cache to all data to be read is called ReadCache Hit Radio. Assuming that a storage system does not use cache, random small IO reads can reach 150IOPS, its cache can provide a cache hit rate of 10%, and its IOPS can actually reach 150/(65438.

Write back cache

First of all, the part of the cache used for write-back function is called WriteCache. In a set of storage with open write cache, a series of write IO commands issued by the operating system will not be executed one by one. These write IO commands will be written into the cache first, and then the changes in the cache will be pushed to the disk at one time, which is equivalent to merging the same IO into one, merging multiple small IO with continuous operation into one big IO, and turning multiple random write IO into a group of continuous write IO, thus reducing the time consumed by operations such as disk addressing.

Although reading cache is obvious to improve efficiency, it also brings serious problems, because cache, like ordinary memory, will lose all data after being discarded. When the write IO command issued by the operating system is written into the cache, it is considered successful, but in fact the data is not actually written to the disk. At this time, if the power is cut off, the data in the cache will be lost forever, which is disastrous. At present, the best way to solve this problem is to equip the cache with batteries to ensure.

Like reading, the write cache also has a write cache hit rate, but it is different from the read cache hit rate. Although the cache hit, the actual io operation is inevitable, just merging.

In addition to the above functions, controller cache and disk cache also play other roles. For example, the disk cache has the function of storing IO command queues. A single disk can only process one IO command at a time, but it can receive multiple IO commands. These unprocessed commands that enter the disk are stored in the cached IO queue.

redundant arrays of inexpensive disks

If you are a database administrator or have frequent contact with servers, you should be familiar with RAID. As the cheapest storage solution, RAID has been popularized in server storage. In all levels of RAID, RAID 10 and RAID5 (although RAID5 has basically come to an end and RAID6 is on the rise, see here to understand why) are the most widely used. Let's take RAID0, RAID 1, RAID5, RAID6 and RAID 10 as examples to talk about the influence of disk array on disk performance. Of course, before reading the following contents, you must be familiar with the structure and working principle of each level of RAID to avoid confusion. It is recommended to check the following items on Wikipedia: RAID, standard RAID level and nested RAID level.

Previous article:Help me calculate my fortune at 4 am 198508 18? Thank you~
Next article:Satisfied with fortune telling _ What if fortune telling is accurate?