S. What is S.M.A.R.T. hard drives Smart hdd decryption

A hard drive is a complex electronic-mechanical device that has its own self-diagnostic technology that can predict the imminent failure of your hard drive. Which is usually a very sad event...

Technology S.M.A.R.T.(English) S elf M monitoring A scribing and R eporting T echnology ) - a technology for assessing the state of a hard disk with built-in self-diagnostic equipment, as well as a mechanism for predicting the time of its failure.

We will not consider this technology in detail, because. this is too broad a question, and each drive manufacturer has its own vision and the number of monitored parameters. Consider the most important from a practical point of view.

To do this, we need a program to view the monitored parameters.

In it, on the "Data storage-> SMART" tab, select the hard disk and the monitored parameters are displayed in the window:

01 Raw Read Error Rate- the number of reading errors. Modern disks have a very high data storage density, so they constantly read data with errors, and the information is restored due to the ECC error correction code. It is these errors that this parameter considers. AT hard drives These non-critical errors are shown by Seagate, other manufacturers prefer to modestly keep silent about this. For Seagate drives, the state can be considered very good when the Raw Read Error Rate and Hardware ECC Recovered parameters are equal. This means that how many errors there were, so many were corrected using the correction code. If these values ​​are not equal, then you should not be afraid. This is not a critical parameter and the disk can live for years without any problems.

03 Spinup Time- time to spin up the disk to a working state. You should only worry if the value is less than half of the initial value. But there are still a few nuances, such as how many platters are in the hard drive. The maximum currently is 5 platters (Hitachi), of course, it will take more time to spin up such a package of disks than for the 1st platter. Nobody canceled the force of inertia.

04 Start/Stop Count- total number of starts/stops of the spindle. For Seagate, the number of times the spindle stops when going into power save mode.

05 Reallocated Sector Count- number of reassigned sectors. That is, when a drive detects a read/write error, it marks the sector as "remapped", and transfers the data to a specially designated spare area. In general, this is a terrible parameter, if its value is more than 10, then this at least means that it is time to check the entire surface of the disk to understand whether this process will continue. Judging by practice, laptop disks suffer from reassigned sectors somewhere after a year of use. Because they work in very harsh conditions. I'm not talking about strikes - most are more or less protected from this. The reason is temperature. The laptop case is usually poorly ventilated and the disk overheats, then we turn off the laptop and go where? That's right, on the street! And it's -10 Celsius. That's just the rate of heating-cooling and destroys the delicate magnetic layer on the disc plates. According to the specifications of all disk manufacturers, the so-called "temporal temperature gradient", that is, the rate of temperature change should be no more than 20 degrees / hour - in working condition and no more than 30 degrees / hour in the off state. This rule is always violated, but for laptops it is especially often and cruelly.

09 Power-on Time Count (Power-on Hours)- the amount of time spent in the on state. Usually for modern drives it is measured in hours (for Fujitsu in seconds). For old Maxtor drives, not for those now produced by Seagate under this brand, but for original Maxtor drives, the time changes in minutes. This is very useful parameter if you buy an old disk, then you want to know how much it has worked in its life. And besides, usually this time coincides with the time of the computer and you can determine how many people spend at the computer on average. As practice and my survey on one of the major forums dedicated to computer hardware show, disks with an operating time of more than 20,000 hours (approximately 2.5 years of continuous operation) already have some kind of defects, for example, the same "reassigned" sectors and are not so far from senile death. From the same manufacturer's specifications, you can find out that discs designed for desktop computers are not designed to work around the clock, but are designed to work in 8/5 mode, that is, 8 hours 5 days a week. This works out to about 2400 hours per year. And it turns out that the warranty is calculated for 3 years - 7200 hours, for 5 years - 12000 hours. Not so much, considering that there are 8760 hours in a year.

0A Spinup Retry Count- The number of retries to spin up disks to operating speed if the first attempt was unsuccessful. If the attribute value increases, then mechanical/bearing damage is likely. It is very rare, modern discs are made with hydrodynamic bearings, and in the event of a malfunction of such a bearing, it jams immediately and tightly or works happily ever after. Not so long ago, Toshiba drives and, to a lesser extent, Western Digital suffered greatly from this. Jamming occurs from overheating.

0C Power Cycle Count- number of disk on/off cycles.

C2 Temperature- disk temperature. Unfortunately, the temperature sensors are on the disks. different manufacturers in different places, so there are overestimations and underestimations of the real temperature. But on average, as shown by a recent Google study, the optimal operating temperature is in the range of 35 to 45 degrees. Above 50 degrees, operation is highly discouraged, but such temperatures and even higher can often be seen in laptops.

The number of sectors that are candidates for replacement. They have not yet been identified as bad, but reading from them is different from reading a stable sector, these are the so-called suspicious or unstable sectors. In the case of a successful subsequent reading of the sector, it is excluded from the list of candidates. In the event of repeated erroneous reads, the drive tries to recover it and performs a remapping operation. A non-zero value usually occurs if there are already remapped sectors on the disk. If this is the case, then with a high probability we can say that the disk is actively "stripping", that is, the magnetic layer of the hard disk platters is being destroyed.

The number of uncorrected errors, that is, severe damage to the disk surface. Such errors appear when the space in the reserve zone of the disk for sector remapping runs out. They can also appear when the power is suddenly turned off at the moment when the disk is writing data - these are the so-called "software bad blocks". If their number is one or two, and the rest of the parameters relating to the surface of the disk are normal, then you should not worry. If it is large, then the data must be saved and the "body to be taken out" should be prepared. :)

C7 Ultra ATA CRC Error Rate- number of transmission errors in the external interface. Usually the cable or poor contact of the cable with the connectors is to blame, especially on SATA drives. Occurs quite often.

C8 Write Error Rate- Errors when writing to disk. Occurs rarely. Usually on very old discs. If there are errors, then this means the physical wear of the hard disk drive. Or with serious damage to the surface of the disk. (when the number of reassigned sectors and uncorrected errors exceeds all reasonable values).

So we briefly reviewed the main parameters of the hard drive self-diagnosis system. If you want to know more about this, you can refer to Wikipedia materials:

Unfortunately, SMART cannot always predict disk death. As the study of the same Google showed, about 50% of disks die abruptly and for no apparent reason. But in one this technology is definitely useful. Using it, you can quickly find out the state of the disk surface, that is, the parameters:

05 Reallocated Sector Count

C5 Current Pending Sector Count

C6 Offline Uncorrectable Sector Count

And it is very useful to know the time that a disk has worked in its life in order to roughly guess what you can expect from it.

And now a little about the future. A sufficient number of offers of really "hard drives" have already appeared on sale. They are made on flash-type solid-state memory chips and are much more reliable both in terms of mechanical stress and temperature. However, manufacturers have not yet agreed on a standard self-diagnostic system for this type of drive. But it will be much easier than for the good old electromechanical drives. And most importantly, it will predict the possibility of failure with a much higher probability! Flash memory is more predictable in this sense. Well, let's wait for this bright future!

    Modern hard disks quite “smart” devices and, in addition to the main properties inherent in them as data storage and processing devices, support the technology of self-testing, state analysis, and accumulation of statistical data on the deterioration of their own characteristics S.M.A.R.T. (S elf- M monitoring A nalysis a nd R eporting T echnology). Basics of S.M.A.R.T. were developed in 1995 by the joint efforts of leading manufacturers of hard disk drives (HDD). In subsequent years, S.M.A.R.T standards have been refined in accordance with changes in technology and equipment (SMART II and SMART III) and continue to improve at the present time.

    A hard drive, starting from the moment of its manufacture, constantly monitors certain parameters of its condition and reflects them in special characteristics - attributes(Attribute), stored in a permanent storage device, as a rule, in a specially allocated part of the disk surface, accessible only to the internal firmware of the drive - service area. Attribute data can be read, according to the ATA specification ( AT A ttachment) by SMART support commands (SMART READ DATA and more than a dozen commands), which are transferred to the drive by special software, such as utilities from equipment manufacturers or universal programs HDD testing and monitoring (udisks, smartctl, GSmartControl, gnome-disks, etc.). Modern ATA standards include support for the SCT (SMART Command Transport) protocol, which reads device statistics logs. The device statistics log is a read-only SMART log sent by the drive when it receives READ LOG EXT, READ LOG DMA EXT, or SMART READ LOG commands.

    The attribute is a characteristic of a certain state of the hard drive, which changes during operation, taking on a numeric value from the maximum value set at the time of manufacture of this device, to the minimum value, upon reaching which the performance of the drive is not guaranteed. All attributes are identified by their digital number, most of which are interpreted in the same way by hard drives of different models. Some of them may only be used by a specific hardware manufacturer, and are supported by certain drive models. So, for example, an attribute with id 7 , which characterizes the number of errors in the installation of heads on the required track of the disk surface Seek_Error_Rate does not make sense for solid-state drives (SSD) and, accordingly, is not supported by them, and the attribute with the identifier 9 characterizing the total operating time of the drive for the entire period of operation and denoted as Power_On_Hours,supported by both SSD and traditional HDD.

    Attributes consist of several fields, (most commonly referred to as Val, Worst, Tresh, RAW), each of which is a specific indicator that characterizes the technical condition of the drive on this moment time. S.M.A.R.T. Readers display the contents of the attributes, usually in the form of several columns:

  • ID#- numeric attribute identifier
  • attribute- attribute name
  • Flags- attribute flags set by the HDD manufacturer. Characterize the attribute type (most programs interpret flags as symbols k,c,r,s,o,p or abbreviations, for example, EC - Event Count, event counter).

    Pre-Failure (PF, 01h)- upon reaching the threshold value of this type disk attributes need to be replaced. Sometimes this bit of flags is denoted as Life Critical (CR) or Pre-Failure warranty (PW)
    O nline test (OC, 02h) - the attribute updates the value when performing off-line / on-line built-in SMART tests;
    P erfomance R elated (PE or PR , 04h) – attribute characterizes performance;
    E rror R ate (ER , 08h) – attribute reflects hardware error counters;
    E vent C ounts (EC, 10h) - the attribute is an event counter;
    S elf P reserving (SP, 20h) - self-reserving attribute;
    Some of the programs can interpret the flags as textual descriptions that are similar in meaning to those discussed above. One attribute can have multiple flag values ​​set to one, for example, an attribute with id 05 reflecting the number of sectors reassigned due to failures from the spare area, has the SP + EC + OC flags set - self-saving, event counter, updated when the drive is offline and online.

  • value- the current value of the attribute
  • Threshold- minimum threshold value of the attribute
  • Worst- the worst value of the attribute for the entire time of the drive
  • Raw- the absolute value of the attribute
  • type- some of the programs in this optional field display information from attribute flags or signs of their criticality ( Critical or Prefail, reflecting the degradation of equipment performance, and old-age for attributes reflecting the production of a resource);

        To analyze the state of the drive, perhaps the most important attribute value is value- conditional number (usually from 0 to 100 or up to 253) set by the manufacturer. Meaning value is initially set to the maximum when the drive is manufactured, and decreases as the drive degrades. For each attribute, there is a threshold value, upon reaching which, the manufacturer does not guarantee its performance - the field Threshold. If the value value approaching or falling below Threshold, - it's time to change the drive.

    The list of attributes and their values ​​are not rigidly standardized and some of them may be determined by the drive manufacturer, but the main part is interpreted in the same way. For example, an attribute with id 05 (reallocated sector count) will characterize the number of disk sectors rejected and reassigned from the spare area, both for devices manufactured by Seagate Technology, and for devices manufactured by Western Digital. The set of supported attributes depends on the drive model and may vary significantly in composition for different models.

        Most common software tool to obtain S.M.A.R.T data in a Linux environment, is a utility smartctl from the kit smartmontools, usually included in the default software any distribution. If necessary, you can update the version, as well as download documentation in English, on the smartmontools.org project website.

    To work with the utility smartctl superuser rights required root.

    Format command line smartctl:

    smartctl device options

    Examples of using smartctl

    smartctl --help or smartctl --usage- display a hint about using the command.

    Options smartctl:

    -V, --version, --copyright, --license- display version, copyright and license information.

    -i, --info- display identification information for the device.

    -g NAME, --get=NAME- display disk settings options (all, aam, apm, lookahead, security, wcache, rcache, wcreorder)

    -a, --all- display all SMART data of the specified drive.

    -x, --xall- display all technical data for the specified drive.

    --scan- search for disk devices.

    -q TYPE, --quietmode=TYPE set output verbosity mode for smartctl (errorsonly, silent, noserial)

    -d TYPE, --device=TYPE- set device type (ata, scsi, sat[,auto][,N][+TYPE], usbcypress[,X], usbjmicron[,p][,x][,N], usbsunplus, marvell, areca,N /E, 3ware,N, hpt,L/M/N, megaraid,N, cciss,N, auto, test) smartctl cannot determine it automatically.

    -b TYPE, --badsum=TYPE- set reaction to checksum errors detection (warn, exit, ignore)

    -r TYPE, --report=TYPE- option intended for developers smartmontools and allows you to get detailed information when performing transactions of the I / O device control function ioctls(ioctl, ataioctl, scsiioctl and debug level). Details - man smartctl

    -n MODE, --nocheck=MODE- the mode of prohibition to perform tests for power saving modes (never, sleep, standby, idle). Typically used to prevent the spindle motor from starting with the smartctl command.

    -s VALUE, --smart=VALUE- disable or enable SMART (on / off)

    -o VALUE, --offlineauto=VALUE- disable or enable automatic execution of tests in non-interactive mode (in drive idle mode), accepted values ​​- on/off

    -S VALUE, --saveauto=VALUE autosave attributes (on/off)

    -s NAME[,VALUE], --set=NAME[,VALUE]- disable/enable drive hardware parameters (aam,, apm,, lookahead,, security-freeze, standby,, wcache,, rcache,, wcreorder,)

    -H, --health- display the drive status (SMART health status)

    -c, --capabilities- display information about the supported SMART capabilities of the specified hard drive.

    -A, --attributes- display SMART attributes

    -f FORMAT, --format=FORMAT- set the format of displayed SMART attributes (old, brief, hex[,id|val]). Basically, it affects the format of displayed values ​​of attribute identifiers and the format of displaying their flags:
    old- attribute identifiers are displayed in decimal system reckoning, flag values ​​are displayed in hexadecimal and interpreted as text.
    hex- the same as in the previous case, but attribute IDs are displayed in hexadecimal notation.
    brief - compact output, identifiers are displayed in decimal notation, flags are displayed as characters with decoding at the bottom of the table:
    ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-- 114 100 006 - 78309029 . . . . . . 254 Free_Fall_Sensor -O--CK 100 100 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance || ______ O updated online |______ P prefailure warning

    -l TYPE, --log=TYPE- display the specified device log (selftest, selective, directory[,g|s], xerror[,N][,error], xselftest[,N][,selftest],background, sasphy[,reset], sataphy[,reset ], scttemp, scttempint,N[,p], scterc[,N,M], devstat[,N], ssd, gplog,N[,RANGE], smartlog,N[,RANGE]

    -v N,OPTION , --vendorattribute=N,OPTION- set a parameter for a manufacturer-defined attribute with identifier N

    -F TYPE, --firmwarebug=TYPE- adaptation of the program to account for errors in the hardware firmware of the drive (none, nologdir, samsung, samsung2, samsung3, xerrorlba, swapid)

    -P TYPE, --presets=TYPE- preset disk options. By default, having found information about the drive in its database, the utility smartctl, uses the set of options available for this model. Option use- use presets for this drive, ignore- do not use, show- display presets for this disk, showall- display presets for the specified model. Examples:

    smartctl -P ignore /dev/hdb- ignore presets for /dev/hdb;
    smartctl -P show /dev/sdb- display presets for the specified drive;
    smartctl -P showall 'ST9250315AS'- - display presets for the specified disk model - ST9250315AS;
    smartctl -P showall 'ST3750515AS' 'SD15'- display presets for the specified disk model ST3750515AS with SD15 firmware;

    -B [+]FILE, --drivedb=[+]FILE- read and change the database of disk models from the file FILE. The “+” sign in front of the file name means adding new records to the database before the existing ones.

    By default, the database is stored in /usr/share/smartmontools/drivedb.h

    DEVICE SELF-TEST OPTIONS =====

    -t TEST, --test=TEST- start the test execution TEST Run test. TEST: offline, short, long, conveyance, force, vendor,N, select,M-N, pending,N, afterselect,

    -C, --captive- execution of tests in drive capture mode. Used in conjunction with the parameter -t for tests not in mode offline. Using this option may cause the device to be busy for the duration of the test and result in system disruption and data loss. Do not use the option -c to perform tests on drives with mounted partitions. For SCSI devices, this option means running built-in tests in "Foreground mode" .

    -X, --abortion- force quit a test running without a key --captive.

    Examples of using smartctrl.

    smartctl --info /dev/sdb- display identification information for the /dev/sdb device. Example command output:

    === START OF INFORMATION SECTION === Device Model: ST9500620NS Serial Number: 9XF0AW8T Firmware Version: SN01 User Capacity: 500,107,862,016 bytes Device is: Not in smartctl database ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Tue Oct 28 15:05:31 2014 MSK SMART support is: Available - device has SMART capability. SMART support is: Enabled

    smartctl --all /dev/hda- display all SMART data for the device /dev/hda

    Example of displayed data:

    === START OF INFORMATION SECTION === Device Model: ST9500620NS Serial Number: 9XF0AW8T Firmware Version: SN01 User Capacity: 500,107,862,016 bytes Device is: Not in smartctl database ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Tue Oct 28 15:05:45 2014 MSK SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: (0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (634) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: (1) minutes. Extended self-test routine recommended polling time: (102) minutes. Conveyance self-test routine recommended polling time: (2) minutes. SCT capabilities: (0x10bd) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID # ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 082 064 044 Pre-fail Always - 190 274 202 3 Spin_Up_Time 0x0003 096 096 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100,100,020 Old_age Always - 72 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 070 060 030 Pre-fail Always - 11302732 9 Power_On_Hours 0x0032 073,073,000 Old_age Always - 24037 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 72 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 081 048 045 Old_age Always - 19 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0192 Power-Off_Retract_Count 0x0032 100,100,000 Old_age Always - 38,193 Load_Cycle_Count 0x0032 100,100,000 Old_age Always - 73,194 Temperature_Celsius 0x0022 019,052,000 Old_age Always - 19 (0 14 0 0) 195 Hardware_ECC_Recovered 0x001a 118 100000 Old_age Always - 190 274 202 197 Current_Pending_Sector 0x0012 100,100,000 Old_age Always - 0198 Offline_Uncorrectable 0x0010 100,100,000 Old_age Offline - 0199 UDMA_CRC_Error_Count 0x003e 200,200,000 Old_age Always - 0 SMART Error log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.

    smartctl -A -v 9,minutes /dev/hda- display all SMART attribute data for a device /dev/hda and attribute with id 9 (power-on time) should be interpreted as an internal value given in minutes rather than hours.

    smartctl --smart=on --offlineauto=on --saveauto=on /dev/hda- enable SMART for /dev/hda disk, enable automatic execution offline tests and self-saving attributes. The command can be run on a running system. In fact, this is the installation of standard operating parameters for a conventional disk drive.

    smartctl --test=long /dev/hda- run extended built-in tests for the /dev/hda drive. The command can be used on a running system. To view the results of test execution, use the command to display the internal log after the test is completed.
    smartctl -l selftest /dev/hda

    smartctl --attributes --log=selftest --quietmode=errorsonly /dev/had- Display internal self-test log data and error attributes.

    smartctl -s on -t offline /dev/hdc- enable SMART and perform an offline test for the /dev/hdc drive. If an error is detected during testing, information on it will be written to the internal log, which can be viewed using the parameter -l error.

    smartctl -q silent -a /dev/had- check SMART data without displaying received information. Usually used in scripts. After the command is executed, the return code is checked (variable $? command shell) to determine whether the value of any attribute has gone beyond the limit value or whether there is an error entry in the device logs.

    smartctl -q errorsonly -H -l selftest /dev/had- output information only if there is an erroneous SMART condition or if any of the internal tests fail.

    smartctl -t select,10-100 -t select,30-300 -t afterselect,on -t pending,45 /dev/hda- perform an internal test in the specified area of ​​LBA blocks and, after its completion, scan the rest of the disk. If the power is turned off during the scan, continue scanning 45 minutes after the power is turned on.

    smartctl --all --device=3ware,0 /dev/sda- get SMART data for the first ATA disk connected to the 3ware RAID controller.

    smartctl -a -d 3ware,0 /dev/twe0- get SMART data for the first ATA disk connected to a 3ware RAID 6000/7000/8000 RAID controller.

    smartctl -a -d 3ware,0 /dev/twa0- get SMART data for the first ATA disk connected to a 3ware RAID 9000 RAID controller

    smartctl -t short -d 3ware,3 /dev/sdb- run short internal tests for disk 4, second disk SCSI device /dev/sdb

    smartctl -a -d hpt,1/3 /dev/sda- to get data SMART drive connected to channel 3 of the first HighPoint RocketRAID controller

    Explanation of S.M.A.R.T attributes

    Attribute identifiers are given in decimal notation, and those in brackets are in hexadecimal.

  • 001 (1h) Raw Read Error Rate- absolute value of reading errors. There are some differences in the formation of the value of this attribute by different manufacturers. From practice, I can say that Seagate drives can have a giant RAW value of this attribute, being really in good condition, and Western Digital drives can have it zero, having critical indicators for other characteristics. Some models do not support this attribute at all.
  • 002 (02h) Throughput Performance- average hard drive performance. A rare attribute.
  • 003 (3h) Spin Up Time- The average spin-up time of the disc spindle from 0 RPM to operating speed. For SSD drives not supported.
  • 004 (4h) Start/Stop Count- Number of spindle start/stop cycles.
  • 005 (5h) Reallocated Sector Count- The number of reassigned (reallocated) sectors. Modern drives have a spare surface area to use its volume in case of deterioration of the characteristics of blocks from the main zone. If the drive firmware detects errors in writing/reading any block of the working surface, then a mechanism is launched that redirects requests to the defective block (sector) to the block from the spare part. It automatically moves its data to the spare area, and this block is marked as "reassigned". This process is often referred to as "remapping", or "automatic defect reassignment". Reassignment procedure bad sectors to backup, is performed automatically by the internal firmware of the drive, and it is invisible to the user (operating system). The very fact of the reassignment and the number of reassigned sectors are available only from the SMART logs. Attribute absolute value field Raw Value contains the total number of remapped sectors. Normalized value value reflects the percentage of allowed number of defective blocks. When the spare area is exhausted, remapping becomes impossible and the disk must be replaced. Even a non-critical, but high value of this field can lead to a decrease in the data exchange speed, since the drive performs additional operations of installing heads on the tracks of the spare area, which is usually located at the end of the working surface of the disk.
  • 007 (7h) Seek Error Rate- The frequency of occurrence of errors in the positioning of the block of magnetic heads (BMG) . Disk drives controls the correct installation of the heads on the required track of the surface. In the case when the installation was performed incorrectly, an error is fixed and the operation is repeated. In practice, a large number of positioning errors can be caused not only by equipment problems, but also by the influence of external factors - inappropriate temperature conditions or vibration.
  • 008 (8h) Seek Time Performance
  • 009 (09h) Power-On Hours (POH) Operating Hours - The number of hours the drive has been powered on since manufacturing, as an integer value in hours. Sometimes there are drive models in which the internal value of this attribute is stored as the number of working minutes or seconds, rather than hours. Reaching the threshold value of this attribute means exhaustion of the resource specified by the manufacturer ( MTBF - Mean Time Between Failures
  • 010 (0Ah) Spin Retry Count- Number of retries to start the spindle. After turning on the power, the drive spins up the disks and controls the achievement of the operating rotation speed set by the manufacturer for this model. If the operating speed is not reached within the allotted control time, the value of this attribute is increased and the engine is restarted.
  • 011(0B) Recalibration Retries- the attribute reflects the number of repeated recalibrations, in case the first attempt was unsuccessful. If the attribute value increases, then there is a high probability of problems with the mechanical part of the drive. In addition, an increase in the absolute value of this attribute can be caused by the fact that the recalibration procedure is used by the drive's internal firmware to correct other types of errors.
  • 012 (0Ch) Device Power Cycle Count- the absolute value of Raw Value indicates the number of drive power on/off cycles for the entire period of operation. The normalized value of Value is usually unchanged and equal to 100.
  • 013 (0Dh) - Soft Read Error Rate- Quantity software failures- the total number of software failures. Normalized Value: Starting at 100, displays the percentage of incremental software failures remaining tolerable.
  • 100 (64h) Erase/Program Cycles- the number of erase-write cycles of reprogrammable memory (flash) for SSD drives. The number of such cycles is limited and depends on the permanent rewritable memory chips used in this SSD model.
  • 103 (67h) Translation Table Rebuild- the number of events associated with the destruction of the internal tables of the translator and its rebuilding.
  • 170 (AAh)Reserved Block Count- the number of available spare blocks for remapping bad sectors (see attribute E8h).
  • 171 (ABh) Program Fail Count- write errors in SSD flash memory
  • 172 (ACh) Erase Fail Count– SSD flash erase errors. The process of writing to rewritable read-only memory consists of two parts - erasing and writing. The erasing procedure is always performed before data is written.
  • 173 (ADh) Wear Leveller Worst Case Erase Count- the maximum allowable number of erase operations for a single block of an SSD.
  • 174 (AEh) Unexpected Power Loss- unexpected power outage for SSD. This indicator is also called the "Number of emergency shutdowns" in the terminology of hard drives with magnetic media. Absolute Raw Value: The cumulative number of abnormal shutdowns over the lifetime of the device.
  • 175 (AFh) Program Fail Count- This attribute is used in Intel SSDs and displays information about SSD power-off protection failures. The results of the last test in the form of the number of microseconds before the capacitor discharges, is fixed at the maximum value. It also records the number of minutes since the last test and the total number of tests over the lifetime of the device. Raw Value Raw Value: Bytes 0-1: The results of the last test as the number of microseconds before the capacitor discharges, fixed at the maximum value. The test result should be in the range 25 - 5,000,000, a lower value indicates a specific error code. Bytes 2-3: number of minutes since the last text, fixed at the maximum value. Bytes 4-5: number of tests over the life of the device, does not increase with power-on and power-off cycles, fixed at the maximum value. Value is set to 1 if the test fails, or 11 when testing a capacitor in unacceptable temperature conditions; otherwise set to 100.
  • 183 (B7h) SATA Downshifts- Number of SATA Speed ​​Downs Raw Value: The number of times the SATA interface was set to a reduced data rate due to errors (from 6Gb/s to 3Gb/s or 1.5Gb/s or from 3Gb/s to 1.5Gb /s Very often, this attribute characterizes the insufficient quality of the power supply, the oxidation of the contacts of the interface cable, or its malfunction.
  • 184 (B8h) End-to-End error Number of end-to-end disk cache errors detected. Absolute value: the number of end-to-end errors detected and corrected by the hardware.
  • 187 (BBh) Reported Uncorrectable Errors The number of unrecoverable errors. Raw Value - The number of errors that could not be corrected by the drive's internal routines.
  • 188 (BCh) Command Timeout- the number of commands interrupted by timeout.
  • 189 (BDh) High Fly Writes- the number of events related to errors recorded by the Fly Height Monitor when the record heads are in a position that does not guarantee normal operation. If the flight height of the head above the magnetic surface, even for a short time, exceeds the optimum, then the data recorded by it, in the future, may not be read. Modern drives use a specially developed technology for controlling the height of the flight heads, which allows not to write data at a non-optimal height. One is added to the counter of this attribute, and the entry is made after the normal flight altitude is set. An increased value of this attribute can be caused by external shocks or vibrations, abnormal temperatures, deterioration of the magnetic surface or head.
  • 190 (BEh) Airflow Temperature air flow temperature (case). Raw Value: Case temperature statistics. Bytes 0-1: current case temperature in degrees Celsius; byte 2: recent minimum case temperature in degrees Celsius; byte 3: recent maximum case temperature in degrees Celsius; bytes 4-5: overtemperature counter. The number of times the recorded temperature exceeded the drive's maximum operating temperature.
  • 191 (BFh) G-sense error rate- the number of errors resulting from impact loads. The attribute stores the readings of the built-in accelerometer, which captures all shocks, shocks, falls, and even inaccurate installation of the disk into the computer case. Usually it quite accurately characterizes the operating conditions of laptops - a large value of the attribute indicates sharp shocks and falls during the operation of the device.
  • 192 (C0h) Emergency Retract Cycle Count Number of emergency shutdowns (number of emergency shutdowns) - the total number of events of emergency (abnormal) power off for the entire period of use of the device. For SSD disks, "abnormal shutdown" means turning off the power of the device without first issuing the STANDBY IMMEDIATE command.
  • 194 (C2h) HDA Temperature- the temperature of the drive itself (HDA - Hard Disk Assembly). This attribute stores the readings of the built-in temperature sensor, which is usually one of the magnetic heads (usually the lower one). For SSD drives, the thermal sensor is located inside the case on printed circuit board. The data recorded in the attribute fields displays the current, minimum and maximum temperatures. The Worst field shows the worst temperature reached during the operation of the drive (you can set the fact of overheating and its degree), Raw Value - the current temperature. Some drive models may support attribute 205 (CDh) Thermal asperity rate (TAR), which records the number of dangerous temperature drops.
  • 195 (C3h) Hardware ECC Recovered- the number of read errors corrected by the drive hardware using the error correction code. Such errors do not require re-reading of the sector, and do not lead to a loss of data exchange rate, but a large number of them indicates a deterioration in the read path parameters.
  • 196 (C4h) Reallocation Event Count raw value
  • raw value field This attribute shows the total number of sectors that the drive currently considers candidates for remapping to the spare area. If in the future any of these sectors is read successfully, then it is excluded from the list of candidates. If the reading of the sector is accompanied by errors, then the drive will try to recover the data and transfer it to the backup area, and mark the sector itself as remapped.
  • 198 (C6) Uncorrectable Sector Count- The counter of uncorrectable errors, i.e., the counter of errors that were not corrected by the drive's internal hardware correction tools. This means that such errors show up as classic bad blocks. file system(Bad block). The cause of such disk failures may be the failure of individual elements or the lack of free sectors in the spare area of ​​​​the disk when remapping was necessary.
  • 199 (C7h) UltraDMA CRC Error Rate- The number of errors during data transfer in direct memory access mode detected by means of cyclic redundancy check (CRC). Hardware controls for transferring data from the drive to RAM found a checksum error and corrected it "on the fly" if the error is correctable. In this case, the algorithm of normal disk operation does not change. In the case of an unrecoverable error, the procedure for handling it is performed by the system. Typically, this attribute contains a counter of any kind of CRC error. Often this type of error is associated not so much with the hardware of the drive, but with a faulty interface cable, oxidized contacts, poor power supply, overclocking of the PCI bus frequency, overheating of the chipset chipset of the motherboard, etc.
  • 200 (C8h) Write Error Rate (Multi Zone Error Rate)- data recording errors.
  • 232 (E8h) Total Count of Write Sectors For SSD drives, the number of sectors written. Raw Value increases by 1 for every 65,536 sectors (32 MB) written by the system. For Intel SSD - Intel SSD Available Reserved Space- percentage of available spare area used for reassignment of bad blocks.
  • 233 (E9h) Power On Hours- Storage time. For SSD drives, this attribute is interpreted as Remaining Life- media wear indicator. The number of cycles of operation of the NAND carrier. Decreases linearly from 100 to 1 as the average number of erase cycles increases from 0 to maximum. The normalized value will stop decreasing after reaching 1, but in all likelihood the device will withstand significant additional wear.
  • 241 (F1h) Total LBAs Written- Total number of recorded LBA sectors. Raw Value: The total number of sectors written by the system. The value increases by 1 for every 65,536 sectors (32 MB) written by the system.
  • 242 (F2h) Total LBAs Read- Total number of LBA sectors read. Raw Value increases by 1 for every 65,536 sectors (32 MB) read by the system.
  • 254 (FEh) Free Fall Event Count- the number of events of free fall acceleration of the disk during its operation (how many times the disk fell).

    Assessment of the technical condition of the hard drive according to S.M.A.R.T data

    Supported Attribute Set specific model hard drive, even if it is minimal, allows you to determine the technical condition and prospects for the operation of the device with high reliability. You can determine the time spent in the on state by the value of the attribute 9 , and together with the attribute value 12 - the number of power on / off, and therefore - round-the-clock or periodic operation. Intensity of use, temperature, negative external influences- all these facts are easily tracked by the absolute values ​​of the corresponding attributes. Similarly, you can evaluate the level of equipment wear, the quality of the surface and the write / read path.

    Minimally informative disk status monitoring can be performed even at the BIOS level. In case of reaching the critical value of any attribute that characterizes the health, with S.M.A.R.T status monitoring enabled in BIOS settings, the operating system loading is suspended and the following message is displayed on the screen:

    Primary Master Hard Disk: S.M.A.R.T status BAD!, Backup and Replace.
    Press F1 to Resume

    Thus, without installing or running additional software, it is possible to timely determine the fact of the critical state of the drive using Base System Input-Output (BIOS) when you turn on the computer.

    The technical condition of a hard disk that has not reached the critical threshold is characterized by the absolute value of the attributes that reflect the counters of failures detected and corrected by the drive hardware.

  • 001(1) Raw Read Error Rate- absolute value of reading errors. There are some differences in the formation of the value of this attribute by different manufacturers. In practice, Seagate drives can have a gigantic RAW value of this attribute, while being in really good condition, and Western Digital drives can have it zero, having critical values ​​for other characteristics. Some models may not support this attribute at all.
  • 005 (5) Reallocated Sector Count- Number of reassigned sectors. A non-zero value of this counter indicates that defective blocks were detected, the data of which was transferred to the spare area.
  • 196 (C4) Reallocation Event Count- Number of bad sector remapping events. In field raw value This attribute stores the total number of attempts to transfer data from unstable sectors to the spare area. Both successful and unsuccessful attempts are counted.
  • 197 (C5) Current Pending Sector Count- The current number of unstable sectors. raw value field This attribute shows the total number of sectors that the drive currently considers candidates for remapping. If in the future any of these sectors is read successfully, then it is excluded from the list of candidates. If the reading of the sector is accompanied by errors, then the drive will try to recover the data and transfer it to the backup area, and mark the sector itself as remapped. If the value of attributes 5,196,197 increases in a short period of time (days, or even hours), then this is a warning sign - or worsen technical specifications the drive itself, or the influence of external influences.
  • 007 (07h) Seek Error Rate- The frequency of occurrence of errors in the positioning of the block of magnetic heads (BMG). A large value indicates problems with the positioning mechanism, although it can also be caused by external factors such as overheating or increased vibration.
  • 008 (08h) Seek Time Performance- average positioning speed of magnetic heads. If the attribute value decreases (positioning slowdown), then there is a high probability of problems with the mechanical part of the actuator.
  • 199 (C7) UltraDMA CRC Error Count- Counter of errors that occurred during data transfer in UltraDMA mode. An increase in the absolute value indicates problems when the disk controller transfers data to RAM. Most often, it is caused by a bad cable and unstable power supply.

    Change absolute values attributes must be considered in dynamics, and in a logical relationship with each other.

    Running built-in S.M.A.R.T tests

    The set of built-in S.M.A.R.T tests is determined by the manufacturer and may vary significantly for different hard drive models. Basically, the built-in SMART tests are short tests (short self-test) and long ones (extended sels-test). The short tests scan a small portion of the disk surface as defined by the manufacturer and run for about 1 minute on average. Long tests scan the entire working surface of the disk and can run, depending on the speed and volume of the disk, even for several hours. Also, for modern disks, you can perform selective tests (selective self-test), the parameters of which are set by the user and tests after transporting the device (conveyance self-test). Tests can be aborted if the drive's capture mode (captive) is not set and the drive supports the test abort command. Regarding drive capture mode when running tests captive, then you need to use it carefully if the disk is used by the system.

    Examples:

    smartctl --test=short /dev/sdb- run a short test. In response to the command, information will be displayed:

    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun (previous test aborted). Please wait 1 minute for test to complete. Test will complete after Fri Dec 5 16:08:09 2014 Use smartctl -X to abort test.

    Which means that a command was sent to the disk to perform a short test, the disk accepted it successfully, the test will last 1 minute, and you can use the smartctl –X command to force it to stop.

    The result of the test execution can be checked by viewing the test log with the command smartctl –l selftest. Log information will be received in response self test:

    === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 831 -

    Log columns: Num- record number.
    Test_Description- description of the test.
    Status- completion status (completed without errors)
    Remaining- percentage of time remaining until the end of the test, if it is not yet completed (00%)
    Lifetime(hours)- time of operation of the drive from the beginning of operation.
    LBA_of_first_error- number of the LBA logical block where the first error was detected during the test execution. AT this example, there are no errors.

    To run a long test, use the command:

    smartctl --test=long /dev/sdb

    In response to the command, information about the start of the test is displayed:

    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 70 minutes for test to complete. Test will complete after Fri Dec 5 17:15:44 2014

    As you can see, the long test for this drive model will run for 70 minutes.

    The result of the execution can be checked with the command smartctl –l selftest /dev/sda

    List of ATA commands for working with S.M.A.R.T

    SMART_READ_VALUES 0xd0 SMART_READ_THRESHOLDS 0xd1 SMART_AUTOSAVE 0xd2 SMART_SAVE 0xd3 SMART_IMMEDIATE_OFFLINE 0xd4 SMART_READ_LOG_SECTOR 0xd5 SMART_WRITE_LOG_SECTOR 0xd6 SMART_ENABLE 0xd8 SMART_DISABLE 0xd9 SMART_STATUS 0xda SMART_AUTO_OFFLINE 0xdb

    More on the topic of hardware in Linux:

  • Sequence of actions in the presence of S.M.A.R.T. hard drive or SSD errors. How to fix disk and recover lost data. When you boot your computer or laptop, S.M.A.R.T. appears. hard drive or ssd error? After this error, the computer does not work as before, and you are afraid about the safety of your data? Don't know how to fix the error?

    Actual for OS: Windows 10, Windows 8.1, Windows Server 2012, Windows 8, Windows Home Server 2011, Windows 7 (Seven), Windows Small Business Server, Windows Server 2008, Windows Home Server, Windows Vista, Windows XP, Windows 2000, Windows NT.

    What to do with SMART error?

    Step 1: Stop using the failed HDD

    Receiving an error diagnostic message from the system does not mean that the drive has already failed. But in case of S.M.A.R.T. errors, you need to understand that the disk is already in the process of failure. A complete failure can occur both within a few minutes, and after a month or a year. But in any case, this means that you can no longer trust your data to such a disk.

    You need to take care of the safety of your data, create backup or transfer files to another storage medium. Along with the safety of your data, you must take steps to replace the hard drive. The hard drive where the S.M.A.R.T. errors should not be exploited - even if it does not completely fail, it can partially damage your data.

    Of course, HDD may fail without warning S.M.A.R.T. But this technology gives you the advantage of warning you that a drive is about to fail.

    Step 2: Recover deleted disk data

    In the event of a SMART error, data recovery from the disk is not always required. In the event of an error, it is recommended to immediately create a copy of important data, as the disk may fail at any time. But there are errors in which it is no longer possible to copy data. In this case, you can use the recovery program data hard disk - Hetman Partition Recovery.

    For this:

    1. Download the program, install and run it.
    2. By default, the user will be prompted to use File recovery wizard. Pushing a button "Further", the program will prompt you to select the drive from which you want to recover files.
    3. Double click on the failed drive and select the type of analysis you want. Choose "Full Analysis" and wait for the disk scanning process to complete.
    4. After the scanning process is completed, you will be provided with files to restore. Highlight necessary files and press the button "Reestablish".
    5. Choose one of the suggested ways to save files. Do not save recovered files to a disk with an error.

    Step 3: Scan the disk for bad sectors

    Run a scan of all hard disk partitions and try to fix any errors found.

    To do this, open the folder "This computer" and click right click mouse on disk with SMART error. Select Properties / Service / Check In chapter Checking the disk for errors.

    As a result of scanning, errors found on the disk can be corrected.

    Step 4: Reduce disk temperature

    Sometimes, the cause of the “S M A R T” error may be the exceeding of the maximum allowable operating temperature of the disk. This error can be fixed by improving the ventilation of the computer. First, check if your computer is equipped with sufficient ventilation and if all fans are working properly.

    If you find and fix a ventilation problem, after which the disk's operating temperature drops to a normal level, then the SMART error may no longer occur.

    Step 5:

    Open folder "This computer" and right-click on the disk with the error. Select Properties / Service / Optimize In chapter Disk optimization and defragmentation.

    Select the drive you want to optimize and click Optimize.

    Note. In Windows 10, disk defragmentation and optimization can be configured to run automatically.

    Step 6: Buy a new hard drive

    If you encounter a SMART hard drive error, then purchasing a new drive is only a matter of time. Which hard drive you need depends on your computer style and the purpose for which it is being used.

    What to look for when purchasing a new drive:

    1. Disk type: HDD, SSD or SSHD. Each type has its pros and cons, which are not critical for some users and are very important for others. The main ones are the speed of reading and writing information, volume and resistance to repeated rewriting.
    2. The size. There are two main drive form factors: 3.5" and 2.5". The disk size is determined in accordance with the installation location of a particular computer or laptop.
    3. Interface. Main hard drive interfaces:
      • SATA
      • IDE, ATAPI, ATA;
      • SCSI
      • External drive (USB, FireWire, etc.).
    4. Specifications and performance:
      • Capacity;
      • Read and write speed;
      • The size of the memory buffer or cache;
      • Response time;
      • Fault tolerance.
    5. S.M.A.R.T.. The presence of this technology in the disk will help determine possible mistakes its work and prevent data loss in time.
    6. Equipment. This item includes the possible presence of interface or power cables, as well as warranty and service.

    How to reset SMART error?

    SMART errors can be easily reset in the BIOS (or UEFI). But the developers of all operating systems We strongly do not recommend doing this. If the data on the hard disk is of no value to you, then the output of SMART errors can be disabled.

    To do this, do the following:

    1. Restart your computer, and by pressing the key combination indicated on the boot screen (they are different for different manufacturers, usually "F2" or Del) go to BIOS (or UEFI).
    2. Go to: advanced > SMART settings > SMART self test. Set value Disabled.

    Note: the location of the deactivation of the function is indicated approximately, since depending on the BIOS versions or UEFI, the location of this setting may vary slightly.

    Is HDD repair worth it?

    It is important to understand that any of the ways to eliminate SMART errors is self-deception. It is impossible to completely eliminate the cause of the error, since the main cause of its occurrence is often the physical wear of the hard drive mechanism.

    To eliminate or replace malfunctioning hard drive components, you can contact the service center of a special laboratory for working with hard drives.

    But the cost of work in this case will be higher than the cost of a new device. Therefore, it makes sense to do repairs only if it is necessary to restore data from an already inoperable disk.

    SMART error for SSD drive

    Even if you have no claims to work SSD drive, its performance gradually decreases. The reason for this is the fact that SSD memory cells have a limited number of write cycles. The wear resistance function minimizes this effect, but does not completely eliminate it.

    SSD drives have their own specific SMART attributes that signal the state of the disk's memory cells. For example, “209 Remaining Drive Life”, “231 SSD life left”, etc. These errors can occur when cells are degraded, which means that the information stored in them can be corrupted or lost.

    The cells of an SSD disk in the event of a failure are not restored and cannot be replaced.

    Equipped with a special firmware for self-diagnosis S.M.A.R.T. (self-monitoring, analysis and reporting technology). This technology allows you to monitor the state of the HDD, analyze its operation and predict failure. "SMART" monitors over 40 parameters, the result for each of which is entered in a special table. Analysis of S.M.A.R.T. allows you to detect vulnerabilities and predict the failure of a hard drive.

    This article will tell you how to view the SMART of a hard drive, decipher its readings, and what parameters should be given special attention. It should be noted that the information is presented in a structured way, but special software is required to extract data from it.

    How to watch S.M.A.R.T. hard drive. Decryption of parameters.

    To check the "SMART" parameters, this function must be enabled in the system. This is true for computers manufactured before 2010. They have an HDD S.M.A.R.T option in the BIOS. Capability, the inclusion of which allows you to fully track the "SMART". In new PCs, the question “how to enable S.M.A.R.T. on the hard drive? irrelevant - everything is enabled by default.

    To view HDD status parameters, you need a special utility for working with hard drives (Victoria, HD Tune, HDD Scan) or complex diagnostic programs(Everest or its "successor" Aida64). They allow you to display the table in an easy-to-understand way.

    Let's analyze the parameters on the example of "Victoria". As you can see from the image, the hard drive (in this case it is a 200 GB Seagate with an outdated IDE interface) does not support all "SMART" commands and fixes some of the parameters.

    In the header of the table, you can see the parameter ID, its name, VAL, Wrst, Tresh and Raw values, as well as the estimated Health column.

    • ID – parameter number in the general list of analyzed criteria.
    • VAL is its current value in abstract units (usually a percentage of the ideal indicator).
    • Wrst is the worst value that the hard drive has ever reached.
    • Tresh is a conditional threshold for the VAL value, upon reaching which the system notifies of the impending "death" of the HDD.
    • RAW is an expression of the VAL parameter in numerical format (the number of hours of operation / failures / errors / bugs).

    The Health parameter allows you to assess the state of the HDD for people who are unfamiliar with the intricacies of computer hardware or English language. He assigns the usual score of 1 to 5 points to each of them.

    When analyzing the state of a hard drive, you should pay attention to VAL (comparing with the Tresh column) and RAW (for an objective assessment). In the above example, it can be seen that the hard drive has experienced many read errors (for Seagate, Fujitsu and Samsung, you can not look at this column - all errors are recorded here) and has a long operating time (parameter 9). The table shows that the number of hardware error corrections (parameter 195) is quite high. The rest of the "SMART" values ​​are normal, or close to it. It is important that parameter 5 (Reallocated Sectors Count) is OK. This means that the number of bad sectors is small (11 in this case) and nothing threatens the disk itself yet.

    If parameter 5 differs with alarming values, the health of the HDD is at risk. In the above screenshot, the Reallocated Sectors Count graph indicates that the railway is close to failure. In this case, it is a system failure (a mismatch between RAW zero and critical VAL indicates this), and a SMART hard drive recovery is required to bring it back to normal. But usually such information indicates that the HDD is about to break down, and it can no longer be used normally.

    How to reset or restore S.M.A.R.T. hard drive

    We can't go into detail on how to reset a SMART hard drive. Although this action is not criminal (unlike the same IMEI change of a smartphone), it can help unscrupulous dealers sell faulty hard drives under the guise of new ones. But for users who need to know how to recover a SMART hard drive to get it back in service after a software failure, let's explain the situation in general terms.

    • To reset S.M.A.R.T. (just like other service tasks) a hard drive connection via the COM interface is required. To do this, manufacturers equip the HDD with a special connector of 4 or 5 pins. It is located next to the slots for data and power cables. Newer computers often do not have a COM jack on the rear panel, so a special USB-COM board takes over its function.

    Hard drive interface connectors


    Hi all! In the last article, we reviewed . And today we will look at how to look hard health disk, for example, in order to know that nothing will happen to it in the near future. Well, or it happened and you still have time to save the data.

    To get started, download the free program:

    We also run:

    1. Select the disk whose health you want to check
    2. Next, click on the magnifying glass
    3. And press SMART

    In the Attribute Name cell, the name of the smart test. You can find more detailed information in the file by clicking on the download button. This is information from Wikipedia. The file will also contain critical names and non-essential ones. If your critical titles have exceeded the norm, then think about changing the hard drive.

    She is Russian and less functional.

    We also pay attention to temperature. I've been doing an experiment about this, ssd is on my side wall (at zalman cases there is a special mount), and second hard the disk is in its place, and there is also a cooler in front, which additionally cools it. So, with and without a cooler, the difference is 4 degrees. So I will move the ssd closer to the cooler. After all, when a hard drive fails, the first reason is temperature.

    Critical values

    Pay special attention to the following parameters:

    • 01 (01) Raw Read Error Rate- how often errors occur when reading from a data disk.
    • 03 (03) Spin-Up Time- how fast the plate will unwind from the state of rest to the working state.
    • 05 (05) Reallocated Sectors Count- the number of reassigned sectors. If the number of reassigned sectors ends, then will appear.
    • 07 (07) Seek Error Rate- if the head is not exactly on the track, this indicates damage to the mechanics. This may be due to overheating. The more often the head misses a track, the higher the value.
    • 10 (0A) Spin-Up Retry Count- also in case of mechanical failure. The error appears when the disk cannot spin up to operating speed.
    • 196 (C4) Reallocation Event Count- how many reassignments were made bad sectors to reserve.
    • 197(C5)Current Pending Sector Count (unstable sectors)- How many sectors are applicants for reassignment. These sectors are not yet broken, but they have a weak response.
    • 198 (C6) Uncorrectable Sector Count- due to corrupted mechanics, shows the number of failed times to read sectors.
    • 220 (DC) Disk Shift- due to impact, the plates can be knocked off the axis.

    That's all. Not critical errors and description you will find by downloading in the document above. This is how you can check the health of your hard drive using these 2 programs. Which one to use is up to you.