Hard Disk Failure

From IPitomy Wiki
Jump to navigation Jump to search

Hard Drive Problems

For troubleshooting hard drive problems go to Drive_Hardware_Troubleshooting

Hard Drives Need Some Consideration for PBX Maintenance There are several basic types of hard drive failures. Software or firmware damage may cause the disk to become unreadable, resulting in the inability to interact properly with the computer. Problems with the controller board on the hard disk may result in electronic failure. Mechanical failure can occur when components on the disk become faulty. And logical corruption may occur when there is a problem with the information on the disk.

Hard drive failures result from destructive forces that threaten to destroy your hard drive. The six worst hard drive destroyers are simpler than you might think.

  • Heat: The primary cause of hard drive failures is hardware overheating. Inadequate ventilation and cooling in and around your computer hardware can cause severe damage to the equipment. Overworked hardware with little or no downtime and smoke or fire damage can wreak havoc on a system.
    • For IPitomy Dealers, always make sure your IPitomy server is located in a well-ventilated air conditioned environment. Fans should be checked periodically as well as clearing out any accumulation of dust that may be present on the fan blades, cooling fins, air intakes and in dust filters. This should be included in a quarterly maintenance service routine to assure your customer of uninterrupted service.
  • Physical damage to your computer: Any type of physical force, such as bumping, jarring, or dropping your computer may lead to physical damage to the hard drive. If your computer is in use at the time of the impact, the read/write heads may touch or gouge the disks, causing damage to the platter’s magnetic surfaces where the data is embedded. This is known as a head crash, and the damage can be significant. Even if your computer is powered down, the likelihood of jarring your computer’s components is still present.
    • Malfunctioning machinery like HVAC systems that cause excessive vibrations should not be located too close to the server if possible.
  • Power Surges: A power surge can be caused by lightning strikes, interference with power lines, or by any event which causes the flow of energy to be interrupted and then restarted. Power surges can result in data loss when the read/write heads fail to function properly, and in the worst case, a complete computer crash.
    • Make sure that your server is properly grounded and is isolated from any potential equipment that could cause a power surge or brown out. Electric motors are common causes of brown outs. Having a qualified UPS system that has adequate surge protection is imperative.
  • Water Damage: Moisture caused by roof leaks, flooding or even by spilling a liquid onto your computer is almost always bad news. The casing that holds the hard disk drive is not designed to be a barrier against water penetration. Water will have damaging effects on a computer’s electronic parts and disk components, possibly even causing unwanted electrical currents which can further damage your computer.
  • Corrupted files: Turning of your server before shutting down the system risks corrupting data. While IPitomy does run a file check when it reboots, the corrupt files can sometimes be repaired. If the files in the boot sector are corrupt, you will need to reformat the drive with an Emergency Recovery Dongle from IPitomy and load your backup. Power failures and accidental computer shut downs can contribute to corrupted files, causing damage to the hard drive. IPitomy will integrate with an American Power Conversions (APC) UPS System. By plugging the USB connector that comes with each system into a USB port on the server, the UPS can notify IPitomy when the power is down to 3 minutes of battery life. When IPitomy gets this notification, it can gracefully shut down the system without corrupting any files. As part of a quarterly maintenance routine, it is recommended that the UPS be checks and verified for performance. UPS systems have been known to report that they are OK, but in fact, when the power drops, they drop too causing the server to suddenly lose power.
  • Human error: The functions of the hard drive can be impaired by human tampering with the system files. Accidental deletion of files imperative to the disk drive is not uncommon. Improper installation and removal of files from your computer can cause the hard disk to malfunction. Removing the power without shutting down the server can cause damage to the disk. In the case of an IPitomy server, pressing the power switch momentarily and waiting for the system to shut down on its own assures that there is no damage to the disk. Pulling the plug while the system is operating increases the chance of data corruption and physical disk damage.

Hard drive damage can destroy your hard disk drive in an instant, and recovery of important information may be difficult. Backing up your data, and using simple preventive techniques can help to save you and your customers from the headache of a hard drive failure. IPitomy makes available an |Emergency Recovery Dongle] in case of hard drive failure due to corruption of the boot sector on the hard disk. This disk can be created with some simple online instruction in the IPitomy WIKI. The Emergency Recovery Dongle will completely load a fresh new install of IPitomy on your hard drive. If the drive was the victim of data corruption, this will fix it. You will need to have a backup to restore all of the original programming, voice messages, music on hold files and other data back on the drive once it is rebuilt. If the hard drive is physically damaged, it will need to be replaced. Backup the Data often and automatically to a location outside of the server to assure that the system can be fully restored quickly and with minimal down time for the customer.

A quick note regarding Solid State Drives (SSD’s)

In a recent test comparing SSD’s to standard hard drives, the testes were not encouraging. The tests were performed on 15 drives. Of the 15 drives (10 different models, from five vendors), only one drive model, from one vendor, had no failures of any sort. One device failed completely (SSD #1), while one-third of SSD #3 became unusable due to metadata corruption. The other SSDs all exhibited various types of data corruption when they unexpectedly lost power, including the high-end enterprise SSDs with SLC NAND and supercapacitors. According to the research team, part of the problem is that virtually none of the devices actually behave as expected under fault conditions. While all the drives claim to use ECC RAM, for example, many exhibited single-bit errors of the kind of errors that ECC is meant to prevent. While one of the two included hard drives also developed errors, the HDDs are both far cheaper and showed no sign of the disastrous failures that characterized the SSDs. SSD’s do not offer much protection from power failure. The technology is improving daily and IPitomy is watching for performance improvements to SSD’s with an eye toward incorporating them into the products as an option once the testing comes back with more positive results.

USB Install Dongle

As long as you have a backup of the system, the above process will allow you to restore the Boot sector and software of a Hard Drive and reload. Then it is simply a matter of restoring your backup. If the drive is actually in a failure state, you can also simply purchase a new drive and reload it using this same process. We have found that in more than 70% of cases where it was thought to be an actual drive failure, that it proved to only be boot sector corruption of the drive and that the drive was actually fine and able to be returned to service. If you need assistance with the process feel free to contact Tecnhical Support.