Bones Break…so do RAID 5 arrays – data recovery for Physiotherapist Practice

raid 5 recovery dublin ireland

Last week, we got a call from a Dublin physiotherapist`s practice. Their Dell Poweredge server, configured in RAID 5, had failed.

Their I.T support technician identified the problem immediately. However, for him data recovery from a RAID 5 server was unknown territory. For this blog post, here is an abridged version of the RAID recovery process which we used.

For recovery we decided to use Mdadm.  It is a powerful linux-based RAID recovery utility. A good knowledge of Linux command line and in-depth experience of this tool are essential prerequisites for it’s operation.

The first step in the recovery process was to deterimine the status of the server’s drives in-situ.

We used the following command on every disk in the array:

            mdadm – -examine

We were able to determine that drives /dev/sdc1 and /dev/sdd1 drives had failed (sdc1 being in worse condtion). Mdadm revealed that this RAID 5 had experienced double-disk failure.We then carefully labelled each drive and removed them from the server. Then, using a specialised hardware disk imager – we imaged the disks. This means that we would be working on copies of the disks rather than the original ones. In the unlikely event of the data recovery process being unsuccessful – the original configuration and data, as we received it, would still be intact.

The imaging process completed successfully. We now put the imaged drives into the server. With all the prep work completed. It was now time to take the RAID array “offline”. This can be achieved by using the “mdadm –stop” command. The last thing we wanted was for the RAID rebuilding process to start using a failed disk in bad condition  (e.g. /dev/sdc1) To prevent this from happening, we cleared the superblock of this drive using the command:

            mdadm –zero-superblock /dev/sdc1

Now using the output we got from mdadm –examine, we used the following command to  rebuild the array:

            mdadm –verbose –create –metadata=0.90 /dev/md0 –chunk=128 –level=5 –        -raid-devices=5 /dev/sdd1 /dev/sde1 missing /dev/sda1 /dev/sdb1

We now had to check whether the array was aligned correctly using the command:

            e2fsck -B 4096 -n /dev/md0

Using e2fsck it is always helpful to specify the block size before a scan to get a more accurate status of the array. We also used the -n prefix in case the array was mis-aligned and e2fsck attempted to fix it. ( e2fsck should never be executed on an array that is potentially mis-aligned)

 

E2fsck completed successfully and correctly identified the status and alignment of the array.

It was now safe to proceed with the repair and fix command

 

            e2fsck -B 4096 /dev/md0

Notice that no “-n” was used this time. The scan took around 5.5 hours to complete. It found over 26 inode errors, hundreds of group errors and some bitmap errors.

Now, it was time to add the first failed drive back into the array. We used the command:

            e2fsck -a /dev/sdc1

The RAID array now began to rebuild. After a couple of hours, the RAID 5 was totally re-created, albeit in degraded mode. But the the volume was mountable again and all data was now accessible.

The client had over 4 years of Sage Micropay and Sage 50 accounts files on the server. In addition, they had over 6 years worth of PhysioTools data files. This is a software package which they used to create customised exercise regimes for their  patients.  Reconstructing accounts and staff payslips would have been very time consuming and costly. For their staff to re-create patient exercise regimes, it would have incurred a huge time burden on them. Moreover, it would probably have been professionally damaging for thier reputation if they had to inform their patients that their customised exercise regimes had been “lost”.

We advised the client on some best-practice back-up strategies so they could prevent data loss in the future. It is deeply satisfying to help a customer like this when the “plan B” option would have been so disruptive for them. They could now get back to helping their patients with minimum downtime to their business.

. 

 

The mystery of the continually degrading RAID 5 array


wd caviar green 1tb recovery ireland

A couple of weeks ago, an I.T. support administrator for a Dublin finance company called us. He was in a spot of bother. Last week, their RAID array on their HP Proliant server failed. Luckily, they had a complete back-up and no data had been lost. The I.T. admin decided to replace the four Western Digital (WD2500YS) Enterprise S-ATA setup in a RAID 5 configuration. He replaced them instead with four Western Digital Caviar Green disks (WD10EZRX). Using S-ATA to SAS adaptors he connected them to the HP Smart Array controller card. 

He re-built the server, re-installed Windows Server 2008. But three days later, the server was down…again. In this short space of time, the RAID array had changed from “normal” to “degraded” status. He ran diagnostics on the disks. All of them passed. He suspected that the Smart controller card was at fault. He had a redudant server in his office using the same model of RAID controller card. He removed it and installed it in the problematic server and, for a second time, did a rebuild of the array. He tested it for a couple of hours. It worked fine. Then, just as he was about to start the data transfer process from the local backup, he rebooted the server . The dreaded “degraded” message appeared on the screen – again.   

Being a past customer of Drive Rescue data recovery, the I.T. admin telephoned us for advice about the mysterious problem. After his description of his problem, we had a fairly good inkling as to what the cause might be. But, inklings or assumptions are dangerous. Most the great technological failures of mankind (nuclear power plant explosions, aircraft disasters,etc.)  can be traced to someone, somewhere making a wrong assumption. The same applies to the data recovery process. Good data recovery methodology is not based – or never has been – on assumptions.  We asked him to email us his server event logs, hard drive model numbers and exact model of RAID controller. After looking at his server logs, the specs of his controller card and model of hard disks used; it was now becoming clearer as to what the root-cause of the problem might be.   

 

He got the disks delivered to us and we tested them using our own equipment. His Western Digital Green disks were indeed perfectly healthy. The problem with a lot of WD Green disks (and other non-entreprise-class disks)  is that when they are used in a server or NAS – the RAID controller can erroneously detect them as faulty. The reason for this is quite simple. In some RAID setups, if the controller card detects that the disk is taking too long to respond, it simply drops it out of the array. But, it is normal for error recovery in some non-entreprise / non-NAS classified disks to take approximately 8 seconds to complete.  With error recovery control (ERC) enabled on a disk, error recovery is usually limted to 8 seconds or under. (For example, with Hitachi branded disks ERC is limited to 7 seconds) This means the RAID controller will be less likely to report ERC-related false positives. 

In this case, the Smart RAID controller, used commonly by HP Proliant servers, was detecting some of these disks as faulty when they were not.  The most common type of error recovery control used by Western Digital is TLER (Time Limited Error Recovery). Most WD Caviar Green drives do not have this function. WD Red (NAS disks) and WD entreprise-class disks do have error recovery control.

 

 RAID controllers (especially dedicated hardware controllers like those from LSI, Perc etc.)  are very sensitive to read / write time delays. When a hard disk does not use error recovery control , a RAID controller will often report false positives as to the status of the array or, like what happpened in this case, the controller will simply drop the “defective” disk out of the array.

Entreprise-class disks (such as the WD Caviar RE2, RE2-GP)  or disks made specifically for NAS devices (WD Red) have error recovery control enabled by default.

In this case, the I.T. admin replaced the WD Caviar Green disks with four 1TB WD RE SAS drives. He then rebuilt the RAID 5 array.

Yesterday, we got a nice email from him. The server has been running smoothly ever since. He has rebooted it a couple of times. The event logs are free from disk errors. He no longer has to worry about the uncertainty of the company’s server continually degrading.  He can even sleep more soundly at night.

Firmware Failure on Western Digital Blue – 1 TB Drive

Firmware is low-level software stored on the PCB and System Area of a hard disk drive.  It contains the most basic parameter information needed for the disk’s operations and provides the lowest level direct hardware control. One can think of the firmware equivalent to the disk’s operating system. It contains the manufacturer’s parameters needed to initialise the disk and contains the servo-adaptive information needed for the drive to operate smoothly. Usually, hard drive firmware contains low-level information, the servo-adaptive information, the disk’s model number, date of production, error logs, the P-List and the G-List (the defects tables). The firmware information can be typically found on the ROM chip of the PCB and on the System Area of the drive platters.

 

In early model hard drives, (such as those produced during the 1990’s) firmware was read-only. Nowadays, firmware is stored on EEPROM chips which means the firmware can be modified by data recovery professionals using specialist firmware emulator equipment with EEPROM read / write functionality.   

hard drive schema

Fig.1 Hard Disk Layout. The firmware is kept totally separate from the user data.

The System Area (usually before LBA 0) also stores firmware information. Typically, it contains the P-List and the G-List. Every hard disk will contain a small number of sector errors before it even leaves the factory. This information is stored on the P-List. When the disk is put into everyday use, more errors will develop on it. These “growth errors” are stored on the G-List (growth defect list). For example, if the disk develops a bad sector, the firmware will add this to the G-List and subsequently the sector will be remapped to the reserved area.

p-list remapping hard diskFig. 2 – P-List Remapping

g-list remapping hard disk recoveryFig. 3 – G-List Remapping – Bad sectors are removed to the reserved area. This type of error correction  has been so successful for the storage industry, it is even emulated on the most sophisticated solid-state drives.

The system of bad-sector mapping and relocation is all very clever. But, sometimes due to physical damage of the EEPROM chip, the System Area or corruption of the firmware code, the partition table(s) will become invisible to the host system. A very common occurrence of this type of failure happened with Seagate’s 11th generation of Barracuda series of drives. However, it is not only Seagate where firmware failure can occur. Firmware failure is also commonly seen on the Western Digital D5000AAKS family drives.

 

 

Thankfully most disk firmware failures can be recovered from successfully. Experience, a sound knowledge of EEPROM firmware, specialised firmware repair equipment and a proper methodology often ends in fruitful results. Take for instance a case we were dealing with last week of a firmware failure of a 1TB Western Digital “Blue” drive. We received the drive from a Dublin architects office. One of the partners in the practice had all of his Archicad drawings from the last 4 months on his 4 month old laptop. These were projects he was currently working on.  He was incredulous when his I.T. support company told him that his hard drive in his relatively new laptop had failed. 

 disk recovery dublin wd firmware 25

We performed diagnosis on his drive and it was immediately apparent that the firmware on the disk’s System Area had gone corrupt. Once diagnosis has been complete, we contacted the client for formal go-ahead to proceed with the recovery. Once formal approval had been received, we firstly backed-up the firmware modules on the EEPROM chip and the System Area to an external source.

Once a complete backup had been made, it was now safe to manipulate or edit the firmware. We used our specialised firmware recovery equipment to check the G-list and P-list. The P-List was corrupt. This was the root-cause of problem. After some careful editing of the faulty P-List and a regeneration of the translator, we were almost done. We switched the drives power supply off and on to make the drive initialize itself again with the new parameters. Success…the data was now accessible again. The recovered data was transferred onto a 1 terabyte external hard drive for delivery to the customer. All the files the client needed (Archicad, Word, Excel and .PDF files) were recovered successfully.      

 

 

Data Recovery from Kingston Data Traveller Flash Drive

data recovery kingston usb  ireland

The owner of a small logistics business in Athlone recently contacted us. His business distributes freight from and to Athlone to Dublin, Cork, Galway and Limerick five days a week. Last week the hard drive in his laptop experienced catastrophic failure. However, two weeks previous, he had made a backup of his “My Documents” folder and his Sage accounts file onto his USB stick. He connected his Kingston USB memory stick to his desktop computer but it was not recognised. He got the message “Please insert a disk into drive E”. He believed that the age of his desktop computer was the problem. The following day he bought a new laptop. He connected his USB stick to one of its ports but got the same message. He brought the USB stick to his I.T. support provider so they could examine it. Unfortunately they were not able to retrieve any data from it. They recommended that he contact Drive Rescue.

Once we had received his USB stick, we tried accessing it using our own systems but we got the same error message. The first data recovery step in the data recovery process for this case was the opening up of the drive in our clean room to physically inspect the inside of the device. The PCB tracks and diodes all appeared to be physically okay. We then tested the diodes using a multi-meter. They all appeared to be working. The device was showing all the symptoms of a failed controller. While the NAND chip stores the actual user data. The controller chip stores a software module which contains the ECC and wear-levelling data of the drive. Error Correction Code is an algorithm built into USB memory sticks (and most NAND memory devices) which helps to fix bit errors that occur in the file system during the life of the drive. Meanwhile, the wear-levelling function helps the data to be evenly distributed throughout the memory cells of the device.

usb recovery ireland nand chip

 

We used a hot-air rework station to remove the NAND memory chip from the PCB. This is an intricate and time-consuming job as the temperature has to be hot enough to melt the solder but, not as hot, as to damage the actual memory chip. During the process, it is also important to note whether lead-free solder is used to join the chip to the PCB. Lead-free solder usually requires a higher temperature to remove than a lead-solder joint. Different temperatures will also have to be used depending on chip size the type of PCB.  Using our precision tools, experience and a steady pair of hands – we successfully removed the NAND memory chip from its PCB.

usb stick data recovery controller chip ireland

In order to read the data from the extracted NAND chip, we would need the exact controller information as used on the original USB for our emulator. While the controller chips on USB sticks are usually easy to spot; their exact model number is sometimes hard to identify because often you will only find a part number on them and a manufacturer logo on them. Common controller chips found on USB memory sticks including those from manufacturers such as ALCOR, KTC, Silicon Motion, Ramos, Phison, OTI, SMI and SSS.  In this case, the controller being using was an SSS (Solid State System) with model number 6692.


Once the exact controller module had been uploaded to our emulator, we determined the memory block sizes and erase block sizes of the data. After some manual reconstruction of the FAT32 file system using a hex-editor we were finally able to see the client’s data. We recovered all of his Sage 50 accounts file, along with all of his Word, Excel and scanned document folders.  For the client’s piece of mind, we invited him to log into our systems remotely to view his recovered files. Everything that he needed was there. The delighted customer was sent his recovered data on a new USB memory stick – this time, he will be backing up his data to three different places instead of just two.

 

NAS Data Recovery from a RAID 1 Buffalo Linkstation

NAS data recovery Ireland

NAS devices have never been so popular. They are compact, relatively easy to administer and can often perform the same storage functions of a traditional server.

There are numerous brands of NAS widely available in the Irish market to cater for most storage requirements and budgets. Some of these manufacturers include Drobo, LaCie, Buffalo, Iomega, ReadyNAS (from Netgear), Seagate, Western Digital, Qnap, ZyXel, Synology and G-Technology. All of these manufacturers have a wide variety of models available with different capacities, I/O specs, RAID levels and file systems.

One popular file system commonly used in NAS servers is the XFS file system (developed by Silicon Graphics International in 1993). It is a powerful, fast and scalable file system which can handle a whopping 8 exabytes of data. It is not surprising therefore that organisations such as the CERN research laboratory and NASA’s supercomputing division use XFS for many of their projects requiring high-capacity data storage. One of the main reasons for the growing popularity of XFS is its speed. It is significantly faster than EXT3. When deploying any operations which utilise sequential buffered writes – it will be faster than EXT4. How does it do this? XFS cleverly deploys Allocation Groups to allow multiple I / O requests. Moreover, XFS has another trick up its sleeve. It uses a feature known as Direct I/O – this means data can be transferred from the file system directly to RAM space – obviating the need for a cache or processor request. Clever stuff – but the dexterity of the XFS file system does not end there. XFS does not just offer great IOPS rates; it also uses file parsing. A lot of files will contain a large number of zeros. XFS – hating to see wasted space – will put metadata in place of these zeros. When the file is accessed – it will revert to its original size. In addition, XFS uses online defragmentation – data fragments are converted into continuous blocks on-the-fly. Both of these attributes of XFS means space allocation on XFS-formatted drives is used very efficiently.

So taking into account its great storage capacity, speed and efficient parsing – you might be thinking “if Carlsberg did file systems…” it would be XFS? Maybe, but XFS does have some drawbacks. For example, XFS does not handle sudden power loss very well. Whilst it is a journalised file system, the journaling system is designed to increase performance and not offer redundancy. If there is a power failure on an NAS or server running XFS – your data will probably be irreversibly lost. (hence, a UPS will be indispensable when using XPS). Moreover, like any file system XFS can go corrupt.

Take for instance a case we were dealing with last week. A multinational medical device company from Limerick contacted us. In one of their research laboratories, their Buffalo Linkstation NAS device became inaccessible. Their IT administrators removed the hard drives (two Seagate Barracuda 2TB – model ST2000DM001) and slaved them onto a Linux system. They ran the “xfs_repair” command. This is a fairly standard repair command for XFS which, unlike, “fsck” is not invoked automatically upon system start-up. However, this operation was continually aborting for them. Errors were being returned to them about primary and superblock corruption. Not to be defeated so easily they unmounted the volume again and ran a “xfs_repair –n” command which performs a more thorough check of the file system. Alas, this operation was also aborting for them also.

They sent the Buffalo NAS box to our Dublin lab. We removed its drives and performed diagnostics on them. Drive 0 had several thousand bad sectors on it. This explained why “xfs_repair” and “xfs_repair –n” commands was unable to complete. As always, in order to maintain the integrity of the data we imaged the drives using a hardware imager designed for data recovery purposes. This means we can perform data recovery using a copy of the volume rather than using the original drives themselves. Once imaging had completed, we then examined the inode maps on the volume. We removed all duplicate blocks. We cleared the lost and found directory. Then we rebuilt the volume’s trees and headers. Any disconnected inodes which we found – we moved to the lost and found folder. The volume was then mounted. We then had a workable volume with which we could rebuild their RAID array with. From our examination of the volume’s stripe size, RAID header and parity – we determined they were using RAID 1. Their array was rebuilt. The rebuilt volume was then mounted and we saw what looked like a valid structure and folder directory. We then invited our client to login to our systems remotely to view and verify their retrieved files. The client was delighted. Every file they needed had been successfully recovered. Without this data, their R&D team would have had to replicate months of work. Their recovered data was extracted onto a high-capacity external USB hard drive and along with their Buffalo NAS box and original drives was dispatched back to Limerick. The lesson: even the best file systems can go corrupt if the underlying hardware starts to fail. NAS devices are not bulletproof. The data stored on them should be backed up onto another drive. Better still, many NAS manufacturers like QNAP and Synology have options in their software to backup directly to the Cloud.

Warning: Apple Mavericks OS 10.9 – Mysterious Partition Loss on External Drives

apple mac data recovery ireland

We have been helping a lot of users recently recover data from external and NAS drives. A lot of these cases had two factors in common. 1) There was sudden partition loss and 2) the user had recently upgraded to Apple’s new Mavericks operating system.

We immediately began to surmise that maybe Mavericks was becoming a little too maverick in the way that it managed external drives.

The first instances of this problem began appearing with Apple users who had Western Digital external hard drives connected to their newly installed operating system. For example, one user after connecting his MyBook to Mavericks was shocked to discover his HFS+ partition has disappeared overnight. In it’s place he got one EFI partition and one called MyBook (which incidentally was completely blank). Three years of business documents and family photos disappeared. (We performed data recovery on his drive and successfully restored all his files)

We were beginning to suspect that maybe Western Digital’s ancillary drive software (the software which comes free with external hard drives) had an incompatibility with Mavericks which was causing this problem. Western Digital even issued a press release advising users to remove their Drive Manager and SmartWare software.
We have tried the following Mac commands and they work pretty well in uninstalling WD’s Drive Manager service:

sudo launchctl unload /Library/LaunchDaemons/com.wdc.WDDMservice.plist
sudo rm -R /Library/LaunchDaemons/com.wdc.drivemanagerservice.plist
sudo rm -R /var/tmp/com.WD.WDDriveManagerService

A Twist in the Tale

But then, an interesting twist developed. We had users of LaCie and Buffalo drives who were reporting similar mysterious partition loss. For example, the owner of a LaCie Rugged drive in the south-east of Ireland had the primary partition on his device disappear. Meanwhile, the owner of a 16 TB Buffalo RAID 5 NAS device in Dublin had his array turn into 4 TB individual disks. (Our RAID data recovery process restored all his data – mainly Sage Accounts and AutoDesk files)

It is really surprising that Apple did not pick up on this bug in their beta-testing of their new operating system. I hope this is not a portent of things to come with Apple and they do not go down the route of Microsoft in rushing sloppily coded software to market. Mavericks is one of Apple’s first “free” operating systems. But, most users would be willing to pay for a quality product than have a free product makes their data do a Houdini-like disappearance.

Our advice is to hold off updating to Mavericks until Apple releases an update which addresses this serious problem.

Data Recovery from a CCTV DVR Hard Drive

data recovery CCTV

CCTV is almost everywhere these days. The prediction that George Orwell envisioned in his novel “1984” that surveillance would be ubiquitous has proved to be remarkably prescient. CCTV is often watching us wherever we go – whether we like it or not. Walking down the streets of Ireland’s cities and towns CCTV cameras are easy to spot. Walk into any supermarket and you’re probably being watched by ceiling-mounted dome cameras. Walk into any big-chain fast-food restaurant and more discreet cameras are watching you whether you realise it or not. The arguments for and against this type of surveillance can be discussed ad-infinitum.

Behind even the most sophisticated CCTV systems; the humble hard drive is often used as the primary storage device. VHS tapes used to be used for analogue-based systems but were made redundant by DVR (digital video recorders) devices which were more convenient to use and better suited to storing of digital footage. The hard drive, while it negates the need to be continually “changing the tape” can often be the weakest link when it comes to failure of these systems.

One such case we were dealing with during the week illustrates this. A well known multinational logistics firm contacted us about failure of one of their DVR’s hard drives. Recently, there was a theft from their reception area by (possibly) a member of the general public. Their staff were in the process of trying to extract footage from the device onto a USB drive. During this process the DVR froze several times. At this stage they were suspecting something was seriously wrong with the device. Eventually, they turned it off believing that a reboot of the system might resolve the problem. However, upon turning it back on, it would not initialise. To their horror, a red LED appeared on the system indicating “disk fault”. It was at this juncture that they decided to bring the failed DVR system to us.

The data recovery of footage from CCTV DVR systems poses some unique challenges not found with other cases. For example:

1) The file systems used will often be Ext2 or Ext3.

2) The encoding of video footage will often not be standard. It is extremely seldom that CCTV systems use formats such as .AVI or .MOV. The encoding of footage will often be in a proprietary format.

3) The time-stamp on recovered files is often crucial. The data recovery technician must use a recovery technique which endeavours to keep all file/folder time-stamps intact. Moreover, the data recovery technician must ascertain the time/date settings on the DVR device itself. If the time on the host device is not correct or has not been adjusted for DST (daylight saving time) the timestamps of the recovered folders/files will also be wrong.

We look delivery of the DVR system. Our technicians checked the sequence of warning lights on the device to double-check that there was not a problem with the device itself. The time and date were also checked on the DVR. Once these had been verified, the hard drive (a 2TB Seagate Barracuda S-ATA drive) was removed from the DVR’s motherboard. Our diagnostics revealed that the drive had numerous bad sectors and corrupt firmware on the Service Area.

Our technicians first imaged the drive. This is merely a routine safety procedure to make sure that the data is recovered in the safest way possible. Once this had been completed, our data recovery technicians restored the Service Area firmware modules on the drive. After this had been completed, a surface scan was performed. All the bad sectors were relocated and the drive’s defect tables (the G-List) were then updated.

This brought us to a point where we now had an imaged drive which was functioning properly. The video footage was appearing in .stream format (with all timestamps intact) but, alas, it was completely unreadable to our systems. We contacted the European headquarters of CCTV manufacturer who very kindly emailed us their proprietary codec-pack for reading the video footage. This was installed after a quick reboot all of their files became readable on our system. All of the recovered footage was delivered to the client on a USB drive. Big Brother is watching you but he might not be backing up his data…

Report from Gigaom Structure Europe Conference – London

disk recovery ireland conference

Last week Drive Rescue had the pleasure of attending the Gigaom Structure Europe conference in London.There were lots of interesting speakers from a wide variety of I.T. disciplines and industry sectors. There were some great talks from the CTO’s of Ericsson, Netflix and BMW.

From a data storage technologies perspective, perhaps the most interesting speakers were from Western Digital. WD have now acquired the hard drive division of Hitachi. However, due to competition regulations imposed by the EU and the Ministry of Commerce in China, they have had to keep the two hard drive operations as separate and competing entities (these being Western Digital and HGST).

Hard drive technology moves at a blistering pace. Over the years hard drive manufacturers have devised some very clever ways of increasing areal densities (the amount of data that be stored on disk platters). These have included PMR (perpendicular magnetic recording), shingled PMR and HAMR (heat-assisted magnetic recording). But, even using all of these technologies, the maximum capacity of hard drives in mass-scale production today is 4 terabytes. It is possible to produce drives over 5TB but they are not “consumer ready”. In other words, they can be manufactured, but they are not stable or reliable enough for mass production. So Western Digital researchers have thought outside the box. Their hard drive and production process researchers have found one more aspect of the drive’s architecture that can be tweaked to eke out even greater areal density ratios. But this finding does not involve a new type of magnetic layering or a new type of read-write head – it involves removing the air from inside a hard drive and replacing it with helium.

Helium to the rescue

How can air restrict the amount of data that can be stored on a hard drive? Well, the platters of a hard drive spin at around 7200 or 10000 rpm. When these platters start to spin at such a high speed, the air inside the drive actually causes a lot of flutter and resistance for the drive heads to do their job properly. (Think of it as a kind of turbulence). If you replace this air with helium, which only has one seventh the density of air, a substantial amount of flutter is reduced. It will also mean that seven platters can be deployed instead of just five. This will mean Western Digital will be able to bring a 5.6 terabyte drive to market at some stage during 2014. For I.T. administrators and end-users alike helium-filled drives will not only mean more capacity but less power consumption and less heat generated.

The idea of putting helium inside a hard drive is not entirely new. Storage researchers from Seagate have toyed with the idea before and rejected it. The real problem has always been: how do you hermetically seal helium into a hard drive and manufacture it on a mass scale? Well, according to Western Digital, they have finally found the solution. Alas, their representatives were not keen to divulge this proprietary process to attendees. “Even in 2014, we will only have one factory that will be able to do this and supply is going to be very restricted” the attendees were told. The public reaction to the helium-filled hard drives has so far has been good, but from some doomsayers Western Digital have already received comments such as “people will die from helium poisoning”. But these claims are unfounded. Helium is an inert and non-toxic gas. It is the same gas that is used to inflate balloons in theme parks. If it were poisonous people would be dropping dead in Disneyland everyday.

A three horse race

With only three main players left in the mechanical hard drive market: Western Digital, Seagate and Toshiba – it has become a three horse race, and it will be interesting to see which manufacturer breaks the 4 terabyte ceiling first. Would Western Digital be better off just adding some flash storage to their 4TB drives? Will Seagate refine their HAMR technology even further to enable more capacity. Who knows? Or, will Toshiba be the dark horse and be first to market with a 5.6TB drive with a new technology from their venerable R&D labs. It’s going to be an interesting few years ahead in the area of high-capacity drives.

How just one phone call lead to data loss for one user…

data recovery irelandHere at Drive Rescue we come across some unusual cases. It is one of reasons why data recovery is such an interesting job. There is always a new and different challenge.

Last week was one such case. A lady who was finalising her PhD thesis was getting into her car. Her phone rang. She placed her Western Digital Passport external drive on the car’s roof. She continued with her phone conversation and then proceeded to sit behind the wheel. After a few minutes, the conversation ended and she started her ignition and drove off.

The fact that her external hard drive was still resting on her roof had unfortunately escaped her. She drove on for around one kilometre until she reached the motorway. Just arriving onto the slipway of the motorway, she braked and saw a black object with a wire attached to it flying across her windscreen, bouncing onto the road and into the ditch. To her horror, she realised where she had put her hard drive, a half an hour previously. She drove up a little bit further to the hard shoulder, put her hazard lights on and started looking in the ditch. After around ten minutes of searching amongst the overgrown grass and ragwort, she saw a metallic object glistening in the undergrowth. Luckily, it was not a discarded Coke can but it was her hard drive. It had broken loose from its plastic enclosure. After some more searching, she found the plastic enclosure and the USB connection cable. She got into her car, and headed home. She eagerly connected the drive to her computer, but to her dismay heard only a clicking noise. She phoned a friend who works in I.T. in the south of Ireland. He advised her to take it to a data recovery company and recommended Drive Rescue.

We first performed a media test on the drive. Two of the drive heads had failed. The whole head disk assembly would have to be replaced. We now needed to find an exact-match Head Disk Assembly in order to transplant a new HDA to the drive. After a lot of phone calls and emails, one of our suppliers in Germany had the exact part in stock. We got it sent to us via express courier. The damaged drive was brought into our clean-room where the old Head Disk Assembly was carefully removed. The replacement Head Disk Assembly was now carefully inserted in our clean room. It took another few hours before we were satisfied that the torque pressure applied to the HDA was perfect to ensure the precise “flying-height” needed by the drive heads. We then configured the drive’s servo-adaptive parameters as close as possible to the old configuration. If the servo-adaptive parameters are not “tuned” right; PRML (the type of read/write encoding used) will not function correctly and the data will not be read properly. Once we were satisfied that these were accurate, we then imaged the drive. The imaging process took around 6 hours. Once this had been completed; we would be able to check the data. It all looked perfect. We got the client to email us a list of important files as confirmation. Our recovery set had everything on the client’s “most wanted” list and more.

This accident could have happened to anyone. We are all human. We are living in the “connected age”; we can now get distracted from even the most perfunctory of tasks. The PhD thesis (which took two years to complete) and accompanying scans of research documents were all recovered successfully. The recovered data was delivered to the client on a brand new USB drive. The lesson, as always, is: backup your data; expect the unexpected and never put your hard drive on the roof of a car!

Important Factors for a Successful Head Disk Assembly Swap

data recovery seagate drive ireland 2

We recently performed a successful data recovery operation for a multinational bio-tech company based in south-west Ireland. In one of their laboratories, there was a desktop PC which experienced hard drive failure. Recently, their laboratory staff noticed the system getting slower and less responsive to use. Last week, it shut down on them and would not successfully boot-up again. They thought they had a full back-up of their files, but on further investigation, it transpired that their backup was a few months out-of-date. Their own I.T. administrators tried to extract data from it, but to their dismay, they heard “click-of-death” sound the moment they connected it to another PC.

The client had used our data recovery service successfully before. They sent us the drive – a 3.5” 2TB Seagate Barracuda S-ATA drive. Our diagnosis revealed that it had 6 failed heads and also showed a lot of evidence of overheating. The increased temperatures which Ireland experienced this July probably did not help.

A head disk assembly swap is perhaps one of the most intricate data recovery operations to perform. It requires excellent theoretical knowledge of the workings of a magnetic hard drive, specialised data recovery tools, a Class-100 clean room, years of experience, patience and a steady hand! The recovery went smoothly and was a complete success. One hundred percent of their data was recovered.

Like any process, there are a number of factors which make the difference between a half-baked recovery and a recovery which is a complete success. For a head disk assembly swap, there are a number of variables which a competent data recovery technician will observe.

Firstly, the technician must acquire an exact-match donor part. For example, in this case, we already had an exact-match head disk assembly part (from an identical 1TB Seagate drive) in stock. This saved the client (and us) time. The part matched the original drive’s model number, revision number and both were manufactured in the same month and year.

The old drive heads should be removed carefully from the donor drive using the proper tools. The head disk assembly should be removed from the platters of the old drive without it actually touching them. In this case, we used a special spacer tool (customised for Seagate data recovery) to carefully remove the head disk assembly from the donor drive without any platter contact. Likewise, the old faulty heads were removed from the drive needing recovery using a similar process.

Alignment of the new head-stack must be in an identical position as the old one. If the HDA is off-kilter, this excessive head-to-platter “eccentricity” cannot be tracked out by the drive’s servo.

The centre of the platters should line up with one another perfectly. This can be helped by using platter alignment tools, but technician experience will be an even greater asset for a perfect alignment.

Lastly, it is very important during this type of data recovery operation that the donor head disk assembly is properly torqued. If not enough pressure is applied, the heads will be at a “flying height” that is either too low or too high. Heads that are flying too close to the platters risk touching, or worse still, scouring them. If the heads are too high the read signal will be too attenuated for the drive’s pre-amplifier, and little or no data will be readable.

There are a lot more issues involved in a head stack replacement which go beyond the scope of one blog post. Technical processes and other minutiae of a recovery operation account for little if there are no results. The most important aspect of the data recovery process is the final outcome. In this case, the bio-tech firm got all their data retrieved and it was extracted and delivered to them on an external USB drive.