One of the system administrators of a healthcare organisation recently contacted us.
They were decommissioning around 18 of their Dell laptops. For data security purposes, he removed all the Crucial MX500 S-ATA SSDs from the systems and attempted to use Crucial Storage Executive software (hosted on a desktop PC) to perform a SecureErase function on them. The only problem was SecureErase was not executing on any of them. This left him with in a bit of a pickle because even just formatting the SSDs using Windows Disk Management is not considered secure. This is because, there is a high probability that a “Windows format” is going to miss areas on the NAND flash of the SSD like the user space area, the overprovisioned space, the spare blocks and bad block locations. SecureErase is designed get into all of these nooks and crannies.
He was beginning to think the problem was related to the TPM chips inside the Dell laptops and was not relishing the prospect of re-inserting all the SSDs. As a previous customer of Drive Rescue, he contacted us – did we have any suggestions?
Get the Sequence Right…
We did actually! This is a known problem with the Crucial Storage Executive software. Sometimes, the “PSID revert” utility has to be run before “Sanitize”. PSID revert involves reading the label of the disk and inputting the PSID code, as written on Crucial MX500 series SSDs, into the CSE software. Without following this sequence, the Sanitize (SecureErase) function will not work. This is just a quirk of the SSD management software.
This morning we got a nice Starbucks gift card in the post from the kindly systems admin who was very relieved to have found a quick and secure solution to this problem.
Successful data recovery from a hard drive which has been exposed to a residential, office or industrial fire depends on a number of factors. These include factors such as the level of exposure to the fire. It depends on the level of smoke particle ingress. It depends on whether the label has been burnt or not. It depends on whether the disk is a hard disk drive (HDD) or solid state drive (SSD). And recovery can also depend on how much exposure the disk had to fire suppression agents such as water.
Burnt disk labels – If you have an HDD or SSD damaged by fire, sometimes the biggest challenge can be a burnt label. The reason for this is simple. If you have an HDD with a fire-damaged PCB but is otherwise mechanically sound, using specialised data recovery equipment such as PC-3000 it’s firmware can be emulated and the volume read. However, in order to emulate a disk’s firmware, you need to know the disk family and the model number. Without this firmware information emulation cannot take place. Similarly, if an HDD involved in a fire requires a head-disk assembly (HDA) replacement swap, it’s also imperative to know the model number. HDA swap operations need to use exact-match donor parts. Likewise with an SSD, you might have a fire damaged SSD which could be read using disk emulation. But you need to know the model first. You also need to know what controller chip the disk using. We really wish disk manufacturers would use fire retardant labels…
SSDs will survive a fire better than a HDD – The NAND chips on SSDs can survive temperatures of up to 300 degrees Celsius. (Controller chips are much more sensitive to heat though) In contrast, HDDs exposed to temperatures of over 60 degrees Celsius you will see bit errors start to multiply. Moreover, with HDDs exposed to fire their disk-heads are liable warp and are also liable to make contact with the platters due to excessive heat.
The water damage incurred by sprinkler systems or fire crews can be worse than the damage incurred by the fire itself – This one surprises a lot of people, but water (used for fire suppression purposes) often does more damage to hard disks than the fire itself. Within a very short space of time, micro corrosion sets in on the PCB components (such as diodes, capacitors and tracks) causing short-circuits. These short circuits can prevent a disk from initialising.
Smoke Damage – Electro-mechanical hard disks are hermetically sealed units designed to block out any contaminated air. They use a rubber gasket to secure the seal between the chamber and the lid. Even in polluted industrial environments, this mechanism works well at keeping contaminants out. However, the intense heat of a fire can cause a disk’s rubber gasket to deform or melt paving the way for the ingress of smoke particles. For the disk, this can be catastrophic. Smoke particles on the platters are the equivalent of rocks on a railway track. These particles can accumulate under the disk-heads blocking the read/write signals, scouring the platter surface but can also cause the disk-heads to overheat.
Off-site backup provides the best protection against data loss due to fire damage. Even if you think your premises has a low fire risk, it can often be an adjoining premises that’s the source.
Your server or comms room should have a high-sensitivity smoke detection system (HSSD) smoke detector installed which is regularly tested.
Try to maintain an off-site inventory of disks inside your systems. A record of disk model numbers can sometimes make the difference between a failed or successful recovery. IT asset management tools like LanSweeper can automate this task.
If adopting a belt-and-braces approach in mitigating the fire risk to your data, you could consider fire-retardant DAS and NAS solutions from ioSafe. These storage devices running DSM (from Synology) offer protection of your disks from fires up to 840 degrees Celsius for up to 30 minutes. They also offer IP68 water protection – very useful protection from sprinkler systems and over-zealous fire crews.
HP Proliant servers are a very common on-premise server in Ireland. These systems come in two main form factors – blade or tower. Their blade series includes models such as the DL360, DL380 and DL385. While their tower series includes models such as ML10, ML110 and ML350.
Recently, we recovered data from an HP Proliant ML350. This Windows Server 2019 server running VMware virtualised machines. Using 4 X HP SAS disks, it’s RAID 5 array had gone into degraded mode. While this can be very frustrating, “degraded mode” is actually like a self-protection mechanism of the server. It occurs when unrecoverable errors are detected in one or more of the disks. Its role to prevent any further damage that might occur due to silent data corruption. The server subsequently became unbootable.
Examination of the 4 x 1.2TB HP SAS (EG001200JWFUT) disks (formatted in EXT4) proved interesting. Disk 0 was fine. Disks 1 and 2 were seriously over-heating. Our infrared thermometer recorded temperatures of 48 and 49 degrees Celsius respectively. While disk 3 was clicking. Great…
We made bit images of each of the 3 working disks. Then using a SFF-8492 cable we connected each of the disks to our Areca SAS card. It is important to note that this PCIe card was not a RAID controller. The last thing you want is for a RAID rebuild process to initiate with a missing disk. Specialised software is required. Using a non-RAID SAS card means the integrity of the images remains sound.
We now had to ascertain the exact RAID parameters used in the original array. If you don’t use the exact parameters, corrupted files will be inevitable. The HP documentation as to the parameters used, was unsurprisingly lousy. Therefore, we used a HEX editor to find the original RAID parameters – namely the block size, the offset and the block order. With these parameters now electronically recorded using the high-tech medium of Microsoft Notepad and using specialised RAID re-build software, we could start the re-build process. This took a number of hours, but eventually, we had on our recovery system several VMDK and -flat.vmdk files. Exactly what we were looking for! Our file integrity checks revealed all files to be intact. The client was extremely fortunate. Some VMDK virtual disk files can be unwieldy, fragmented and liable to corruption during RAID array failure events. Anyway, the client’s data (Excel files, PDFs, ROS certificates and BrightPay payroll data) could now be extracted onto a 4TB external disk for delivery.
This recovery process saved this Dublin accountancy practice hours and hours of labour time that would otherwise have been spent reconstructing files.
How RAID 5 failure and recovery could have been prevented…
First of all, RAID 5 should not be considered a backup. In this particular case, the client should have had a valid up-to-date backup of their main server. There are swathes of virtual machine backup applications (such as Veeam and Nakivo) out there which can backup locally and to the cloud.
RAID 6, which users dual-parity can sometimes be a better and safer alternative to RAID 5. This is especially true in cases where disks are over 1TB in size which is commonplace in many of even the most basic servers.
If your on-premise RAID server stores a lot of data that is infrequently accessed, you should have a data scrubbing regime in place. The scrubbing process reads all the data and checks for consistency. For some file systems like BTRFS, you can use the “BTRFS scrub” command. For EXT4, it does not checksum data. However, it does allow for metadata check-summing which can help detect early disk problems.
Drive Rescue is based in Dublin, Ireland. We offer a complete RAID data recovery service for HP Proliant and HP Microserver systems. Whether your data is stored in bare metal format or VMDK, VDI or VHD virtual disk formats – we can help recover your data.
Without an encryption key, if threat actors or intelligence agencies cannot access an encrypted storage device such as a laptop HDD, contrary to popular belief, they will not try to brute force it. Nor, will they use a quantum computer. If it’s really important, more likely than not, they will deploy what is known as a side-channel attack. Such an approach does not endeavour to “break” the encryption of the storage device, but rather, gain access to the protected volume by side-stepping it.
One of the most common side-channel attacks exploits DMA ports. But what are DMA ports? Well, first some context, in the 1990’s with the proliferation of multimedia use, some computer manufacturers wanted to equip their devices with data transfer speeds faster than the 1.5 Mbps or 12 Mbps afforded by USB 1.0 and USB 1.1. This gave rise to DMA ports such as FireWire (IEEE 1394) which allows peripheral hardware devices to access the host memory directly. In the mid-1990’s, Sony and Apple were pioneers in equipping their devices with FireWire ports, giving their multimedia users vastly improved data transfer speeds. So, for example, in the early 2000s, USB 2.0 allowed transfer speeds of 400Mbps while FireWire 800 (IEEE1394B) enabled double those transfer speeds. Today, on consumer and enterprise-class computing devices, the most common DMA ports in use are Thunderbolt and USB Type-C. Lesser known hardware components having DMA access include network cards and external GPUs.
How DMA ports can provide a backdoor to your data
Ok, so let’s say you have a HDD or SSD in a laptop which is using a full-disk encryption application such as BitLocker? Could a threat actor access your data? Theoretically, yes! Here are a few side-channel permutations to consider.
Cold Boot Attack – This type of attack occurs when a threat actor performs a memory dump from a computer system’s RAM. This attack vector exploits remanence – a phenomenon where some data still resides in RAM shortly after the power of the host system has been turned off.
Recovering a BitLocker Key using an FPGA and data sniffing software – Microsoft and many hardware manufacturers extoll the virtues of using a Trusted Protection Module (TPM) to store the cryptographic keys of BitLocker. Unfortunately, this is not as secure as most people think. For example, using a field programmable gate array (FPGA) card (such as a Lattice Ice 40) combined with software like LPC_Sniffer, which can sniff BitLocker Volume Master Keys from the Low Pin Count bus used by the TPM chip. However, this only works if BitLocker’s pre-boot authentication is disabled.
Bypassing Apple File Vault Encryption using ThunderClap – Some Apple users believe that if their MacBook is encrypted with FileVault 2 that they are immune from such attacks. Not according the developers of ThunderClap however. This powerful software, used in conjunction with an FPGA card (such as Intel Arria), mimics an Ethernet card and enables the sniffing of data packets to and from an encrypted macOS system.
But surely, software and hardware vendors have implemented protections against DMA attacks?
Software and hardware vendors are well aware of such attacks. This is why they have introduced input-output memory management units (IOMMUs). This acts as a gatekeeper to the system memory only allowing privileged devices to access sensitive memory regions. Apple was one of the first mainstream computer manufacturers to embrace this technology enabling it by default on OS X 10.8.2 Mountain Lion. Today, macOS is one of the few mainstream operating systems that has IOMMU enabled by default. However, even in macOS, its implementation is not fully watertight. Some security researchers have found that a single IOMMU page uses shared mappings (i.e. user data could be stored in the same memory space as the peripheral used by the attacker). So, for example, a threat actor or investigator could in theory, use a modified hardware device such as a trojanised Thunderbolt dock to access the memory of a macOS system. This operating system is supposed to be protected from rogue hardware devices (like a modified Thunderbolt dock) by hardware whitelisting. However, this security mechanism could be easily thwarted by using an “Apple approved” PCIe bridge board (taken from a Thunderbolt dock, for example) and using that to bridge a nefarious DMA device.
Aside from IOMMU, there are other protections against DMA attacks. For example, Microsoft provides Kernel DMA protection for Windows 10 and Windows 11. But, in Microsoft documentation, there is rather worrying admonition that “This feature doesn’t protect against DMA attacks via 1394/FireWire, PCMCIA, CardBus, ExpressCard, and so on”.
How to access an encrypted SSD just like the CIA…
The “DarkMatter” files of WikiLeaks gave us a brief insight into how intelligence agencies like the CIA access encrypted hard disks. Not surprisingly, they don’t use any FileVault or BitLocker “bruteforcing software” which tries to use multiple combinations of passwords to bypass disk authentication. Instead, and perhaps not surprisingly, they exploit DMA ports. More specifically, it was discovered that they use a device known as a Sonic Screwdriver. This device, using modified firmware of a Thunderbolt-to-Ethernet adaptor can change the boot path of MacBooks whilst injecting keylogging malware into system files which have the ability to harvest encryption credentials.
We need to talk about self-encrypting SSDs…
The term “Self-encrypting disks” (SEDs) has to be the biggest misnomer in the data storage world ever! SEDs basically use an AES processor to enable encryption. The data is protected using a disk encryption key (DEK). Each disk is automatically encrypted with this. For users, such as governments and corporate entities, it means that disks can be erased by simply deleting the key facilitating easier asset decommissioning and disposal. And while “self-encrypting” drives are encrypted, for most SSD manufacturers, by default, any sort of authentication protocol is disabled. This means that while their users are very re-assured by using a “self-encrypting disk” the reality is, if that disk was lost or stolen, any dog on the street could connect it a standard PC system and all their files would be accessible. Moreover, even if authentication on self-encrypting drives (SED) is enabled, many S-ATA SEDs can be subject to what are known as “hot plugging attacks”. This involves an adversary or investigator disconnecting the S-ATA data connector of a disk and connecting a data cable of another system without cutting its power. In a substantial number of cases, this normally grants access to the data because the SED, even with authentication enabled, still thinks it is connected to the original host. The main condition needed for this approach to work is that the second system, to which the disk is being connected, must have a hot-swap compatible motherboard.
And another problem with self-encrypting drives is the unknowns involved with Vendor Specific Commands (VSC). Basically, every SSD manufacturer has their own language command set for their disk models. These commands can be used for diagnostics, maintenance and firmware repair. They are also proprietary – therefore not very open to public scrutiny. And, like with any proprietary software, this opaqueness presents a security problem. In fact, security researchers from the Netherlands have successfully used SSD VSCs to access encrypted data on some models of Crucial MX, Samsung T3 and T5 SSDs. And it is also rumoured that the NSA’s Equation Group extensively used Seagate and Western Digital VSCs in designing their HDD firmware rootkits. These vulnerabilities remind us of the importance of projects such as the OpenSSD Project which advocates for SSD firmware to be open-source and fully transparent.
WD My Passport disks provide a classic example of the weaknesses of hardware encryption. This line up of portable disks has encryption keys which can be bruteforced. Some of these models use a very leaky random number generator for key protection. Other My Passport models use hard-coded AES-256 credentials. Moreover, when their ROM can be “patched” by data recovery systems.
Practical Prevention: To protect highly confidential information using BitLocker, it is essential that the application is configured correctly. BitLocker should always be setup with pre-boot authentication using an alphanumeric PIN. Make sure you have SecureBoot enabled which helps prevent devices with unsigned firmware code booting up. A BIOS password is recommended. In standby or hibernation state, some Windows systems will store the BitLocker encryption key in RAM, therefore it is recommended that you disable standby or hibernate mode on the systems you wish to protect. To enable IOMMU in Windows systems, you will need to access the BIOS. The protection will be either listed as “IOMMU”, “I/O Memory Management”, “Intel VT-d” or “AMD Vi”. For protection of external storage devices, you might want to give hardware encryption a wide berth. Instead, you can an open-source encryption like VeraCrypt for whole disk encryption.
Drive Rescue, Dublin, Ireland provide a full hard disk recovery service for disks encrypted with BitLocker, FileVault, VeraCrypt and many other leading data recovery applications. We also provide a recovery service for WD My Passport external disks including My Passport Slim, My Passport Ultra and My Passport for Mac.
ECC (Error Correction Code) plays a crucial role in maintaining the integrity of data stored inside your solid state drive. ECC is a bit like a quality-control inspector inside your disk. When it detects soft bit errors, it automatically corrects them helping to keep the integrity of your data is kept intact.
However, sometimes, due to defects such as wear of the oxide layer, ECC failure will occur. Here your SSD controller has another trick up its sleeve. Many SSDs employ what are known as “superpages”. These are tracts of data spread across multiple dies. For example, you might have an SSD with 4 dies (NAND chips). If you have a data (a 200 page PDF document, for instance) stored on your SSD, the file probably won’t just be stored on one chip. Instead, it will be spread out among the 4 chips. The data is then XOR’ed. This is kind of analogous to the way data is stored in RAID volumes. The spreading of multiple I/O requests to multiple dies means much faster processing times. Now, even if ECC is unable to rectify the bit-errors, using superpage-level parity, data recovery is still possible.
For example, a client recently presented us with an Asenno 240GB SSD. There were numerous un-correctable bit-error showing. Using the power superpages along with some powerful data recovery equipment, we were able to recover the complete NTFS volume for the client.
Drive Rescue (Dublin, Ireland) offer a complete data recovery service for Asenno 240GB, 480GB, 512GB, 960GB and 1TB (S-ATA and NVMe PCIe) SSDs. Typical problems we help with include:
Your Asenno SSD is not showing up in Windows or macOS
Your Asenno SSD appears to be corrupted
Your Asenno SSD is appearing as “unallocated” in Windows Disk Management
You’ve accidentally deleted files from your Asenno SSD
You’ve got a BIOS-level warning that “SMART failure” is predicted on your Asenno disk.
Your Asenno disk appears as “unformatted” in Windows Explorer or macOS Finder
DAS (or direct access storage) devices are ideal for tasks involving high data throughput such as photo or 4K video editing. Unlike an NAS, no network equipment such as routers or switches are required. The device can simply be attached to a host system using a USB or Thunderbolt connection.
To increase the I/O (input / output) rates of these devices, it is common for manufacturers to use a RAID 0 configuration. This simply means that two (usually S-ATA HDDs) disks are joined at the hip (using software) to form one (NTFS, HFS+, EXT3 etc.,) data volume. If large files need to be transferred to the volume, the data is written concurrently to the two disks (instead of just one) making the write operation faster. For example, typical write speeds would be around 320 MB/s. This is a relatively fast speed for spinning metal platters and is a perfect example of how storage devices can exploit data parallelism afforded by RAID.
However, there is a downside to using RAID’ed disks like this. Namely if one disk fails, the whole volume topples over like a proverbial house of cards. And this is exactly what happened to a customer we were helping last week. Their 8TB G-RAID (12V 5Amp) device had two HGST 4TB S-ATA disks (HUS726040ALE614) in a RAID 0 configuration. One of the HGST disks developed firmware issues and bad sectors causing the volume to become unrecognised by Windows. When our customer connected his G-RAID drive to his Windows 10 system, it was no longer showing up. Instead, they got an ominous red light on power up.
Reasons why your G-Technology G-RAID drive no longer shows up in Windows or Mac.
One or more of the disks inside your G-RAID drive might have developed physical faults such as issues with the read-write heads. For example, heads can physically deform due to shock damage while some heads will just fail due to wear-and-tear. Problems with the read-write heads can make the MBR (master boot record), the firmware on the servo-tracks and the user-created data of each disk unreadable.
One or more of the disks inside your drive might have developed firmware faults. Firmware is microcode used by hard disks to manage the drive. It is typically stored on the ROM chip of the PCB and on the servo tracks of the disk. Firmware code helps manage errors on the disk and is also involved in crucial functions such as logical block addressing.
Another reason why your G-RAID rive is no longer being detected is that one or more of your disks inside your 4TB or 8TB G-RAID drive might have developed bad sectors. Sectors are the smallest area in a hard drive where data is stored. Some bad sectors are normal. In fact, most electro-mechanical hard disks leave the factory with some bad sectors already in place (this is recorded in the P-List). As the disks gets used, more bad sectors start to develop – these are recorded on the G-List. Then, after a while, a surfeit of bad sectors may culminate in your G-RAID drive failing to initialise.
The PCB (printed circuit board) inside your G-Technology G-RAID drive might have failed. This can occur due to thermal stress, over-voltage (e.g. a power surge) or due to liquid damage.
Data recovery from a G-RAID device
Thankfully most of the problems with G-RAID drives can be fixed. In this particular case, we resolved the issues with the firmware and bad sectors. The using byte-for-byte disk images, we performed a detailed analysis on them ascertaining key RAID parameters that were used such as disk order, block size, block order and disk offsets. All of these are parameters are needed for the RAID rebuild process. We eventually rebuilt the RAID. We now had a complete NTFS volume and were able to recover all the drive’s data (video footage and .tiff images) for our very pleased customer.
Is your G-RAID drive not mounting on your Mac? Is your G-RAID drive not showing in Windows? Is your G-RAID disk freezing? Is your iMac or MacBook reporting that your G-RAID disk is “not readable”. Is your G-RAID drive showing a red light? Drive Rescue offer a complete data recovery service for G-RAID devices such as G-RAID Thunderbolt 4TB, 8TB and 12TB. We have extensive experience in recovering and repairing the hard disks (usually HGST 3.5” S-ATA) used inside these drives.
So called “Ultra small form factor” PCs have never been so popular for their compactness, versatility and low power consumption. You can hold them easily with one hand and most are lighter than a dictionary. In fact, during the pandemic, some organisations were able to dispatch these book-size PCs to their home-working employees in the post. All the employee had to do was connect the system to a monitor cable (HDMI, DPI), mouse and keyboard and they were up and running in no time.
While all this sounds great, but here at Drive Rescue we’ve noticed one thing. Some of the disks inside these ultra small form factor PCs, seem to experience higher-than-average failure rates. This is not surprising. While very convenient, most of these systems do not offer the same level of airflow as their more internally capacious brethren. Even with sophisticated heat sink designs, lower levels of internal airflow, mean that inside, the components (such as northbridge chip) and disks inside these systems can get hotter than a Tokyo metro train during rush hour during a heatwave.
And that’s not good news for HDDs or SSDs. Conventional hard disks (P-ATA, S-ATA) never liked the heat. They have too many metal components (such as platters, spindles, sliders and voice-coil motors) inside which expand when exposed to heat. SSDs (such as M.2 NVMe) on the other hand, actually run better when hot, but after a while this heat-induced performance boost begins to take a toll on the disk’s controller. Too much heat can cause the controller to execute failed bad block management operations, failed logical block addressing and eventually the thermal stress can culminate in complete failure of the controller IC itself.
How to recover data from a Lenovo small form factor PC?
Take last week for example, we were recovering data from a Lenovo IdeaCentre Q180. The disk inside a WD Blue 500GB S-ATA (WD5000LPVX) had a failed head-disk assembly (HDA). More specifically, the head-gimbal assembly at the end of the HDA had “lifted” from the fly zone. Damage in congruence with thermal stress. Anyway, we replaced the HDA in our clean-room, we then imaged the disk enabling a full data extraction from it’s NTFS partition table.
Drive Rescue (Dublin) offer a complete data recovery servicefor small form factor PCs such as Fujitsu Esprimo E420, G5011, G5010, Q520, Q910, Q958, Intel Nuc, Lenovo ThinkCenre M700, M900, M710s, M710q, ThinkStation P350, Dell Optiplex 780,790,3020,3050,7010, 7040,9020m and Asus PN50, PN60. We recover from most disk types used in these systems including M.2 NVMe (SSD) disks, m-SATA and S-ATA disks.
For conventional hard disks (HDDs), the smallest unit of storage is called a sector. This traditionally has been 512 bytes with most hard disks of the last 10 years or so using 4096-byte sectors (Advanced Format). Each sector will hold the user-generated data, sync bytes but will also hold some ECC (Error Correction Code) to maintain the integrity of the data. The ECC acts as a sort of checksum to filter out corrupt data before it’s transmitted to the host’s RAM.
The problem with ECC
Modern ECC algorithms (such as Reed-Solomon and Bose-Chaudhuri-Hocquenghem) are great, they help prevent bit-rot and other corrupting processes. However, when you have a failing hard disk with bad sectors and try to read it on a standard PC, ECC will probably be the reason that the disk can’t be read. The host computer attempts to read the sectors once but ECC will report the sectors as unreadable. To the user, they will probably see a “not responding error” or similar on their GUI. ECC is a fusspot in this regard – any corruption at all and it won’t let the host PC read the data.
ECC and consumer-grade data recovery software
ECC is not only problematic for reading failing disks via an operating system, but it is also one of the main reasons why so many consumer-grade data recovery software applications can’t recover data. Like with operating systems, data recovery applications cannot always read from sectors whose ECC is reporting errors. In order to bypass this, these applications will read and re-read inaccessible sectors multiple times in the hope that ECC might allow a successful read. However, for a hard disk that is failing or damaged, these repeated attempts of reading are the equivalent of torture for your disk.
It’s not only ECC…
ATA controllers, as used in standard PCs, require that data transfers from disk to host use the host’s RAM. This can be problematic, especially when processing disks with bad sectors or read-media issues as BSOD events are likely. In addition to this, ATA controllers in standard PCs cannot perform disk re-set operations.
How professional data recovery equipment circumvents ECC errors and the problems associated with standard ATA disk controllers…
Data recovery technicians use dedicated hardware systems that enable disk-reads that bypass the BIOS and the operating system. They use systems which can ignore ECC errors.
Moreover, technicians use equipment which can directly read the disk’s error register. This gives the technician (and equipment) much more specific information about the underlying problem. For example, this could be a UNC (un-correctable) data error or a TONF (track not found) error. When the equipment knows what the underlying fault is, it can choose a recovery algorithm to maximise the probability of a successful recovery.
Data recovery technicians will typically use systems with ATA disk controllers equipped with Ultra Direct Memory Access. This enables direct data transfers whilst bypassing the host’s RAM.
ATA controllers used in standard computers cannot perform disk re-set operations if the disk becomes unresponsive. A disk re-set operation is much less stressful on a failing hard disk compared to a re-power operation.
Only last week, we were dealing with a very frustrated end-user who was trying to extract data off his LaCie Rugged Thunderbolt USB 3.0 2TB external hard drive. Everything time he connected the disk via a Thunderbolt port to his MacOS system, it would freeze. He found this very frustrating. He had thousands of Adobe PhotoShop (PSD) and Adobe Premiere Pro (PRPROJ) which he needed to transfer to another working disk. Our diagnostics revealed that the disk inside (Seagate Barracuda 2TB ST2000LM015) had developed extensive bad sectors. Using our ECC-bypassing and UDMA-enabled data recovery systems, we were able to transfer his data to his second disk within 48 hours.
Drive Rescue (Dublin, Ireland) offers a complete data recovery servicefor LaCie Rugged disks. We regularly recover from models such as LaCie Rugged Mini, LaCie Rugged USB-C, LaCie Rugged 3TB LaCie Rugged 4TB, LaCie Rugged 5Tb which are not mounting or not recognised in Mac. Likewise, we recover from LaCie external disks which are showing up in Windows (10 or 11) or from LaCie disks which are making a clicking or buzzing noise.
Having your Apple MacOS stuck in a dreaded boot loop can be an exasperating experience. (For those of you lucky enough not to know what a boot loop is, it occurs when an operating system cannot successfully boot to the desktop screen. Instead, on system power-up, the OS goes through the familiar boot-up process but halts at a certain point. If you’re lucky, you’ll get an error message which might give a hint of what the problem might be). In MacOS, boot loops can occur out-of-the-blue due to OS corruption or they can typically occur after the user has attempted to install a fresh version or updated version of their operating system.
Recently, we had a client who experienced this very problem. They tried to upgrade their operating system from Catalina to Big Sur. However, their 256GB SSD did not have enough space. The installation of the OS update files never completed, but now on start-up of their system they would receive a message that “An error occurred preparing the software update”. As a result, they were unable to access their desktop and they had no recent backup.
Luckily, we had heard about this problem before. The earlier versions of the MacOS Big Sur (11.6.1, 11.6.2) installer files have a bug in them. Namely, the installer setup file does not check the size of the disk before the installation process begins proper. Therefore, if you don’t have the pre-requisite of 35GB of free space needed to store the temporary install files, this re-boot loop problem manifests itself. This bug also interferes with FileVault 2 encryption hence making the APFS volume invisible to Target Disk Mode (TDM). TDM will see “Macintosh HD” but not “Macintosh HD – Data” which is the folder you want! And if you’re thinking some bootable Linux tool could image the disk – because of the problem with FileVault 2, that avenue is also closed off.
Thankfully, there is a solution to this problem, albeit convoluted, which goes beyond the scope of this blog. But the long and short of it is this; we got all the data back for our delighted client. The lessons of this case are simple. Always have a complete backup before you start upgrading your MacOS system (or any OS for that matter). And secondly, always try to avoid deploying the first iterations of an operating system because, even with MacOS, these versions can be more bug prone.
Why is SSD firmware super-important to running of your disk?
The host system does not directly interface with the NAND containing your data. Instead, it interfaces with the firmware directly. The firmware holds the File Translation Layer which maps physical blocks to logical blocks. The firmware also performs crucial tasks like data scrambling, bad block management, interleaving, wear levelling and TRIM.
Isn’t firmware the code that’s also used in personal printers, toasters and fitness monitors right?
Yes, but in storage devices such as HDDs and SSDs it tends to more multi-faceted and much more complex. For example, Travis Goodspeed giving his talk “Implementation and implications of a stealth hard-drive backdoor” at Sec-T (2014) revealed how it took him “10 man months” to reverse engineer a Seagate Barracuda hard disk. He and his team also had to “kill” 15 hard disks in the process. So yes, the firmware found in your HDD or SSD is in a different ballpark than the firmware found in your Fitbit.
So, why bother updating the firmware on your SSD?
Well, if a potential problem is discovered it can often be remedied by a pre-emptive firmware update. Now you might be thinking that it’s the disk manufacturers themselves who discover these faults, right? Well, in most cases, it’s usually their customers such as gamers, PC enthusiasts and sys admins who discover them. Such problems could be related TRIM, ECC, bad block management or write amplification. When a problem is discovered, and assuming the disk model in question has a sufficiently large user base, it kind of expected that the manufacturer will release a firmware update to remedy the issue.
Could a firmware update for my SSD brick my drive?
Quite frankly, yes. This is why you should avoid the temptation of hastily applying recently released firmware updates from manufacturers. Because it’s not unknown for a vendor to release a firmware update which can provoke undesirable side-effects (such as dramatic slow-downs of the disk) or in worst case scenarios turning your SSD into a doorstop. This can happen if, for example, if the PMIC (power management IC) or file translation lay (FTL) gets corrupted. Of course, you’re also looking at potential data loss. This is why you should always perform complete disk backup before attempting any firmware update on your SSD.
So, I’ve backed up my data. Now, I can’t apply the firmware update using the manufacturer’s SSD utility (such as Samsung Magician, Crucial Storage Executive, Kingston SSD Manager etc.). What now?
Ok, truth be told. Updating your SSD’s firmware, even with the manufacturers dedicated utility software is rarely a click-and-go process. Some questions to ask before even starting include: are you using the latest version of the utility? Are you running the tool as an administrator? Have you performed a re-boot of your system after installing the SSD utility for the first time? Have you tried disabling your anti-virus or other end-point security software? Is your disk attached directly to your motherboard via a S-ATA or M.2 connection?
I’ve tried all of the above but still can’t apply the firmware update to my SSD. What do I do now?
If all of the above suggestions fail, you may need to create a bootable ISO tool provided by your manufacturer. Such a tool can avoid the layers of abstraction presented by an operating system such Windows. It can also make the firmware update process run more smoothly. So, after you’ve downloaded the ISO file, you need to make it bootable. You can do this using a tool such as the excellent Rufus USB creator. Once your bootable USB SSD utility has been created, boot up your system with it. It should allow you to update your disk’s firmware without the operating system getting in the way.
I think my SSD is failing, will a firmware update fix it?
Applying a firmware update to a failing SSD might actually exacerbate your problem. Writing new firmware to a disk often means that the existing firmware gets wiped. However, if your disk is failing and the new firmware module is unable to be written (to your SSD) – this leaves you in a sort of firmware no man’s land and potentially irreversible data loss. Professional data recovery companies such as Drive Rescue circumvent this problem by using a firmware “loader”. This basically means that the new firmware is loaded onto one of our host systems first and this is then used a “translator” to read the NAND whilst leaving the original firmware intact.
Drive Rescue, Dublin offer a complete data recovery service for faulty or inaccessible SSDs. Popular models we recoverfrom include SK Hynix PC300, PC401 PC601, PC711, Micron 1100 M.2, Micron 1100 S-ATA, Micron 2200, Micron 2300, Samsung Mzvlb256hbhq-000l7, Mzvlb256hbhq-000l7, Mzvlb512hajq, Mzvlb512hajq, PM853T, PM871, PM883, PM991, Kingston A400, Kingston SSDnow SV300, SSDNow V300 and Toshiba Thnsnk256gvn8.