Garbage Collection – the silent enemy of data recovery

Garbage Collection – the silent enemy of data recovery Data Recovery Ireland
3 SSDs which we loaded with two data sets. The data sets were then deleted. Which disk would still have its data intact after 24hours?

Drive Rescue recently gave a guest lecture to the computer science class of a well-known Dublin third-level institution. Their lecturer wanted to give his class some real-world insights into how the world of practitioners sometimes differs to the world of academic theory. So, in the name of science and knowledge enhancement for all – we duly obliged.  

The topic we decided to talk about was garbage collection in solid-state disks (SSD). Garbage collection is a silent (disk controller) process which runs in the background of most solid-state disks and operates as a sort of clean-up mechanism for data which had been recently subject to the delete command. This makes read, write and erase operations in SSDs more efficient. However, for the forensic investigator, the security analyst, the systems administrator or indeed the data recovery technician, the garbage collection feature has the potential to complicate investigations and recovery cases.

Data Deletion from HDDs

When data is deleted from traditional electro-mechanical hard disks, the space on the volume is marked as free by the disk. But the actual data is not deleted until it’s overwritten to the same location.  

Why Garbage Collection is a problem

File deletion with SSDs works differently. Unlike HDDs, they cannot write data to a random area of the disk. SSDs must write to blank pages. Moreover, an SSD cannot erase data at page level, it must be block-level. For this reason, SSDs use TRIM and garbage collection to make sure there always pages ready available for writing.

Most academic texts discussing data deletion in SSDs invariably discuss the topic of TRIM (which is a delete command sent from the operating system). However, the less discussed and underplayed topic is garbage collection. TRIM can simply be disabled by disconnecting the disk from the host system. But with garbage collection, because the process is initiated by the disk’s controller (MCU), the disk only has to be powered up for this process to initiate. This is a massive problem because as soon as the disk is powered up, deleted data or evidence of deleted data starts getting destroyed. It means that the MD5 hash of an SSD can change within minutes, making an SSD forensically unsound.

In order for the class to understand this process a little better, Drive Rescue set up a small experiment. We got three SSDs all of which were of a similar size.

Crucial MX 500 (500GB) – SM2258H

WD Blue (500GB) – Marvell 88SS1074 (Custom WD)

Kingston A400 (480GB) – Phison S11

We put two data sets onto them of the exact same size. Then using Windows Explorer, we deleted the two data sets from each SSD. But, a little bit of background information first. All the disks were brand new. And the data sets were designed to emulate as much as possible the file contents of a standard Windows 10 computer. Data Sample 01 (11.9GB) contained Office Documents such as (.docx .pptx, .xlsx.), video files (.avi and .m4v), photos (JPEG) application and operating system files. While Data Sample 02 (30.8GB) contained .PDF, PST and application files.

So, we connected all three solid state disks to a standard Windows 10 Professional desktop system using three separate disk caddies (all Orico 2.5”). These were then connected to the USB 3.0 port of the host.  Fifteen minutes after the delete command was issued, we decided to scan each of the disks using Forensic Toolkit (Access Data). The Crucial MX 500 and WD Blue still had their data intact. The Kingston A400 SSD had lost its Data Sample 01 sample already.

It was now approaching 5pm. We would leave all disks connected to the host overnight. In a move which would have incurred the wrath of Gretta Thunberg, we disabled all power saving features of the Windows 10 host system.

At 9am the next morning, we checked the disks again. They still had their data intact. (Obviously Data Sample 01 on the Kingston was still undetectable). We checked again at 11am. The result was the same. Finally, at 12pm, we discovered that that Data Sample 02 of the Crucial MX was no longer appearing in FTK.

Garbage Collection – the silent enemy of data recovery Data Recovery Ireland

Discussion

It would appear that Phison S11 controller used by the Kingston A400 SSD has a very aggressive garbage collection algorithm deleting all evidence of Data Sample 01 in under 15 minutes. We were expecting that the Crucial and WD disk would lose their Data Sample 01 in line with the Kingston but this did not happen. Instead, the Crucial relinquished all evidence of Data Set 02 – some 19 hours later. And under our twenty-four test conditions, all the data of the WD Blue SSD would have been recoverable. This certainly contradicts the wisdom found on internet forums that once data is deleted from an SSD – it’s gone. Our little experiment proved otherwise. The experiment also proved that there is very little uniformity in the way SSDs from different manufacturers or SSDs using different controllers handle deleted data.

Mitigating the effects of garbage collection

Data Sample 01 on the Kingston SSD was undetectable after just 15 minutes. Had this been a real-life case, it could have posed a major problem for a forensic investigator, system administrator or data recovery technician. One participant in the class suggested that a write blocker could have been used. However, write blockers are traditionally used to block I/O requests from the operating system and not internal commands from the disk controller.

Other Possible Solutions

One possible solution would be to disconnect the NAND chip from the PCB of the SSD in order to prevent garbage collection from operating. However, this “chip off” solution is a high-risk procedure because the controller is needed to read the data. And even reading the NAND chips using an emulator, the investigator might not have the exact controller microcode for the disk model to upload. Some forensic investigators claim that activating “auto-dismount” on the host system can mitigate the effects of garbage collection. While other investigators claim using a write blocker can dampen the garbage collection process. However, none of these researchers have explained specifically how these measures interact with the disk controller to slow or stop the garbage collection process completely. There is also the option to image the SSD completely, however, with an unstable SSD, this might not be possible.

Further investigation

Further investigation of this issue will be difficult as garbage collection algorithms used by SSD / controller vendors are usually proprietary and a source of competitive advantage. The test and observe method might prove to be one of the richest sources of information on this topic. For those involved in disk forensics and recovery, it means there are going to be some interesting years ahead.