Let’s face it, some SSD models belch out more heat than a small nuclear power station. For some SSD models, running hot is their normal mode of operation. In fact, with some S-ATA-based SSDs, their metal chassis is not only designed to protect the electronics of the disk, but to also act as a passive heat-sink. For a standard computer, a typical temperature for an SSD under load is between 30°C and 50°C (86°F and 122°F) but this can vary a little between manufacturers. It is also normal to have spikes of heat when your SSD goes from being idle to performing an intensive task, such as a large data transfer.
SSDs use NAND flash memory. This type of storage is non-volatile, which means it doesn’t require a continuous power supply to retain data. The floating-gate transistor (aka FGT, a metal-oxide semiconductor) is a popular type of NAND that is used in SSDs (such as those produced by Intel). Another semiconductor used in NAND memory is the Charge Trap Flash (CFT), but its thermal properties are similar to FGT, so for the purposes of this blog, the impact of heat on FGT-based SSDs will be discussed.
The FGT is composed basically of two types of gates, the floating gate (FG) and control gate (CG). The procedure of removing the electric charge from the FG is the Erase process (erase data), whereas the procedure of storing is the Program operation (write data). This operation requires power, and the temperature can increase significantly when the SSD is subjected to large workloads.
The “electron tunnelling” process used during Program/Erase (write/erase) cycles can damage the cell (FGT). The tunnel oxide, a layer that composes the FGT (as presented in Figure 1), wears out over time, when it is exposed to high temperatures. This wear-out results in electron leakage and bit-errors.
When an SSD is overheating, the controller can malfunction leading to all sorts of erratic disk behaviour such as:
Your SSD is not recognised by Windows.
Your computer can’t see your SSD
Your SSD appears as unformatted.
When you try to copy files off your SSD, your computer keeps on freezing.
You cannot copy files off your SSD.
Some files seem to have disappeared off your SSD for no particular reason.
The Catch–22 of SDDs and Heat
Be careful here! Many internet commentators mention that read/write operations in SSDs perform better at higher temperatures. This is correct; NAND programming has always worked optimally at higher temperatures. Put simply, when your SSD is hot, the read, write and erase operations will be quicker and smoother compared to a cooler disk. Degradation of the cell oxide layers is also reduced because the heat causes less stress.
The M.2 Form Factor and Heat
User demand for lighter and thinner devices is not helping the situation. For example, the M.2 “stick of chewing gum” sized form factor has a relatively small surface area coupled with high data densities. This specification can draw power of up to 7 watts but can push temperatures up to 100C. (At least the SATA-based SSDs have a larger surface area for heat dissipation and can use their chassis, which is often metal, as a heat-sink).
Enter Thermal Throttling to Cool Things a Bit but also Slow Them Down…
Many SSD manufacturers use a function known as Thermal Throttling to prevent their devices from overheating. This monitors the temperature of the SSD via a built-in sensor. When the disk temperature reaches a pre-defined threshold, the thermal management function slows down the SSD’s performance to prevent it exceeding its maximum temperature. This results in fewer bits flipping due to heat and ultimately prevents premature failure. A simplified process of the Thermal Throttling technique is presented in Figure 2. It can be seen that the temperature of operation is above 70°C (158°F) which is “normal” for an M.2. However, to ascertain the normal operating temperature of your SSD, refer to the manufacturer’s specification sheet.
Each manufacturer will implement thermal throttling differently. For example, Samsung SSDs use Dynamic Thermal Guard (DTG). If a disk exceeds a threshold temperature, DTG will reduce the power to the NAND and MCU (controller). This disk self-preservation mechanism usually kicks in at around 75C. For a lot of their SSD models, such as the 950 Pro, 960 Pro and 970 Pro, thermal throttling can be a fairly common occurrence under sustained workloads, such as heavy video editing or when the disk is being used in a busy VM server.
Data Recovery form an Intel SSD PCIe 660p M.2 Disk
Last week, we were dealing with an Intel SSD 660p which was proving toasty even after only being connected for ten minutes. This was making sector reads very difficult. We first had to bring the core temperature of the disk down. For this, we used a custom cooling device made for failing SSDs. This uses a heat sink with a very high surface area which means it maximises the dissipation of heat. It also uses a high velocity fan which cools the disk further using convection. This enabled us to bring the disk’s temperature down from 80 to 52 degrees Celsius. Once the Intel 660’s temperature has stabilised, we were now able to connect it to our PCIe data recovery system. Normal reads were proving impossible. Therefore, we had to use a special PCIe disk reader with adjustable read timeout settings, controller power settings and disk reset functions. At a glacial speed of only 64 sectors per read, the disk took around two days to image. Even after this process, the disk’s NTFS partition table needed some repair to its MFT. However, the effort was worth it – most of the client’s files (.DOC. PDF, XLSX, PPTX were successfully recovered.
Drive Rescue Dublin, Ireland offers an advanced data recovery service for failed SSDs such as the Intel 660p,Intel 7600p, Intel H10 SSD M.2, Micron 1100, 1300, 2200, 2300, 5100, WD SN550, SN750 and SK Hynix PC601, HFM256GDJTNG, HFM512GDJTNG. Serving satisfied customers in Dublin since 2007