Search MFT number now surpassed expected size?

Using TestDisk to repair the filesystem
Forum rules
When asking for technical support:
- Search for posts on the same topic before posting a new question.
- Give clear, specific information in the title of your post.
- Include as many details as you can, MOST POSTS WILL GET ONLY ONE OR TWO ANSWERS.
- Post a follow up with a "Thank you" or "This worked!"
- When you learn something, use that knowledge to HELP ANOTHER USER LATER.
Before posting, please read https://www.cgsecurity.org/testdisk.pdf
Locked
Message
Author
diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Search MFT number now surpassed expected size?

#1 Post by diskresc »

Hi,

I have a 8 disk usb storage unit that suddenly ran into issues when i was moving a lot of files around between disks on the same usb interface, lessons learned i guess... But 2 disks out of 8 have the same problem, when mounted in truecrypt they mount but it's impossible to access the drive.

I have backup headers for all disks and i tried restoring the header on the first problematic disk but no different, my guess is that the MFT is corrupt.

I've been running testdisk over night on the disk to rebuild the MFT and to my surprise the number in "Search MFT" is now above the number it predicted it had to search (see screenshot)

Image

How come? Is this normal? As you can see the number to the left started off at 0, and i expected it to be done once it reachced 2743151280 but it has passed it by far. How far should i go?

Thanks.

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

8 Disk USB Drive critical failure, 5 out of 8 disks MFT corrupt simultaneously?

#2 Post by diskresc »

Hello,

I am under distress, here's my scenario. iIve been using a Startech 8 disk usb enclousure in jbod setting and encrypted every disk with truecrypt (full disk encryption).

Yesterday i initiated a large data copy via explorer from one usb drive to another usb drive (inside the same enclosure) the copy ran for hours.
Later under that day eventviewer suddenly report

Code: Select all

The IO operation at logical block address 0x252c2bb30 for Disk 12 (PDO name: \Device\0000006e) was retried.
After this event, it was followed by a large number of:

Code: Select all

A corruption was discovered in the file system structure on volume S:
I didn't notice the issue until 3-4 hours after this message in the eventlog when i attempted to copy a file from one of the drives only to have it return "file not found" via network. Once i went into the server i noticed all the drives were mounted in truecrypt but none of the files were accessible. I figured it was a windows bug so i dismounted all the encrypted drives, and then rebooted the entire system.

Once back online 5 out of 8 drives are inaccessible. They all mount fine in truecrypt so the header appears valid, but i guess the MFT is broken on all 5 drives. What's even stranger is that the S: drive that windows reported in eventviewer was troublesome works perfectly fine.

Now to the extremly interesting part... There are 8 drives in the enclosure, but only 2 different type of drives.
  • 5 drives are of model ST800AS0002 also known as "Seagate Archive V2" installed in enclosure slots 1, 4, 5, 6, 7
  • 3 drives are of model ST8000VN0022 also known as "Seagate Ironwolf" installed in enclosure slots 2, 3, 8
All the Seagate Archive V2 are inaccessible, all of them appears to have broken MFT.
All of the Seagate Ironwolf are accessible and works fine like nothing ever happened.

So what exactly happened here? All of the usb drives were running in "Quick Removal" state, which means it should be safe for the drives to be removed without notice. I also have truecrypt backup headers to every drive but like i said before, the headers are fine, the drives mount fine.

Since all of these drives are 8TB size i've yet to manage to analyze one of them fully, i ran testdisk and it was searching MFT for about 15 hours until i manually cancelled it due to it not making any sense what so ever, the advertised size of 0/270000000000 made sense at the beginning making me think once 0 reaches the number advertised at the othe side of the "/" it would be done, but it surpassed this, and then it looped back to 0 and kept on going. (See screenshot for info on that: https://i.imgur.com/EJcfWF7.png)

I am really out of luck here, i cannot believe this happened. I have taken every damn procaution against dataloss, i opted for a JBOD solution because of worry of bad raid card writing corrupt data to the array and making me loose everything, i felt like losing 1 drive is better then an entire array, i ran the enclosure in quick removal mode to ensure no corruption would occur if power outtage happens, it is also connected to a UPS and monitor smart on daily via email reports.

Even though i did all of this... this somehow happened? I have no idea how it happened, even if i can somehow salvage the problem and rebuild the MFT index i still am very unsure if i ever want to use the enclosure again. It has been running for over 1 year with zero problems and suddenly baam. 60% of the entire data index is gone.

Help appreciated.

User avatar
cgrenier
Site Admin
Posts: 5432
Joined: 18 Feb 2012, 15:08
Location: Le Perreux Sur Marne, France
Contact:

Re: Search MFT number now surpassed expected size?

#3 Post by cgrenier »

I think it's a display problem (problem of integer size).
But searching the MFT will not work for an encrypted volume.
You need to run testdisk or photorec on the unlocked volume (ie S:) or decrypt the whole disk before running the tools on the physical drive.

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Re: Search MFT number now surpassed expected size?

#4 Post by diskresc »

Thanks for the reply cgrenier.

Do you think there's a bigger chance of MFT recovery if i decrypt the drive first or is that pointless? I'm currently running GetDataBack on one of the drives to see if it can find any MFT records, right now it's at 60% it's found 300 000 files which doesn't make sense, the drive only contained a total of 129 000 files when it crashed (i have logs of this), not sure where the other files are coming from?

I also am surprised that 5 drives crash when there's absolutely nothing being written or read from them, i though drives that were not being used were idle. Something tells me the enclosure usb controller iis to blame for all of this horseshit, what a way to go. rip

Cheers for your wonderful tools and this forum, a lot of useful information here for someone who's a complete stranger to filesystems

recuperation
Posts: 2720
Joined: 04 Jan 2019, 09:48
Location: Hannover, Deutschland (Germany, Allemagne)

Re: Search MFT number now surpassed expected size?

#5 Post by recuperation »

diskresc wrote: 30 Sep 2019, 20:27 Thanks for the reply cgrenier.

Do you think there's a bigger chance of MFT recovery if i decrypt the drive first or is that pointless?
As long as you apply your recovery on the unencrypted volume it should not make any difference.
Decrypting a volume results in a heave drive load on the encrypted drive, check SMART parameters first or duplicate the drive(s).
MFT recovery is technically impossible as its information is unique.The MFT backup only contains a few entries. Furthermore MFT dammage has not been proven yet.
I'm currently running GetDataBack on one of the drives to see if it can find any MFT records, right now it's at 60% it's found 300 000 files which doesn't make sense, the drive only contained a total of 129 000 files when it crashed (i have logs of this), not sure where the other files are coming from?
Depending on the mode of operation of your recovery program, recovering garbage is common experience. Maybe you are dealing with the unencrypted device.
I also am surprised that 5 drives crash when there's absolutely nothing being written or read from them, i though drives that were not being used were idle. Something tells me the enclosure usb controller iis to blame for all of this horseshit, what a way to go. rip
Failure is a part of operation. Unfortunately you are to blame because you never mentioned the word "backup".
At least your JBOD enables you to deal with your problems for each drive separately.

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Re: Search MFT number now surpassed expected size?

#6 Post by diskresc »

recuperation wrote: 01 Oct 2019, 11:40
diskresc wrote: 30 Sep 2019, 20:27 Thanks for the reply cgrenier.

Do you think there's a bigger chance of MFT recovery if i decrypt the drive first or is that pointless?
As long as you apply your recovery on the unencrypted volume it should not make any difference.
Decrypting a volume results in a heave drive load on the encrypted drive, check SMART parameters first or duplicate the drive(s).
MFT recovery is technically impossible as its information is unique.The MFT backup only contains a few entries. Furthermore MFT dammage has not been proven yet.
I'm currently running GetDataBack on one of the drives to see if it can find any MFT records, right now it's at 60% it's found 300 000 files which doesn't make sense, the drive only contained a total of 129 000 files when it crashed (i have logs of this), not sure where the other files are coming from?
Depending on the mode of operation of your recovery program, recovering garbage is common experience. Maybe you are dealing with the unencrypted device.
I also am surprised that 5 drives crash when there's absolutely nothing being written or read from them, i though drives that were not being used were idle. Something tells me the enclosure usb controller iis to blame for all of this horseshit, what a way to go. rip
Failure is a part of operation. Unfortunately you are to blame because you never mentioned the word "backup".
At least your JBOD enables you to deal with your problems for each drive separately.
Yeah GetDataBack was actually running on the encrypted volume, even though i had it mounted, hard to determine which is which, but in GetDataBack only one volume was found even though it was mounted and it was the raw one so i wasted around 20 hours there.

I am now running Testdisk once again and i selected the mounted volume, it actually finds boot sector and it looks OK but i selected to rebuild BS anyway, it's now search MFT hoping i can find something.

As for backup, i didn't have enough money, this was the solution i could pay for that would give me the most redundancy in most scenarios except this scenario which i didn't plan for.

Having a mirror backup for 50+tb is very expensive, the data in itself is important to me, but i could deal with the loss of 1 disk, i was hoping to monitor smart and by doing that i could move data off a drive if it was failing, and if i after all lost the entire drive with data on it, i could deal with it. Losing 5 drives at once is something that never happens, ever... I didn't go for a raid solution with 2 spare drives because i don't trust raid. You loose 1 8tb drive in a huge array the amount of load on the other disks and system rebuilding that data is immense and could potentially crash the entire array, not to mention having a broken raid controller that starts writing corrupt data all over the array and that's a wrap for *.*.

I mean this must be some kind of once in a lifetime thing, i feel like running anything on the usb interface is a big nono as it's not safe in anyway even though you may be lead to believe it is. I have most of these files in the cloud, but it's a mess and restoring the data on the drives as it was previously is important to me, that's why the filenames is important too.

Anyway, will post updates as i go along, i have 5 identical drives with the exact same issue, nothing was being written to any of the drives when this occurred so i would be very surprised if the data is completely corrupted. All disks are performing fine in smart, they are not bad in anyway.

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Re: Search MFT number now surpassed expected size?

#7 Post by diskresc »

So after running a full rebuildBS on one of the drives it does absolutely nothing, i cannot list files after it gives me the same messages as before. (This time it was run on the unencrypted volume).

I try to decrypt the volume but truecrypt tells me the volume is not encrypted even though it's mounted in truecrypt with AES encryption.

I have no idea what has happened, all 5 drives behave the same way, considering no data was being written to any of the drives when this occured i am really confused how this could even happen? The last resort is running PhotoRec and get back unidentifiable data, this doesn't help me much because the names of the files and the structure is just as important as the data itself. I will have to start rebuilding the data from my logs and cloud backups, something i prefer not to have to start doing but it appears to be the only way.

In the future, what can i do to avoid this from happening? I still do not understand what happened. A usb disc enclosure that was in idle suddenly decided to start fixing detected MFT issues on one of the drives (my guess is that it's some scheduled job in windows that looks for disk problems from time to time) once this occured the entire fleet of drives became unresponsible and no data could be accessed from any of them.

I dismounted all the encrypted drives and unmounted the usb driver before shutting down the system for a reboot. No complaints were had from the system which usually will tell you "drive is in use" if it is being used by the system to do something while attempting to dismount the driver.

Once rebooted i am only able to access 3 out of 8 drives in the jbod enclosure, the drives report zero issues in the smart application, the drives that work are perfectly fine and operate normally, the ones that don't work can't even be detected as encrypted by truecrypt anymore even though i had all the "headers" backed up and restoring those makes zero difference.

Not sure how to proceed after i have rebuilt my data here, i can't keep using this solution if i have no idea what triggered the problem to begin with, it has been running 24/7 for over a year with zero issues then suddenly this.

Any input is appreciated, now begins the daunting task of piecing together my data from the cloud.

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Re: Search MFT number now surpassed expected size?

#8 Post by diskresc »

Holy smokes i just lost another drive, it's the cabinette that is corrupting the drives... i have no idea what to do.

I inserted a working drive into the cabinette in order to access some files, i did a chkdsk on the drive first:
Volume D: (\Device\TrueCryptVolumeD) is healthy. No action is needed.
After 3-4 minutes suddenly eventviewer reports:
Volume D: (\Device\TrueCryptVolumeD) needs to be taken offline to perform a Full Chkdsk. Please run "CHKDSK /F" locally via the command line, or run "REPAIR-VOLUME <drive:>" locally or remotely via PowerShell.
Once i rebooted the drive is in the same state as the other 5 that was previously corrupted. This drive was tested running outside the cabinnette for about 5 hours yesterday and no issues, and now its data is gone from a few minutes inside the cabinette of death, wtf?

The cabinette has been working flawlessly until this problem occured, i am definitely not inserting any other drives into it, i lost another fucking 8tb of data.

To anyone thinking buying one of these cheap drivebays is a good idea let this 50tb + dataloss be a reminder that it is not. Stay away from this product: https://www.startech.com/HDD/Enclosures ... 358BU33ERM

I'm fucking furious... this is horrific and the worst dataloss i have ever experienced. Signing out, fuck my life.

recuperation
Posts: 2720
Joined: 04 Jan 2019, 09:48
Location: Hannover, Deutschland (Germany, Allemagne)

Re: Search MFT number now surpassed expected size?

#9 Post by recuperation »

diskresc wrote: 02 Oct 2019, 07:42 Holy smokes i just lost another drive, it's the cabinette that is corrupting the drives... i have no idea what to do.
Try to exclude uncertainties. Use a different machine. Connect one questionnable drive internally.
Post a SMART log - you haven't provided any documentation yet.
Copy your drive before writing on it. Use the function "backup header" to make another backup of your header and label that in a way to exclude confusion before trying to restore the old backup header.

Test the information chain on your computer in question by testing the USB-Port that you used to connect the cabinette with just one drive attached and running h2testw under Windows for instance or use F3 for linux.
If you set up a new system, perform a load test with IO-Meter. I don't know if there is an error recognition scheme inside IOMeter, alternatively run multiple instances of h2testw on multiple drives simultaneously.

Finally try to get comfortable with the idea of backups. Your concept is like buying a Ferrari and refusing to pay for the necessary engine belt change.

Read the documentation:
[...
Max Drive Capacity Currently tested with up to 6TB 7200 RPM hard drives
...]

diskresc
Posts: 7
Joined: 30 Sep 2019, 05:02

Re: Search MFT number now surpassed expected size?

#10 Post by diskresc »

The cabinette is connected to another machine when the 6th disk crashed, i am now trying to simulate a crash on another 8tb drive that wasn't present in the cabinette when it all started, i cannot get it to crash so mabye the cabinette isn't at fault anyway. My guess is whatever damage was done when it all went down may have been done to all the drives, and even the 3 that appeared to work if used long enough will trigger windows to detect MFT errors and once that occurs it instantly scrambles the drive and it becomes unaccessible. I am now mounting everything in read only because i think it's windows that is trying to fix MFT but since it's a truecrypt volume it fails somehow and rather then fixing anything it makes things worse?

Either way, to my surprise, i am now running photorec just to see if the data is indeed there and it is... this i knew but i was surprised that i am getting filenames too? I though filenames would not be possible without the MFT? Time will tell how good the data is and how much of it contains filenames.

Whatever happens the worst part about this is the fact that i've learned nothing, i still do not understand what happened. I am now trying to simulate a new crash but it won't happen, the disks that crashed are not reporting any smart errors. Another theory is that the usb cable is faulty, i did try to jiggle it around to see if it caused the drives to loose connectivity but i was unable to, i have no idea what happened, but i am facing a month+ of work just to get back on track and no real idea how to proceed with the risk of everything just collapsing out of the blue from nowhere without any real problem found anywhere...

I may stop using encryption all together, i believe without it this recovery would not as difficult, i though i was safe by backing up the headers of the volumes, i didn't see this coming.

Locked