When trying to recover FAT32(?) filesystem, photorec hangs in photorec_find_blocksize()
Posted: 10 Jan 2022, 05:36
I somehow "corrupted" (it's unreadable on Windows, but I later discovered it mounts fine on Linux) a FAT32 filesystem on an old hard drive. When trying to recover it in TestDisk, I got the message "No file found, filesystem may be damaged". Looking online I found viewtopic.php?t=3052:
I couldn't debug this on Windows, but on Arch Linux, I managed to build PhotoRec with debug symbols enabled, and trace it in gdb. I ran PhotoRec to the "FAT filesystem was beginning before the actual partition." message, then attached gdb and added a breakpoint at photorec_find_blocksize with stack trace:
Afterwards, I added a breakpoint on line 103 (the beginning of the endless while loop). Before the first loop iteration, list_search_space was (alloc_data_t *) 0x55acb525a300, and current_search_space was (alloc_data_t *) 0x55acb6bfc410. I typed continue, which broke on line 103 again, and current_search_space was still (alloc_data_t *) 0x55acb6bfc410 (unchanged!). If I delete all breakpoints and type finish, the function never returns.
I'm not sure how to fix this bug, and didn't debug further than that. I could run commands or supply partial disk images if needed.
----
EDIT: I found what was wrong with my disk: https://bugzilla.gnome.org/show_bug.cgi?id=759916#c21:
I still think Linux's fsck.vfat needs to be changed to recognize and fix this error, and testdisk should ideally recognize this type of corrupted disk and restore it for you, and photorec shouldn't enter an infinite loop when trying to unformat this type of broken partition.
I tried running it on my disk, but with 7.1 on Windows, 7.1 on Arch Linux, 7.2 WIP, and git, every time it prints "FAT filesystem was beginning before the actual partition.", and when I confirm Ok, it hangs burning a CPU core and not performing any disk IO at all (according to Windows taskmgr and Arch Linux iotop):Try PhotoRec. In Options enable the expert mode, start a recovery on the Whole space of the FAT32 partition,
when asked, tell PhotoRec to try the unformat method. You may be able to recover your files with the original filenames.
Code: Select all
PhotoRec 7.2-WIP, Data Recovery Utility, May 2021
Christophe GRENIER <grenier@cgsecurity.org>
https://www.cgsecurity.org
Disk /dev/sdc - 500 GB / 465 GiB (RO) - Seagate FreeAgent GoFlex
Partition Start End Size in sectors
1 P FAT32 0 32 33 12959 179 20 208195584
...
Stop
Code: Select all
#0 photorec_find_blocksize (params=params@entry=0x7ffca2a4d3d0, options=options@entry=0x7ffca2a4d3b0, list_search_space=list_search_space@entry=0x55acb525a300 <list_search_space>) at phbs.c:74
#1 0x000055acb5225b7e in photorec (params=params@entry=0x7ffca2a4d3d0, options=options@entry=0x7ffca2a4d3b0, list_search_space=list_search_space@entry=0x55acb525a300 <list_search_space>) at phrecn.c:338
#2 0x000055acb5226f9a in menu_photorec (params=params@entry=0x7ffca2a4d3d0, options=options@entry=0x7ffca2a4d3b0, list_search_space=list_search_space@entry=0x55acb525a300 <list_search_space>) at ppartseln.c:288
#3 0x000055acb522265d in photorec_disk_selection_ncurses (list_search_space=0x55acb525a300 <list_search_space>, list_disk=0x55acb6b0c9a0, options=0x7ffca2a4d3b0, params=0x7ffca2a4d3d0) at pdiskseln.c:252
#4 do_curses_photorec (params=params@entry=0x7ffca2a4d3d0, options=options@entry=0x7ffca2a4d3b0, list_disk=list_disk@entry=0x55acb6b0c9a0) at pdiskseln.c:348
#5 0x000055acb51cd3a0 in main (argc=2, argv=0x7ffca2a4d558) at phmain.c:393
I'm not sure how to fix this bug, and didn't debug further than that. I could run commands or supply partial disk images if needed.
----
EDIT: I found what was wrong with my disk: https://bugzilla.gnome.org/show_bug.cgi?id=759916#c21:
I ran this command in bash (not zsh), and it fixed a partition on my hard drive with the same issue (previously, Windows and testdisk couldn't recognize the filesystem on the partition). I didn't test on fish though.In case it can help someone, here's the oneliner I've used to fix the
broken FS on a USB key, based on the comments above:
Be careful to target the right partition (/dev/sdb1 in my case,Code: Select all
$ echo -ne '\xeb\x58\x90' | sudo dd conv=notrunc bs=1 count=3 of=/dev/sdb1
probably something else in yours) and to first test the command line
on a text file to make sure the hexadecimal is properly interpreted by
your shell (the above works in ZSH, but with other shell, you might
have to double the backslashes).
I still think Linux's fsck.vfat needs to be changed to recognize and fix this error, and testdisk should ideally recognize this type of corrupted disk and restore it for you, and photorec shouldn't enter an infinite loop when trying to unformat this type of broken partition.