Page 1 of 1
Photorec recovers weird .doc files
Posted: 18 Nov 2012, 13:45
by SadlyMistaken
¡¡Hello!!
I just read the list of files photorec recovers.
In my test, it recovered a lot of file extensions...
.ini .jpg .png .txt . mpg .avi
¡¡this is perfect!!
But there are some .doc that i think they are not really .doc and no .ods... or that kind of text files
I am in ubuntu (linux) and i usually doesn't have .doc files..
I would love to know how to know the real extension of these files... they usually have this inside:
Code: Select all
����������R#o#o#t# #E#n#t#r#y#################################################��������################################` g�`��#####�#######1###################################################################����####����########################################�#######C#a#t#a#l#o#g#################################
where i can read a lot of signs and "Root Entry"... "Catalog"...
I tried to rename the file extension into .zip, .jpg, .png, .gif, .mp3, .flv, .ods, and a long etc... but it never works..
what kind of files are these one?
Are they ex-folders?
I hope you can help me.
I am sorry for my bad english.
Thanks a lot.
Re: Photorec recovers weird .doc files
Posted: 02 Dec 2012, 16:29
by cgrenier
There are probably thumbs.db files.
You are probably running PhotoRec 6.11. Use PhotoRec 6.14-WIP to get the correct file extension.
Re: Photorec recovers weird .doc files
Posted: 04 Dec 2012, 00:15
by andy16h
I'm running 6.14-WIP and getting the same problem. I also have a lot of .txt files that have
Code: Select all
UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUULAME3.89 (alpha)UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
or
Code: Select all
:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:iX='http://ns.adobe.com/iX/1.0/'>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:pdf='http://ns.adobe.com/pdf/1.3/'>
</rdf:Description>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:photoshop='http://ns.adobe.com/photoshop/1.0/'>
<photoshop:History></photoshop:History>
</rdf:Description>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:tiff='http://ns.adobe.com/tiff/1.0/'>
</rdf:Description>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:xap='http://ns.adobe.com/xap/1.0/'>
<xap:CreateDate>2005-04-14T10:30:24-08:00</xap:CreateDate>
<xap:ModifyDate>2005-04-14T10:30:24-08:00</xap:ModifyDate>
<xap:MetadataDate>2005-04-14T10:30:24-08:00</xap:MetadataDate>
<xap:CreatorTool>Adobe Photoshop CS Windows</xap:CreatorTool>
</rdf:Description>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'>
<xapMM:DocumentID>adobe:docid:photoshop:aabe4b28-aca1-11d9-8bcd-98e32862e0fa</xapMM:DocumentID>
</rdf:Description>
<rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
xmlns:dc='http://purl.org/dc/elements/1.1/'>
<dc:format>image/tiff</dc:format>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>8BIM
I would love to know what these files are.
Re: Photorec recovers weird .doc files
Posted: 05 Dec 2012, 17:16
by stumpyuk
Text files do not have a signature. Essentially, photorec checks to see whether the data it has found has the characteristics of text. Thus, it may detect a fragment of XML (which is text based) for instance and recover it as plain text. There is more of an explanation here:
http://www.cgsecurity.org/wiki/PhotoRec_Data_Carving
Re: Photorec recovers weird .doc files
Posted: 08 Dec 2012, 17:13
by cgrenier
andy16h, run PhotoRec, in FileOpts, disable txt and tx? file familly and start a recovery. Please send me some tiff or jpg files now recovered that weren't previously, I will try to improve PhotoRec to deal with this problem.
Re: Photorec recovers weird .doc files
Posted: 05 Jan 2013, 21:25
by MikeHalloran
On that subject, it would be nice to have an option to recover ONLY .txt and NOT .xml, .c, .h and related files.
Okay, I know that xml tags might occasionally appear in a simple text file, and even lines of legitimate C code and such, but unprintable characters are never present, and there has to be a way to distinguish a file that's mostly plain text from a file that's mostly something else.
Thanks for anything you can do.