Page 1 of 1

Photorec recovers weird .doc files

Posted: 18 Nov 2012, 13:45
by SadlyMistaken
¡¡Hello!!

I just read the list of files photorec recovers.
In my test, it recovered a lot of file extensions...
.ini .jpg .png .txt . mpg .avi
¡¡this is perfect!!
But there are some .doc that i think they are not really .doc and no .ods... or that kind of text files
I am in ubuntu (linux) and i usually doesn't have .doc files..
I would love to know how to know the real extension of these files... they usually have this inside:

Code: Select all

����������R#o#o#t# #E#n#t#r#y#################################################��������################################`	g�`��#####�#######1###################################################################����####����########################################�#######C#a#t#a#l#o#g#################################
where i can read a lot of signs and "Root Entry"... "Catalog"...
I tried to rename the file extension into .zip, .jpg, .png, .gif, .mp3, .flv, .ods, and a long etc... but it never works..

:? what kind of files are these one?
Are they ex-folders?

I hope you can help me.
I am sorry for my bad english.
Thanks a lot.

Re: Photorec recovers weird .doc files

Posted: 02 Dec 2012, 16:29
by cgrenier
There are probably thumbs.db files.
You are probably running PhotoRec 6.11. Use PhotoRec 6.14-WIP to get the correct file extension.

Re: Photorec recovers weird .doc files

Posted: 04 Dec 2012, 00:15
by andy16h
I'm running 6.14-WIP and getting the same problem. I also have a lot of .txt files that have

Code: Select all

UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUULAME3.89 (alpha)UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU
or

Code: Select all

:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:iX='http://ns.adobe.com/iX/1.0/'>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:pdf='http://ns.adobe.com/pdf/1.3/'>
 </rdf:Description>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:photoshop='http://ns.adobe.com/photoshop/1.0/'>
  <photoshop:History></photoshop:History>
 </rdf:Description>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:tiff='http://ns.adobe.com/tiff/1.0/'>
 </rdf:Description>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:xap='http://ns.adobe.com/xap/1.0/'>
  <xap:CreateDate>2005-04-14T10:30:24-08:00</xap:CreateDate>
  <xap:ModifyDate>2005-04-14T10:30:24-08:00</xap:ModifyDate>
  <xap:MetadataDate>2005-04-14T10:30:24-08:00</xap:MetadataDate>
  <xap:CreatorTool>Adobe Photoshop CS Windows</xap:CreatorTool>
 </rdf:Description>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'>
  <xapMM:DocumentID>adobe:docid:photoshop:aabe4b28-aca1-11d9-8bcd-98e32862e0fa</xapMM:DocumentID>
 </rdf:Description>

 <rdf:Description rdf:about='uuid:aabe4b29-aca1-11d9-8bcd-98e32862e0fa'
  xmlns:dc='http://purl.org/dc/elements/1.1/'>
  <dc:format>image/tiff</dc:format>
 </rdf:Description>

</rdf:RDF>
</x:xmpmeta>
                                                                                                    
 <?xpacket end='w'?>8BIM
I would love to know what these files are.

Re: Photorec recovers weird .doc files

Posted: 05 Dec 2012, 17:16
by stumpyuk
Text files do not have a signature. Essentially, photorec checks to see whether the data it has found has the characteristics of text. Thus, it may detect a fragment of XML (which is text based) for instance and recover it as plain text. There is more of an explanation here:
http://www.cgsecurity.org/wiki/PhotoRec_Data_Carving

Re: Photorec recovers weird .doc files

Posted: 08 Dec 2012, 17:13
by cgrenier
andy16h, run PhotoRec, in FileOpts, disable txt and tx? file familly and start a recovery. Please send me some tiff or jpg files now recovered that weren't previously, I will try to improve PhotoRec to deal with this problem.

Re: Photorec recovers weird .doc files

Posted: 05 Jan 2013, 21:25
by MikeHalloran
On that subject, it would be nice to have an option to recover ONLY .txt and NOT .xml, .c, .h and related files.

Okay, I know that xml tags might occasionally appear in a simple text file, and even lines of legitimate C code and such, but unprintable characters are never present, and there has to be a way to distinguish a file that's mostly plain text from a file that's mostly something else.

Thanks for anything you can do.