Helping people with computers... one answer at a time.
Normally, we use file extensions to identify a file's type. Without that, the next step is to look at the file's first few bytes or 'signature'.
I downloaded a few video files without extensions. I tried inserting all of the common extensions, but none of them would play. Is there a way to determine which format the files are in?
When it comes to video files, my gut answer is to say, "I don't know". Video file formats are a complex maze of twisty passages that are all alike.
Perhaps we can get a few clues - not only about your video files, but about other types of files.
Warning: This one gets geeky.
Many, though certainly not all, files begin with a series of fixed values that identify the type of file that they are.
A great example is the .EXE file. All .EXE files begin with two bytes: 4D, 5A. That's the hex value for the upper case letters MZ, the initials of the Microsoft engineer who defined the original file format. If the first two bytes of a file are 'MZ', then you're looking at a .EXE file (or a .DLL file and some other variants - all executable files are based on the same format).
Similarly, many other file types have these so called "signatures" as well.
Our goal will be to examine the first few bytes of a file and then use what we find there to see if we can determine the file format.
Unfortunately, the example of 'MZ' is too simple. Those happen to be printable characters, and yes, if you open a .EXE file in Notepad, you'll see 'MZ' at the beginning. Signatures in general, though, aren't necessarily printable characters that you can see.
That means that we need to look at the contents of the file in hexadecimal.
So we'll start by downloading a Hex Editor/Viewer - in this case, the freeware HxD.
Caution: HxD is a Hex editor, which means that you can also modify files. Be careful not to accidentally make changes unless you're absolutely positive that you know what you're doing - you can corrupt files, your system, and/or your hard drive by modifying the wrong things. Fortunately, HxD makes it obvious that you're changing things by displaying changes in red and includes proper confirmations and backup files by default.
Let's say that we're looking at a file called "example.foo". I'll open it up in HxD:
Here, we can see that the file begins with the hex character values 3F, 5F, 03, 00, 00 and so on. The first two also happen to be values for the question mark character and the underscore character. At this point, we don't know if that's intentional, but it doesn't matter. What we care about are the values in hexadecimal.
Naturally, there is no definitive list of what files have what signatures. However, there are a couple of collections. I was able to find this: File Signatures Table. It appears to be relatively complete and recently updated.
We simply scan down the table to look for an entry that begins with the first character - 3F:
In fact, there's only one. As you can see, files that begin with the characters 3F, 5F, 03, 00 are typically files associated with the Windows Help utility.
I can confirm that because I was on that team for a while. The fact that 3F, 5F represents a question mark and underscore (?_) is not a coincidence - those values were chosen as a signature because of their printable appearance.†
The file that I used for my example was C:\Windows\System32\speech.hlp, which is indeed an old-style Windows Help (WinHelp) help file:
As if the search and display weren't geeky enough, I also have to caution you to take care when scanning the table of signatures for possible matches. In short, make sure that what you have matches what you see; if there is more than one possibility, choose the longest candidate that matches.
The file format that you need may not be there. As I said, I've not found an exhaustive list.
The file format that you need might be ambiguous. Several of the signatures list more than one possible application. Perhaps with whatever additional knowledge that you may have of where the file came from, you can distinguish among the possibilities.
Knowing the file format might not be enough. AVI files are a great example; they're container files that can contain audio and video in many different encodings.
But hopefully armed with this information about file signatures, you can gather some additional clues as to what kind of file you might have.
† I also worked on a different help system prior to Windows Help where I defined the signature for the help file format. If a file begins with 4C, 4E, which represent the letters 'LN', you're probably looking at a Microsoft Advisor character mode help file.
Comments on this entry are closed.
If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.
If you don't find your answer, head out to http://askleo.com/ask to ask your question.