Helping people with computers... one answer at a time.

Normally, we use file extensions to identify a file's type. Without that, the next step is to look at the file's first few bytes or 'signature'.

I downloaded a few video files without extensions. I tried inserting all of the common extensions, but none of them would play. Is there a way to determine which format the files are in?

When it comes to video files, my gut answer is to say, "I don't know". Video file formats are a complex maze of twisty passages that are all alike.

Perhaps we can get a few clues - not only about your video files, but about other types of files.

Warning: This one gets geeky. Smile

File signatures

Many, though certainly not all, files begin with a series of fixed values that identify the type of file that they are.

"Naturally, there is no definitive list of what files have what signatures."

A great example is the .EXE file. All .EXE files begin with two bytes: 4D, 5A. That's the hex value for the upper case letters MZ, the initials of the Microsoft engineer who defined the original file format. If the first two bytes of a file are 'MZ', then you're looking at a .EXE file (or a .DLL file and some other variants - all executable files are based on the same format).

Similarly, many other file types have these so called "signatures" as well.

Our goal will be to examine the first few bytes of a file and then use what we find there to see if we can determine the file format.

Examining the file

Unfortunately, the example of 'MZ' is too simple. Those happen to be printable characters, and yes, if you open a .EXE file in Notepad, you'll see 'MZ' at the beginning. Signatures in general, though, aren't necessarily printable characters that you can see.

That means that we need to look at the contents of the file in hexadecimal.

So we'll start by downloading a Hex Editor/Viewer - in this case, the freeware HxD.

Caution: HxD is a Hex editor, which means that you can also modify files. Be careful not to accidentally make changes unless you're absolutely positive that you know what you're doing - you can corrupt files, your system, and/or your hard drive by modifying the wrong things. Fortunately, HxD makes it obvious that you're changing things by displaying changes in red and includes proper confirmations and backup files by default.

Let's say that we're looking at a file called "example.foo". I'll open it up in HxD:

HxD Hex Editor open on our example file, example.foo

Here, we can see that the file begins with the hex character values 3F, 5F, 03, 00, 00 and so on. The first two also happen to be values for the question mark character and the underscore character. At this point, we don't know if that's intentional, but it doesn't matter. What we care about are the values in hexadecimal.

Finding the signature

Naturally, there is no definitive list of what files have what signatures. However, there are a couple of collections. I was able to find this: File Signatures Table. It appears to be relatively complete and recently updated.

We simply scan down the table to look for an entry that begins with the first character - 3F:

3F 5F file signature table entry

In fact, there's only one. As you can see, files that begin with the characters 3F, 5F, 03, 00 are typically files associated with the Windows Help utility.

I can confirm that because I was on that team for a while. The fact that 3F, 5F represents a question mark and underscore (?_) is not a coincidence - those values were chosen as a signature because of their printable appearance.

The file that I used for my example was C:\Windows\System32\speech.hlp, which is indeed an old-style Windows Help (WinHelp) help file:

Our example file, renamed back to speech.hlp, and opened in the WinHelp viewer.

Finding signatures

As if the search and display weren't geeky enough, I also have to caution you to take care when scanning the table of signatures for possible matches. In short, make sure that what you have matches what you see; if there is more than one possibility, choose the longest candidate that matches.

The file format that you need may not be there. As I said, I've not found an exhaustive list.

The file format that you need might be ambiguous. Several of the signatures list more than one possible application. Perhaps with whatever additional knowledge that you may have of where the file came from, you can distinguish among the possibilities.

Knowing the file format might not be enough. AVI files are a great example; they're container files that can contain audio and video in many different encodings.

But hopefully armed with this information about file signatures, you can gather some additional clues as to what kind of file you might have.

† I also worked on a different help system prior to Windows Help where I defined the signature for the help file format. If a file begins with 4C, 4E, which represent the letters 'LN', you're probably looking at a Microsoft Advisor character mode help file.

Article C4855 - June 24, 2011 « »

Share this article with your friends:

Share this article on Facebook Tweet this article Email a link to this article
Leo Leo A. Notenboom has been playing with computers since he was required to take a programming class in 1976. An 18 year career as a programmer at Microsoft soon followed. After "retiring" in 2001, Leo started Ask Leo! in 2003 as a place for answers to common computer and technical questions. More about Leo.

Not what you needed?

17 Comments
anonymous
June 27, 2011 7:35 PM

Nice info. Now I know what the two bytes at the beginning of QBASIC.HLP stands for :-) The contents of the file seems to be compressed, as it is garbled and contains a word list of some sort. I wonder whether it is compressed using one of the compression algorithms you got patents for.

It is indeed. This patent and a subsequent one cover the compression algorithms developed for that help system. The word list you see is a list of frequently-occuring words that are replaced by shorter tokens in one phase of the compression.
Leo
29-Jun-2011

Mark J
June 27, 2011 11:13 PM

In the case of a video file without an extension, I've found that adding the .avi extension works most of the time, even though it may not be the correct one. I use VLC media player, and in most cases VLC reads the signature and opens it correctly.

Coly Moore
June 28, 2011 8:46 AM

It's been ages since I saw "maze of twisty passages that are all alike". Nostalgia time..

Plugh Smile
Leo
29-Jun-2011

Venus
June 28, 2011 9:09 AM

This is extraordinarily helpful, thank you!

I wonder, what's your opinion of the file characterization (exiftool-based) that is used at virustotal.com or the characterizations at threatsense,com? If you have ever submitted a file or link at a file/link evaluation engine, I'd love to know which ones you find useful.

Look forward to your newsletters, your writing is simple to understand on complex topics, and you cover the basics that most people skip. Thanks!
Venus V.

HmS-PA
June 28, 2011 9:49 AM

Why not set the Windows Explorer (not IExplorer -web browser) to show extensions? Many virus may come looking like Foobar.txt and actually be Foobar.txt.exe making it a executable that you get zinged with. It doesn't click that the .txt is showing and no other extensions show. Setting Explorer to show those could save a lot of grief.

The questioner actually had a file with no extension at all. But yes, Windows Explorer should be set to always show extensions for safety's sake as you describe.
Leo
29-Jun-2011

Steve K
June 28, 2011 10:54 AM

I'm sure that the OP is dealing with a file that was saved with no extension. Otherwise, just double-clicking it (even if extensions are hidden) would invoke the correct video player to open it.

Mike
June 28, 2011 11:18 AM

I've gotten some video files in which the extension was misattributed (i.e., mpg for a wmv file). WMP will advise that it's misnamed, but offer to try playing it anyway, which it does most of the time. As others have noted, VLC tends to play any video file that isn't actually corrupted (and even then, sometimes). As a purist (who leaves extensions visible), I still like to appropriate the correct extension, regardless. Obviously, all the info in Properties hasn't been forthcoming, so I'm left with trying every usual extension until WMP no longer gives me the warning. Not exactly scientific, and not even assuredly correct, but it's the best I have. At least I'm comforted that it's not due to my singular lack of knowledge.

Gigi
June 28, 2011 12:40 PM

The easiest solution is to open the file with http://mediainfo.sourceforge.net/en find the correct format and rename the file.

Tomas
June 28, 2011 1:10 PM

There is a free tool which will analyze the video file end present what type of video and audio codecs that is used in the analyzed file. You'll find it here:
http://www.headbands.com/gspot/

Greg Jackson
June 28, 2011 1:29 PM

Egads- Personally, I wouln't want to deal (or touch) a file w/o an extension - unless it was life or death. Is it? If it was.... know the source, scan with AV. Then, use VLC or MediaPlayer - if they can't play it, I would dump it. For reasons of security and state of mind, the end results may never be justified (in my previous experience, it never has). I'm just saying....

sharon
June 28, 2011 3:06 PM

CCCP-Insurgent-2007-01-01.exe (downloadable from www.cccp-project.net) is also useful for getting information out of a video file. Tools > Media Information.

Bob
June 28, 2011 4:06 PM

Regarding: "... complex maze of twisty passages, all alike." I imagine most people won't catch this, but it immediately took me back to the original Adventure game and the moment in the 70's when I finally understood what was going on in the maze of twisty passages, all alike. Thanks for including the (implicit) reference.

I figured a couple of people might recognize it. For those that don't it's a reference to Colossal Cave Adventure which I played years ago on a CDC mainframe in school, and then later on my first home computer, an Apple ][.
Leo
29-Jun-2011

Will
June 29, 2011 8:10 AM

In a parallel case, I sometimes download pictures that arrive with no file type attached. I add .jpg. Even if it isn't a jpg anything opening JPGs will probably open whatever it is.

Cezar
June 30, 2011 4:50 AM

Sounds complicated with all that programming language, I thought changing the file settings in control panel would be solve the problem but this is out of my league.

G. Adams
July 3, 2011 12:11 PM

For video files, download the avidemux program, which is freeware that is useful and powerful, but complicated. When you load a video file into the program, you can click in the "i" button to get details on what type of file it is.
KMPlayer, also free, will do the same when you right click on the screen and ask for the file information, video or audio. It will also tell you if an mp3 is DRM protected. Good media player, too.

BreezeRichard(TERRY)
July 3, 2011 4:06 PM

My mentor( Local) as opposed To ASK-Leo,
Sent me a file I can@t open So I downloaded these pages to try out its ideas!
Will advise success (if it works)
Thank you for addressing yet another help thingey
Terry B

Robin Clay
July 5, 2011 12:18 PM

So... having read the comments, I thought VLC might me a useful program to have, so I tried to fetch it from
http://www.vlcmediaplayer.org/download.html
whither Google had sent me.

But McAfee threw out:-
Adware-HotBar.d
as a "potentially unwanted program"
so I aborted the download.

Just f.y.i.

Comments on this entry are closed.

If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.

If you don't find your answer, head out to http://askleo.com/ask to ask your question.