Helping people with computers... one answer at a time.
Domain names are simple in concept and yet can be constructed in ways that might fool you. I'll look at some examples, and discuss what's important.
Security when clicking onto a web site confounds me. Some sites put the section of the site you are wanting ahead of the web address. Example http://photos.kodak.com and some put the section after example http;//kodak.com/photos. These examples are just made up but I hope you understand what I'm saying. How do I know if I'm on the secure website I'm supposed to be on? At times I see other addresses flashing by on the toolbar that are not the site I clicked on before the actual site appears. I've never see anyone bringing up some of this query.
This simple little question opens up a veritable Pandora's box when it comes to URLS, and understanding what is and is not safe to click on.
The concepts are actually very simple, but the complexity in how those concepts can be combined is staggering. Particularly if someone is attempting to deceive you.
I'll try to make some sense of it all.
"URL" is short for Uniform Resource Locator. The most common one we know of is the web address - something like "http://ask-leo.com /how_do_i_know_that_this_web_address_is_safe.html".
There are three primary components to a URL; let's start by looking at what those are. We'll use this URL as our example for discussion:
http://www.somerandomservice.com - Server. This identifies the protocol (http - the language of web pages) and the server to contact. www.somerandomservice.com identifies a specific server on the internet from which what follows will be requested.
folder/page.html - Page. The page specifies exactly what it is you are requesting from the server. Typically it's a web page - perhaps within a folder on that server, but it might be a program to run on the server or a file to be downloaded.
parameter1=value2¶meter2=value2 - Parameters. - Parameters are information that is being supplied to the page. Since "pages" can often be small (or large) computer programs, information from the parameters part of a URL can be given to those programs to as items for them to act on.
URL Safety Rule #1: The Server specification ends at the first "/" that occurs after the "http://" start of the URL, and the Page specification ends at the first question mark after that. This rule is important to understanding whether a URL is valid, bogus or misleading.
I'll restate the first part of that rule to focus on what we care about:
The server being contacted begins after the "http://" and ends at the next "/".
Or, in this URL, the part that's highlighted:
That's the part that matters, because that's the part that tells your browser what server to connect to. Everything else is secondary. Important, yes, but not nearly as important.
Let's look at one of the ways that phishing attempts often try to fool you. Check out this URL:
It might be tempting to look at that quickly and say "oh, that ends in paypal.com, therefore it's Paypal!"
No it's not. Look again:
Actually that URL loads a page called "www.paypal.com" (a valid page name) from the server www.somerandomservice.com.
Now, my example is probably pretty lame, as "www.somerandomservice.com" is big and obvious at the front of that URL. But scammers use all sorts of variations on this theme to make it look like you're going to some place trusted, when you're not if you don't look closely.
For this we need to pick apart the way server names are created and used.
URL's are created from right to left, and the individual components are separated by a period. Consider "www.somerandomservice.com".
In general, fully qualified domain names like "www.somerandomservice.com" identify a server on the internet. "photos.somerandomserver.com" would typically be a different server, though it doesn't have to be.
The choice between using something like "photos.somerandomserver.com" versus "somerandomserver.com/photos" is purely one of site design and has no security implications. That's just how the person building the website chose to do it. There are geeky pros and cons to each, but for you as a typical web user it doesn't really matter.
What does matter is how subdomains can be abused. For example, it's perfectly possible for this to be a valid domain:
Once again, with only a quick glance, you might think it was actually paypal.com since it started with "http://www.paypal.com".
In that example "www.paypal.com." is just a subdomain created by the owner of "somerandomservice.com".
Here's a worse example:
Once again, it's designed to fool you into looking like paypal.com, but in fact it's not - especially if your browser happens to only show you the first part of the URL in your status bar since it's so long.
And once again, scammers often use many different variations on this technique to trick you.
This was brought up by a comment on this article (thanks Ken!), and is important enough to warrant an update.
Characters in URLs can be "encoded" with a special representation that acts the same as the character it encodes. The format is a percent sign followed by a two digit hexadecimal number (individual digits will be 0-9 or A-F). A space character, for example, is %20, and you'll actually see that in legitimate URLs from time to time since a an actual space character cannot be used.
%2F is the slash character "/".
So this rule:
The server being contacted begins after the "http://" and ends at the next "/".
Still applies, but %2F could be seen in place of "/". More correctly:
The server being contacted begins after the "http:", "/" or "%2F", "/" or "%2F" and ends at the next "/" or "%2F".
It gets ugly, but the thing to remember is just this: %2F is exactly the same as "/".
Here's an example of how it might be abused:
That is NOT Paypal. Replace the %2F with "/" and you'll see instead:
Clearly it goes to somerandomservice.com.
As Ken points out in his comment, any URL with a % notation in the server portion is suspect. % notation after the server portion (in the page or more commonly the parameters) is typically OK.
All of the above is unrelated to what we normally think of as a "secure" website: namely the use of https (note the "s") as the protocol. Https does two important things:
It encrypts the data flowing between your computer and the server.
It validates that the server you connect to is, in fact, the server you requested.
Note that https doesn't validate you're connecting to the server you think you are, it validates that you're connecting to the server you requested. Those are two different things.
For example, let's say you fall for one of my lame examples above and click on a link like this:
That's an https connection. It is very possible - not even all that hard actually - for the owner of somerandomservice.com to purchase and install a completely valid https certificate for www.paypal.com.somerandomservice.com.
Thus when you click on that link your browser will confirm that you are indeed connecting to what you asked for: www.paypal.com.somerandomservice.com. That might not be what you think you asked for, if you fell for a scammers trick, but that's all that https can validate for you: you got what you asked for.
It's unfortunate that something that's fairly simple is actually quite complex once you assume that people will attempt to deceive you.
I'll sum it up with this:
Pay close attention to the domain name, that's everything between "http://" and the next "/", in any URL you are about to click on. Remember that domain names build from the right, so if it ends in, for example, ".paypal.com" you can be assured that it's a domain or sub-domain owned by paypal.com.
Comments on this entry are closed.
If you have a question, start by using the search box up at the top of the page - there's a very good chance that your question has already been answered on Ask Leo!.
If you don't find your answer, head out to http://askleo.com/ask to ask your question.