Good midday, 54.80.157.133.
Today is friday the 24th of November 2017. The time is 11:06:15 and it's week number 47.

(2010-10-15) Can I Haz Cookie?

The world entered a new epoch in early 90s, when Tim Berners-Lee invented what would later become the World Wide Web as we now know it. From its humble beginning it has grown to become the ubiquitous platform of almost everything you do on the Internet.

But there was no way anyone could have guessed how people would actually use the new invention, and thus it entered this world ill-prepared for its coming task. The standard that defined how a web page should be displayed was a markup-language, meaning it only defined things like text styles and paragraphs. It had no conditional logic, so a web page could not do something like changing its background to black if the page was loaded at night. This was unless the web server itself did this before serving up the site to the user.

But there was a more underlying limitation to the World Wide Web. The http protocol that was and still today is used to let the user request and receive web pages and objects like pictures was stateless. In layman’s terms: it had amnesia. It had no way of remembering it was you who downloaded it five seconds ago. Therefore there was no way of saving preferences or customizing the data for the user. Your IP-address may change over time, so it could not be used as identification either. Please note that all this was not an oversight. It was what the creators intended: a way of retrieving and reading texts from all over the world. But as time passed it was clear that the web really needed to keep track of state.

Over the years that followed, many technologies emerged that bridged the gap between the capabilities of the protocols and standards and what people wanted to use the web for. The conditional logic problem was solved by technologies such as JavaScript, Java and Active X. A discussion of the security problems those technologies brought with them are out of the scope of this discussion, but you probably already know something about them.

The state problem was solved primarily by something called cookies. A cookie is an token that your web browser can store and use to identify your session or store any type of information. By this time I must point out that when you surf the web, most of the work is done by your web browser on the “client side”. The web server itself is responsible for receiving the requests and delivering the content to the client. This, I admit, is a huge oversimplification. But the most common reason for cookies is to make sure you are who you claim to be. This is called as authentication and hereby referred to as cookie authentication.

Rule: When I claim to be a user, I identify myself. When the server challenges me to prove that I am who I say that I am, I go through the process of authentication. When the server accepts my proof of identity (e.g. that I know the password) I am considered to be authenticated. When I ask for access and permission to use a resource in a certain manner, the process that checks that I am allowed to do so is called authorization.

The general flow of cookie authentication is that cookies are set by the server and then stored by the client’s web browser. Inside each cookie is a randomly assigned session number that corresponds to the session information that the server keeps. The client then presents the cookies to the server as you request pages and objects from the server. The server recognizes the number as a valid session and thus knows what the user should be able to see. Please note that the cookie here is used to authenticate the user.

The session uses the cookie you got as a badge on a convention: it shows that you at some time were successfully authenticated.

This is the backbone of most authentications on the web. There is off course the possibility to authenticate on every request, which is how many Microsoft IIS web applications does it. This can create a huge overhead and slow down the process down.

For the process to be secure the cookie must be sent over an encrypted connection and the session must time out. A session must only be good for a limited time. The server generally keeps track of this as no web server must ever rely on the client to handle the authentication and authorization process.

What gets stored in the cookie is up to the web application/site. The most common thing is a session number that the server uses to keep track of you. Without this mechanism authentication will not work, so it’s commonly used by webmail and everywhere else that you must login.

The problem is that cookies may have other, more debatable uses such as the “third party cookies” used by advertisement companies. Those companies generally have a large number of sites that show their advertisements they act as a broker between the companies that buy ad space and the web sites showing them. This means that throughout the day you will most likely display ads from a number of sites using the same ad company. And they store cookies that identify your session as you go from site to site. So the ad companies will know that your id e.g has surfed between a car manufacturer, an online gaming site and the site of a local brewery. Those patterns are valuable for the advertisers. A more common name for those cookies is “tracking cookies”. This is old news and those cookies are generally albeit often reluctantly accepted, so let’s leave this for now and get into another security problem.

Cookies add a number of security threats and the ways to exploit those changes as web browsers get better security but also as they get new features. The latter often increases the risk. When tabbed browsing became popular, people often logged into different sites in different tabs. This made it possible for malicious sites to abuse this to access the sites in the other tabs posing as the logged in user. This was often a “shot in the dark”, but could work quite well with a bit of guess work. Web browsers now impose restrictions that hinder those things.

Another risk is vulnerabilities that allow malicious sites to access the cookies directly. This is normally not possible, but every now then vulnerabilities in different vendor’s browsers are found that makes this possible. A well behaved web application should only save a randomly generated session key in the cookie, but cookies containing the user’s name and password have not been unheard of. And there have even been attacks against session cookies due to flawed randomization in some web applications. This meant your could send a large number of random session numbers to a server until you matched a live session and then take it over.

The combination between JavaScript and cookie based authentication can be particularly nasty, as it may allow an attacker to grab authentication data and send it to a hacker’s site. Those attacks are known as “Cross Site Request Forgery”.

So what can you do about this? Common sense applies. I generally use one browser for work and another one to browse the web. I also am a fan of the No Script plug-in for Firefox. It disables most unsafe scripting unless I chose to add the site I’m accessing to a trust list. Apply the latest patches and think about where you “go”. Run your browser unprivileged, or even better: log in as a non-privileged user. And don’t forget that software such as Adobe Acrobat reader automatically installs plug-ins into all browsers it can find when you install it. A vulnerability in Acrobat Reader can then open a way for a hacker to attack your client through your web browser. This applies to other applications such a Skype, because the add themselves into the browser.

As a system administrator or developer, I suggest reading up on those issues. OWASP has a very good guide that lists the most common attacks and how to avoid them.

Posted: 2010-10-15 by Erik Zalitis
Changed: 2013-03-19 by Erik Zalitis

News archive