[language-agnostic] What are sessions? How do they work?

I am just beginning to start learning web application development, using python. I am coming across the terms 'cookies' and 'sessions'. I understand cookies in that they store some info in a key value pair on the browser. But I have a little confusion regarding sessions, in a session too we store data in a cookie on the user's browser.

For example - I login using username='rasmus' and password='default'. In such a case the data will be posted to the server which is supposed to check and log me in if authenticated. However during the entire process the server also generates a session ID which will be stored in a cookie on my browser. Now the server also stores this session ID in its file system or datastore.

But based on just the session ID, how would it be able to know my username during my subsequent traversal through the site? Does it store the data on the server as a dict where the key would be a session ID and details like username, email etc. be the values?

I am getting quite confused here. Need help.

This question is related to language-agnostic session

The answer is


Think of HTTP as a person(A) who has SHORT TERM MEMORY LOSS and forgets every person as soon as that person goes out of sight.

Now, to remember different persons, A takes a photo of that person and keeps it. Each Person's pic has an ID number. When that person comes again in sight, that person tells it's ID number to A and A finds their picture by ID number. And voila !!, A knows who is that person.

Same is with HTTP. It is suffering from SHORT TERM MEMORY LOSS. It uses Sessions to record everything you did while using a website, and then, when you come again, it identifies you with the help of Cookies(Cookie is like a token). Picture is the Session here, and ID is the Cookie here.


Simple Explanation by analogy

Imagine you are in a bank, trying to get some money out of your account. But it's dark; the bank is pitch black: there's no light and you can't see your hand in front of your face. You are surrounded by another 20 people. They all look the same. And everybody has the same voice. And everyone is a potential bad guy. In other words, HTTP is stateless.

This bank is a funny type of bank - for the sake of argument here's how things work:

  1. you wait in line (or on-line) and you talk to the teller: you make a request to withdraw money, and then
  2. you have to wait briefly on the sofa, and 20 minutes later
  3. you have to go and actually collect your money from the teller.

But how will the teller tell you apart from everyone else?

The teller can't see or readily recognise you, remember, because the lights are all out. What if your teller gives your $10,000 withdrawal to someone else - the wrong person?! It's absolutely vital that the teller can recognise you as the one who made the withdrawal, so that you can get the money (or resource) that you asked for.

Solution:

When you first appear to the teller, he or she tells you something in secret:

"When ever you are talking to me," says the teller, "you should first identify yourlself as GNASHEU329 - that way I know it's you".

Nobody else knows the secret passcode.

Example of How I Withdrew Cash:

So I decide to go to and chill out for 20 minutes and then later i go to the teller and say "I'd like to collect my withdrawal"

The teller asks me: "who are you??!"

"It's me, Mr George Banks!"

"Prove it!"

And then I tell them my passcode: GNASHEU329

"Certainly Mr Banks!"

That basically is how a session works. It allows one to be uniquely identified in a sea of millions of people. You need to identify yourself every time you deal with the teller.

If you got any questions or are unclear - please post comment and i will try to clear it up for you. The following is not strictly speaking, completely accurate in its terminology, but I hope it's helpful to you in understanding concepts.

Explanation via Pictures:

Sessions explained via Picture


HTTP is stateless connection protocol, that is, the server cannot differentiate between different connections of different users.

Hence comes cookie, once a client connects first time to a server, the server generates a new session id, which later will be sent to the client as cookie value. And from now on, this session id will identify that client connection, because within each HTTP request it will see the appropriate session id inside cookies.

Now for each session id, the server keeps some data structure, which enables him to store data specific to user, this data structure you can abstractly call session.


"Session" is the term used to refer to a user's time browsing a web site. It's meant to represent the time between their first arrival at a page in the site until the time they stop using the site. In practice, it's impossible to know when the user is done with the site. In most servers there's a timeout that automatically ends a session unless another page is requested by the same user.

The first time a user connects some kind of session ID is created (how it's done depends on the web server software and the type of authentication/login you're using on the site). Like cookies, this usually doesn't get sent in the URL anymore because it's a security problem. Instead it's stored along with a bunch of other stuff that collectively is also referred to as the session. Session variables are like cookies - they're name-value pairs sent along with a request for a page, and returned with the page from the server - but their names are defined in a web standard.

Some session variables are passed as HTTP headers. They're passed back and forth behind the scenes of every page browse so they don't show up in the browser and tell everybody something that may be private. Among them are the USER_AGENT, or type of browser requesting the page, the REFERRER or the page that linked to the page being requested, etc. Some web server software adds their own headers or transfer additional session data specific to the server software. But the standard ones are pretty well documented.

Hope that helps.