Tor, Concealed Identity, and Privacy: how it works and what are the limitations
Tor is one of the open source project I’m most fond of, because of the quality of the service and the admirable mission behind it: improving privacy and ensuring concealed identities to the users during the access to the web resources. Tor is easily accessible on Android devices, but before explaining how to browse anonymously the web on your smartphone, I think it might be beneficial explaining how Tor works and what are its limitations.
Even if there are numerous benefits in using Tor, the concealed identity it offers is limited; hence why I highly recommend reading this article before starting using it. However, if you cannot wait to start using it, read the article on how to browse using Tor on your Android device.
Please notice: this is not an explicative article and it doesn’t replace Tor’s official guidelines. If you need an extra-safe, concealed identity due to serious reasons, please make sure to read Tor’s official documents, and ask for support to the community that moderates the project.
HOW TOR WORKS (IN A NUTSHELL)
Tor is a network of servers (also knows as nodes) operated by volunteers that give access to their resources as well as web traffic to increase privacy and safety on the Internet. When a user or a client connect to a website or a web service through Tor, the connection doesn’t happen directly; it goes through a number of steps involving a certain number of Tor servers. The data travel like if they were passing the torch from one server to the next one, until they arrive to destination. Before being sent out, the data are encrypted as many times as the number of servers involved in the process. The data are protected by an encrypted structure with many layers, like an onion. Tor is in fact the acronym for The Onion Routing Network, and its logo features an onion for this reason. Each layer represents an encrypted level that only the respective Tor’s server can solve. This way, each Tor node can solve the encryption only of the nodes that are behind and before it in the process; nobody, besides the client, knows the whole process the data go through. It works the same way for the ISPs (Internet Server Providers) involved in the communication: they have no access to the complete process of the data.
If you have enough already of technical details, you can skip to the next paragraph. If the topic is growing on you, I’ll share a few examples that will make you understand in detail what I briefly summarized in the above paragraph.
In the following examples we assume that client A wants to visualize a page of Website B, and therefore it sends a message to B saying “MESSAGE”.
WHAT HAPPENS IF YOU DON’T USE TOR?
In this case, as you can see in the top part of the picture, A sends out the message through the ISP that provides Internet connection (if the device is a smartphone, the ISP will be the telephone company). Besides the ISP, the message goes through different servers before getting to B. I indicated the servers the message goes through with small grey clouds, and I called them Intermediaries (“I” for short).
The question is: who knows what?
- A: knows the sender, the recipient, and the content of the message.
- i: knows the sender, the recipient, and the content of the message.
- B: knows the sender, the recipient, and the content of the message.
Clearly, this is the worst case if we consider the matter of privacy.
Things improve slightly if the connection with B happens through a safe protocol such as https (as shown on the bottom of the image). In this case, the intermediary knows the sender and the recipient, but it doesn’t know the content of the message.
WHAT CHANGES WHEN WE USE TOR?
Let’s assume to send the same message – from A to B – using the Tor net. Client A has access to the list of available Tor servers, and he picks at least 3 of the available Tor servers; then, he orders them in a sequence to delineate the process that the message will go through before getting, eventually, to B.
In the next picture you can see the single messages received by each node; in order to express the functioning of the “onion” structure, I enclosed in colorful parenthesis the encrypted data (in colors as well). The color corresponds to the color of the server able to decrypt the data; the others will not be able to decrypt them.
Let’s take a closer look to what are the steps the client takes in order to compose the message to send to server 1.
The original message “MESSAGE” is enclosed in purple parenthesis to indicate that it has been encrypted so that only the recipient B is able to read it. This level of encryption does not depend on Tor, but on the fact we’re assuming to visit an https page (if you notice, it was present in the previous image related to the encrypted connection without Tor). In case the webpage was served through http, the encryption highlighted in purple would disappear.
Once you obtain the message described in the previous point, you have to add information regarding the recipient, B. I indicated this information with “next:B”. The combination of the message and “next:B” is encrypted again, so that only server 3 is able to read it. This encryption is highlighted in green.
The message obtained with the previous step is integrated with information on the server that is behind the recipient, which is server 3; I indicated this information as “next:3”. The conjunction of the message and the information “next:3” is encrypted once again so that only server 8 is able to read it. This encryption is shown in the picture in blue.
It keeps going like this, backwards. The message obtained through the previous step is integrated with information on the previous server, which is 8; I indicated this information as “next:8”. The conjunction of the message with the information “next:8” is encrypted again so that only server 1 can read it. The encryption is represented in the picture in orange.
We are at the point where the message is ready to be sent to server 1, which will read it and obtain the information on the next server, “next:8”, along with an encrypted message. The latter is sent to server 8, which will operate in a similar way: it will decrypt the content to obtain information on the next server, “next:3”, but it won’t be able to decrypt the message. The handover proceeds like this all the way to the recipient B.
Let’s answer the question we encountered earlier: who knows what?
A: knows everything. Sender, recipient, and content of the message.
i1: knows the sender and knows that it’s connected to the Tor net through server 1. It knows neither the message nor the recipient.
1: knows the sender. It knows that it has to forward the message to server 8. It knows neither the message nor the recipient.
i2: knows that two servers of the Tor net are communicating; it knows neither the message nor the recipient.
8: knows it received a message from server 1, and that it has to forward it to server 3; it doesn’t know the sender, the recipient, or the message.
i3: knows that two servers of the Tor net are communicating; it doesn’t know the sender, the recipient, or the message.
3: knows the recipient, but it doesn’t know the sender or the message*.
i4: knows the recipient and knows that it’s receiving a message from the Tor net, but it doesn’t know the sender nor the message*.
B: knows the message and knows that it comes from server 3 on the Tor net. It doesn’t know the sender.
*Please Note: in case the session is open (which happens, for instance, when you visit a web page with an url different than https) both i4 and server 3 know the message as well.
LIMITATIONS OF THE USE OF TOR
As we just explained in the previous paragraph, it’s clear that privacy on the Tor net has some limitations, and it requires a few preventative measures.
The ISP that provides you Internet connection knows that you’re on a Tor net, even if it doesn’t know the message or the recipient. This is usually not a problem, but it’s better being aware.
The data that travel to the last node of the Tor chain to the recipient will not be encrypted, and will be delivered just as you sent them. This means that if you’re visiting a web page with the url starting with https://, the data are going to be encrypted, but if the url begins with http://, the data will be open frequency. In this case, the last node of the Tor net and other external servers involved in the last step of the communication chain will know the message, even though they won’t know that it comes from you. Preventative measures: they are the same to adopt when you navigate the Internet without Tor net. If you’re sending out important data such as passwords or credit card information, make sure the URL of the page is https://. Tor is not able to protect your personal information once you’re outside its net, and the only way to protect it is to connect to website that use encrypted connections.
The website you visit is always able to read the message you sent. That’s obvious; otherwise it wouldn’t be able to get back to you with the information you required. However, thanks to the Tor net, it doesn’t know who sent the message; it only knows it comes from a Tor net. As a preventative measure, if your goal is hiding your identity to the website you’re visiting, do not login with your credentials. If you enter username and password, you’re giving your personal information to the website, which will be able to identify you even on a Tor net.