Real data on the filtering methods used for web censorship in Syria in 2011 have been analysed for the first time by researchers at INRIA (The French Institute for Research in Computer Science and Automation), NICTA (Australia) and UCL. This analysis reveals the techniques used to monitor, filter and block traffic from Syrian users. These include address blocks based on geographic location, and requests containing specific keywords related to censorship avoidance.
Studying how Internet censorship is practiced in the real world is difficult due to lack of data. Previously studies have attempted to access web services in restricted areas and observed which are successful, but this does not reveal the full extent of censorship, or how regular users may be affected or work around it. Unusually, this work analyses 600GB of records from the censor’s side, taken from filtering ‘proxies’ inserted between the local network and the main routes for long-distance communications.
This dataset specifies the results of millions of requests for content, which may either be allowed, denied, or returned from stored records. They were captured and published to the Internet by the Telecomixgroup of hacking activists in October 2011. The records were subsequently confirmed as genuine by the device manufacturer Blue Coat Systems, although the company has stated that they did not authorize the use of their proxies in this country.
While the records give a rare opportunity to examine how censorship is applied in practice at large scale, this also means they are messy and include mistakes and dead ends. To handle this large and challenging collection, the international group of privacy researchers developed a method which they hope will be transferrable to others doing similar work, balancing the joint needs of accuracy and ease of use. Only aggregated traffic was analysed in order to preserve users’ privacy.
They found that four main kinds of filtering were employed: based on URLs, keywords, destination IP addresses and customized categories. These aimed to restrict access to instant messaging such as Skype, video sharing, and Wikipedia. Sites and Facebook pages related to news and opposition parties were off-limits, as was the keyword ‘Israel’, the domain .il, and some subnetworks. Detailed inspection of requests (Deep Packet Inspection, DPI) was used to deny those related to anti-censorship tools (for instance, any searches containing the word ‘proxy’), although this also removes a large section of content not related to censorship evasion.
In response to this, the researchers found people in Syria engaging in self-censorship: taking precautions and limiting their online behavior due to surveillance. However, they also use technology to get around these restrictions. Peer-to-peer networks like BitTorrent are used for sharing instant messaging software and privacy related tools. Google Cache, which stores versions of pages suggested by the search engine, is also used to avoid blocking, as the content is located on Google servers rather than at the banned addresses.
Dr Emiliano De Cristofaro (UCL Computer Science) said:
“We have exposed strengths and weaknesses of Web filtering conducted using Blue Coat proxies (which are still used in many countries worldwide). Through the lessons we learn we hope to give recommendations to users in how they can adapt to surveillance and censorship, and also guide research in the design of censorship-evading tools.”
Dr Mohamed-Ali Kaafar (NICTA Australia and INRIA Rhône-Alpes), said:
“Our analysis shows that compared to other censoring regimes, Internet censorship in Syria is less invasive and quite targeted. In fact, while aggressively censoring Instant Messaging, the censorship selectively targets a few Facebook pages and geo-politically significant content. This, however, does not necessarily mean minor information control or less ubiquitous surveillance, but rather shows that censorship aims at a more subtle control of the Internet. This is achievable today as the proxy appliances seamlessly support Deep Packet Inspection (DPI) which allows fine-grained censorship in real-time.”
The technical report “Censorship in the Wild: Analyzing Web Filtering in Syria”, has been published as a pre-print on the arXiv repository and is now freely available. The manuscript will soon be submitted for peer review.
Emiliano De Cristofaro, UCL Computer Science
email@example.com +44 2076790349
Mohammed Ali (Dali) Kaafar, NICTA (Australia)
Kate Oliver, Communications Manager, UCL Engineering
firstname.lastname@example.org +44 2031084085