CryptoParty handbook - Appendices -

The necessity of Open Source
Cryptography and Encryption
Glossary

The last 20 years have seen network technology reaching ever more deeply into our lives, informing how we communicate and act within the world. With this come inherent risks: the less we understand the network environment we depend upon, the more vulnerable we are to exploitation.

This ignorance is something traditionally enjoyed by criminals. In recent years however some corporations and governments have exploited civilian ignorance in a quest for increased control. This flagrant and often covert denial of dignity breaches many basic rights, the right to privacy, in particular.

Closed source software has been a great boon to such exploitation – primarily due to the fact there is no code available for open, decentralised security auditing by the community . Under the auspices of hiding trade secrets, closed-source software developers have proven to be unwilling to explain to users how their programs work. This might not always be an issue were it not for the high stakes: identity theft, the distribution of deeply personal opinion and sentiment, a persons diverse interests and even his/her home increasingly come into close contact with software in a world-wide network context. As such, many people find themselves using software for personal purposes with full trust that it are secure. The Windows operating system itself is the most obvious real-world example. Apple’s OS X follows close behind, with large portions of the operating system’s inner-workings barred from public inspection.

In Cryptography there is a strong principle, established in the 19th century by Auguste Kerckhoff (and hence named after him) which demands that

While this principle has been taken further by most scientific and (of course) open source communities – publishing their methods and inner-workings upfront, so potential weaknesses can be pointed out and fixed before further distribution – most distributors of proprietary software rely on obfuscation to hide the weaknesses of their software. As such they often prove to address newly discovered vulnerabilities in a non-transparent way – leaving many trusting users at risk of exploitation.

Of course it must be said that Open Source Software is as secure as you make it (and there is a lot of OSS written by beginners). However there are many good examples of well written, well managed software which have such a large (and concerned) user base that even the tiniest of mistakes are quickly found and dealt with. This is especially the case with software depended upon in a network context.

To use closed source software in a network context is not only to be a minority, it is to be overlooked by a vast community of concerned researchers and specialists that have your privacy and safety in mind.

N.B. There is also a more cynical view of Open Source Software, which points out that since nobody is paid full time to constantly review and regression test the latest tinkering by unskilled or deliberately malicious programmers, it can also suffer from major security weaknesses which go undetected for long periods of time in complicated software, leaving it vulnerable to hackers, criminals and intelligence agencies etc. e.g. the (now fixed) Debian Linux predictable random number generator problem which led to the creation of lots of weak cryptographic keys.

Cryptography and Encryption

Cryptography and encryption are similar terms, the former being the science and latter the implementation of it. The history of the subject can be traced back to ancient civilisations, when the first humans began to organise themselves into groups. This was driven in part by the realisation that we were in competition for resources and tribal organisation, warfare and so forth were necessary, so as to keep on top of the heap. In this respect cryptography and encryption are rooted in warfare, progression and resource management, where it was necessary to send secret messages to each other without the enemy deciphering ones moves.

Writing is actually one of the earliest forms of cryptography as not everyone could read. The word cryptography stems from the Greek words kryptos (hidden) and graphein (writing). In this respect cryptography and encryption in their simplest form refer to the writing of hidden messages, which require a system or rule to decode and read them. Essentially this enables you to protect your privacy by scrambling information in a way that it is only recoverable with certain knowledge (passwords or passphrases) or possession (a key).

Put in another way, encryption is the translation of information written in plaintext into a non-readable form (ciphertext) using algorithmic schemes (ciphers). The goal is to use the right key to unlock the ciphertext and return it back into its original plain text form so it becomes readable again.

Although most encryption methods refer to written word, during World War Two, the US military used Navajo Indians, who traveled between camps sending messages in their native tongue. The reason the army used the Navajo tribe was to protect the information they were sending from the Japanese troops, who famously could not decipher the Navajo’s spoken language. This is a very simple example of using a language to send messages that you do not want people to listen into or know what you’re discussing. Why is encryption important? —————————-

Computer and telecommunication networks store digital echoes or footprints of our thoughts and records of personal lives.

From banking, to booking, to socialising: we submit a variety of detailed, personalised information, which is driving new modes of business, social interaction and behavior. We have now become accustomed to giving away what was (and still is) considered private information in exchange for what is presented as more personalised and tailored services, which might meet our needs, but cater to our greed.

Lets consider a scenario whereby we all thought it was fine to send all our communication on open handwritten postcards. From conversations with your doctor, to intimate moments with our lovers, to legal discussions you may have with lawyers or accountants. It’s unlikely that we would want all people to be able to read such communications. So instead we have written letters in sealed envelopes, tracking methods for sending post, closed offices and confidential agreements, which help to keep such communication private. However given the shift in how we communicate, much more of this type of interaction is taking place online. More importantly it is taking place through online spaces, which are not private by default and open to people with little technical skills to snoop into the matters that can mean the most to our lives.

Online privacy and encryption is something we therefore need to be aware of and practice daily. In the same way we would put an important letter into an envelope or have a conversation behind a closed door. Given that so much of our private communication is now happening in networked and online spaces, we should consider the interface, like envelopes or seals, which protect this material as a basic necessity and human right.

Encryption examples

Throughout history we can find examples of cipher methods, which have been used to keep messages private and secret.

A Warning!

This chapter first explains a number of historical cryptographic systems and then provides a summary of modern techniques. The historical examples illustrate how cryptography emerged, but are considered broken in the face of modern computers. They can be fun to learn, but please don’t use them for anything sensitive!

Historical ciphers

Classical ciphers refer to historical ciphers, which are now out of popular use or no longer applicable. There are two general categories of classical ciphers: transposition and substitution ciphers.

In a transposition cipher, the letters themselves are kept unchanged, but the order within the message is scrambled according to some well-defined scheme. An example of a transposition cipher is Skytale, which was used in ancient Rome and Greece. A paperstrip was wrapped around a stick and the message written across it. That way the message could not be read unless wound around a stick of similar diameter again.

Cryptography

A substitution cipher is a form of classical cipher whereby letters or groups of letters are systematically replaced throughout the message for other letters (or groups of letters). Substitution ciphers are divided into monoalphabetic and polyalphabetic substitutions. The Caesar Shift cipher is common example of amonoalphabetic substitution ciphers, where the letters in the alphabet are shifted in one direction or another.

Cryptography

Polyalphabetic substitutions are more complex than substitution ciphers as they use more than one alphabet and rotate them. For example, The Alberti cipher, which was the first polyalphabetic cipher was created by Leon Battista Alberti, a 15th century Italian, Renaissance polymath and humanist who is also credited as the godfather of western cryptography. His cipher is similar to the Vigenère cipher, where every letter of the alphabet gets a unique number (e.g. 1-26). The message is then encrypted by writing down the message along with the password repeatedly written beneath it.

In the Vigenère cipher the corresponding numbers of the letters of message and key are summed up (with numbers exceeding the alphabet being dragged around the back) making the message so unreadable that it couldn’t be deciphered for centuries (nowadays, with the help of computers, this obviously isn’t true anymore).

Cryptography

During World War 2 there was a surge in crypography, which lead to the development of new algorithms such as the one-time pad (OTP). The OTP algorithm combines plaintext with a random key that is as long as the plaintext so that each character is only used once. To use it you need two copies of the pad, which are kept by each user and exchanged via a secure channel. Once the message is encoded with the pad, the pad is destroyed and the encoded message is sent. On the recipient’s side, the encoded message has a duplicate copy of the pad from which the plaintext message is generated. A good way to look at OTP is to think of it as a 100% noise source, which is used to mask the message. Since both parties of the communication have copies of the noise source they are the only people who can filter it out.

OTP lies behind modern day stream ciphers, which are explained below. Claude Shannon, (a key player in modern cryptography and information theory), in his seminal 1949 paper “Communication Theory of Secrecy Systems” demonstrated that theoretically all unbreakable ciphers should include the OTP encryption, which if used correctly are impossible to crack.

Modern ciphers

Post the World Wars the field of cryptography became less of a public service and fell more within the domain of governance. Major advances in the field began to reemerge in the mid-1970s with the advent of personalised computers and the introduction of the Data Encryption Standard (DES, developed at IBM in 1977 and later adopted by the U.S government). Since 2001 we now use the AES, Advanced Encryption Standard), which is based on symmetric cryptography forms.

Contemporary cryptography can be generally divided into what is called symmetric, asymmetric and quantum cryptography.

Symmetric cryptography, or secret key, cryptography refers to ciphers where the same key is used to both encrypt and decrypt the text or information involved. In this class of ciphers the key is shared and kept secret within a restricted group and therefore it is not possible to view the encrypted information without having the key. A simple analogy to secret key cryptography is having access to a community garden, which has one key to open gate, which is shared by the community. You cannot open the gate, unless you have the key. Obviously the issue here with the garden key and with symmetric cryptography is if the key falls into the wrong hands, then an intruder or attacker can get in and the security of the garden, or the data or information is compromised. Consequently one of the main issues with this form of cryptography is the issue of key management. As a result this method is best employed within single-user contexts or small group environments.

Despite this limitation symmetric key methods are considerably faster than asymmetric methods and so are the preferred mechanism for encrypting large chunks of text.

Symmetric ciphers are usually implemented using block ciphers or stream ciphers.

Block ciphers work by looking at the input data in 8 or 16 or 32 byte blocks at a time and spreading the input and key within those blocks. Different modes of operation are performed on the data in order to transform and spread the data between blocks. Such ciphers use a secret key to convert a fixed block of plain text into cipher text. The same key is then used to decrypt the cipher text.

In comparison stream ciphers (also known as state cipher) work on each plaintext digit by creating a corresponding keystream which forms the ciphertext. The keystream refers to a stream of random characters (bits, bytes, numbers or letters) on which various additive or subtractive functions are performed and combined to a character in the plaintext message, which then produces the ciphertext. Although this method is very secure, it is not always practical, since the key of the same length as the message needs to be transmitted in some secure way so that receiver can decypher the message. Another limitation is that the key can only be used once and then its discarded. Although this can mean almost watertight security, it does limit the use of the cipher.

Asymmetric ciphers work much more complex mathematical problems with back doors, enabling faster solutions on smaller, highly important pieces of data. They also work on fixed data sizes, typically 1024-2048 bits and and 384 bits. What makes them special is that they help solve some of the issues with key distribution by allocating one public and one private pair per person, so that everyone just needs to know everyone else’s public portion. Asymmetric ciphers are also used for digital signatures. Where as symmetric ciphers are generally used for message authenticity. Symmetric ciphers cannot non-repudiation signatures (i.e., signatures that you cannot later deny that you did not sign). Digital signatures are very important in modern day cryptography. They are similar to wax seals in that they verify who the message is from and like seals are unique to that person. Digital signatures are one of the methods used within public key systems, which have transformed the field of cryptography are central to modern day Internet security and online transactions.

Quantum Cryptography

Quantum cryptography is the term used to describe the type of cryptography that is now necessary to deal with the speed at which we now process information and the related security measures that are necessary. Essentially it deals with how we use quantum communication to securely exchange a key and its associated distribution. As the machines we use become faster the possible combinations of public-key encryption and digital signatures becomes easier to break and quantum cryptography deals with the types of algorithms that are necessary to keep pace with more advanced networks.

Challenges & Implications

At the heart of cryptography lies the challenge of how we use and communicate information. The above methods describe how we encrypt written communication but obviously as shown in the Navajo example other forms of communication (speech, sound, image etc) can also be encrypted using different methods.

The main goal and skill of encryption is to apply the right methods to support trustworthy communication. This is achieved by understanding the tradeoffs, strengths and weaknesses of different cipher methods and how they relate to the level of security and privacy required. Getting this right depends on the task and context.

Importantly when we speak about communication, we are speaking about trust. Traditionally cryptography dealt with the hypothetical scenarios, where the challenge was to address how ‘Bob’ could speak to ‘Alice’ in a private and secure manner.

Our lives are now heavily mediated via computers and the Internet. So the boundaries between Bob, Alice + the ‘other’ (Eve, Oscar, Big Brother, your boss, ex-boyfriend or the government) are a lot more blurred. Given the quantum leaps in computer processing, in order for ‘us’, Bob’s and Alice’s to have trust in the system, we need to know who we are talking too, we need to know who is listening and importantly who has the potential to eavesdrop. What becomes important is how we navigate this complexity and feel in control and secure, so that you can engage and communicate in a trustful manner, which respects our individual freedoms and privacy.

Glossary

aggregator

An aggregator is a service that gathers syndicated information from one or many sites and makes it available at a different address. Sometimes called an RSS aggregator, a feed aggregator, a feed reader, or a news reader. (Not to be confused with a Usenet News reader.)

anonymity

Anonymity on the Internet is the ability to use services without leaving clues to one’s identity or being spied upon. The level of protection depends on the anonymity techniques used and the extent of monitoring. The strongest techniques in use to protect anonymity involve creating a chain of communication using a random process to select some of the links, in which each link has access to only partial information about the process. The first knows the user’s Internet address (IP) but not the content, destination, or purpose of the communication, because the message contents and destination information are encrypted. The last knows the identity of the site being contacted, but not the source of the session. One or more steps in between prevents the first and last links from sharing their partial knowledge in order to connect the user and the target site.

anonymous remailer

An anonymous remailer is a service that accepts e-mail messages containing instructions for delivery, and sends them out without revealing their sources. Since the remailer has access to the user’s address, the content of the message, and the destination of the message, remailers should be used as part of a chain of multiple remailers so that no one remailer knows all this information.

ASP (application service provider)

An ASP is an organization that offers software services over the Internet, allowing the software to be upgraded and maintained centrally.

backbone

A backbone is one of the high-bandwidth communications links that tie together networks in different countries and organizations around the world to form the Internet.

badware

bandwidth

The bandwidth of a connection is the maximum rate of data transfer on that connection, limited by its capacity and the capabilities of the computers at both ends of the connection.

bash (Bourne-again shell)

The bash shell is a command-line interface for Linux/Unix operating systems, based on the Bourne shell.

BitTorrent

BitTorrent is a peer-to-peer file-sharing protocol invented by Bram Cohen in 2001. It allows individuals to cheaply and effectively distribute large files, such as CD images, video, or music files.

blacklist

A blacklist is a list of forbidden things. In Internet censorship, lists of forbidden Web sites or the IP addresses of computers may be used as blacklists; censorware may allow access to all sites except for those specifically listed on its blacklist. An alternative to a blacklist is a whitelist, or a list of permitted things. A whitelist system blocks access to all sites except for those specifically listed on the whitelist. This is a less common approach to Internet censorship. It is possible to combine both approaches, using string matching or other conditional techniques on URLs that do not match either list.

bluebar

The blue URL bar (called the Bluebar in Psiphon lingo) is the form at the top of your Psiphon node browser window, which allows you to access blocked site by typing its URL inside.

block

To block is to prevent access to an Internet resource, using any number of methods.

bookmark

A bookmark is a placeholder within software that contains a reference to an external resource. In a browser, a bookmark is a reference to a Web page – by choosing the bookmark you can quickly load the Web site without needing to type in the full URL.

bridge

brute-force attack

A brute force attack consists of trying every possible code, combination, or password until you find the right one. These are some of the most trivial hacking attacks.

cache

A cache is a part of an information-processing system used to store recently used or frequently used data to speed up repeated access to it. A Web cache holds copies of Web page files.

censor

To censor is to prevent publication or retrieval of information, or take action, legal or otherwise, against publishers and readers.

censorware

Censorware is software used to filter or block access to the Internet. This term is most often used to refer to Internet filtering or blocking software installed on the client machine (the PC which is used to access the Internet). Most such client-side censorware is used for parental control purposes.

Sometimes the term censorware is also used to refer to software used for the same purpose installed on a network server or router.

CGI (Common Gateway Interface)

CGI is a common standard used to let programs on a Web server run as Web applications. Many Web-based proxies use CGI and thus are also called “CGI proxies”. (One popular CGI proxy application written by James Marshall using the Perl programming language is called CGIProxy.)

chat

Chat, also called instant messaging, is a common method of communication among two or more people in which each line typed by a participant in a session is echoed to all of the others. There are numerous chat protocols, including those created by specific companies (AOL, Yahoo!, Microsoft, Google, and others) and publicly defined protocols. Some chat client software uses only one of these protocols, while others use a range of popular protocols.

cipher

In cryptography, a cipher (or cypher) is an algorithm for performing encryption or decryption

circumvention

Circumvention is publishing or accessing content in spite of attempts at censorship.

Common Gateway Interface

command-line interface

A method of controlling the execution of software using commands entered on a keyboard, such as a Unix shell or the Windows command line.

A cookie is a text string sent by a Web server to the user’s browser to store on the user’s computer, containing information needed to maintain continuity in sessions across multiple Web pages, or across multiple sessions. Some Web sites cannot be used without accepting and storing a cookie. Some people consider this an invasion of privacy or a security risk.

country code top-level domain (ccTLD)

Each country has a two-letter country code, and a TLD (top-level domain) based on it, such as .ca for Canada; this domain is called a country code top-level domain. Each such ccTLD has a DNS server that lists all second-level domains within the TLD. The Internet root servers point to all TLDs, and cache frequently-used information on lower-level domains.

cryptography

Cryptography is the practice and study of techniques for secure communication in the presence of third parties (called adversaries). More generally, it is about constructing and analyzing protocols that overcome the influence of adversaries and which are related to various aspects in information security such as data confidentiality, data integrity, authentication, and non-repudiation. Modern cryptography intersects the disciplines of mathematics, computer science, and electrical engineering. Applications of cryptography include ATM cards, computer passwords, and electronic commerce.

DARPA (Defense Advanced Projects Research Agency)

DARPA is the successor to ARPA, which funded the Internet and its predecessor, the ARPAnet.

decryption

Decryption is recovering plain text or other messages from encrypted data with the use of a key.

disk encryption

Disk encryption is a technology which protects information by converting it into unreadable code that cannot be deciphered easily by unauthorized people. Disk encryption uses disk encryption software or hardware to encrypt every bit of data that goes on a disk or disk volume. Disk encryption prevents unauthorized access to data storage.

domain

DNS (Domain Name System)

The Domain Name System (DNS) converts domain names, made up of easy-to-remember combinations of letters, to IP addresses, which are hard-to-remember strings of numbers. Every computer on the Internet has a unique address (a little bit like an area code+telephone number).

DNS leak

A DNS leak occurs when a computer configured to use a proxy for its Internet connection nonetheless makes DNS queries without using the proxy, thus exposing the user’s attempts to connect with blocked sites. Some Web browsers have configuration options to force the use of the proxy.

DNS server

A DNS server, or name server, is a server that provides the look-up function of the Domain Name System. It does this either by accessing an existing cached record of the IP address of a specific domain, or by sending a request for information to another name server.

DNS tunnel

Because you “abuse” the DNS system for an unintended purpose, it only allows a very slow connection of about 3 kb/s which is even less than the speed of an analog modem. That is not enough for YouTube or file sharing, but should be sufficient for instant messengers like ICQ or MSN Messenger and also for plain text e-mail.

On the connection you want to use a DNS tunnel, you only need port 53 to be open; therefore it even works on many commercial Wi-Fi providers without the need to pay.

The main problem is that there are no public modified nameservers that you can use. You have to set up your own. You need a server with a permanent connection to the Internet running Linux. There you can install the free software OzymanDNS and in combination with SSH and a proxy like Squid you can use the tunnel. More Information on this on http://www.dnstunnel.de. eavesdropping

Eavesdropping is listening to voice traffic or reading or filtering data traffic on a telephone line or digital data connection, usually to detect or prevent illegal or unwanted activities or to control or monitor what people are talking about.

e-mail

E-mail, short for electronic mail, is a method to send and receive messages over the Internet. It is possible to use a Web mail service or to send e-mails with the SMTP protocol and receive them with the POP3 protocol by using an e-mail client such as Outlook Express or Thunderbird. It is comparatively rare for a government to block e-mail, but e-mail surveillance is common. If e-mail is not encrypted, it could be read easily by a network operator or government.

embedded script

encryption

Encryption is any method for recoding and scrambling data or transforming it mathematically to make it unreadable to a third party who doesn’t know the secret key to decrypt it. It is possible to encrypt data on your local hard drive using software like TrueCrypt (http://www.truecrypt.org) or to encrypt Internet traffic with TLS/SSL or SSH.

exit node

file sharing

File sharing refers to any computer system where multiple people can use the same information, but often refers to making music, films or other materials available to others free of charge over the Internet.

file spreading engine

A file spreading engine is a Web site a publisher can use to get around censorship. A user only has to upload a file to publish once and the file spreading engine uploads that file to some set of sharehosting services (like Rapidshare or Megaupload).

filter

To filter is to search in various ways for specific data patterns to block or permit communications.

Firefox

Firefox is the most popular free and open source Web browser, developed by the Mozilla Foundation.

forum

On a Web site, a forum is a place for discussion, where users can post messages and comment on previously posted messages. It is distinguished from a mailing list or a Usenet newsgroup by the persistence of the pages containing the message threads. Newsgroup and mailing list archives, in contrast, typically display messages one per page, with navigation pages listing only the headers of the messages in a thread.

frame

A frame is a portion of a Web page with its own separate URL. For example, frames are frequently used to place a static menu next to a scrolling text window.

FTP (File Transfer Protocol)

The FTP protocol is used for file transfers. Many people use it mostly for downloads; it can also be used to upload Web pages and scripts to some Web servers. It normally uses ports 20 and 21, which are sometimes blocked. Some FTP servers listen to an uncommon port, which can evade port-based blocking.

A popular free and open source FTP client for Windows and Mac OS is FileZilla. There are also some Web-based FTP clients that you can use with a normal Web browser like Firefox.

full disk encryption

gateway

A gateway is a node connecting two networks on the Internet. An important example is a national gateway that requires all incoming or outgoing traffic to go through it.

GNU Privacy Guard

GNU Privacy Guard (GnuPG or GPG) is a GPL Licensed alternative to the PGP suite of cryptographic software. GnuPG is compliant with RFC 4880, which is the current IETF standards track specification of OpenPGP.

GPG

honeypot

A honeypot is a site that pretends to offer a service in order to entice potential users to use it, and to capture information about them or their activities.

hop

A hop is a link in a chain of packet transfers from one computer to another, or any computer along the route. The number of hops between computers can give a rough measure of the delay (latency) in communications between them. Each individual hop is also an entity that has the ability to eavesdrop on, block, or tamper with communications.

HTTP (Hypertext Transfer Protocol)

HTTP is the fundamental protocol of the World Wide Web, providing methods for requesting and serving Web pages, querying and generating answers to queries, and accessing a wide range of services.

HTTPS (Secure HTTP)

Secure HTTP is a protocol for secure communication using encrypted HTTP messages. Messages between client and server are encrypted in both directions, using keys generated when the connection is requested and exchanged securely. Source and destination IP addresses are in the headers of every packet, so HTTPS cannot hide the fact of the communication, just the contents of the data transmitted and received.

IANA (Internet Assigned Numbers Authority)

IANA is the organization responsible for technical work in managing the infrastructure of the Internet, including assigning blocks of IP addresses for top-level domains and licensing domain registrars for ccTLDs and for the generic TLDs, running the root name servers of the Internet, and other duties.

ICANN (Internet Corporation for Assigned Names and Numbers)

ICANN is a corporation created by the US Department of Commerce to manage the highest levels of the Internet. Its technical work is performed by IANA.

Instant Messaging (IM)

Instant messaging is either certain proprietary forms of chat using proprietary protocols, or chat in general. Common instant messaging clients include MSN Messenger, ICQ, AIM or Yahoo! Messenger. intermediary

Internet

The Internet is a network of networks interconnected using TCP/IP and other communication protocols. IP (Internet Protocol) Address

An IP address is a number identifying a particular computer on the Internet. In the previous version 4 of the Internet Protocol an IP address consisted of four bytes (32 bits), often represented as four integers in the range 0-255 separated by dots, such as 74.54.30.85. In IPv6, which the Net is currently switching to, an IP address is four times longer, and consists of 16 bytes (128 bits). It can be written as 8 groups of 4 hex digits separated by colons, such as 2001:0db8:85a3:0000:0000:8a2e:0370:7334.

IRC (Internet relay chat)

IRC is a more than 20-year-old Internet protocol used for real-time text conversations (chat or instant messaging). There exist several IRC networks – the largest have more than 50 000 users.

ISP (Internet Service Provider)

An ISP (Internet service provider) is a business or organization that provides access to the Internet for its customers.

JavaScript

JavaScript is a scripting language, commonly used in Web pages to provide interactive functions.

KeePass, KeePassX

keychain software

keyword filter

A keyword filter scans all Internet traffic going through a server for forbidden words or terms to block.

latency

Latency is a measure of time delay experienced in a system, here in a computer network. It is measured by the time between the start of packet transmission to the start of packet reception, between one network end (e.g. you) to the other end (e.g. the Web server). One very powerful way of Web filtering is maintaining a very high latency, which makes lots of circumvention tools very difficult to use.

log file

A log file is a file that records a sequence of messages from a software process, which can be an application or a component of the operating system. For example, Web servers or proxies may keep log files containing records about which IP addresses used these services when and what pages were accessed.

low-bandwidth filter

A low-bandwidth filter is a Web service that removes extraneous elements such as advertising and images from a Web page and otherwise compresses it, making page download much quicker.

malware

Malware is a general term for malicious software, including viruses, that may be installed or executed without your knowledge. Malware may take control of your computer for purposes such as sending spam. (Malware is also sometimes called badware.)

man in the middle

A man in the middle or man-in-the-middle is a person or computer capturing traffic on a communication channel, especially to selectively change or block content in a way that undermines cryptographic security. Generally the man-in-the-middle attack involves impersonating a Web site, service, or individual in order to record or alter communications. Governments can run man-in-the-middle attacks at country gateways where all traffic entering or leaving the country must pass.

middleman node

A middleman node is a Tor node that is not an exit node. Running a middleman node can be safer than running an exit node because a middleman node will not show up in third parties’ log files. (A middleman node is sometimes called a non-exit node.)

monitor

network address translation (NAT)

NAT is a router function for hiding an address space by remapping. All traffic going out from the router then uses the router’s IP address, and the router knows how to route incoming traffic to the requestor. NAT is frequently implemented by firewalls. Because incoming connections are normally forbidden by NAT, NAT makes it difficult to offer a service to the general public, such as a Web site or public proxy. On a network where NAT is in use, offering such a service requires some kind of firewall configuration or NAT traversal method.

network operator

A network operator is a person or organization who runs or controls a network and thus is in a position to monitor, block, or alter communications passing through that network.

node

A node is an active device on a network. A router is an example of a node. In the Psiphon and Tor networks, a server is referred to as a node.

non-exit node

obfuscation

Obfuscation means obscuring text using easily-understood and easily-reversed transformation techniques that will withstand casual inspection but not cryptanalysis, or making minor changes in text strings to prevent simple matches. Web proxies often use obfuscation to hide certain names and addresses from simple text filters that might be fooled by the obfuscation. As another example, any domain name can optionally contain a final dot, as in “somewhere.com.”, but some filters might search only for “somewhere.com” (without the final dot).

open node

An open node is a specific Psiphon node which can be used without logging in. It automatically loads a particular homepage, and presents itself in a particular language, but can then be used to browse elsewhere.

OTR/Off-the-Record messaging

Off-the-Record Messaging, commonly referred to as OTR, is a cryptographic protocol that provides strong encryption for instant messaging conversations.

packet

A packet is a data structure defined by a communication protocol to contain specific information in specific forms, together with arbitrary data to be communicated from one point to another. Messages are broken into pieces that will fit in a packet for transmission, and reassembled at the other end of the link.

password manager

A password manager is software that helps a user organize passwords and PIN codes. The software typically has a local database or a file that holds the encrypted password data for secure logon onto computers, networks, web sites and application data files. KeePass http://keepass.info/ is an example of a password manager.

pastebin

A web service where any kind of text can be dumped and read without registration. All text will be visible publicly.

peer-to-peer

A peer-to-peer (or P2P) network is a computer network between equal peers. Unlike client-server networks there is no central server and so the traffic is distributed only among the clients.This technology is mostly applied to file sharing programs like BitTorrent, eMule and Gnutella. But also the very old Usenet technology or the VoIP program Skype can be categorized as peer-to-peer systems.

perfect forward secrecy

In an authenticated key-agreement protocol that uses public key cryptography, perfect forward secrecy (or PFS) is the property that ensures that a session key derived from a set of long-term public and private keys will not be compromised if one of the (long-term) private keys is compromised in the future.

Pretty Good Privacy (PGP)

Pretty Good Privacy (PGP) is a data encryption and decryption computer program that provides cryptographic privacy and authentication for data communication. PGP is often used for signing, encrypting and decrypting texts, e-mails, files, directories and whole disk partitions to increase the security of e-mail communications.

PGP and similar products follow the OpenPGP standard (RFC 4880) for encrypting and decrypting data.

PHP

PHP is a scripting language designed to create dynamic Web sites and web applications. It is installed on a Web server. For example, the popular Web proxy PHProxy uses this technology.

plain text

Plain text is unformatted text consisting of a sequence of character codes, as in ASCII plain text or Unicode plain text.

plaintext

privacy

Protection of personal privacy means preventing disclosure of personal information without the permission of the person concerned. In the context of circumvention, it means preventing observers from finding out that a person has sought or received information that has been blocked or is illegal in the country where that person is at the time.

private key

POP3

Post Office Protocol version 3 is used to receive mail from a server, by default on port 110 with an e-mail program such as Outlook Express or Thunderbird.

port

A hardware port on a computer is a physical connector for a specific purpose, using a particular hardware protocol. Examples are a VGA display port or a USB connector.

Software ports also connect computers and other devices over networks using various protocols, but they exist in software only as numbers. Ports are somewhat like numbered doors into different rooms, each for a special service on a server or PC. They are identified by numbers from 0 to 65535.

protocol

A formal definition of a method of communication, and the form of data to be transmitted to accomplish it. Also, the purpose of such a method of communication. For example, Internet Protocol (IP) for transmitting data packets on the Internet, or Hypertext Transfer Protocol for interactions on the World Wide Web.

proxy server

A proxy server is a server, a computer system or an application program which acts as a gateway between a client and a Web server. A client connects to the proxy server to request a Web page from a different server. Then the proxy server accesses the resource by connecting to the specified server, and returns the information to the requesting site. Proxy servers can serve many different purposes, including restricting Web access or helping users route around obstacles.

Psiphon node

A Psiphon node is a secured web proxy designed to evade Internet censorship. It is developed by Psiphon inc. Psiphon nodes can be open or private.

private node

A private node is a Psiphon node working with authentication, which means that you have to register before you can use it. Once registered, you will be able to send invitations to your friends and relatives to use this specific node.

public key

public key encryption/public-key cryptography

Public-key cryptography refers to a cryptographic system requiring two separate keys, one of which is secret and one of which is public. Although different, the two parts of the key pair are mathematically linked. One key locks or encrypts the plaintext, and the other unlocks or decrypts the ciphertext. Neither key can perform both functions. One of these keys is published or public, while the other is kept private.

Public-key cryptography uses asymmetric key algorithms (such as RSA), and can also be referred to by the more generic term “asymmetric key cryptography.”

publicly routable IP address

Publicly routable IP addresses (sometimes called public IP addresses) are those reachable in the normal way on the Internet, through a chain of routers. Some IP addresses are private, such as the 192.168.x.x block, and many are unassigned.

regular expression

A regular expression (also called a regexp or RE) is a text pattern that specifies a set of text strings in a particular regular expression implementation such as the UNIX grep utility. A text string “matches” a regular expression if the string conforms to the pattern, as defined by the regular expression syntax. In each RE syntax, some characters have special meanings, to allow one pattern to match multiple other strings. For example, the regular expression lo+se matches lose, loose, and looose.

remailer

An anonymous remailer is a service which allows users to send e-mails anonymously. The remailer receives messages via e-mail and forwards them to their intended recipient after removing information that would identify the original sender. Some also provide an anonymous return address that can be used to reply to the original sender without disclosing her identity. Well-known Remailer services include Cypherpunk, Mixmaster and Nym.

router

A router is a computer that determines the route for forwarding packets. It uses address information in the packet header and cached information on the server to match address numbers with hardware connections.

root name server

A root name server or root server is any of thirteen server clusters run by IANA to direct traffic to all of the TLDs, as the core of the DNS system.

RSS (Real Simple Syndication)

RSS is a method and protocol for allowing Internet users to subscribe to content from a Web page, and receive updates as soon as they are posted.

scheme

On the Web, a scheme is a mapping from a name to a protocol. Thus the HTTP scheme maps URLs that begin with HTTP: to the Hypertext Transfer Protocol. The protocol determines the interpretation of the rest of the URL, so that http://www.example.com/dir/content.html identifies a Web site and a specific file in a specific directory, and is an e-mail address of a specific person or group at a specific domain.

shell

A UNIX shell is the traditional command line user interface for the UNIX/Linux operating systems. The most common shells are sh and bash.

SOCKS

A SOCKS proxy is a special kind of proxy server. In the ISO/OSI model it operates between the application layer and the transport layer. The standard port for SOCKS proxies is 1080, but they can also run on different ports. Many programs support a connection through a SOCKS proxy. If not you can install a SOCKS client like FreeCap, ProxyCap or SocksCap which can force programs to run through the Socks proxy using dynamic port forwarding. It is also possible to use SSH tools such as OpenSSH as a SOCKS proxy server.

screenlogger

A screenlogger is software able to record everything your computer displays on the screen. The main feature of a screenlogger is to capture the screen and log it into files to view at any time in the future. Screen loggers can be used as powerful monitoring tool. You should be aware of any screen logger running on any computer you are using, anytime.

script

A script is a program, usually written in an interpreted, non-compiled language such as JavaScript, Java, or a command interpreter language such as bash. Many Web pages include scripts to manage user interaction with a Web page, so that the server does not have to send a new page for each change.

smartphone

A smartphone is a mobile phone that offers more advanced computing ability and connectivity than a contemporary feature phone, such as Web access, ability to run elaborated operating systems and run built-in applications.

spam

Spam is messages that overwhelm a communications channel used by people, most notably commercial advertising sent to large numbers of individuals or discussion groups. Most spam advertises products or services that are illegal in one or more ways, almost always including fraud. Content filtering of e-mail to block spam, with the permission of the recipient, is almost universally approved of.

SSH (Secure Shell)

SSH or Secure Shell is a network protocol that allows encrypted communication between computers. It was invented as a successor of the unencrypted Telnet protocol and is also used to access a shell on a remote server.

The standard SSH port is 22. It can be used to bypass Internet censorship with port forwarding or it can be used to tunnel other programs like VNC.

SSL (Secure Sockets Layer)

SSL (or Secure Sockets Layer), is one of several cryptographic standards used to make Internet transactions secure. It is was used as the basis for the creation of the related Transport Layer Security (TLS). You can easily see if you are using SSL by looking at the URL in your Browser (like Firefox or Internet Explorer): If it starts with https instead of http, your connection is encrypted.

steganography

Steganography, from the Greek for hidden writing, refers to a variety of methods of sending hidden messages where not only the content of the message is hidden but the very fact that something covert is being sent is also concealed. Usually this is done by concealing something within something else, like a picture or a text about something innocent or completely unrelated. Unlike cryptography, where it is clear that a secret message is being transmitted, steganography does not attract attention to the fact that someone is trying to conceal or encrypt a message.

subdomain

A subdomain is part of a larger domain. If for example “wikipedia.org” is the domain for the Wikipedia, “en.wikipedia.org” is the subdomain for the English version of the Wikipedia.

threat analysis

A security threat analysis is properly a detailed, formal study of all known ways of attacking the security of servers or protocols, or of methods for using them for a particular purpose such as circumvention. Threats can be technical, such as code-breaking or exploiting software bugs, or social, such as stealing passwords or bribing someone who has special knowledge. Few companies or individuals have the knowledge and skill to do a comprehensive threat analysis, but everybody involved in circumvention has to make some estimate of the issues.

Top-Level Domain (TLD)

In Internet names, the TLD is the last component of the domain name. There are several generic TLDs, most notably .com, .org, .edu, .net, .gov, .mil, .int, and one two-letter country code (ccTLD) for each country in the system, such as .ca for Canada. The European Union also has the two-letter code .eu.

TLS (Transport Layer Security)

TLS or Transport Layer Security is a cryptographic standard based on SSL, used to make Internet transactions secure.

TCP/IP (Transmission Control Protocol over Internet Protocol)

TCP and IP are the fundamental protocols of the Internet, handling packet transmission and routing. There are a few alternative protocols that are used at this level of Internet structure, such as UDP.

Tor bridge

A bridge is a middleman Tor node that is not listed in the main public Tor directory, and so is possibly useful in countries where the public relays are blocked. Unlike the case of exit nodes, IP addresses of bridge nodes never appear in server log files and never pass through monitoring nodes in a way that can be connected with circumvention.

traffic analysis

Traffic analysis is statistical analysis of encrypted communications. In some circumstances traffic analysis can reveal information about the people communicating and the information being communicated.

tunnel

A tunnel is an alternate route from one computer to another, usually including a protocol that specifies encryption of messages.

UDP (User Datagram Packet)

UDP is an alternate protocol used with IP. Most Internet services can be accessed using either TCP or UDP, but there are some that are defined to use only one of these alternatives. UDP is especially useful for real-time multimedia applications like Internet phone calls (VoIP).

URL (Uniform Resource Locator)

The URL (Uniform Resource Locator) is the address of a Web site. For example, the URL for the World News section of the NY Times is http://www.nytimes.com/pages/world/index.html. Many censoring systems can block a single URL. Sometimes an easy way to bypass the block is to obscure the URL. It is for example possible to add a dot after the site name, so the URL http://en.cship.org/wiki/URL becomes http://en.cship.org./wiki/URL. If you are lucky with this little trick you can access blocked Web sites.

Usenet

Usenet is a more than 20-year-old discussion forum system accessed using the NNTP protocol. The messages are not stored on one server but on many servers which distribute their content constantly. Because of that it is impossible to censor Usenet as a whole, however access to Usenet can and is often blocked, and any particular server is likely to carry only a subset of locally-acceptable Usenet newsgroups. Google archives the entire available history of Usenet messages for searching. VoIP (Voice over Internet Protocol)

VoIP refers to any of several protocols for real-time two-way voice communication on the Internet, which is usually much less expensive than calling over telephone company voice networks. It is not subject to the kinds of wiretapping practiced on telephone networks, but can be monitored using digital technology. Many companies produce software and equipment to eavesdrop on VoIP calls; securely encrypted VoIP technologies have only recently begun to emerge.

VPN (virtual private network)

A VPN (virtual private network) is a private communication network used by many companies and organizations to connect securely over a public network. Usually on the Internet it is encrypted and so nobody except the endpoints of the communication can look at the data traffic. There are various standards like IPSec, SSL, TLS. The use of a VPN provider is a very fast, secure and convenient method to bypass Internet censorship with little risks but it generally costs money every month. Further, note that the VPN standard PPTP is no longer considered secure, and should be avoided.

whitelist

A whitelist is a list of sites specifically authorized for a particular form of communication. Filtering traffic can be done either by a whitelist (block everything but the sites on the list), a blacklist (allow everything but the sites on the list), a combination of the two, or by other policies based on specific rules and conditions.

World Wide Web (WWW)

The World Wide Web is the network of hyperlinked domains and content pages accessible using the Hypertext Transfer Protocol and its numerous extensions. The World Wide Web is the most famous part of the Internet.

Webmail

Webmail is e-mail service through a Web site. The service sends and receives mail messages for users in the usual way, but provides a Web interface for reading and managing messages, as an alternative to running a mail client such as Outlook Express or Thunderbird on the user’s computer. For example a popular and free webmail service is https://mail.google.com/

Web proxy

A Web proxy is a script running on a Web server which acts as a proxy/gateway. Users can access such a Web proxy with their normal Web browser (like Firefox) and enter any URL in the form located on that Web site. Then the Web proxy program on the server receives that Web content and displays it to the user. This way the ISP only sees a connection to the server with the Web proxy since there is no direct connection.

WHOIS

WHOIS (who is) is the aptly named Internet function that allows one to query remote WHOIS databases for domain registration information. By performing a simple WHOIS search you can discover when and by whom a domain was registered, contact information, and more.

A WHOIS search can also reveal the name or network mapped to a numerical IP address