Bad Egg.

This website's just a number, baby.

Syntax.

Throughout this website, there will be maths and pseudo-code samples.

This is a code block.

To ensure you understand what's going on in them, here's a quick rundown of syntax:

// Text starting with // is a comment.
+ // Addition symbol.
- // Subtraction symbol.
* // Multiplication symbol.
/ // Division symbol. There are two kinds of number: integer (whole numbers only), and decimal (or floating-point). Integer division truncates the decimals.
% // Remainder, or modulo symbol. 3 % 2 is 1, 4 % 2 is 0.
^ // Power symbol; 2^3 is 2 to the power of 3, or 2 cubed.
= // Assignment symbol; a = 1 means a is assigned the value 1.
== // Equality symbol; a == 1 means a is equal to 1 (may be used in conditionals).
!= // Inequality symbol; a != 1 means a is not equal to 1.
< // Less than symbol; a < 10 means a is less than 10.
> // Greater than symbol; a > 10 means a is greater than 10.
<= // Less than or equal to symbol.
>= // Greater than or equal to symbol.

// A keyword declaring a function; a named piece of code which can be 'called'.
// A function may take inputs (arguments), and may return a value.
function

// Declare a function named 'do_something' which takes the following arguments:
// - 'a': A string, or piece of text.
// - 'b': A list of strings.
function do_something(string a, List<string> b)

// A conditional statement.
if x
then
 // Do something if x evaluates to true.
else
 // Do something if x evaluates to false.

// A keyword used to exit the current function.
return

// Exit the current function with a return value of x.
return x 

// Access the value named 'b' attached to the value 'a'.
a.b

// Call the function named 'b' attached to the value 'a'.
// Pass in the number '0', the string "Hello, World!", and the value of 'c' as arguments.
a.b(0, "Hello, World!", c)

// Set the value 'a' to the return value of the function 'b'.
a = b()

[0, 5, 9] // A list containing the values 0, 5 and 9.

To understand computers, you first need to understand data. All computers do is operate on data; even the instructions which tell a computer what to do are, themselves, data. Below, we'll explore data from the ground up, covering binary, encodings, compression, cryptography, and the instructions telling your computer how to render the text you're reading. As well as theoretical explanations, we'll define and build upon toy examples as demonstrative tools, and explore how these things are applied in the real world, in ways you might use every day.

Binary.

You've heard that computers operate on binary, but what does that actually mean?

Binary is just a number system. You're already familiar with a number system, which you use every day without thinking: decimal, or base-10.

In base-10, numbers are a string of digits from 0-9. The least significant (smallest) digit is on the right, and the most significant (largest) digit is on the left. Consider the number 1,234; the 1 corresponds to the largest numerical value (1,000), and the 4 corresponds to the smallest numerical value (4). Each digit has a corresponding numerical value equal to digit * base^significance, and a number is the sum of the numerical value of its digits:

1 * 10^3 +
2 * 10^2 +
3 * 10^1 +
4 * 10^0 // Note: anything to the power of 0 is 1.
==
1 * 1000 +
2 * 100  +
3 * 10   +
4 * 1
==
1000 +
 200 +
  30 +
   4
==
1,234

Binary, also known as base-2, is exactly the same as base-10, only it uses the digits 0-1. For example, the base-10 number 1,234 represented in binary is:

1 * 2^10 +
0 * 2^9  +
0 * 2^8  +
1 * 2^7  +
1 * 2^6  +
0 * 2^5  +
1 * 2^4  +
0 * 2^3  +
0 * 2^2  +
1 * 2^1  +
0 * 2^0
==
1,024 + 128 + 64 + 16 + 2

Because 1 * n == n, and 0 * n == 0, the above can be simplified, dropping the 0 digits, and the multiplications by 1:

2^10 + 2^7 + 2^6 + 2^4 + 2^1
==
1,024 + 128 + 64 + 16 + 2

Just like how in base-10, we can divide or multiply a number by a power of 10 by shifting its digits to the left, or right (all numbers have an infinite number of leading zeroes):

001234 * 100 == 123400
123400 / 10  == 012340

We can multiply and divide by a power of 2 in binary by shifting digits:

01010 * 010 == 10100 // 10 * 2 == 20
10100 / 100 == 00101 // 20 / 4 == 5

All other calculations work exactly as they would in base-10. For example, to add two binary numbers, just like with base-10, we add each corresponding digit, tracking the carries as we go.

1001 +
0101
==
1 +
0 =
1
1000 // Intermediate sum.
 0 +
 1 =
 1
1100 // Intermediate sum.
  0 +
  0 =
  0
1100 // Intermediate sum.
   1 +
   1 =
  10
==
1110

We use binary in computers, because representing two values in electronics is easy; a switch can be on, or off. Sometimes, we call a value of 1 high, and a value of 0 low.

Digits in a binary number are called bits, and in computing, bits usually live in groups of eight, called bytes. A byte can represent 2^8 == 256 possible values, most often numbers from 0-255.

There's nothing special about the numbers 2, or 10. We can represent numbers in any base, and programmers often use hexadecimal, or base-16 to represent numbers with digits ranging from 0-F. Binary isn't very ergonomic, because we need a lot of digits to represent relatively small numbers. Base-16 requires fewer digits than binary or decimal, and is easy to convert to and from binary, because 16 is a power of 2.

It's common in programming to prefix numbers with a 'base indicator'; 0b11010110, 0xF5, 0d123 is a binary number, then a hex number, then a decimal number.

Computer memory operates at a byte level, but a maximum value of 255 is very limiting. If we want to represent a number larger than 255, but we're constrained to representing data in 8-bit bytes, we can combine multiple bytes to represent a base-256 number:

11111111 10000000 00010000
==
255 128 16
==
255 * 256^2 +
128 * 256^1 +
 16 * 256^0
==
16,744,464

This is a simple example of an encoding (for integers of size N).

Encodings.

All data is binary, meaning that all data is just a number, this website included. But how can that be? You're reading text right now.

The magic is encoding; encodings are rulesets which allow computers to interpret numbers as text, audio, video, instructions, and anything else a computer can interpret.

Text.

The text in this website is encoded in UTF-8. Basic characters, like those in the Latin alphabet, Arabic numerals, and common punctuation symbols are each represented with a single byte. Some byte values correspond to control characters, which tell the computer to do something like start a new line of text, or interpret the next 1-3 bytes as part of a multi-byte character.

A simpler text encoding, and the encoding UTF-8 itself was based on, is ASCII. ASCII is a 7-bit character encoding (many computers used to use 7-bit bytes) which includes only letters, numbers, basic symbols, and a few control characters like spaces and newlines. Because ASCII defines only 128 sybols (2^7 == 128), UTF-8 is able to maintain compatibility with it without requiring each character to take up an extra byte; UTF-8 uses numbers between 128 and 255 as its multi-byte character markers.

Below is a table of ASCII characters, and their corresponding numerical values:

Binary	Decimal	Character	Binary	Decimal	Character
0000000	0	[Null]	0100110	38	&
0000001	1	[Start of Heading]	0100111	39	'
0000010	2	[Start of Text]	0101000	40	(
0000011	3	[End of Text]	0101001	41	)
0000100	4	[End of Transmission]	0101010	42	*
0000101	5	[Enquiry]	0101011	43	+
0000110	6	[Ack]	0101100	44	,
0000111	7	[Bell]	0101101	45	-
0001000	8	[Backspace]	0101110	46	.
0001001	9	[Horizontal Tab]	0101111	47	/
0001010	10	[Line Feed]	0110000	48	0
0001011	11	[Verical Tab]	...
0001100	12	[Form Feed]	0111001	57	9
0001101	13	[Carriage Return]	0111010	58	:
0001110	14	[Shift In]	0111011	59	;
0001111	15	[Shift Out]	0111100	60	<
0010000	16	[Data Link Escape]	0111101	61	=
0010001	17	[Device Ctrl 1]	0111110	62	>
0010010	18	[Device Ctrl 2]	0111111	63	?
0010011	19	[Device Ctrl 3]	1000000	64	@
0010100	20	[Device Ctrl 4]	1000001	65	A
0010101	21	[Neg Ack]	...
0010110	22	[Sync Idle]	1011010	90	Z
0010111	23	[End of Transmit Block]	1011011	91	[
0011000	24	[Cancel]	1011100	92	\
0011001	25	[End of Medium]	1011101	93	]
0011010	26	[Substitute]	1011110	94	^
0011011	27	[Escape]	1011111	95	_
0011100	28	[File Separator]	1100000	96	`
0011101	29	[Group Separator]	1100001	97	a
0011110	30	[Record Separator]	...
0011111	31	[Unit Separator]	1111010	122	z
0100000	32	[Space]	1111011	123	{
0100001	33	!	1111100	124	\|
0100010	34	"	1111101	125	}
0100011	35	#	1111110	126	~
0100100	36	$	1111111	127	[Delete]
0100101	37	%

The string "Hello, world!" can therefore be transformed into the bytes 72 101 108 108 111 44 32 87 111 114 108 100 33, and as long as anyone who wants to read it knows to expect ASCII-encoded text, they can decode it with the same mapping.

Images.

Images can be represented in a similar manner; if I define a pixel to be three bytes wide, with a red, green and blue value in that order, I can encode an image as simply as:

Width // 4 byte integer.
Pixels... // String of 3-byte pixels.

To interpret such an image, we read the width, then fill in each pixel in a row of the defined width by reading the next three bytes, moving on to the next row when we've reached the defined width. For example, an 8x3 pixel image with a row of red pixels, then a row of black pixels, and finally a row of white pixels, would be represented (ignoring whitespace and comments) as:

0 0 0 8 // Width of 8 px.
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255

A basic image format like this is known as a bitmap; most image encodings include in-built compression, as well as additional metadata such as pixel format definitions (eg: the ability to specify that its pixels are monochrome, requiring only a single bit per pixel).

Videos.

Encodings can be layered on top of one another to represent more complex datatypes. The above image format consists of an image metadata encoding (just the width, in this case), a pixel format (three bytes per pixel, with red, green and blue values in that order), and an image content encoding (the image is represented with a single string of pixels, from left to right, top to bottom).

We could define a video encoding which reuses the pixel and image content encodings for frames, with a new header (the metadata part) as follows:

Width // 4 byte integer.
Height // 4 byte integer
Frames per second // 2 byte integer.
Pixels... // String of 3-byte pixels.

In this encoding, the pixel string can be broken into rows according to the width, then separate images (or frames) according to the height. The frames are played back at the specified framerate.

A three second video of the previously defined image, with its rows rotating vertically, would be defined as:

0 0 0 8 // Width of 8 px.
0 0 0 3 // Height of 3 px.
0 1 // 1 frame per second.
// Frame 1:
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
// Frame 2:
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
// Frame 3:
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000

Scaled up, and looping, the above defined video looks like:

These are, of course, basic encodings which aren't very practical for real-world use. However, we could achieve something much more practical by simply layering on additional encoding layers, such as an audio ecoding layer, and an encoding layer for compression.

Compression.

Compresion is commonly used as an encoding layer to reduce the size of files. Our previously defined uncompressed video format requires 10 bytes for metadata, then three bytes per pixel. A 60 second 1920x1080p video running at 30fps would require 10 + 3 * 1920 * 1080 * 30 * 60 == 11,197,440,010B == 11.2GB! That's a bitrate of ~187MB/s, which is a lot; you'd need 1,496Mb/s (megabits per second) of bandwith to stream such a video over the internet, which you almost certainly don't have.

As you can hopefully see, compression is necessary for modern computing; the internet as we know it couldn't function without it.

There are many compresison techniques, and they can be applied in multiple ways. Compression is a gigantic topic, so today, we'll just look at implementing two basic compression types on the simple video we used to demonstrate our video encoding:

0 0 0 8 // Width of 8 px.
0 0 0 3 // Height of 3 px.
0 1 // 1 frame per second.
// Frame 1:
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
// Frame 2:
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
// Frame 3:
255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255  255 255 255
255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000  255 000 000
000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000  000 000 000

Dictionaries.

First, we'll look at dictionary compression. Our video only has 5 unique byte values:

0, 1, 3, 8, 255

To represent 5 values, we only need three bits (0b111 == 0d7). We can define our available values at the start, then encode the actual data as a bit string:

005 // The length of our dictionary.
000
001
003
008
255 // The last entry in our dictionary; everything after has 3-bit 'words'.
// Metadata:
0 0 0 3 // Width
0 0 0 2 // Height
0 1 // Framerate
// Frame 1:
4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0
0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0
4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4
// Frame 2:
0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0
4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4
4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0
// Frame 3:
4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4  4 4 4
4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0  4 0 0
0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0  0 0 0

Our original encoding required 10 bytes + 3 colours * 8px * 3px * 3fr == 226 bytes == 1808 bits to encode our sample video. Now, we have 6 bytes, or 48 bits of dictionary data, then our actual video is encoded in 3-bit chunks for 6 bytes * 8 bits + (10 words + 3 colours * 8px * 3px * 3fr) * 3 bits == 726 bits == 90.75 bytes. We have to round up, as the file will live in memory and/or on a drive which can only address whole bytes, but at 91 bytes we've more than halved the size of our video.

You might notice that we still have a lot of repetition! Our pixel data still uses 3 words per pixel, but we only have three unique pixels:

4 0 0 // 255 000 000; red.
0 0 0 // 000 000 000; black.
4 4 4 // 255 255 255; white.

If we tweak our video format to have a second, internal pixel dictionary, we can save even more space:

005 // The length of our dictionary.
000
001
003
008
255 // The last entry in our dictionary; everything after has 3-bit 'words'.
// Metadata:
0 0 0 3 // Width
0 0 0 2 // Height
0 1 // Framerate
2 // There are 3 pixels in our pixel dictionary.
0 0 0
4 0 0
4 4 4 // We know we're at the end of our pixel dictionary; everything after is 2-bit 'words'.
// Frame 1:
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
// Frame 2:
0 0 0 0 0 0 0 0
2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1
// Frame 3:
2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0

Now we have 48 bits in our first dictionary, 30 bits in our video metadata, 30 bits in our pixel dictionary, and 8px * 3px * 2 bits * 3fr == 144 bits in our frame data. Our video only requires 252 bits, or 32 bytes on disk!

Run Length Encoding.

Our video is pretty small at this point, but we still have a fair bit of duplication in our frames! What if each pixel was represented by a 2 bit pixel identifier, and a 3 bit repetition count (sourced from our main dictionary)?

01 011 00 011 10 011 // Frame 1
00 011 10 011 01 011 // Frame 2
10 011 01 011 00 011 // Frame 3

In this case, our frame data requires only 45 bits! The total size of our video is just 153 bits, or 20 bytes.

This technique is called run-length encoding. We could even define a run-length dictionary, rather than reverting back to our first dictionary for the run length, but in this case, all our run-lengths are the same, and we'd end up just eliding the run-length altogether with a 1-length dictionary.

Format Specificity.

Practically all widely used file formats include compression. The examples we tried here are known as 'lossless' compression, because we can reverse the compression process, and get back the original data. This form of compression is used by file formats such as PNG and ZIP, while lossy compression allows for some bounded simplification of data to save additional space, such as in JPEG or MP4.

While some techniques can be applied broadly, the best compression algorithm will depend on the data being compressed; data with lots of long runs will benefit greatly from run-length encoding, but data with few, or short runs, might take up more space when run-length encoded.

Compression algorithms which can make use of dictionary or run-length encoding can include complex logic to decide the size of dictionary elements, when (or whether) to split dictionaries up, where to apply run-length encoding, and more. Many more complicated (but more effective) algorithms include a trade-off between compression speed, and compression ratio (how much the data ends up being compressed), because trying many variations of these parameters can become computationally expensive fast.

Encryption.

Another common kind of encoding layer is encryption; encryption takes advantage of the fact that all data is just a number by performing mathematical (and logical) operations on data to render it unreadable to anyone who doesn't have the necessary secret key (which is, itself, just a number).

Symmetric Encryption.

The simpler form of encryption is the one you're likely already familiar with, and imagine when you hear the word 'encryption'.

In symmetric encryption schemes, you have a single key. You use that key to encrypt data (perform a complex set of mathematical and logical operations on it), and you need that same key to decrypt it. Unencrypted data is often referred to as plaintext, and its encrypted form is often referred to as cyphertext.

The standard symmetric encryption algorithm is AES256. This scheme uses a random 256-bit key (a 32-byte long integer) to encrypt and decrypt data. This scheme is quantum-resistant; even the best cryptographically relevant quantum computers (which are still only theoretical and may never materialize) could only roughly half the bitwise security of AES256. This would mean a 256-bit AES key would have 128-bit security.

You might have caught that slightly odd wording. In cryptography, we use bits to refer to data, as you might be familiar with, but we also use it to refer to computational complexity. In rough terms, an algorithm with 128 bits of security should require, on average, 2^128 operations to break (often by pseudo-randomly guessing keys until you find the correct one).

An 'operation' can be complex, or simple. 128 bits of security is generally considered to be cryptographically secure (meaning mathematically implausible to break), even where the operation in question is computationally simple. If you designed a computer capable of executing 1 trillion of those operations per second, crammed 1 billion of those computers into a data centre, and built 1 million of those data centres, it would take 2^128 / (1,000,000,000,000 ops/s * 1,000,000,000 computers * 1,000,000 datacentres * 60 seconds * 60 minutes * 24 hours * 365.2425 days) ~= 10,783 years on average to attack one key, using all quadrillion of those computers at full utilization the entire time.

For every bit of security, the computational work doubles; at 129 bits, you'd either need twice as much time, or twice as much compute to manage it in the same time. At 140 bits, it takes 2^(140-128) ~= 4,096 as much work, and at 160 bits, 4,294,967,296 (4.3 billion) times as much!

In case that wasn't a sufficient demonstration of the power of exponentials, consider going the other way; at just 20 bits less security (108 bits), it'd only take about 3.75 days for all that compute to crack your key! That's still a mind-boggling amount of compute, mind you; the entire Bitcoin network has done about 96 bits of work since the first block, which is less than 1/4000th of the work required to break that 'weak' 108-bit key.

A very simple, though poor demonstrative cypher could perform bit and byte rotations on a fixed-size block of input data, whose distances are determined by the key's bits:

1000111010011100 // 16-Bit Key
072 101 108 108 111 044 032 119 111 114 108 100 033 // ASCII Plaintext of 'Hello, world!'
072 101 108 108 111 044 032 119 111 114 108 100 033 000 000 000 // 0-padded plaintext for length of 16.
// Padded plaintext bits.
01001000 01100101 01101100 01101100
01101111 00101100 00100000 01110111
01101111 01110010 01101100 01100100
00100001 00000000 00000000 00000000
// Right-rotate bits in each byte by bits 1-3 of key (4).
10000100 01010110 11000110 11000110
11110110 11000010 00000010 01110111
11110110 00100111 11000110 01000110
00010010 00000000 00000000 00000000
// Right-rotate bits in even bytes (1-indexed) by bits 4-6 of key (3).
10000100 11001010 11000110 11011000
11110110 01011000 00000010 11101110
11110110 11100100 11000110 11001000
00010010 00000000 00000000 00000000
// Right-rotate bits in odd bytes (1-indexed) by bits 7-9 of key (5).
00100100 11001010 00110110 11011000
10110111 01011000 00010000 11101110
10110111 11100100 00110110 11001000
10010000 00000000 00000000 00000000
// Right-rotate bits in % 3 == 0 bytes (1-indexed) by bits 10-12 of key (1).
00100100 11001010 00011011 11011000
10110111 00101100 00010000 11101110
11011011 11100100 00110110 01100100
10010000 00000000 00000000 00000000
// Right-rotate bits in % 3 == 1 bytes (1-indexed) by bits 12-14 of key (7).
01001000 11001010 00011011 10110001
10110111 00101100 00100000 11101110
11011011 11001001 00110110 01100100
00100001 00000000 00000000 00000000
// Right-rotate bits in % 3 == 2 bytes (1-indexed) by bits 14-16 of key (4).
01001000 10101100 00011011 10110001
01111011 00101100 00100000 11101110
11011011 11001001 01100011 01100100
00100001 00000000 00000000 00000000
// Right-rotate bytes by bits 1-4 of key (8).
11011011 11001001 01100011 01100100
00100001 00000000 00000000 00000000
01001000 10101100 00011011 10110001
01111011 00101100 00100000 11101110
// Right-rotate bytes in groups of 8 by bits 5-7 of key (7).
11001001 01100011 01100100 00100001
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000 
// Right-rotate bytes in groups of 4 by bits 8-9 of key (1).
01100011 01100100 00100001 11001001
00000000 00000000 11011011 00000000
00011011 10110001 01111011 10101100
00100000 11101110 01001000 00101100
// Right-rotate bytes in groups of 2 by bit 10 of key (0).
01100011 01100100 00100001 11001001
00000000 00000000 11011011 00000000
00011011 10110001 01111011 10101100
00100000 11101110 01001000 00101100
// Right-rotate bytes in groups of 4 by bits 11-12 of key (1).
11001001 01100011 01100100 00100001
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000
// Right-rotate bytes by bits 13-16 of key (12).
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000
11001001 01100011 01100100 00100001

The above cypher is not very secure; the most obvious flaws are its key length, and its preservation of high/low bit distributions; a zero byte will always be zero, a byte with a value of 255 will always be 255. It is, as a result, more vulnerable to statistical analysis than it should be; if you knew you were looking for ASCII text, you might be able to optimize over a bruteforce search (trying every possible key) by considering the number of high bits in each byte, and performing a dictionary attack.

It is, however, a useful demonstration; the operations involved are simple, and to decrypt all we need to do is reverse the order and direction of operations:

// Cyphertext Bits
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000
11001001 01100011 01100100 00100001
1000111010011100 // 16-Bit Key
// Left-rotate bytes by bits 13-16 of key (12).
11001001 01100011 01100100 00100001
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000
// Left-rotate bytes in groups of 4 by bits 11-12 of key (1).
01100011 01100100 00100001 11001001
00000000 00000000 11011011 00000000
00011011 10110001 01111011 10101100
00100000 11101110 01001000 00101100
// Left-rotate bytes in groups of 2 by bit 10 of key (0).
01100011 01100100 00100001 11001001
00000000 00000000 11011011 00000000
00011011 10110001 01111011 10101100
00100000 11101110 01001000 00101100
// Left-rotate bytes in groups of 4 by bits 8-9 of key (1).
11001001 01100011 01100100 00100001
00000000 00000000 00000000 11011011
10101100 00011011 10110001 01111011
00101100 00100000 11101110 01001000 
// Left-rotate bytes in groups of 8 by bits 5-7 of key (7).
11011011 11001001 01100011 01100100
00100001 00000000 00000000 00000000
01001000 10101100 00011011 10110001
01111011 00101100 00100000 11101110
// Left-rotate bytes by bits 1-4 of key (8).
01001000 10101100 00011011 10110001
01111011 00101100 00100000 11101110
11011011 11001001 01100011 01100100
00100001 00000000 00000000 00000000
// Left-rotate bits in % 3 == 2 bytes (1-indexed) by bits 14-16 of key (4).
01001000 11001010 00011011 10110001
10110111 00101100 00100000 11101110
11011011 11001001 00110110 01100100
00100001 00000000 00000000 00000000
// Left-rotate bits in % 3 == 1 bytes (1-indexed) by bits 12-14 of key (7).
00100100 11001010 00011011 11011000
10110111 00101100 00010000 11101110
11011011 11100100 00110110 01100100
10010000 00000000 00000000 00000000
// Left-rotate bits in % 3 == 0 bytes (1-indexed) by bits 10-12 of key (1).
00100100 11001010 00110110 11011000
10110111 01011000 00010000 11101110
10110111 11100100 00110110 11001000
10010000 00000000 00000000 00000000
// Left-rotate bits in odd bytes (1-indexed) by bits 7-9 of key (5).
10000100 11001010 11000110 11011000
11110110 01011000 00000010 11101110
11110110 11100100 11000110 11001000
00010010 00000000 00000000 00000000
// Left-rotate bits in even bytes (1-indexed) by bits 4-6 of key (3).
10000100 01010110 11000110 11000110
11110110 11000010 00000010 01110111
11110110 00100111 11000110 01000110
00010010 00000000 00000000 00000000
// Left-rotate bits in each byte by bits 1-3 of key (4).
01001000 01100101 01101100 01101100
01101111 00101100 00100000 01110111
01101111 01110010 01101100 01100100
00100001 00000000 00000000 00000000
072 101 108 108 111 044 032 119 111 114 108 100 033 000 000 000 // ASCII Bytes
Hello, world!\0\0\0 // ASCII Value (note: \0 is the ASCII null character (0 value) escape code).

The operations performed during encryption must be reversible; as long as this criteria is met, to decrypt you need only execute the inverse operations in reverse order, as above. Addition and subtraction are reversible (though in the case of fixed-size integers they must wrap when overflowing, eg 5-10 in an unsigned 1-byte integer (values 0-255) must yield 251, and 251 + 10 must yield 5), but multiplication and division, in a fixed-size integer context, are not. The former operations could, however, be used to resolve the issue of high-bit counts not changing in the above cypher.

In the case of symmetric encryption, operations are reversible with the same key, but that doesn't have to be the case...

Asymmetric Encryption.

In asymmetric encryption schemes, you have two keys: a public key, and a private key. Your private key is just a large number, much like a symmetric key, but the public key is derived from your private key using complex mathematics which we won't be covering today.

If you encrypt data with my public key, only I, with the corresponding private key, can decrypt it. This allows for us to communicate secret information, without a shared secret key, as long as we know one anothers' public keys (which are, as the name suggests, non-privileged, and can be shared around freely).

Asymmetric encryption is more computationally expensive than symmetric encryption and, as such, it's common place to establish a 'session key', which is a symmetric encryption key shared secretly between communicating parties by encrypting it with their public keys.

There's another interesting thing you can do with an asymmetric scheme: if I encrypt data with my private key, anybody with my public key can decrypt it. This might not seem particularly useful at first glance, because my public key is public, and so the contents of a message encrypted with my private key can't be a secret, but this is roughly the basis of cryptographic signatures.

Say you have my public key, which you attained by trusted means (possibly we exchanged public keys in person last time we met up), and you want to talk to me. We are, however, communicating over an insecure channel, like an IRC chatroom; anybody could pretend to be me, and you'd simply have to guess whether it was really me; something which might be harder nowadays given the prevalance of generative 'AI'.

Even if we each have an account on a common website or application, and we exchanged our usernames, somebody could gain unauthorized access to my account, or anybody with administrative permissions on the site could hijack my account and pretend to be me.

Enter cryptographic signatures: you can ask me to 'sign' a message with my private key. I encrypt it with my private key and, if when you attempt to decrypt it with my public key, you get a well-formed message (which matches what you were expecting), you know that I must possess the corresponding private key.

This isn't a 100% guarantee that I am who I say I am, however. It's a cryptographic guarantee that the person you're communicating with has access to the private key corresponding to the public key you're verifying against. I can generate a private key myself, and so I don't need to trust a website administrator not to pretend to be me; they can't get my private key. I can keep my private key secure, on my own computer, even in an airgapped (not internet connected) environment to avoid it being stolen by malware. I can use full-disk (symmetric) encryption to prevent anybody who gains access to my computer being able to steal my private key. As long as I'm successful, any message signed by that key can be trusted.

In the case that my private key is compromised, I can sign and widely distribute a message saying as much; anybody can verify that message against my public key, and know that somebody holding the private key signed it. Even if it wasn't me, the key would have to be compromised for such a signature to exist! I couldn't, however, share my new public key in that signed message; you'd have no way of knowing whether that key belongs to the real me, or someone who compromised my old one. We'd instead have to go through a trusted process (preferably meet in person) to exchange public keys again.

Asymmetric cryptography is used in many ways beyond simple communication; it enables you to browse the internet with more privacy than you'd otherwise have, and is the basis for Bitcoin.

HTTPS.

When you visit a website using the HTTPS protocol (the URL starts with https://), only your computer and the website's server can read the messages being exchanged between the two of you. Your ISP can see which server you're connecting to, but they can't see what you're doing on that website (as long as the website isn't sharing that information with them!).

This is only possible because of asymmetric cryptography. The HTTPS protocol relies on cryptographic signatures, asymmetric encryption, and symmetric encryption (for session keys, and previously described).

TODO: Overview of CAs.

TODO: Overview of DH key exchange.

Bitcoin.

Bitcoin doesn't have accounts; it has UTXOs. A UTXO is an unspent transaction output; transactions have inputs, which they spend, and outputs, which they send coin to. An UTXO is literally an output for a transaction which happened in the past which hasn't been used as an input in any other transaction yet.

A transaction output pays an amount to an address. An address is derived from a 'spend script', most commonly a hash of a public key whose corresponding private key must sign any transaction which uses the UTXO as an input (spending it).

To spend a UTXO, I must have the private key whose public key matches the key defined in the spend script (hash). I derive the public key, include it in the transaction, and sign the transaction with my private key. Everybody else on the network checks that the transaction is signed with the public key the transaction reveals, then independently checks that the public key is actually the one which the UTXO was locked to.

This means that ownership of Bitcoin is dictated solely by control of the private key(s) from which any address containing Bitcoin was derived. If somebody steals my private key, they can spend my Bitcoin, and the network has no way of knowing the difference. If I don't have a private key (eg, I have an account on an exchange), I don't own Bitcoin; I own an IOU. However, if I can protect my private key (eg: by airgapping it; signing data doesn't require an internet connection), nobody can gain access to my coins unless I choose to give up my private key.

Onion Nets.

TODO

Instructions.

Your computer has three main components:

The central processing unit (CPU).
High-speed, volatile (non-persistent) working memory for programs (RAM).
Storage: non-volatile (persistent) memory for long-term storage (hard drives, SSDs, flash memory, EMMC, etc).

The CPU is the part which actually executes instructions, while the memory (RAM and storage both) store the data the CPU operates on, including the instructions it's processing.

The CPU itself has some very small, very fast regions of memory, known as the cache, and registers. One of these registers is the Instruction Pointer: a number which points to the location of the next byte of code to execute in RAM. Every time an instruction is executed, the instruction pointer is updated.

CPUs operate on 'cycles'. Each instruction takes some number of cycles to execute, quite often just one. Your CPU's clock speed tells you how many cycles each core in your CPU can execute per second; a speed of 4.0GHz means your CPU performs 4 billion cycles per second and can therefore execute up to 4 billion instructions per second.

Each core in a CPU has its own registers and cache (note: there are different kinds of cache, but the L1 Cache is usually core-specific), but the RAM and storage are shared between all the cores, and must be synchronized. To avoid having to wait for other cores to finish what they're doing, cores load memory they think they'll need soon, including code, into their cache. Regions of memory can be marked as read-only, in which case no synchronization of that region is necessary, because no other core can modify it.

But what is an instruction? Well, it depends on your instruction set. Most computers today use either the x86_64 instruction set (Intel and AMD), the AARCH64 instruction set (mobile devices, other ARM devices like newer Apple computers), or RISC-V (most common on embedded and small servers). The instruction set is just an encoding, like ASCII, but rather than defining which bytes correspond to which character, they define which bytes correspond to which behaviour.

Consider the following toy instruction set, with 64-bit (8-byte) memory addresses and registers:

// Register - 1 Byte
0x00: Instruction Pointer.
0x01-0xFE: Working Register.
0xFF: Invalid.
// Size - 1 Byte
// Note: registers have an implicit size of 8; where size is mismatched, 0 padding/truncation is used.
0x00: 1 byte.
0x01: 2 bytes.
0x03: 4 bytes.
0x03: 8 bytes.
// Location - Variable (1 byte or 10 bytes)
0x00: Instruction Pointer.
0x01-0xFE: Working Register.
0xFF SIZE ADDR: A location in memory; ADDR is 8 bytes.
// Comparison Result - 1 Byte
0x00: Less than.
0x01: Equal to.
0x02: Greater than.
// Instruction - Variable Length
0x00: Halt execution until the instruction pointer changes.
0x01 LOC VAL: Write VAL to LOC.
0x02 LOC1 LOC2: Copy the value at LOC2 to LOC1.
0x03 REG LOC: Add the value at LOC to REG.
0x04 REG LOC: Subtract the value at LOC from REG.
0x05 REG LOC: Multiply REG by the value in LOC.
0x06 REG LOC: Divide REG by the value in LOC.
0x07 REG LOC: Bitwise AND REG with the value in LOC; 0b1011 AND 0b0110 == 0b0010.
0x08 REG LOC: Bitwise OR REG with the value in LOC; 0b1010 OR 0b1100 == 0b1110.
0x09 REG LOC: Bitwise XOR (exclusive OR) REG with the value in LOC; 0b1101 XOR 0b1011 == 0b0110.
0x0A REG LOC: Left Shift REG by the value in LOC.
0x0B REG LOC: Right Shift REG by the value in LOC.
0x0C REG: Invert the bits in REG.
0x0D REG LOC1 LOC2: Compare LOC1 to LOC2 and write the result to REG.
0x0E REG LOC1 LOC2 LOC3: Write the value at LOC1 to REG if LOC2 equals LOC3.

To write the value 0xFF to memory address 0x8000000000000000, the following bytes are used:

0x01 0xFF 0x00 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xFF

Interpretation of the above bytecode is as follows:

Read 1 byte, incrementing instruction pointer by 1.
0x01 is a write instruction; expect a location next.
Read 1 byte, incrementing instruction pointer by 1.
0xFF is a memory location; expect a size and address.
Read 1 byte, incrementing instruction pointer by 1.
0x00 is a size of 1 byte.
Read 8 bytes for address to write to (0x8000000000000000), incrementing instruction pointer by 8.
Read 1 byte for value to write (0xFF), incrementing instruction pointer by 1.
Write value to location.

To write the value 0x80FF00FF00FF00AA to the first working register, the following bytecode is used:

0x01 0x01 0x80 0xFF 0x00 0xFF 0x00 0xFF 0x00 0xAA

Interpretation as follows:

Read 1 byte, incrementing instruction pointer by 1.
0x01 is a write instruction; expect a location next.
Read 1 byte, incrementing instruction pointer by 1.
0x01 is the first working register; expect an 8-byte value next.
Read 8 bytes for value to write (0x80FF00FF00FF00AA), incrementing instruction pointer by 8.
Write value to location.

To write the same value to the instruction pointer, executing a 'jump' to a different piece of code, the same bytecode as above is used, with a location byte of 0x00. Execution would continue from 0x80FF00FF00FF00AA, rather than the previous (incremented) value of the instruction pointer.

Our instruction set doesn't have a 'remainder' instruction. To calculate the remainder of the 4-byte value in the 5th working register, divided by the value at memory location 0x10FFA7481CDE99CE, the following bytecode, consisting of multiple instructions could be used:

0x02 0x06 0x05 // Copy WR5 to WR6.
0x06 0x06 0xFF 0x02 0x10 0xFF 0xA7 0x48 0x1C 0xDE 0x99 0xCE // Divide WR6 by value at 0x10FFA7481CDE99CE; integer division truncates decimals.
0x05 0x06 0xFF 0x02 0x10 0xFF 0xA7 0x48 0x1C 0xDE 0x99 0xCE // Multiply WR6 by value at 0x10FFA7481CDE99CE.
0x02 0x05 0x06 // Subtract WR6 from WR5.

Assembly.

As you can hopefully see, instructions are just an encoding for behaviours a CPU can perform. The above encoding is the sort your computer works with; it's simpler and faster for instructions to be represented in a minimal, binary encoding because the interpretation (and execution) of instructions needs to be implemented in hardware (that is, physical circuitry!). Bytecode is not, however, very ergonomic for humans.

Assembly languages are an (approximate) 1:1 mapping of byte code to mnemonic (text). They, too, are just an encoding, but rather than encoding the behaviour directly, they encode the machine code which, in turn, encodes the behaviour. Our toy instruction set could have the following mnemonics:

// Register
ISP: Instruction Pointer, maps to 0x00.
WR001-WR254: Working Register, maps to 0x01-0xFE.
// Size
b: Byte; maps to 0x00.
s: Short; 2 bytes, maps to 0x01.
m: Medium; 4 bytes, maps to 0x03.
l: Long; 8 bytes, maps to 0x04.
// Location
ISP: Instruction Pointer, maps to 0x00.
WR001-WR254: Working Register, maps to 0x01-0xFE.
SIZE HEX_ADDR: A location in memory, mapping to 0xFF SIZE ADDR. Eg: b80808080FEFEFEFE.
// Comparison Result
LT: Less than, maps to 0x00.
EQ: Equal to, maps to 0x02.
GT: Greater than, maps to 0x03.
// Instruction
// Instructions are terminated with ';'.
HLT: Halt, maps to 0x00.
WRT LOC VAL: Write, maps to 0x01.
MOV LOC1 LOC2: Move, maps to 0x02.
ADD REG LOC: Add, maps to 0x03.
SUB REG LOC: Subtract, maps to 0x04.
MUL REG LOC: Multiply, maps to 0x05.
DIV REG LOC: Divide, maps to 0x06.
AND REG LOC: Bitwise and, maps to 0x07.
ORR REG LOC: Bitwise or, maps to 0x08.
XOR REG LOC: Bitwise xor, maps to 0x09.
LSH REG LOC: Left shift, maps to 0x0A.
RSH REG LOC: Right shift, maps to 0x0B.
INV REG: Bit invert, maps to 0x0C.
CMP REG LOC1 LOC2: Compare, maps to 0x0D.
CMV REG LOC1 LOC2 LOC3: Conditional move, maps to 0x0E.

With the above assembly language encoding, our remainder calculating code becomes much more readable:

MOV WR006 WR005;
DIV WR6 m10FFA7481CDE99CE;
MUL WR6 m10FFA7481CDE99CE;
SUB WR005 WR006;

We can implement an if statement equivalent with something like:

CMP WR001 WR002 b8000000000000000; // Compare WR1 with WR2, writing result to memory (a register would be better if you're not running low).
WRT WR003 GT; // Write 'greater than' to WR3; we could use WR1 or WR2 if we don't need their current values anymore.
WRT WR004 0x0000000000000000; // Write our jump location to WR4.
CMV ISP WR004 b8000000000000000 WR003; // If WR001 > WR002, jump to 0x0000000000000000.
// 'Else' code goes here.

Working out the address of code is a lot of hassle, however, and assembly languages usually have labels which allow you to reference a specific line of code, or location in memory, by name. In real instruction sets, jumps by an amount, rather than to a specific location, are common, but one could achieve the same with our toy example by copying out the instruction pointer, adding to, or subtracting from it, and conditionally writing it back (or, where the jump is unconditional, just adding to/subtracting from it directly).

Computers don't understand assembly language. They only speak machine code. If you tried to change your instruction pointer to piece of assembly code somewhere in memory, your CPU would raise an exception (stop because of an error) in short order. Assembly language requires an 'assembler', which parses (decodes) your assembly code, does any necessary calculation for 'fancy' features like labels, and encodes the result as bytecode. An assembler's output is a 'binary'; a file you can actually execute, containing bytecode (and, in the case of programs on an operating system, required metadata).

Most people don't write assembly, because it's barebones and instruction-set specific. Instead, high-level programming languages with features like if statements (and lots more) are preferred. Make no mistake, however: those programming languages are just behavioural encodings, like machine code, and they have to be decoded, evaluated, and reencoded into machine code before they can be run. Many programming languages use a compiler, where the source code is compiled ahead of time, and the resulting binaries are distributed to the users, while others use an interpreter, which reads the source files and translates them into machine code live. Whatever the approach, your computer executes bytecode, not C, or Rust, or JavaScript.

Compilers and interpreters usually perform a sort of compression as part of their encoding stack. Rather than optimizing for byte size of the resulting binary (though this is also an option), they attempt to optimize for CPU cycles taken. Just as an image compressor attempts to encode the same image in fewer bytes, a compiler attempts to encode the same behaviour in fewer cycles.

Hashing.

Encodings are two-way functions; you can encode, and decode, compress and extract, encrypt and decrypt, go forwards, and back. There's another class of data function, which is less understood, but just as important: the one-way function.

Hashing is a one-way function. It takes data of abitrary length, and performs mathematical and logical operations on it (comparable to, but not the same as, those used by encryption) to transform it into a fixed-length pseudo-random output.

A common hashing algorithm is SHA256. SHA256 can be fed any number of bytes, and then 'closed' to produce a hash of 32 bytes, or 256 bits. A hash, like all other data, is just a number. In this case, it's a very large number whose bits each have a 50/50 chance of being high, or low. Feed the same data to a hasing algorithm, and you'll always get the same hash out. Change even a single bit of the input, and at least in the case of cryptographically-secure hashing algorithms, like SHA256, you'll get a totally different, completely unpredictable hash.

Given the arbitrary-length input, and fixed-length output, hashing is necessarily irreversible. While a poor hashing algorithm might leak some information about the input data, each hash has an infinite number of inputs which could result in it. Consider a (horrible) hashing algorith which just truncates or pads the input data:

HASH_SIZE = 4
function bad_hash(List<byte> input)
 if input.length < HASH_SIZE
  then
   return input.pad(HASH_SIZE, 0)
  else
   return input.truncate(HASH_SIZE)

bad_hash([1]) // Returns [1, 0, 0, 0].
bad_hash([1, 0]) // Returns [1, 0, 0, 0].
bad_hash([1, 2, 3, 4]) // Returns [1, 2, 3, 4].
bad_hash([1, 2, 3, 4, 5]) // Returns [1, 2, 3, 4].

There's no way for someone with a given hash from the above (again, horrible) hashing algorithm to know whether the input merely started with the bytes in the hash, or if zero bytes at the end of a hash were part of the input, or just padding. Information is lost, and in a real hashing algorithm, no information about the input is leaked.

Passwords.

The most common use for hashing is one you use every single day, without realizing: passwords. When you register an account with a website, your password isn't (or, really shouldn't be) stored on their server.

Instead, their server hashes the password and stores the hash (preferably with some random bytes appended to the password, and stored alongside the hash). When you try to log in, the server hashes the password you provide, and checks whether the hash matches the hash they have stored. If it matches, the password was correct, and you're logged in.

function register_user(string username, string password)
 salt = generate_random_bytes(32)
 hash = calculate_hash(password.append_bytes(salt))
 store(username, salt, hash)

function check_password(string username, string password)
 (salt, hash) = get_auth_values_for(username)
 if hash == calculate_hash(password.append_bytes(salt))
 then
  return SUCCESS
 else
  return FAILURE

Even if their server is compromised, and their database leaked, your password is nowhere to be found; the attacker gets the hash of your password which, if they attempted to enter into another website, would get hashed, and not match the hash they have stored, even if you used the same password across multiple websites (you shouldn't, though).

In the case of weak passwords, attackers can (and do) build tables of password to hash, by generating weak passwords and hashing them. Given a compromised database, they can look up all the password hashes in their table, and quickly find an input value for any weak passwords. The random bytes appended to your password is known as a salt, and prevents this attack functioning, because the hash will be totally different with and without the salt.

Proof of Work.

TODO