Table of Contents
1. Introduction
Welcome to my little article about Base64 encoding. I want to share with you my experience to understand and implement Base64 encoding and decoding. As always, it' probably not the best or most efficient way, but it should be a good starter. So let's begin …
Base64 is an algorithm to convert a stream of bytes into a stream of printable characters (and back).
The origin of such binary-to-text encoding scheme like Base64 is the requirement to send a stream of bytes over a communication channel which does not allow binary data but only text-based data.
E.g. the ASCII standard is based on seven-bit values, thus uses a range of 0-127. Still an ASCII value is normally stored in a byte which has eight bits (0-255). The usage of the most significant bit of an ascii value is not completely defined (could be used as a parity bit for example). So if a stream which is expected to contain only ASCII values actually also contain non-ASCII values, this could potentially lead to problems. However, it is not convenient that a programm shall know whether the data bytes to process are 'text' or 'binary' - and often binary data shall be processed / transmitted via a text-based protocol / algorithm. This is where Base64 encoding & decoding could come into play.
Base64 theory
Base64 uses 64 characters that are available (and printable) in most text encoding schemes to represent its data.
So how to represent a data byte which can hold 256 different values by a set of only 64 characters? Well, of course this is not possible by a 1:1 mapping, so the Base64 encoded data stream needs more characters than the original byte-based. In fact three data byte values are encoded to four Base64 values: This makes sense as three 8bit values are in total 24bit, while a value in range 0-63 requires 6bit for representation, thus four Base64 characters fit in 24bits.
Before we look at the encoding process, first here is the Base64 index mapping table:
Index | Character | Index | Character | Index | Character | Index | Character | |||
0 | A | 1 | B | 2 | C | 3 | D | |||
4 | E | 5 | F | 6 | G | 7 | H | |||
8 | I | 9 | J | 10 | K | 11 | L | |||
12 | M | 13 | N | 14 | O | 15 | P | |||
16 | Q | 17 | R | 18 | S | 19 | T | |||
20 | U | 21 | V | 22 | W | 23 | X | |||
24 | Y | 25 | Z | 26 | a | 27 | b | |||
28 | c | 29 | d | 30 | e | 31 | f | |||
32 | g | 33 | h | 34 | i | 35 | j | |||
36 | k | 37 | l | 38 | m | 39 | n | |||
40 | o | 41 | p | 42 | q | 43 | r | |||
44 | s | 45 | t | 46 | u | 47 | v | |||
48 | w | 49 | x | 50 | y | 51 | z | |||
52 | 0 | 53 | 1 | 54 | 2 | 55 | 3 | |||
56 | 4 | 57 | 5 | 58 | 6 | 59 | 7 | |||
60 | 8 | 61 | 9 | 62 | + | 63 | / |
2.1 Encoding Base64
The encoding algorithm is pretty simple:
Take three character bytes from the input stream (24bits), divide them into four 6bit parts and convert each 6bit value according to the table above. Repeat this until no more input character bytes are left.
Input character bytes: S u n ASCII Hex value: 0x53 0x75 0x6E Binary 01010011 01110101 01101110 6bit parts: 010100 110111 010101 101110 Decimal (index)values: 20 55 21 46 Base64 characters: U 3 V u
So the string "Sun" has encoded in Base64 the value "U3Vu".
What to do if the number of input character bytes is not divisible by three?
In this case, the input buffer is filled up with zeros until it is divisable by three. Then each 6bit part which was filled up with zero is encoded with the special padding character '='.
So actually there are three cases to distinct:
- There is only one input character byte left in the last triple:
The most significant 6 bits are normally encoded, the other 2 bits are expanded with zeros and also encoded. To fill up the based64 encoded quadruple, two padding '=' characters are appended. - There are two input character bytes left in the last triple:
From those 16 bits, the most significant 2 x 6 bits are encoded to two base64 output characters. The remaining 4 bits are expanded with zeros and also encoded. Then a single padding character is appended to fill up the base64 quadruple. - The number of input character bytes is divisable by three:
No padding, thus no special handling necessary.
Let's see an example for case 1:
Input character bytes: S Hex value: 0x53 Binary 01010011 6bit parts: 010100 110000 xxxxxx xxxxxx Decimal (index)values: 20 48 N/A N/A Base64 characters: U w = =
After encoding the first 6 bits (010100) of 'S', the lower two bits (11) are expanded with zeros. Two padding characters are inserted. So the string "S" has base64-encoded the value "Uw=="
Let's see an example for case 2:
Input character bytes: S u Hex value: 0x53 0x75 Binary 01010011 01110101 6bit parts: 010100 110111 010100 xxxxxx Decimal (index)values: 20 55 20 x Base64 characters: U 3 U =
The four remaining bits (0101) uf 'u' are expanded with two zeros. So 010100 are normally encoded and one padding character is inserted. So the string "Su" has base64-encoded the value "U3U="
2.2 Decoding Base64
The decoding algorithm is also pretty simple:
Due to the padding during encoding, the number of characters of a Base64 string is always divisable by four. Thus we can process four characters of the string in one step to retrieve three decoded bytes. This means: Extract the next four characters, get the index value from the lookup table (reverse lookup), merge the four 6bit values to three 8bit values - those are the three decoded character bytes.
Base64 characters : U 3 V u Decimal (index) values : 20 55 21 46 6bit parts: 010100 110111 010101 101110 Binary 01010011 01110101 01101110 Hex value: 0x53 0x75 0x6E Character bytes : S u n
So the Base64-encoded string "U3Vu" has decoded the value "Sun"
Only special care has to taken for the last quadruple if it contains padding characters.
Summarized, there are three cases to distinct:
- The third and fourth byte of the quadruple equal the padding byte '='. (Two padding characters)
- The fourth byte of the quadruple is the padding byte '=', but not the third byte. (One padding character)
- The third and fourth byte of the quadruple do not equal the padding byte '='. This is the standard case from above.
Let's see an example for case 1:
Base64 characters : U w = = Decimal (index) values : 20 48 N/A N/A 6bit parts: 010100 110000 N/A N/A Binary 01010011 0111xxxx Hex value: 0x53 N/A N/A Character bytes : S
So the Base64-encoded string "Uw==" has decoded the value "S".
Let's see an example for case 2:
Base64 characters : U 3 U = Decimal (index) values : 20 55 21 N/A 6bit parts: 010100 110111 010100 N/A Binary 01010011 01110101 00xxxx Hex value: 0x53 0x75 N/A Character bytes: S u
So the Base64-encoded string "U3U" has decoded the value "Su"
2.3 Length calculation
Sometimes it could be useful to statically allocate the required storage of the encoding or decoding result. So we shall answer the question: From a string of x input characters, how many characters y has the resulting Base64-encoded result string? And the other way round: From a Base64-encoded string with y characters, how many bytes x has the decoded array?
To find it out, let's analyze those values for some input (x) and output (y) lengths to retrieve these formula:
Number of bytes of input stream: x | Number of characters of Base64 stream: y |
1 | 4 |
2 | 4 |
3 | 4 |
4 | 8 |
5 | 8 |
6 | 8 |
7 | 12 |
8 | 12 |
9 | 12 |
10 | 16 |
... | ... |
Using above table, we can retrieve the wanted formulas:
The number of required Base64 characters (y) from an input stream with x bytes is calculated as:
From this formula we see that for input length x = 1 we get y = 4 output characters, which is also the worst case for output-input ratio. In general, we see that for large x, the formula limits to y ≈ 4x/3, thus having a output to input ratio of ≈ 1,33, thus roughly 33% overhead.
The length of a decoded byte stream (x) from a Base64 character stream with length y is calculated as:
This second formula correctly results the maximum value out of the three possible lengths (e.g. a 12 character long Base64-encoded string could be made of a 7, 8 or 9 byte stream, so the formula returns 9.).
3. Implementation
3.1 Encoding thoughts
The actual conversion from an input byte to an output character is above listed as a mapping table, so the most obvious way is an lookup table for this conversion. So let's define a 1:1 mapping, where the index into this string matches exactly the Base64 character (e.g. index 1 = 'B'):
Another point is the extraction of the index values. Remember that three 8bit characters result in four 6bit Base64 characters, so we need a way to extract those 4 values from three consecutive bytes. This is easily achieved with some bit operations. Assume inputBytes is a 3-element array of bytes, so to extract the four index values idx1 to idx4 following code can be used:
byte idx1 = inputBytes[0] >> 2;
// take the lower two bits of first byte and place them to the most significant bit locations
// take the four most significant bits of second byte and place them to the four lowest bit locations
byte idx2 = ((inputBytes[0] & 0x03) << 4) | ((inputBytes[1] & 0xF0) >> 4);
// take the four less significant bits of second byte and place them to the most significant bit locations
// take the two most significant bits of third byte and place them to the two lowest bit locations
byte idx3 = ((inputBytes[1] & 0x0F) << 2) | ((inputBytes[2] & 0xC0) >> 6);
// take the six less significant bits of third byte
byte idx4 = inputBytes[2] & 0x3F;
The index value is used to access the lookup table to retrieve the actual Base64 character, e.g. char c1 = encLookupTable[idx1];
Before going further, here some conventions:
- The input data is declared as byte[] toEncodeAsBytes.
- A Stringbuilder called sb is used to store the encoded Base64 characters. The initialization is left out.
3.2 Encoding algorithm A
The first idea is to convert first all complete input bytes triples to four output characters. Afterwards, the special cases must be handled: Either no, one or two bytes are left and must be encoded and finally filled up with the padding characters.
Using this approach and above mentioned notes, a possible implementation looks like this:
int indexOfLastCompleteTriple = (int)(toEncodeAsBytes.Length / 3) * 3;
/* handle triples of input characters per loop */
for (i = 0; i < indexOfLastCompleteTriple; i += 3)
{
sb.Append(encLookupTable[toEncodeAsBytes[i] >> 2]);
sb.Append(encLookupTable[((toEncodeAsBytes[i] & 0x03) << 4) | ((toEncodeAsBytes[i + 1] & 0xF0) >> 4)]);
sb.Append(encLookupTable[((toEncodeAsBytes[i + 1] & 0x0F) << 2) | ((toEncodeAsBytes[i + 2] & 0xC0) >> 6)]);
sb.Append(encLookupTable[toEncodeAsBytes[i + 2] & 0x3F]);
}
if (i < toEncodeAsBytes.Length)
{ /* last triple incomplete, either one or two input characters 'missing' */
/* get first index value, always available */
byte idx1 = toEncodeAsBytes[i];
/* get second index value, if second input byte of last triple not available 'fill up with zeros' */
byte idx2 = (i + 1 < toEncodeAsBytes.Length) ? toEncodeAsBytes[i + 1] : (byte)0;
/* encode first byte of last incomplete triple */
sb.Append(encLookupTable[idx1 >> 2]);
sb.Append(encLookupTable[((idx1 & 0x03) << 4) | ((idx2 & 0xF0) >> 4)]);
if (i + 1 < toEncodeAsBytes.Length)
{ /* only one byte 'missing', encode last character = second byte in last triple */
sb.Append(encLookupTable[((idx2 & 0x0F) << 2)]);
}
else
{ /* two bytes 'missing', add one padding character */
sb.Append('=');
}
sb.Append('=');
}
return sb.ToString()
3.3 Encoding algorithm B
Another approach is to always try to handle the next input triple, and in each iteration check if the triple is complete or if we are already at the last, possibly incomplete triple.
The advantage of this approach is a slightly shorter implementation, however the checks in each iteration makes it less performant. Here is a sample implementation:
{ /* first encoding byte is never '=' */
sb.Append(encLookupTable[toEncodeAsBytes[i] >> 2]);
if (i < toEncodeAsBytes.Length - 1)
{ /* second input byte available, encode second output character */
sb.Append(encLookupTable[((toEncodeAsBytes[i] & 0x03) << 4) | ((toEncodeAsBytes[i + 1] & 0xF0) >> 4)]);
if (i < toEncodeAsBytes.Length - 2)
{ /* all three input bytes available, encode complete triple */
sb.Append(encLookupTable[((toEncodeAsBytes[i + 1] & 0x0F) << 2) | ((toEncodeAsBytes[i + 2] & 0xC0) >> 6)]);
sb.Append(encLookupTable[toEncodeAsBytes[i + 2] & 0x3F]);
}
else
{ /* last input byte missing, encode third output character with zeros expanded and add padding char */
sb.Append(encLookupTable[((toEncodeAsBytes[i + 1] & 0x0F) << 2)]);
sb.Append('=');
}
}
else
{ /* only one input byte available, encode second output character with zeros expanded and add two padding chars */
sb.Append(encLookupTable[((toEncodeAsBytes[i] & 0x03) << 4) | 0]);
sb.Append('=');
sb.Append('=');
}
}
return sb.ToString();
4. Decoding implementation
4.1 Decoding thoughts
In the encoding process, a lookup table was used to directly map an input value in range [0,63] to the appropriate Base64 character. Why not use a reverse lookup table for decoding?
This decoding lookup table must map a Base64 character to its appropriate index. E.g. the character 'B' which has ASCII value 66 must be mapped to 1 according to the Base64 mapping table.
The Base64 character with the highest ASCII value is 'z' = 122. So the decoding lookup table must have at least 123 entries [0-122].
However, because there are only 64 Base64 characters this means there are gaps in the decoding table. For example, ASCII value 36 ('$') is not part of the Base64 character set, so there is a dummy element at position 36 in the decoding table. I used the value 80 for such dummy entrys.
If a 123-element decoding lookup table is used, but an invalid Base64 stream is given as input which e.g. contains the invalid value '}' (125 decimal), a range check would be required to avoid an out-of-bounds access in this case. Because the maximum ASCII value is 127, it's common to fill the decoding table with five additional pattern values to get a 128-element table - then no illegal memory access is possible.
80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, /* 0 - 15 */
80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, /* 16 - 31 */
80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 62, 80, 80, 80, 63, /* 32 - 47 */
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 80, 80, 80, 64, 80, 80, /* 48 - 63 */
80, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, /* 64 - 79 */
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 80, 80, 80, 80, 80, /* 80 - 96 */
80, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, /* 87 - 111 */
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 80, 80, 80, 80, 80 /* 112 - 127 */
};
Four Base64 characters make three output bytes. After decoding a quadruple using the lookup table, the six lower of bits of each Base64 character has to be extracted and 'merged' by using bit shifting into three 8bit values. This is shown below:
byte b2 = decLookupTable[base64str[i + 1]];
byte b3 = decLookupTable[base64str[i + 2]];
byte b4 = decLookupTable[base64str[i + 3]];
/* get 3 bytes from 4 base64 characters */
c1 = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
c2 = (byte)(((b2 & 0x0F) << 4) | ((b3 & 0x3C) >> 2));
c3 = (byte)(((b3 & 0x03) << 6) | (b4 & 0x3F));
With this information, it's a piece of cake to develop a decoding function. At first, some conventions:
- The input Base64 stream is a string named base64Str.
- The result is a byte array named resBytes with correct length.
- The lookup table is declared as decLookupTable.
- Error checking / input length validation is left out. It is assumed that the length of each input Base64 stream is divisable by four.
4.2 Decoding algorithm A
Corresponding to encoding algorithm A, this first approach decodes as a first step all complete quadruples (containing no padding characters). The determination if the last quadruple is complete is accomplished by checking the last input stream character: if it's equal to '=', then it's incomplete, otherwise it's complete. After decoding all complete quadruples, there is additional special handling of the last incomplete quadruple.
Here a possible implementation:
bool isLastQuadrupleIncomplete = false;
if (base64str[base64str.Length - 1] == '=')
{
streamEndIndexOLastCompleteQuadruple = ((base64str.Length / 4) - 1) * 4;
isLastQuadrupleIncomplete = true;
}
else
{
streamEndIndexOLastCompleteQuadruple = (base64str.Length / 4) * 4;
}
/* decode all complete quadruples */
int outPos = 0;
int i;
for (i = 0; i < streamEndIndexOLastCompleteQuadruple; i += 4)
{
byte b1 = decLookupTable[base64str[i]];
byte b2 = decLookupTable[base64str[i + 1]];
byte b3 = decLookupTable[base64str[i + 2]];
byte b4 = decLookupTable[base64str[i + 3]];
resBytes[outPos++] = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
resBytes[outPos++] = (byte)(((b2 & 0x0F) << 4) | ((b3 & 0x3C) >> 2));
resBytes[outPos++] = (byte)(((b3 & 0x03) << 6) | (b4 & 0x3F));
}
/* decode last quadruple if incomplete */
if (isLastQuadrupleIncomplete)
{
byte b1 = decLookupTable[base64str[i]];
byte b2 = decLookupTable[base64str[i + 1]];
resBytes[outPos++] = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
if (base64str[i + 2] == '=')
{
/* already added first output character */
}
else
{
/* decode second byte of triple */
byte b3 = decLookupTable[base64str[i + 2]];
resBytes[outPos++] = (byte)(((b2 & 0x0F) << 4) | ((b3 & 0x3C) >> 2));
}
}
return resBytes;
4.3 Decoding algorithm B
Another idea is to process the input stream one character after one character, always checking if a padding character occurs and special handling is required. This results in a shorter bus less efficient implementation:
int i;
byte c1, c2, c3;
for (i = 0; i < base64str.Length; i += 4)
{
byte b1 = decLookupTable[base64str[i]];
byte b2 = decLookupTable[base64str[i + 1]];
byte b3 = decLookupTable[base64str[i + 2]];
byte b4 = decLookupTable[base64str[i + 3]];
if (base64str[i + 3] == '=')
{
if (base64str[i + 2] == '=')
{ /* two padding characters, only one byte to decode */
c1 = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
resBytes[outPos++] = c1;
}
else
{ /* one padding character, two bytes to decode */
c1 = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
c2 = (byte)(((b2 & 0x0F) << 4) | ((b3 & 0x3C) >> 2));
resBytes[outPos++] = c1;
resBytes[outPos++] = c2;
}
}
else
{
/* get 3 bytes from 4 base64 characters */
c1 = (byte)(b1 << 2 | ((b2 & 0xF0) >> 4));
c2 = (byte)(((b2 & 0x0F) << 4) | ((b3 & 0x3C) >> 2));
c3 = (byte)(((b3 & 0x03) << 6) | (b4 & 0x3F));
resBytes[outPos++] = c1;
resBytes[outPos++] = c2;
resBytes[outPos++] = c3;
}
}
return resBytes;
5. Notes to source code
The Visual Studio project with complete source code (download link at top of the article) and contains some test little test cases used by the different encoding and decoding algorithms of verification. Check it out for yourself.
Furthermore, a performance comparison to the .NET Base64 encoding / decoding functions of the System.Convert is also implemented. Interestingly, those functions are twice as fast as my implementations!
What I did is to implement an 'unsafe' version of each algorithm which allowed to access the strings with a fixed char pointer - this speed things up, but does not reach the .NET implementation.
To come to an end, I did not investigate the root cause why the .NET implementation is so fast - if you feel free, I would really like to hear the cause of this performance difference.
6. Conclusion & References
I hope you found this article interesting and could learn something! Drop me a line for corrections, hints, criticism, praise ;)
Sunshine, January 2016 (last update: March 2024)
Updates
- 2024/03/13: Fixed an issue in the decoding example case 2 in chapter 2.2.