header pic goes here
log in for personalized pages!



Tutor Section: How to translate into base64 and back

Return to base64 Javascript tutorial
Return to tutor home

How does the actual translation from bytes to base64 characters occur? We must first set up a mapping of values (0 through 63) to base64 characters (A-Z, a-z, 0-9, '+', and '/'). We can do this by simply indexing an array.
So, for a value of 25 (which is 011001 in binary) the base64 character would be 'Z', the character for binary 101010 (which is 42 in decimal) the base64 character would be 'q'.

Let's start with something simple, a text-to-base64 conversion. We will convert the string "Hello World!" to a base64 representation. We will start by getting the ASCII byte values for each letter.
Remember that for base64, we will be using three bytes at a time. Each ASCII character is one byte, so we will be working with "Hel", "lo ", "Wor", and "ld!" separately.

Let's start with the first three characters.
  1. Convert the characters to binary.
    "Hel" is 01001000 01100101 01101100 in binary. (Notice there are 24 bits).
  2. Convert the 24 bits from three 8 bit groups to four 6 bit groups.
    01001000 01100101 01101100 becomes 010010 000110 010101 101100.
  3. Convert each of the four 6 bit groups into decimal.
    010010 = 18
    000110 = 6
    010101 = 21
    101100 = 44
  4. Use each of the four decimals to look up the base64 character code.
    18 = 'S'
    6 = 'G'
    21 = 'V'
    44 = 's'
    You now have your first three ASCII characters ("Hel") encoded as base64 ("SGVs").
Follow these steps for the next 9 ASCII characters and you get the following results:
"Hel" = SGVs
"lo " = bG8g
"Wor" = V29y
"ld!" = bGQh
The phrase "Hello World!" has been converted to "SGVsbG8gV29ybGQh". The original phrase has exactly 12 ASCII characters, and is represented by 16 base64 characters, exactly one and one third more than the original text.

So what happens, you might ask, if you don't have exact sets of three bytes? What if you had remainder bytes left over? For example, what if the data you had was "Hello" (5 ASCII characters, 5 bytes)? What if it was "blue" (4 ASCII characters, 4 bytes)? In those cases, you have groups of less than three letters: "Hello" groups into "Hel" "lo", and "blue" groups into "blu" "e".

To handle these cases, we throw one more readable character into our base64 character list. This character is not in the lookup table because it is only reserved for the two cases where you have one or two remainder bytes after grouping. We use the "=" character. Let's start with "Hello".
Follow the same exact steps for the first three characters as above. Your first three ASCII characters "Hel" are the same base64 as before "SGVs". For the remaining 2 characters, follow these steps:
  1. Convert the characters into binary.
    "lo" is 01101100 01101111 in binary.
  2. Starting from the left, separate the bytes into 6 bit chunks as best as possible.
    01101100 01101111 becomes 011011 000110 1111.
As you can see, we still need two more bits for the last group, plus a whole other six bits for the full four base64 characters. What we need is something looking like 011011 000110 1111xx xxxxxx. We can convert 011011 and 000110 to decimal just fine.
011011 = 27
000110 = 6
1111xx = what?
xxxxxx = what?

To resolve this problem, we fill the last two bits of 1111xx with 0's, so 111100 = 60. Our base64 characters so far are "bG8". Since we are missing one single complete base64 character, we add one of our special "=" characters to the back to signify that we are missing one byte. Our complete converted base64 string is now "bg8=". So the word "Hello" translates to "SGVSbg8=".

We do the same thing for the word "blue", which is missing 2 bytes.
The first three characters should be easy by now to convert. "blu" is 01100010 01101100 01110101. Translate that to 6 bit groups and you get 011000 100110 110001 110101. These convert to "Ymx1" in base64. Now you have one remaining character, "e". We do the exact same thing as last time. "e" in binary is 01100101. When you split that into four 6 bit groups, you get 011001 01xxxx xxxxxx xxxxxx. Fill the second group with 0's to be able to look it up. 011001 010000 xxxxxx xxxxxx becomes "ZQ". Because you were missing two complete bytes, add two of our special character on the end. So the letter "e" in ASCII becomes "ZQ==". The word "blue" becomes "Ymx1ZQ==".

Note: I said before that base64 encoding is one and one third larger than the byte representation. In the cases were you are missing a byte, it is actually slightly more than this. The actual range is from exactly one and one third to one and one third plus two characters.

Continue to decoding base64.