Base 36

From Wikipedia, the free encyclopedia
  (Redirected from Base36)
Jump to: navigation, search

Base 36 is a positional numeral system using 36 as the radix. The choice of 36 is convenient in that the digits can be represented using the Arabic numerals 0–9 and the Latin letters A–Z[1] (the ISO basic Latin alphabet). Base 36 is therefore the most compact case-insensitive alphanumeric numeral system using ASCII characters, although its radix economy is poor. While this article uses upper case letters, in practice lower case letters are often used to avoid confusion between numbers that look like upper case letters, for example '0O', '1I', '8B', and '5S'.

From a mathematical viewpoint, 36, as with all highly composite numbers, is a convenient choice for a base in that it is divisible by both 2 and 3, and by their multiples 4, 6, 9, 12 and 18. Each base 36 digit can be represented as two senary(base 6) digits.

The most common latinate name for base 36 seems to be hexatridecimal, although sexatrigesimal would arguably be more correct. The intermediate form hexatrigesimal is also sometimes used. For more background on this naming confusion, see the entry for hexadecimal. Another name occasionally seen for base 36 is alphadecimal, a neologism coined based on the fact that the system uses the decimal digits and the letters of the Latin alphabet.

Examples[edit]

* 1 2 3 4 5 6 7 8 9 A
1 1 2 3 4 5 6 7 8 9 A
2 2 4 6 8 A C E G I K
3 3 6 9 C F I L O R U
4 4 8 C G K O S W 10 14
5 5 A F K P U Z 14 19 1E
6 6 C I O U 10 16 1C 1I 1O
7 7 E L S Z 16 1D 1K 1R 1Y
8 8 G O W 14 1C 1K 1S 20 28
9 9 I R 10 19 1I 1R 20 29 2I
A A K U 14 1E 1O 1Y 28 2I 2S

Conversion table:

Decimal 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Base 6 1 2 3 4 5 10 11 12 13 14 15 20 21 22 23 24 25 30
Base 36 1 2 3 4 5 6 7 8 9 A B C D E F G H I
 
Decimal 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Base 6 31 32 33 34 35 40 41 42 43 44 45 50 51 52 53 54 55 100
Base 36 J K L M N O P Q R S T U V W X Y Z 10


Some numbers in decimal, base 6 and base 36:

Decimal Base 6 Base 36
1 1 1
10 14 A
100 244 2S
1,000 4344 RS
10,000 7 4144 7PS
100,000 205 0544 255S
1,000,000 3323 3344 LFLS
1,000,000,000 2431 2124 5344 GJDGXS
1,000,000,000,000 2043 2210 1030 1344 CRE66I9S
Base 6 Base 36 Decimal
1 1 1
100 10 36
1 0000 100 1,296
100 0000 1000 46,656
1 0000 0000 10000 1,679,616
100 0000 0000 100000 60,466,176
1 0000 0000 0000 1000000 2,176,782,336
100 0000 0000 0000 10000000 78,364,164,096
1 0000 0000 0000 0000 100000000 2,821,109,907,456
52 3032 3041 2221 3014 WIKIPEDIA 91,730,738,691,298
Fraction Decimal Base 6 Base 36
1/2 0.5 0.3 0.I
1/3 0.3 0.2 0.C
1/4 0.25 0.13 0.9
1/5 0.2 0.1 0.7
1/6 0.16 0.1 0.6
1/7 0.142857 0.05 0.5
1/8 0.125 0.043 0.4I
1/9 0.1 0.04 0.4
1/10 0.1 0.03 0.3L


Conversion[edit]

32- and 64-bit integers will only hold up to 6 or 13 base-36 digits, respectively. For example, the 64-bit signed integer maximum value of "9223372036854775807" is "1Y2P0IJ32E8E7" in base-36. For numbers with more digits, one can use the functions mpz_set_str and mpz_get_str in the GMP arbitrary-precision math library. For floating-point numbers the corresponding functions are called mpf_set_str and mpf_get_str.

C implementation[edit]

static char *base36enc(long unsigned int value)
{
	char base36[37] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
	/* log(2**64) / log(36) = 12.38 => max 13 char + '\0' */
	char buffer[14];
	unsigned int offset = sizeof(buffer);
 
	buffer[--offset] = '\0';
	do {
		buffer[--offset] = base36[value % 36];
	} while (value /= 36);
 
	return strdup(&buffer[offset]); // warning: this must be free-d by the user
}
 
static long unsigned int base36dec(const char *text)
{
	return strtoul(text, NULL, 36);
}

C# implementation[edit]

private const string Clist = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static readonly char[] Clistarr = Clist.ToCharArray();
 
public static long Base36Decode(string inputString)
{
    long result = 0;
    var pow = 0;
    for (var i = inputString.Length - 1; i >= 0; i--) {
        var c = inputString[i];
        var pos = Clist.IndexOf(c);
        if (pos > -1)
            result += pos * (long)Math.Pow(Clist.Length, pow);
        else
            return -1;
        pow++;
    }
    return result;
}
 
public static string Base36Encode(ulong inputNumber)
{
    var sb = new StringBuilder();
    do {
        sb.Append(Clistarr[inputNumber % (ulong)Clist.Length]);
        inputNumber /= (ulong)Clist.Length;
    } while (inputNumber != 0);
    return Reverse(sb.ToString());
}
 
public static string Reverse(string s)
{
    var charArray = s.ToCharArray();
    Array.Reverse(charArray);
    return new string(charArray);
}

Erlang implementation[edit]

list_to_integer("kf12oi",36). %% =>1234567890
 
integer_to_list(1234567890, 36). %% => "KF12OI"

Go implementation[edit]

package main
 
import "strconv"
 
func main() {
	encode := strconv.FormatInt(1234567890, 36)
	println(encode) // => "kf12oi"
 
	decode, _ := strconv.ParseInt("kf12oi", 36, 64)
	println(decode) // => 1234567890
}

Java implementation[edit]

public class Base36 {
  public static long decode(final String value) {
    return Long.parseLong(value, 36);
  }
 
  public static String encode(final long value) {
    return Long.toString(value, 36);
  }
}

PHP implementation[edit]

The decimal value of 12abcxyz is <?php print base_convert("12abcxyz",36,10); ?>

The base_convert function converts the value to a floating-point number, which loses accuracy for numbers above implementation-specific limits. For PHP 5.3.6 on 64-bit Linux, the highest base-36 integer that can be represented accurately is 1Y2P0IJ32E8E7, equal to 263 − 1 or 9223372036854775807. Negative signs, decimal points and any characters outside the range 0Z are stripped prior to conversion, so −1.5 = 15 = F, rather than −1.I as might be expected.

Python implementation[edit]

def base36encode(number, alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    """Converts an integer to a base36 string."""
    if not isinstance(number, (int, long)):
        raise TypeError('number must be an integer')
 
    base36 = ''
    sign = ''
 
    if number < 0:
        sign = '-'
        number = -number
 
    if 0 <= number < len(alphabet):
        return sign + alphabet[number]
 
    while number != 0:
        number, i = divmod(number, len(alphabet))
        base36 = alphabet[i] + base36
 
    return sign + base36
 
def base36decode(number):
    return int(number, 36)
 
print base36encode(1412823931503067241)
print base36decode('AQF8AA0006EH')

Ruby implementation[edit]

1412823931503067241.to_s(36)  #=> "aqf8aa0006eh"
"aqf8aa0006eh".to_i(36)  #=> 1412823931503067241

JavaScript implementation[edit]

(1234567890).toString(36)  // => "kf12oi"
parseInt("kf12oi",36) // => 1234567890

bash implementation[edit]

b36(){
        b36arr=(0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)
        for     i in $(echo "obase=36; $1"| bc)
        do      echo -n ${b36arr[${i#0}]}
        done
        echo
}

Visual Basic implementation[edit]

Public Function ConvertBase10(ByVal d As Double, ByVal sNewBaseDigits As String) As String
    ' call using ConvertBase10(12345, "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ") for base36
    ' can be used to convert to any base
    ' from http://www.freevbcode.com/ShowCode.asp?ID=6604

    Dim S As String, tmp As Double, i As Integer, lastI As Integer
    Dim BaseSize As Integer
    BaseSize = Len(sNewBaseDigits)
    Do While Val(d) <> 0
        tmp = d
        i = 0
        Do While tmp >= BaseSize
            i = i + 1
            tmp = tmp / BaseSize
        Loop
        If i <> lastI - 1 And lastI <> 0 Then S = S & String(lastI - i - 1, Left(sNewBaseDigits, 1)) 'get the zero digits inside the number
        tmp = Int(tmp) 'truncate decimals
        S = S + Mid(sNewBaseDigits, tmp + 1, 1)
        d = d - tmp * (BaseSize ^ i)
        lastI = i
    Loop
    S = S & String(i, Left(sNewBaseDigits, 1)) 'get the zero digits at the end of the number
    ConvertBase10 = S
End Function

Uses in practice[edit]

  • The Remote Imaging Protocol for bulletin board systems used base 36 notation for transmitting coordinates in a compact form.
  • Many URL redirection systems like TinyURL or SnipURL/Snipr also use base 36 integers as compact alphanumeric identifiers.
  • Geohash-36, a coordinate encoding algorithm, uses radix 36 but uses a mixture of lowercase and uppercase alphabet characters in order to avoid vowels, vowel-looking numbers, and other character confusion.
  • Various systems such as RickDate use base 36 as a compact representation of Gregorian dates in file names, using one digit each for the day and the month.
  • Dell uses a 5- or 7-digit base 36 number (Service Tag) as a compact version of their Express Service Codes.
  • The software package SalesLogix uses base 36 as part of its database identifiers.[2]
  • The TreasuryDirect website, which allows individuals to buy and redeem securities directly from the U.S. Department of the Treasury in paperless electronic form, serializes security purchases in an account using a 4-digit base 36 number. However, the Latin letters A–Z are used before the Arabic numerals 0–9, so that the purchases are listed as AAAA, AAAB... AAAZ, AAA0, AAA1... AAA9, AABA...
  • The E-mail client program PMMail encodes the UNIX time of the email's arrival and uses this for the first six characters of the message's filename.
  • MediaWiki stores uploaded files in directories with names derived from the base-36 representation of an uploaded file's checksum.[3]
  • Siteswap, a type of juggling notation, frequently employs 0–9 and a–z to signify the dwell time of a toss (which may roughly be thought of as the height of the throw). Throws higher than 'z' may be made but no notation has widespread acceptance for these throws.
  • In SEDOL securities identifiers, the check digit is computed from a weighted sum of the first six characters, each character interpreted in base-36.
  • In the International Securities Identification Number (ISIN), the check digit is computed by first taking the value of each character in base-36, concatenating the numbers together, then doing a weighted sum.
  • Reddit uses base-36 for identifying posts and comments.

References[edit]

  1. ^ Hope, Paco; Walther, Ben (2008), Web Security Testing Cookbook, Sebastopol, CA: O'Reilly Media, Inc., ISBN 978-0-596-51483-9 
  2. ^ Sage SalesLogix base-36 identifiers: http://www.slxdeveloper.com/page.aspx?action=viewarticle&articleid=87
  3. ^ FileStore http://www.mediawiki.org/wiki/FileStore

External links[edit]