what's the big deal with BigInt?
November 14, 2019
If you’ve ever been interested in binary numbers and operations in JavaScript, you may have run into the problem of not being able to represent a binary integer over or under the decimal value of 231 or -231, respectively. This is because JavaScript handles binary operations by using 32-bit integers only. What if you need to set the 76th bit of an integer? You’d be out of luck if the BigInt type didn’t exist. BigInt allows us to have integer representations of an arbitrary size, letting us safely manipulate the 76th bit of a binary number - with some important stipulations.
Introduction
When saving a number to a variable name, JavaScript will store that number initially as a 64-bit floating-point number. Even the number 5 will be stored initially as a floating point number. JavaScript uses the IEEE 754 Standard for Floating-Point Arithmetic that most other languages adhere to, ensuring a precise implementation of storing and manipulating decimal numbers. It could be warranted to go down this rabbit hole, finding out how we can only obtain 53 bits of precision on integers according to this standard, but the remainder of this article will not be dealing with storage and manipulation of decimal numbers - rather, we will be focusing on binary numbers and operations, their limitations, and finally how the BigInt object helps liberate us from those constraints.
Performing binary operations on numbers stored as 64-bit floating-point numbers does something similar to trying to add a string and a number together. In the latter scenario, the number is coerced to a string and ultimately the return value is a string. Similarly, the operands of a binary operation are casted into 32-bit integers and the return value from an operation is a 32-bit integer as well. While most development in JavaScript will never touch binary operations defined by the developer and even fewer will come into contact with the limitations of a 32-bit integer, there are some special cases where the ability to go beyond 32-bits would create an optimal solution.
Permutation Palindrome
Given a string, find if any of the permutations of its character form a valid palindrome.
To solve this problem, we could generate a collection of all of the permutations of the given string, and then run each of those through an isPalindrome helper function until we receive a truthy return value. That works with the exception of large strings - the amount of possibilities for a string of 17 characters, somewhere around 3.55e14, is far larger than the capacity of elements for a JavaScript array. A better solution involves incorporating the frequencies of characters and the length of the string. If a string is a palindrome, it will have an even frequency for each character if the string length is even and conversely, it will have one odd frequency of a character (the rest being even) if the string length is odd.
How would we store these character frequencies? Of course we could iterate over the string once and use an object with keys as characters and values as the respective frequencies. This would definitely work, but let’s challenge ourselves - solve this problem with linear time and constant space. We can achieve constant space if we use specific bits on an integer to hold whether or not a particular character appears an odd or even amount of times.
For every character in the string in the string, we will find its ASCII character code, and flip the specific bit on our integer - if a character appears three times, the bit’s value will be 1, and if a character appears four times, the bit’s value will be 0. It should be known that most strings we encounter will be comprised of alphanumerics and we will be given ASCII codes somewhere between 41 (“A”) and 122 (“z”). In the introduction, we learned that any number that deals with binary operations gets stored as a 32-bit integer - if we try to access the 41th bit of a 32-bit integer to store the frequency of “A”, we will not get our intended outcome. Let’s explore integer overflow and later on we will see how BigInt will provide us a path around this obstacle.
Integer Overflow
Before we explore how integer overflow causes a problem for 32-bit integers, let’s simplify our approach and use 8-bit integers for the time being. The number 1 as an 8-bit integer is expressed as 0000 0001. We can perform many binary operations on this number, but the one we will be looking at is left shift - “<<”. Given a decimal number, left shift will move the 1 to the left the specified number of indices. A left shift of 3 on the number 1, written 1 << 3, will give the binary result 0000 1000, or the decimal value of 8. Remembering that we are dealing with 8-bit integers at the moment, what would happen if we left shifted the number 1 by 8? The result may be surprising, but 1 << 8 would come out to 0000 0001 or 1, in their respective binary and decimal forms. This is one way you can achieve integer overflow - shifting the 8th bit will return the 1 to the smallest position in the binary number at the right.
Because this principle will apply to numbers with any specified amount of bits, we will now return to our 32-bit example and see how integer overflow will get in the way of us solving Permutation Palindrome in constant space. Using a 32-bit integer to store frequencies of characters, lining up their ASCII codes with specific bits, presents an obvious problem - most of the characters we will deal with are alphanumerics and the ASCII values range from 41 to 122. For the letter “a”, with an ASCII code of 41, we would have to overflow the integer once and land at the 10th index of the integer to record an odd frequency - the integer would look like …0010 0000 0000. Additionally, if we wanted to record an occurence of the letter “I”, with an ASCII value of 73, we would overflow the integer twice and land at the 10th index again. We would flip the bit and the binary representation of our integer would …0000 0000 0000. That’s a big problem if we just recorded two characters with one occurrence each, but our integer says it’s either never seen a recording at the 10th index or the frequency of that specific character is even.
To solve this, one would intuitively think to use an integer with a higher bit capacity - maybe a 128-bit integer would suffice. Sadly, those don’t exist in the JavaScript language, nor do they in most others. All is not lost, the BigInt implementation is a TC39 - Stage 4 proposal, which means it is a standard part of the language and available in most mainstream browsers and in Node. Let’s explore how using BigInt will help us achieve a constant space solution to Permutation Palindrome.
BigInt
You can think of BigInt as an object with a few key properties:
{
type: 'BigInt',
sign: 0,
num_digits: 3,
digits: [0x12…, 0x34…, 0x56…],
}While the type property is self-explanatory, let’s dive into the others. The property sign intuitively keeps track of whether the specific instance of BigInt is positive or negative. Let’s now what digits and num_digits means. You can think of each element inside the digits array as one 32-bit integer. For our purposes, the first element would be the 0th-31st indices - subsequently, the second element would represent the 32nd-63rd indices, and so on. While we don’t have a single number to keep track of what we need, we do have a dynamic object that provides an API to do comparisons and operations with other BigInt objects. There are a couple interesting methods and data structures associated with BigInt, but we will continue to use BigInt to solve Permutation Palindrome in constant space.
Bringing back our original scenario, to record an occurrence of the character “a”, ASCII code 41, the second element in the digits array would be selected and the 10th bit would be flipped. Subsequently, when recording the occurrence of “I” with an ASCII code of 73, the BigInt object will find the appropriate digit and flip the appropriate bit on that integer - no overflow issues.
Check out the solution below to Permutation Palindrome and try to check out where and how BigInt’s are used:
const permPalin = str => {
/**
* Storing frequency of characters in an integer (BigInt)
*
* Achieves constant space by using an integer to memoize the character frequencies.
*
* - If string length is even, the bitVector should be 0.
* - If string length is odd, the bitVector should contain one bit of value 1
*
* Time Complexity: O(n)
* Space Complexity: O(1)
*
*/
/**
* toggles specific bit at index on the bitVector
*
* @param {BigInt} bitVector
* @param {Integer} index
*
* toggle(0b1000, 3) => bitVector = 0
* toggle(0, 3) => bitVector = 0b1000
*/
const toggle = (bitVector, index) => {
// index out of bounds
if (index < 0) return bitVector;
// mask used for comparison against bitVector, detecting whether specific bit is 1 or not
// index = 3, mask = 0b1000
const mask = BigInt(1 << index);
// if ANDing bitVector and mask results in 0, the specific bit at index will be set to 1
// 0b1000 & 0b100 = 0 => bitVector = 0b1000 | 0b100 = 0b1100
if ((bitVector & mask) === 0n) {
bitVector = bitVector | mask
}
// else the specific bit on bitVector has a value of 1,
// the bitVector is ANDed with the inverse of mask
// if 0b10 & 0b10 = 0b10 then bitVector = 0b10 & 0b01 = 0
else {
bitVector = bitVector & ~mask
}
return bitVector;
}
// BigInt to handle bit indexes over 53
let bitVector = 0n;
const strLengthEven = str.length % 2 === 0;
// toggle on/off each character of the string using character codes
str.split('').forEach(char => {
bitVector = toggle(bitVector, char.charCodeAt())
});
if (strLengthEven && bitVector === 0n) return true
// if bitVector = 0b1000 and bitVector - 1 = 0b111, ANDing them gives the result 0,
// indicating only one bit on the integer had a value of 1
// conversely, if bitVector = 0b1100 and bitVector - 1 = 0b1011, ANDing gives the result 0b1000,
// indicating more than one specific bit on bitVector had a value of 1
else if (!strLengthEven && ((bitVector & (bitVector - 1n)) === 0n)) return true
else return false
}There are a few bitwise operations and conventions used (bit mask, inverse, AND operator) that will not be discussed at the moment, hopefully the comments in the code are useful and references will be made at the bottom of this article. What is important is how BigInt objects are made and how they are compared and manipulated. The first most common scenario for creating a BigInt object is to invoke it - BigInt(number). The second is the use the letter n after a normal decimal number - either of these will return the same result.
Dealing with comparisons, it is important to remember that regular numbers and BigInt objects cannot be compared against each other. A BigInt and a regular number cannot be operated upon in the same expression - you cannot add a BigInt to a regular number. This explains why most numbers in the solution to Permutation Palindrome are numbers, even just the numbers 0 or 1. The constant strLengthEven is not a BigInt because it never comes into comparison with another BigInt.
Conclusion
In a situation where we would need a number of indices to keep track of boolean frequency (even or odd in our case), we could of course use an array, but we also learned how to use a number - sort of. A BigInt is obviously not a primitive data type like a number, but it allows us to use an object like a number to gain access to two a couple incredible features that we normally don’t have - integer representation over 2**53, and what we explored in our article, access to binary-like representations over 32-bits in length. If you have ever been handicapped by the capability or precision of the primitive number type, BigInt may be able to help you come to a more optimized solution.
Additional Resources
Adding BigInts to V8
BigInt: arbitrary-precision integers in JavaScript
BigInt MDN Documentation
Permutation Palindrome Solution with BigInt