Garbage Collection in JavaScript

Garbage Collection in JavaScript

·

5 min read

The best way to improve our coding/learning skills is by re-iterating the same or similar task so that the concept has been absorbed completely. Thus I started out with a 151 days coding challenge from 1st Jan 2021. I have been learning at least 1 topic everyday and have been tracking my progress on Twitter.

While I was mostly learning the practical uses of various functions within JavaScript, I came across Akshay Saini who has released a fantastic course on understanding and falling in ❤ with JavaScript - Namaste🙏JavaScript. This playlist currently contains 19 videos, all of which were binge watched by me in 3 days. Yes, it took me 3 days to absorb this playlist because each and every video of this course is filled with such useful content.

On the Episode 16 of this playlist, while explaining how the JS Engine is designed and how it works, Akshay mentioned about Garbage Collection in JavaScript. This is how I decided to write a blog about it.

What is Garbage Collection?

Whenever we create bindings (variables) or functions in JavaScript, there is some memory allocated for the same on the heap area within the JS engine. If you couldn't make sense of what you just read, I would highly recommend you to go watch Episode 16 first and then continue from here.

Once the JS program is executed, these variables and functions are no longer required to be kept in the memory. This is when memory management comes into picture.

In some of the low level languages (such as C), memory management have to be taken care manually by the developer using malloc() and free() functions. Then, there are programming languages like JavaScript which automatically allocates memory when objects are created and frees it when they are not used anymore. This automatic process is taken care by Garbage Collection.

In general, the memory life cycle is as follows:

  1. Allocate the memory you need
  2. Use the allocated memory (read, write)
  3. Release the allocated memory when it is not needed anymore.
//Allocate the memory you need
let a = 1;
var str = "hello";
function add(num){
    return num + 4;
}

//Use the allocated memory (read, write)
a = add(a);
str += "world";
console.log(str);
console.log("Value of a is:" + a);

//Release the allocated memory when it is not needed anymore.

How does Garbage Collector work?

As mentioned above, JavaScript automatically allocates memory when objects are created and frees it when they are not used anymore. However, the general problem of automatically finding whether some memory "is not needed anymore" is undecidable. As a consequence, garbage collectors implement a restriction of a solution to the general problem. Let us understand how this is implemented within JavaScript.

The main concept of garbage collection algorithms rely on the concept of reference. Until the object is pointing to a reference or containing a value within it, the object is treated to be useful. Once all the execution is completed, it is no longer required and that is when the object is said to be "garbage".

//set some memory to the variable named "user"
let user = {
    fname: "John";
}
//data manipulation - write
user.lname: "Doe";
//reading the data 
console.log("Welcome "+user.fname+" "+user.lname);
// Output: Welcome John Doe

//further this data is not required anywhere. Thus we can perform the following
user = null;

In the above example, the last step where (developer specifies) user=null; is executed, JS engine would free up memory which was pointing to the variable named 'user'. This is similar to how a garbage collector would work.

JavaScript implements the Mark and sweep algorithm in order to perform garbage collection. Let's see how this algorithm works!

Mark and Sweep Algorithm

Any garbage collection algorithm must perform two basic operations.

  • It should be able to detect all the unreachable objects
  • It must reclaim the heap space used by these garbage objects and revive space in order to program once again.

The above operations are performed by this algorithm in 2 phases:

  • Mark phase
  • Sweep phase

This algorithm assumes the knowledge of a set of objects called roots. In JavaScript, the root is the global object. Periodically, the garbage collector will start from these roots, find all objects that are referenced from these roots, then all objects referenced from these, etc. Starting from the roots, the garbage collector will thus find all reachable objects and collect all non-reachable objects.

Mark Phase

When an object is created, its "mark" bit is set to 0 (false). In the Mark phase, this bit is set to 1(true) for all the reachable objects. This operations is performed using graph traversal depth first search approach. Using DFS, we start by visiting the root object and then further visit all the objects reachable from the root object and so on until all the reachable objects are visited. If there are more than one root object present, the mark phase would be called on all the root variables.

Algorithm for Mark Phase

  1. When the object is created, "mark" bit is set to 0 (false).

    object.markedBit = false;
    

    MarkPhase1.png

  2. The mark phase is called for all of the objects that are reachable.

Mark(root){
    if(!root.markedBit){
        root.markedBit = true;
    }
}

forEach(obj in root.next){
    Mark(obj);
}

MarkPhase2.png

Sweep Phase

As the name suggests, the garbage collector "sweeps" all the unreachable objects by clearing its memory from the heap. All those objects with markedBit = false are wiped out of the heap memory leaving only those objects which have the markedBit as 1(true).

Now, the markedBit value for all the reachable objects is set to false, since the whole algorithm would be called again - Mark phase followed by the Sweep phase.

Algorithm for Sweep Phase

forEach(obj in heap){
    if(obj.markedBit){
        obj.markedBit = false;
    }
    else
    heap.release(obj);
}

SweepPhase.png

The mark-and-sweep algorithm is called a tracing gabage collector because it traces out the entire collection of objects that are directly or indirectly accessible by the program.

As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last few years are implementation improvements of this algorithm, but not improvements over the garbage collection algorithm itself nor its reduction of the definition of when "an object is no longer needed".

I hope this article provided a good understanding on how JavaScript uses the mark and sweep algorithm for garbage collection.

Peace!