Does my NodeJS application leak memory? – 2

Memory Management, Introducing the Stack

Introduction

Catching up from last time, a memory leak happens when regions of memory cannot be reclaimed for reuse. The process of reclaiming memory is the job of the garbage collector which is a part of the V8 engine. A garbage collection cycle or an event is triggered when the memory requested by an application is not available. Goes without saying that garbage collection(GC) and memory management work hand in hand. Even though the inner workings of memory management are a core part of V8 and the OS(Linux, Windows…) and do not affect day to day programming, it is important to get a certain degree of understanding to appreciate how it affects leaks and performance issues.

In this post I will talk a little bit of the overall memory organization and attempt to explain how the stack works in a very simplistic way.

Memory Organization

Memory is organized into multiple sections or segments, each section with a specific purpose. The two segments that are relevant to us are the stack and the heap. The stack is used to track and manage executing functions. In this functions store their arguments,  local variables, and housekeeping data on the stack. The heap is used for storing everything else that the stack does not store such as objects and large data.

In V8 the stack only stores immediate small integers(Smi). These are 31 bit signed integers which are only stored on the stack and never on the heap. The stack also stores references (called handles in V8 speak) to data stored on the heap.

The following figure(2.1) makes it clear:

memory-management.001

In the figure above:

  1. The local variable aSmallInt is a 31 bit signed integer and is hence placed directly on the stack.
  2. Variable aFloat and anObject are placed on the heap with references stored on the stack.

The Stack

A stack is simple LIFO (Last In First Out) data structure akin to a stack of pancakes or bills. The stack grows down, in other words, it starts off at the highest address and grows down in memory addresses. Every executing function pushes its local variables and arguments on the stack along with  housekeeping data such as the return address.

The stack has the following important properties:

  • It is small in size (984K Bytes for 64 bit systems, 492K bytes for 32 bit), very fast and extremely efficient in managing its resources.
  • It is statically allocated and de-allocated, that is, no special request has to be made for allocation. Stacks are managed automatically.
  • A stack is sub divided into stack frames. A stack frame is the area in the stack allotted to each function. Each stack frame houses the calling function’s arguments, local variables and housekeeping data.
  • A typical stack stores value types such as integers, floats and booleans. However in V8 the stack only stores immediate small integers(31bit).

For the stack to manage itself it maintains two pointers. The stack pointer and the base pointer. The stack pointer(SP) as the name suggests points to the last value pushed to the stack. In other words it is always pointing to the top of the stack. The base pointer (a.k.a the frame pointer) points to the start of the frame.

The best way to understand the operation and simplicity of the stack is to go through an example. Consider a simple piece of code in which a function add is called and the result put the variable ret.

Step – 1: The state of the stack at the start is:

  • The base pointer is just starting up so contains null.
  • The stack pointer is pointing to its latest push.
  • Each address slot in the stack is 4 bytes(32 bit) away from its immediate neighbor.
  • The highest address is at the bottom. The stack pointer moves up, towards lower memory addresses.

memory-management.002

Step – 2: Before add is called space is reserved for the return variable ret and arguments n and m and set to values 2 and 3.

Note:

  • The stack pointer moved up after every push.
  • Each memory address is offset by 4 bytes from the base pointer.

memory-management.003

Step – 3: Function add is now called. The return address of add is saved(pushed). This is the address where the function will return with the results.

Note:

  • I have used the term ‘return address’ instead of the actual memory address in hex. This is for pure simplicity.
  • The pointer which points to the next instruction to be executed is called the instruction pointer(eip).

memory-management.004

Step – 4: Before starting to execute add the address of the old frame is first saved. The base pointer now moves to the new frame.

Note:

  • The housekeeping data consists of return address and the frame address.
  • Return address takes back the program where the program left off.
  • The frame address(value of base pointer) allows the stack to connect one frame with another.

memory-management.005

Step – 5: On entering function add, space for variable r is reserved.

Note:

  • The stack frame consists of  the arguments(2 and 3), the address of the previous frame, the return address and the local variable r.

memory-management.006

Step – 6: Addition of n and m is performed and the result of 5 is put into the variable r and now its time to return. This means items need to be popped off the stack. In other words unwind the stack.

memory-management.007

Step – 7: The result of 5 is popped off the stack and held in a temporary register(eax). The stack pointer moves down.

Note:

  • The temporary register is located in the CPU and is named “eax”.

 

memory-management.008

Step – 8: The saved address of the previous frame is popped off and the base pointer moves back to the previous frame. The return address is popped off and control is returned to module. The result of 5 is put into the variable ret.

memory-management.009

Step – 9: The frame is now popped off, stack is empty and the program terminates.

Note:

In reality the stack is not empty at this pointer. It has frames for the module which is calling  program. However we do not need to go into that kind of detail.

Summary

The important lesson to learn here is that the stack is merciless in terms of cleaning up after itself. It does not need any garbage collection process. However, it does hold the key to the garbage collection process. Namely any references(handles) to objects on the heap. To understand that connection we need to look at the heap next which I will explain in my next post.

 

References:

For this post I have relied heavily on two excellent posts by Gustavo Duarte:

Other well written posts that I have referenced:

Advertisements

One thought on “Does my NodeJS application leak memory? – 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s