Module 0052: Stopping memory leak and memory corruption

Tak Auyeung, Ph.D.

December 14, 2016

1 About this module

2 Background

In module 0043, we discussed how to implement the list ADT using linked lists. In a linked list, each node is dynamically allocated. In our implementation, other than List_delete, the nodes are never deallocated. This can lead to memory leak as we remove nodes in a linked list.

Memory leak and corruption are common when dynamically allocated memory is used in structures that required individual objects to point to other objects. Although it possible to integrate ADT specific logic to free objects only when it is appropriate, there are also other methods to minimize leak and corruption.

3 The problems

3.1 Memory leak

The first problem is memory leak. Observe the following example:

{  
  struct X *pX;  
 
  pX = (struct X *)malloc(sizeof(struct X));  
  // do something with what pX points to  
}

Here, the problem is that the object allocated by malloc is never freed. After we exit this block, pX is both out-of-scope and dead. This means we lost the only way to access the object allocated by malloc.

Although this is a simplistic case that can be avoided, we encounter similar situations very often. In actual programming situations, pX is most likely a pointer data member of an object that is dynamically allocated itself.

3.2 Memory corruption

The second problem is corruption due to doubly freeing objects:

{  
  struct X *pX, *pY;  
 
  pX = pY = (struct X *)malloc(sizeof(struct X));  
  // do something with pX and pY  
  free(pX);  
  // continue to do something with pY  
  free(pY);  
}

We have a serious problem here. After free(pX), the object is already freed! This means the code that uses pY after free(pX) is using an object that is freed already. Depending on whether dynamic memory is reallocated or not in this code, symptoms may or may not show up. The free(pY) statement is wrong, as well, since the object is already freed by free(pX) earlier.

Again, the example is just here to illustrate the nature of the problem. In reality, pX and pY can be two pointer data member in two different dynamically allocated object. That makes it nearly impossible to see that we may continue to use what a pointer points to when the object is already freed.

3.3 The Right solution

C is really ill-suited to handle complex data structures that involve many pointers in dynamically allocated objects pointing to other dynamically allocated objects. If a program or algorithm requires such complex structures, a programmer should consider using some other programming languages.

C++ is better than C in many ways. For example, C++ has constructors and destructors. There member methods are called automatically when an object is created and destroyed, respectively. They can be utilized to invoke the logic to clean up dynamically allocated objects.

Better yet, scripting languages like Perl deals with dynamically allocated objects natively. There is no free or delete. Java has similar capabilities.

4 Alias and deletable

Just so that we do not use well-defined terms like “pointer”, “reference” and “handle”, we’ll define some new terms.

A deletable is an object that is dynamically allocated, and it is can be deleted (freed or deallocated, same meaning). Not only that, a deletable also knows the logic of deleting itself. In other words, when a deletable is deleted, it may trigger the deletion of other deletables.

An alias is a way to access a deletable. It is “kind of” like a pointer, but it is not. There are two kinds of aliases. A named alias is one that is either a variable (with a name). A nameless alias is one that is in dynamically allocated memory.

4.1 ADT

The ADT of an alias is defined as follows:

The ADT of a deletable is as follows:

4.2 Representation

The internal structure of an alias is as follows:

struct _Alias  
{  
  struct _Deletable *pD;  
  unsigned isNamed : 1; // a bit field  
};

The internal structure of a deletable is as follows:

struct _Deletable  
{  
  unsigned namedRefCount; // direct named reference counter  
  unsigned namelessRefCount; // direct nameless ref counter  
  void (**methods)(void *);  
  void *payload;  
  unsigned reached : 1;  
};

The array of methods can use an enumerated type for indexing:

enum alias_methods  
{  
  logicalDelete;  
  reach,  
  resetReach  
};

4.3