Module 0052: Stopping memory leak and memory corruption

Tak Auyeung, Ph.D.

December 14, 2016

1 About this module

Prerequisites: 0043
Objectives: This module discusses how we can minimize memory corruption and leak using references and reference counters.

2 Background

In module 0043, we discussed how to implement the list ADT using linked lists. In a linked list, each node is dynamically allocated. In our implementation, other than List_delete, the nodes are never deallocated. This can lead to memory leak as we remove nodes in a linked list.

Memory leak and corruption are common when dynamically allocated memory is used in structures that required individual objects to point to other objects. Although it possible to integrate ADT speciﬁc logic to free objects only when it is appropriate, there are also other methods to minimize leak and corruption.

3 The problems

3.1 Memory leak

The ﬁrst problem is memory leak. Observe the following example:

{
  struct X *pX;

  pX = (struct X *)malloc(sizeof(struct X));
  // do something with what pX points to
}

Here, the problem is that the object allocated by malloc is never freed. After we exit this block, pX is both out-of-scope and dead. This means we lost the only way to access the object allocated by malloc.

Although this is a simplistic case that can be avoided, we encounter similar situations very often. In actual programming situations, pX is most likely a pointer data member of an object that is dynamically allocated itself.

3.2 Memory corruption

The second problem is corruption due to doubly freeing objects:

{
  struct X *pX, *pY;

  pX = pY = (struct X *)malloc(sizeof(struct X));
  // do something with pX and pY
  free(pX);
  // continue to do something with pY
  free(pY);
}

We have a serious problem here. After free(pX), the object is already freed! This means the code that uses pY after free(pX) is using an object that is freed already. Depending on whether dynamic memory is reallocated or not in this code, symptoms may or may not show up. The free(pY) statement is wrong, as well, since the object is already freed by free(pX) earlier.

Again, the example is just here to illustrate the nature of the problem. In reality, pX and pY can be two pointer data member in two diﬀerent dynamically allocated object. That makes it nearly impossible to see that we may continue to use what a pointer points to when the object is already freed.

3.3 The Right solution

C is really ill-suited to handle complex data structures that involve many pointers in dynamically allocated objects pointing to other dynamically allocated objects. If a program or algorithm requires such complex structures, a programmer should consider using some other programming languages.

C++ is better than C in many ways. For example, C++ has constructors and destructors. There member methods are called automatically when an object is created and destroyed, respectively. They can be utilized to invoke the logic to clean up dynamically allocated objects.

Better yet, scripting languages like Perl deals with dynamically allocated objects natively. There is no free or delete. Java has similar capabilities.

4 Alias and deletable

Just so that we do not use well-deﬁned terms like “pointer”, “reference” and “handle”, we’ll deﬁne some new terms.

A deletable is an object that is dynamically allocated, and it is can be deleted (freed or deallocated, same meaning). Not only that, a deletable also knows the logic of deleting itself. In other words, when a deletable is deleted, it may trigger the deletion of other deletables.

An alias is a way to access a deletable. It is “kind of” like a pointer, but it is not. There are two kinds of aliases. A named alias is one that is either a variable (with a name). A nameless alias is one that is in dynamically allocated memory.

4.1 ADT

The ADT of an alias is deﬁned as follows:

initialize nameless alias: this operation initializes an alias that is a part of an object that is dynamically allocated. A nameless alias is an alias of nothing after initialization.
initialize named alias: this operation initializes an alias that is a variable. A named alias is an alias of nothing after initialization.
clone nameless alias: this operation changes a nameless alias so it becomes the clone of another alias (named or nameless).
clone named alias: this operation changes a named alias so it becomes the clone of another alias (named or nameless).
alias of nothing: this is a predeﬁned named alias that is an alias of nothing.
is alias of nothing: this returns a boolean to indicate whether an alias is an alias of nothing.
get payload: this returns the payload of the deletable of an alias. If an alias is an alias of nothing, then the payload is a NULL pointer.
set payload: this sets the payload of the deletable of an alias. Is an alias is an alias of nothing, setting the payload creates a deletable and automatically set the alias to become an alias of the newly created deletable. The payload is speciﬁed by a pointer to a dynamically allocated object, and an array of pointers to functions related to that type of objects.

The ADT of a deletable is as follows:

attach: a deletable can be attached to an alias.
detach: a deletable can be detached from an alias. If a deletable is no longer attached to any named alias directly or indirectly, the detach operation uses the table of functions stored in the alias to logically delete the deletable and its payload.
get payload: returns the payload of a deletable
set payload: sets the payload of a deletable

4.2 Representation

The internal structure of an alias is as follows:

struct _Alias
{
struct _Deletable *pD;
unsigned isNamed : 1; // a bit field
};

The internal structure of a deletable is as follows:

struct _Deletable
{
  unsigned namedRefCount; // direct named reference counter
  unsigned namelessRefCount; // direct nameless ref counter
  void (**methods)(void *);
  void *payload;
  unsigned reached : 1;
};

The array of methods can use an enumerated type for indexing:

enum alias_methods
{
  logicalDelete;
  reach,
  resetReach
};