Module 0081: Types and Records

Tak Auyeung, Ph.D.

November 8, 2018

Contents

1 About this module
2 Life without records
3 Types
 3.1 Integers
 3.2 Boolean
 3.3 String
4 Notation of types
5 Records
 5.1 An example
 5.2 Record formalized
 5.3 Array of records?

1 About this module

2 Life without records

We have got individual (atomic) variables, and we have got arrays. What else do we need (for organizing data)? Let us first consider an example that computes the sum of two complex numbers in listing 1.


Listing 1: complex sum
1define sub complexsum 
2  by value a_r 
3  by value a_i 
4  by value b_r 
5  by value b_i 
6  by reference sum_r 
7  by reference sum_i 
8 
9  sum_r  a_r + b_r 
10  sum_i  a_i + b_i 
11end define sub

Note how a complex number is represented by two parameters. This is because conceptually, a complex number has a real r portion, and an imaginary portion i. This is why parameter a has two components, a_r and a_i.

While the code in listing 1 is correct, it is a bit tedious. Furthermore, it is more prone to errors. Let us assume that we have to complex numbers, x and y. Let x = 3 + 6i (which means the real portion is 3, and the imaginery portion is 6), and y = 2 + 2i. Listing 2 calls complexsum for the calculation:


Listing 2: use complex sum
1local x_r 
2local x_i 
3local y_r 
4local y_i 
5local sum_r 
6local sum_i 
7x_r  3 
8x_i  6 
9y_r  
10y_i  2 
11invoke complexsum x_r  a_r,x_i  a_i,y_r  b_r,y_r  b_r,sum_r  sum_r,sum_r  sum_r

This code is not exactly short. However, more importantly, it is easy to make mistakes. What if I specified x_r to specify parameters a_r and b_r? In the case of a real programming language, we can even change the order of arguments incorrectly because the matching of parameters and arguments is done by positioning, not by parameter names.

3 Types

Before we move any further, we should first discuss what is a type. Up to this module, we do not explicitly state the “type” of a variable or parameter. Our only types up to this point are all integers (or numbers).

The type of a value indicates what can be done with the value.

For example, “23” is a number. As a result, we can add it to another number, yield a sum (that is also a number). We can compare “23” with another number, yielding a boolean value (true or false). We can copy the value “23” into a variable.

However, it makes little sense to say “23 or (x < y)” because 23 is not a boolean type. A disjunction expects both the right hand side and the left hand side be true/false values.

We can also specify the type of a parameter. This way, we can double check to see if an argument (in an invoke statement) match the expected use of a parameter (in a subroutine definition).

Let us now explore the common built-in types of most programming languages.

3.1 Integers

Integers include position whole numbers, negative whole numbers and zero. Examples include -25, 299, 0, 2293, -1028. 4.2 is not an integer because it is not a whole number. π is not an integer because it is not a whole number.

With integers, we can perform the usual operations like multiply, divide, add and subtract. We can also compare integers with the operators <, , >, , =, and . Note that the comparison operators produce values of boolean type, not integer type.

In other words, 2 * 3 is an integer value (6), but 2 < 3 is a boolean value (true).

3.2 Boolean

A boolean value an either be “true” or “false”. Comparison operators result in boolean values. Logical operators like “and”, “or” and “not” both expect and produce boolean values. The condition of a condition statement or loop (prechecking or postchecking) must be of boolean type.

3.3 String

A string is basically some text value. For example, a name like “Tak” is of string type.

Strings can be compared using alphabetical order. For example, (“Tak” < “Auyeung”) is false because the letter ‘A’ is before (smaller) than the letter “T”.

In some scripting languages, strings can be concatinated using the ‘+’ operator. For example, the result of (“Tak ” + ‘Auyeung”) is “Tak Auyeung”.

4 Notation of types

We can add type information to many algorithms that we have explored in other modules. For example, listing 3 is a subroutine that finds factors, but this time with type information for each parameter.


Listing 3: findfactor
1define sub findfactor 
2  by value n : integer 
3  return type : integer 
4  local f : integer 
5  ... 
6end define sub

In this example, we skipped the content of the actual code because that is not important. However, we did indicate the type of each name used in the subroutine. We use a colon (:) to separate a name from its type. Note that the “return type” also has its own type. In this example, the “return type” is integer, which means the invocation of “findfactor” can take the place of any integer (in an expression, comparison and etc).

We can also express the type of each element in an array. We can define the algorithm that finds a value in an array as in listing 4.


Listing 4: binary search
1define sub bsearch 
2  by value a : array of integer 
3  by value k : integer 
4  return type boolean 
5  local b : integer 
6  local e : integer 
7  local m : integer 
8 
9  b  0 
10  e  | a |  1 
11  repeat 
12    m  (b+e)/2 
13    if a[m] < k then 
14      b  m+1 
15    else if a[m] > k then 
16      e  m
17    end if 
18  until (b > e) or (a[m] = k) 
19  return (a[m] = k) 
20  m  
21end define sub

In this example, the parameter a is an array of integers. However, the “return type” is boolean. This mean the invocatin of bsearch can take the place of any boolean (true/false) value.

5 Records

5.1 An example

A record is a named collection of data members, and it serves as a user-defined type. Is this confusing? Let us use some examples.

Getting back to the complex number example, we could change the code as in listing 5 to 7.


Listing 5: record def
1record Complex 
2  i : number 
3  r : number 
4end record

This is the definition of a record. The name of the record is Complex. It is a convention that use an upper case character to begin the name of a record. Next, it has two members, i and r. Both data members, in this case, are of the type number. In general, however, the type of each data member in a record definition is independent to others.

Next, let us redefine the subroutine to add two complex numbers as in listing 6.


Listing 6: complex add
5define sub complexadd 
6  by value a : Complex 
7  by value b : Complex 
8  by reference sum : Complex 
9  sum.r  a.r + b.r 
10  sum.i  a.i + b.i 
11end define sub

We can now change the main program as in listing 7.


Listing 7: use complexadd
12local x : Complex 
13local y : Complex 
14local sum : Complex 
15x.r  3 
16x.i  6 
17y.r  
18y.i  2 
19invoke complexsum x  a,y  b, sum  sum

5.2 Record formalized

After the example, let us discuss records in a more formal and general way.

First of all, the code in listing 5 is the definition of a record type. It does not create a parameter or a variable. You can look at this as “all variables and parameters of the type Complex will look like this...” In other words, Complex is the name of a template, a cookie cutter.

In the definition of a record, we first give the record template a name Complex. This makes it possible to later on define variables and parameters of this type.

i and r are “data members” of the Complex record type. Each data member has a type. Although in this example, both i and r have the same type, the types of data members are independent from each other in general. You can see i and r as the names of components of any variable or parameter of the type Complex.

In listing 6, we utilize the Complex record type. Parameters a, b and sum are all of the type Complex. Using the cookie analogy, Complex is the name of the cookie cutter, but a and b are actual cookies cut out using the Complex cookie cutter. sum is the alias of a cookie cut from the cookie cutter Complex.

When we need to refer to a data member of a variable or parameter of a record type, we use the dot (.) notation. In the subroutine definition, we use sum.r to indicate “refer to just the r component of the record parameter sum”. This dot notation is fairly universal across all programming languages.

5.3 Array of records?

Sure, why not? Let us consider an example in which we define a student record as in listing 8.


Listing 8: student record
1record Student 
2  firstname : string 
3  lastname : string 
4  gpa : number 
5end record

We can store information about students in a class using an array of Student records. Given such an array, we can then search for a student with a particular last name. We can define a binary search subroutine for this purpose as in listing 9.


Listing 9: class binary search
6define sub class_bsearch 
7  by value students : array of Student 
8  by value v : string 
9  return type : boolean 
10  local b : integer 
11  local e : integer 
12  local m : integer 
13  b  0 
14  e  | students |  1 
15  repeat 
16    m  (b + e) / 2 
17    if students[m].lastname < v then 
18      b  m + 1 
19    else if students[m].lastname > v then 
20      e  m  1 
21    end if 
22  until (b > e) or (students[m].lastname = v) 
23  return students[m].lastname = v 
24end define sub

Note that when we compare, we need to specify which part of a Student record to use for comparing. We cannot compare an entire Student record with a string.

We can also define another subroutine to search for the highest GPA in a class, and print out the name of the student with that GPA:


Listing 10: highest GPA
25define sub highest_GPA 
26  by value students : array of Student 
27  local i : integer
28  local m : integer
29  m  0 
30  i  1 
31  while i < | students | do 
32    if students[m].gpa < students[i].gpa then 
33      m  i 
34    end if 
35    i  i + 1 
36  end while 
37  print students[m].firstname " " students[m].lastname " has the highest GPA" 
38end define sub

In this example, we use m as the index of the element in students that has the highest GPA. We start with assuming the first record has the highest score. Variable i is used to scan through the entire array. In each iteration, we check to see if the record at index m has a lower GPA than the one at index i. If that is true, we update m so it is the same as i. Each iteration also advances i so we can view the next element in students in the following iteration.