We have got individual (atomic) variables, and we have got arrays. What else do we need (for organizing data)? Let us first consider an example that computes the sum of two complex numbers in listing 1.
Note how a complex number is represented by two parameters. This is because conceptually, a complex number has a real r portion, and an imaginary portion i. This is why parameter a has two components, a_r and a_i.
While the code in listing 1 is correct, it is a bit tedious. Furthermore, it is more prone to errors. Let us assume that we have to complex numbers, x and y. Let (which means the real portion is 3, and the imaginery portion is 6), and . Listing 2 calls complexsum for the calculation:
This code is not exactly short. However, more importantly, it is easy to make mistakes. What if I specified x_r to specify parameters a_r and b_r? In the case of a real programming language, we can even change the order of arguments incorrectly because the matching of parameters and arguments is done by positioning, not by parameter names.
Before we move any further, we should first discuss what is a type. Up to this module, we do not explicitly state the “type” of a variable or parameter. Our only types up to this point are all integers (or numbers).
The type of a value indicates what can be done with the value.
For example, “23” is a number. As a result, we can add it to another number, yield a sum (that is also a number). We can compare “23” with another number, yielding a boolean value (true or false). We can copy the value “23” into a variable.
However, it makes little sense to say “23 or (x < y)” because 23 is not a boolean type. A disjunction expects both the right hand side and the left hand side be true/false values.
We can also specify the type of a parameter. This way, we can double check to see if an argument (in an invoke statement) match the expected use of a parameter (in a subroutine definition).
Let us now explore the common built-in types of most programming languages.
Integers include position whole numbers, negative whole numbers and zero. Examples include -25, 299, 0, 2293, -1028. 4.2 is not an integer because it is not a whole number. is not an integer because it is not a whole number.
With integers, we can perform the usual operations like multiply, divide, add and subtract. We can also compare integers with the operators , , , , , and . Note that the comparison operators produce values of boolean type, not integer type.
In other words, 2 * 3 is an integer value (6), but 2 < 3 is a boolean value (true).
A boolean value an either be “true” or “false”. Comparison operators result in boolean values. Logical operators like “and”, “or” and “not” both expect and produce boolean values. The condition of a condition statement or loop (prechecking or postchecking) must be of boolean type.
A string is basically some text value. For example, a name like “Tak” is of string type.
Strings can be compared using alphabetical order. For example, (“Tak” < “Auyeung”) is false because the letter ‘A’ is before (smaller) than the letter “T”.
In some scripting languages, strings can be concatinated using the ‘+’ operator. For example, the result of (“Tak ” + ‘Auyeung”) is “Tak Auyeung”.
We can add type information to many algorithms that we have explored in other modules. For example, listing 3 is a subroutine that finds factors, but this time with type information for each parameter.
In this example, we skipped the content of the actual code because that is not important. However, we did indicate the type of each name used in the subroutine. We use a colon (:) to separate a name from its type. Note that the “return type” also has its own type. In this example, the “return type” is integer, which means the invocation of “findfactor” can take the place of any integer (in an expression, comparison and etc).
We can also express the type of each element in an array. We can define the algorithm that finds a value in an array as in listing 4.
In this example, the parameter a is an array of integers. However, the “return type” is boolean. This mean the invocatin of bsearch can take the place of any boolean (true/false) value.
A record is a named collection of data members, and it serves as a user-defined type. Is this confusing? Let us use some examples.
Getting back to the complex number example, we could change the code as in listing 5 to 7.
This is the definition of a record. The name of the record is Complex. It is a convention that use an upper case character to begin the name of a record. Next, it has two members, i and r. Both data members, in this case, are of the type number. In general, however, the type of each data member in a record definition is independent to others.
Next, let us redefine the subroutine to add two complex numbers as in listing 6.
We can now change the main program as in listing 7.
After the example, let us discuss records in a more formal and general way.
First of all, the code in listing 5 is the definition of a record type. It does not create a parameter or a variable. You can look at this as “all variables and parameters of the type Complex will look like this...” In other words, Complex is the name of a template, a cookie cutter.
In the definition of a record, we first give the record template a name Complex. This makes it possible to later on define variables and parameters of this type.
i and r are “data members” of the Complex record type. Each data member has a type. Although in this example, both i and r have the same type, the types of data members are independent from each other in general. You can see i and r as the names of components of any variable or parameter of the type Complex.
In listing 6, we utilize the Complex record type. Parameters a, b and sum are all of the type Complex. Using the cookie analogy, Complex is the name of the cookie cutter, but a and b are actual cookies cut out using the Complex cookie cutter. sum is the alias of a cookie cut from the cookie cutter Complex.
When we need to refer to a data member of a variable or parameter of a record type, we use the dot (.) notation. In the subroutine definition, we use sum.r to indicate “refer to just the r component of the record parameter sum”. This dot notation is fairly universal across all programming languages.
Sure, why not? Let us consider an example in which we define a student record as in listing 8.
We can store information about students in a class using an array of Student records. Given such an array, we can then search for a student with a particular last name. We can define a binary search subroutine for this purpose as in listing 9.
Note that when we compare, we need to specify which part of a Student record to use for comparing. We cannot compare an entire Student record with a string.
We can also define another subroutine to search for the highest GPA in a class, and print out the name of the student with that GPA:
In this example, we use m as the index of the element in students that has the highest GPA. We start with assuming the first record has the highest score. Variable i is used to scan through the entire array. In each iteration, we check to see if the record at index m has a lower GPA than the one at index i. If that is true, we update m so it is the same as i. Each iteration also advances i so we can view the next element in students in the following iteration.