Ira Pohl CMPS 012A: Homework 5

C Course Homework

Homework 5 - Insertion Sort and Binary Lookup

Note: Due to inconsistencies with the clock() function in time.h , you don't have to do the timing portion of this homework assignment. However, instead of doing the timing, you will have to write some code to test your binary lookup routine (for the smallest array only with 10 elements). I have removed references to timing (or marked them as optional).

Here is some sample output...

Things You Will Need to Know/Learn:

Arrays (Chapters 8 and 9)
Using C's Random Number Generator
Insertion Sort (as described in lecture)
Timing the duration of a function using time.h (see the book's appendix)....not necessary.

Program Description

For this homework assignment, you will write a sorting routine, a lookup routine and test both in relation to correctness. You will be will be implementing a function which performs Insertion Sort (a sorting method) to sort an array of integers in increasing order. The Insertion Sort method will be described in detail in class. You will also be writing a function which performs a binary lookup to see if an element exists in the array. Note that although these two routines are the only code you are required to implement as functions, you may also wish to implement other parts of the program as functions to make your program easier to read and understand. To test your sorting routine you will have to build random arrays of various sizes by using the random number generator to fill the array slots.

Insertion Sort Routine

The first function you will have to write is the function which does the actual sorting. This function will work on an array, data, with size data elements. The prototype for the function is shown below (you should not modify the function prototype in any way).

void insertion_sort(int data[], int size);
/* An integer array, "data", with "size" elements is sorted. */

Briefly, an insertion sort assumes that the first k elements of an array are already sorted. The sorting method then takes the (k+1)st element and moves it back through the array until it is in its appropriate position. So if we have an array with 6 elements (3, 7, 18, 9, 15, 6), it is sorted as follows:

The first 1 elements (3) are assumed to be sorted (always true as long as the array is non-empty). The 2nd element (7) is moved back in the array until it is in the appropriate position for the elements checked so far. 7 is already in the correct position, so the array looks like:
3, 7, 18, 9, 15, 6
The first 2 elements (3, 7) are assumed to be sorted. The 3rd element (18) is moved back in the array until it is in the appropriate position for the elements checked so far. 18 is already in the correct position, so the array looks like:
3, 7, 18, 9, 15, 6
The first 3 elements (3, 7, 18) are assumed to be sorted. The 4th element (9) is moved back in the array until it is in the appropriate position for the elements checked so far. 9 is swapped with 18, so the array looks like:
3, 7, 9, 18, 15, 6
The first 4 elements (3, 7, 9, 18) are assumed to be sorted. The 5th element (15) is moved back in the array until it is in the appropriate position for the elements checked so far. 15 is swapped with 18, so the array looks like:
3, 7, 9, 15, 18, 6
The first 5 elements (3, 7, 9, 15, 18) are assumed to be sorted. The 6th element (6) is moved back in the array until it is in the appropriate position for the elements checked so far. 6 is swapped with 18, then 6 is swapped with 15, then 6 is swapped with 9, and finally 6 is swapped with 7, so the array looks like:
3, 6, 7, 9, 15, 18
Finally, the first 6 elements are assumed to be sorted which is how many elements we initially started off to sort. Done.

Binary Lookup Routine

The second function you have to write is a way of looking up or finding an element in a sorted array if the element exists. The method you should use is a binary lookup. Binary lookup takes a key as input (the key which is being looked for) and looks in the middle of the sorted array of data. The key is repeatedly (or recursively) compared to the middle element until the key is found, or all possible locations in the array where key might be found have been searched. Each of these comparisons has three possible outcomes:

key = middle element ==> the key has been found at the position (index) of the middle element.
key < middle element ==> the key must be located in the left half (smaller half) of the sorted array if the key exists.
Repeat the comparison with only the left half of the array using the middle element of the left half to compare against.
key > middle element ==> the key must be located in the right half (larger half) of the sorted array if the key exists.
Repeat the comparison with only the right half of the array using the middle element of the right half to compare against.

By repeating this process over and over again, you can see that the key will eventually be found if it exists, or that the next half of the array to be checked (the left or right half) will be empty (contain no elements) which implies that the key does not exist in the sorted array. The binary lookup routine looks for a key in a sorted array, data, with size elements, and stores the index (position) of where the key is found (if it exists) in the array in the memory location position . If key is not found in the array, the function returns 0 and sets position to the value of the nearest element in the array. On the other hand, if key is found, the function returns 1 and sets position to the index where key was found. Here is the function prototype (again, you should not change this):

int lookup(int data[], int size, int key, int *position);
/* Lookup "key" in "data". Return as the position value, the */
/* index "key" is found at or the value of a nearest element. */
/* If "key" is not found, the return value should be 0. If it */
/* is found, the return value should be 1. */
/* "data" is assumed to be sorted. */

Testing Your Program and Main

Your main program will have to test 4 different array sizes, 10, 100, 1000, and 10000 elements. You will have to do the following for each array:

Fill the array with random integers using rand().
For array size 10 print out the contents of the array.
Sort the array using the insertion sort routine.
For array size 10 print out the contents of the array.
Test the lookup procedure on the sorted array by doing 100 random lookups. For array size 10 do only 20 random lookups and print out the results of the 20 lookups (see the sample output ).

As for hw4, your program should not take any user input.

Grading

Style -- 2 points total

proper comments and variable names 1 point
- Name, class, date, instructor, program description, and useful
  in-line comments.
- Variable names which hint towards the use of the variable.
proper indentation, use of whitespace, and overall appearance 1 point

Correctness -- 6 points total

correct main() 2 points
- For 4 different array sizes (10, 100, 1000, and 10000), the main program should (1) fill the array with random numbers, (2) print the array if size == 10, (3) sort the array, (4) print the array if size == 10, (5) do 100 random binary lookups on the sorted array (if size == 10, only do 20 lookups and print out the results). This can be done with a maximum-size array or by explicitly writing out the code. The main program should not accept any user input.
correct implementation of insertion_sort 2 points
- Part 1 - overall (1 point):
  The function should have the prototype:
  void insertion_sort(int data[], int size);
  
  The student should not have changed the prototype. The function should contain two nested loops. The inner loop should contain a conditional swap.
- Part 2 - insertion sort specifics (1 point):
  The function should implement insertion sort (not bubble sort or other sorting algorithm).
Make sure the outer loop starts at 0 and ends at size-2 and the
inner loop starts at i+1 and ends at 1, given the swap condition
shown above. The students' solutions may have slightly different
ranges of their loop variables, but in this case check to make
sure no "off-by-one" errors exist.
In other words, make sure they don't index off the end of the
array.
correct implementation of lookup 2 points
Part 1 - overall (1 point):
- The function should have the prototype:
  int lookup(int data[], int size, int key, int *position);
- The student should not have changed the prototype. The function should contain variables which store the leftmost index to be checked and the rightmost index to be checked.
- The function should initialize the leftmost index to 0 and the rightmost index to size-1.
- The function should also contain a way of keeping track of the current range's midpoint. This should be updated every time the leftmost or rightmost index changes using the formula, midpoint = (leftIdx + rightIdx)/2.
- Before exiting the function must set *position to the midpoint (indicating the index at which key was found or the index last checked). The function should return 0 if key is not found and 1 if key is found.
Part 2 - within the loop (1 point):
- The function should contain a loop which is executed until the key is found (key == data[midpoint]), or until no more elements are left to be checked (leftIdx > rightIdx).
- The body of the loop should be an if-else if-else clause.
- One condition is for key == data[midpoint], a second condition is for key < data[midpoint], and a third condition is for key > midpoint. In the first condition, the function must terminate and return 1. In the second condition, the function must update the rightIdx
  to be midpoint-1, and run the loop again (as long as leftIdx <= rightIdx). In the third condition, the function must update the leftIdx to be midpoint+1, and run the loop again (as long as leftIdx <= rightIdx). (See my sample solution if you have questions.)
Program execution -- 2 points total

Compile the program using "% gcc hw5.c".
Run the program and verify:

For 1 point:
The program prints the contents of the array for array size 10 BEFORE sorting.
The program prints the contents of the array for array size 10 AFTER sorting--verify that the output is sorted.
For 1 point:
The program prints the results of 20 random binary lookups for array size 10--verify that the binary lookup routine returns 0 (not found) or 1 (found) properly and sets *position
appropriately. If the student does not give you enough information in the output to verify that a key was found or not (and at which position), do not award this point.

You can see the sample output to see what my program produced. Note that the student's output may differ, but should convey the same information. Do not deduct points if the student did not reduce the number of possible random numbers using the '%' operator.

Deductions

Deduct 3 points if the program does not compile.
(use "gcc <filename.c>" or "gcc <filename.c> -lm")

Hints

Check out the sample output .
To run your code on the different array sizes (10, 100, 1000, and 10000), you can declare an array of size 10000, and then do a loop which varies the actual number of elements used in an interation. Here's some pseudo-code for main:

int main(void) { declare array of size MAXSIZE (assuming you #define MAXSIZE to 10000); loop (size used varies from 10 to 100 to 1000 to MAXSIZE) { Fill array; Print array (if size used equals 10); Sort array; Print array (if size used equals 10); Do 100 binary lookups on array (if size used equals 10, only do 20 binary lookups and print the results of each lookup similar to what is shown in the sample output); } }

To show an example, if we are in the third iteration of the loop, we will be using the same variable which refers to the array we declared at the beginning of main, but throughout that iteration of the loop we will only use the first 1000 elements (index ranges from 0 to 999). Similarly, for the first iteration we only use the first 10 elements, and for the second iteration we only use the first 100 elements.
Bubble Sort vs. Insertion Sort

From a high level perspective, bubble sort repeatedly starts at the end of the array, and "pushes" (by swapping elements) the smallest element to the beginning of the array.
Insertion sort repeatedly starts at the beginning of the array and adds one element into the sorted array one element at a time (like sorting a deck of cards).
From more of a programming perspective, bubble sort and insertion sort have similarities and differences.

Similarities between Bubble Sort and Insertion Sort:

Both sort an array of elements.
Both have two nested loops.
Both have an outer loop which varies the loop variable (let's call the outer loop variable i ) from the beginning of the array to the end of the array.
Both have an inner loop which determines how many swaps might have to be done. (Let's call the inner loop variable j .)
For both, the number of times the inner loop is executed is the same as the number of possible swaps which may occur (for that iteration of the outer loop).
Both do a swap within the inner loop if the jth element in the array is smaller than the (j-1)th element.

The range of the inner loop is where bubble sort and insertion sort differ.

Differences between Bubble Sort and Insertion Sort:

Bubble sort has an inner loop which varies j from the end of the array down to the value of the outer loop variable, i .
Insertion sort has an inner loop which varies j from the value of the outer loop variable, i , down to the beginning of the array.
Therefore, in bubble sort the number of inner loop iterations (same as number of possible swaps) decreases by 1 with every iteration of the outer loop.
On the other hand, in insertion sort the number of inner loop iterations increases by 1 with every iteration of the outer loop.
So bubble sort starts by working very hard at the beginning and looking at the entire array (doing a lot of swaps), and with every iteration of the outer loop, it slacks off a bit, and by the time it reaches the last iteration of the outer loop, it only has one last possible swap to do.
Insertion sort starts with as little work as possible, and with every iteration of the outer loop, it adds a bit of work, and by the time it reaches the last iteration of the outer loop, it has to consider the entire array.

The bottom line is both sorting methods accomplish their goal, but they distribute their workloads differently.
YOUR program should implement insertion sort.

Star Solutions - Insertion Sort and Binary Lookup

Here's my version of hw5.