When dealing with large amounts of data, it is common to want to find a particular piece of data. This is referred to as searching for the desired data.
Sorting a list places the list in a specified order, such as alphabetical order or rank order (smallest to largest).
In situations where we want to repeatedly search for various data elements in a list, having the list presorted can greatly reduce the amount of work to find things.
As we discussed in the first lecture of the semester, the design and analysis of algorithms comprises a major branch of computer science - Theory and Algorithms.
Searching and sorting algorithms play an important role in the field of Theory and Algorithms.
Random search
Given a list of numbers, one way to find the one that we want is to randomly look at elements of the list until we find the one we want.
Suppose we have an array of N integers, and the array has been filled with a set of numbers, and we want to find the element of the array that is equal to m.
Using a random search, we would have code that looks something like this:
int random_search(int list[], int N, int m) { int i; while(1) { // Generate a number between 0 and N-1 i = random(N); // See if that element of the array contains the number we // are looking for if(list[i] == m) break; } return(i); }It should be obvious to everyone that this algorithm is extremely inefficient. But how inefficient?
Best case: The first guess is correct.
Average case: 1/N chance per try => 50/50 chance after N/2 tries. Worst case: Never guess the correct one.
Even though the best case and average case aren't bad, the worst case behaviour suggests that this would not be a good algorithm to use in most cases.
Even worse, if the desired element is not in the list, this search algorithm will never finish.
Linear search
Another way to find a specific element in a list is to simply look at every element of the list in order until we find the one we are looking for. This is called Linear search .
int linear_search(int list[], int N, int m) { int i; for(i = 0; i < N; i++) { if(list[i] == m) break; } return(i); }Note that if the element we are looking for is not in the list, then linear_search() returns N, the size of the list.
This algorithm has the same best case and average case times as the random search we discussed above, but the worst case behaviour is much better.
Best case: The first element in the list is the one we want.
Average case: We look at N/2 elements before we find the one we want.
Worst case: We look at N elements before we find the one we want.
The principle strength of linear search is that it doesn't require that the elements of the list be in any particular order, only that we can step through the list.
Insertion sort
Insertion sort adds one element to the list at a time, placing it in the proper place.
Cards analogy
for(i = 0; i < N; i++) { for(j = 0; j < i; j++) if(sorted_list[j] > list[i]) break; for(k = i-1; k >= j; k--) sorted_list[k+1] = sorted_list[k]; sorted_list[j] = list[i]; }Best case: N Worst case: N^2
Selection sort
Selection sort iterates over the list, placing one element in the correct place each time.
One version, described in the book, finds the smallest element in the list and swaps it with whatever is first, then repeats with each element of the list.
Note that this also uses linear search to find the smallest element each time.
for(i = 0; i < N; i++) { min = i; // Find the ith smallest element for(j = i; j < N-i; j++) { if(list[j] < min) min = j; } // Swap the ith element and the ith smallest element temp = list[i]; list[i] = list[min]; list[min] = temp; }
Bubble sort
bound = N while(1) { if(bound == 0) break; for(i = 0; i < bound; i++) { if(list[i] > list[i+1]) { temp = list[i]; list[i] = list[i+1]; list[i+1] = temp; last = i; } } bound = last; }