Ira Pohl CMPS109 Advanced Programming Winter 03
Homework 5: File IO and hashing
Write a program that opens a file of text and computes an alphabetized list of words with their frequency of occurrence. Upper and lower case should not be distinguished. The file size should be large namely at least 50000 words.
You should use hashing or a balanced tree approach to efficiently count and order the words. Ignore text non-words such as integers and punctuation.
Your program must include an original container class that provides functionality at least equivalent to a map (associative array). The implementation of this container must not use any containers or algorithms provided by the environment or language (Java's HashMap, STL's sort(), etc.). Your program must also produce the alphabetized word list from the contents of your container without using sorting functionality provided by or derived from the environment or language.
The output should be an alphabetized list of all the words with their frequency.
The program should use String[] args to enter the name of the source file and the name of the destination file.
Use exception handling to detect any errors, such as an incorrect number of arguments or a file handle that is incorrect.
You should create at least one unique exception type of your own that is derived from an appropriate standard exception class.
Do not use tio - instead stay with the standard stream and file classes provided by Java. (You should examine the tio code at the end of text; after doing this exercise it should be clear how tio was developed and how it works.)
Print out the word count of the source file and the processing time the program took on the file. The order of the computation should be no worse than N log N.
Due Date: Submit by 10pm. March 3, 2003.