Specifically your program should recognize strings of the form: "<a href=url>". There can be extra spaces before "href", around the "=" and before the closing ">". There can also be double quotation marks around the url.
All tag designators and URLs are INSENSITIVE to case, so "<A hREf=http://wWw.soe.UcSc.edu>" works just the same as "<a href=http://www.soe.ucsc.edu>". A good method to know about is String.toLowerCase().
A url may or may not begin with a protocol specifier such as "http:" or "ftp:". The protocol specifier will then be followed by a path. You will need to recognize three different forms for the path.
http://soe.ucsc.edu/classes/cmps012a,
where hostname is the name of the server and theRest is
the path to the page on that server. The hostname maybe be empty
in which case the hostname defaults to localhost
(that is literally the string "localhost").
hw/prog1.html.
/classes/cmps012a.
Case one is the easiest, if the path begins with two slashes followed by a non-slash then just print the url (prepending http: if necessary).
For case two you must prepend the base url onto the path value to get the complete url for printing. Determining the base url is discussed below.
For case three you must prepend "//hostname" onto the path value to get the complete url for printing. Determing the hostname is discussed below.
java PrintAnchors startingURLHere is a test page with a variety of tags (the output for this page is shown below). Note that the "localhost" links are "broken" (by design), in that they would only be valid if you happen to be running a web browser while logged into the soe web server.
Here are some sample executions. The third one shows that you can actually use the "file:" protocol for testing on your local machine without a web server. You will of course have to change the path to make sense for your machine.
os-prompt% java PrintAnchors http://www.soe.ucsc.edu/classes/cmps012a/Fall06/hw/Hw3TestPage.html http://www.soe.ucsc.edu/classes http://www.soe.ucsc.edu/classes http://www.soe.ucsc.edu/classes http://www.soe.ucsc.edu/classes/cmps012a/Fall06/hw/prog1.html http://www.soe.ucsc.edu/classes/cmps012a/Fall06/hw/prog1.html http://www.soe.ucsc.edu/classes/cmps012a/Fall06/labinfo/index.html http://www.soe.ucsc.edu/classes/cmps012a/Fall06/labinfo/index.html http://localhost/classes/cmps012a/Fall06/hw/prog2.html http://localhost/classes/cmps012a/Fall06/hw/prog2.html os-prompt% os-prompt%java PrintAnchors http://www.soe.ucsc.edu/~charlie http://www.soe.ucsc.edu/~charlie/official.html http://www.soe.ucsc.edu/~charlie/personal.html http://www.soe.ucsc.edu/~charlie/projects/index.html http://www.soe.ucsc.edu/~charlie/classes http://www.soe.ucsc.edu/~charlie/jarel http://www.mtsu.edu/~untch/karel/ http://www.lulu.com/JavaByDissection http://www.cse.ucsc.edu/~pohl/java.html http://www.soe.ucsc.edu/~charlie/research.html http://www.soe.ucsc.edu/~charliedirections.html os-prompt% os-prompt%java PrintAnchors file://localhost/Users/charlie/class/12a/webpage/index.html http://www.cse.ucsc.edu/classes/cmps012a/Fall06/labinfo http://www.cse.ucsc.edu/classes/cmps012a/Fall06/faq http://www.cse.ucsc.edu/classes/cmps012a/Fall06/supplements/ http://www.cse.ucsc.edu/classes/cmps012a/Fall06/hw http://www.cse.ucsc.edu/classes/cmps012a/Fall06/stars http://www.cse.ucsc.edu/classes/cmps012a/Fall06/exams http://ic.ucsc.edu/docs/webct/create-account.php http://www.cse.ucsc.edu/classes/cmps012a/Fall06/notes http://www.cse.ucsc.edu/classes/cmps012a/Fall06/labinfo http://www.cse.ucsc.edu/classes/cmps012a/Fall06/labinfo http://www.cse.ucsc.edu/classes/cmps012a/Fall06/supplements/supplements.html http://www.cse.ucsc.edu/classes/cmps012a/Fall06/faq/faq.html http://www.lulu.com/JavaByDissection http://www.abebooks.com/servlet/SearchResults?&isbn=0201725991&nsa=1 http://www.cse.ucsc.edu/classes/cmps012a/Fall06/hw/pairProgramming.html http://www.cs.berkeley.edu/~aiken/moss.html http://oasas.ucsc.edu/avcue/integrity/ http://ic.ucsc.edu:8000/webct/public/home.pl http://www.cse.ucsc.edu/classes/cmps012a/Fall06/hw http://www.cse.ucsc.edu/classes/cmps012a/Fall06/hw http://www.cse.ucsc.edu/classes/cmps012a/Fall06/labinfo http://www.cse.ucsc.edu/classes/cmps012a/Fall06/faq http://www.cse.ucsc.edu/classes/cmps012a/Fall06/supplements/ http://www.cse.ucsc.edu/classes/cmps012a/Fall06/hw http://www.cse.ucsc.edu/classes/cmps012a/Fall06/stars http://www.cse.ucsc.edu/classes/cmps012a/Fall06/exams
5 points - properly handles tags without extra spaces or quotation marks
5 points - properly handles a-tags with extra spaces
5 points - properly handles a-tags with quotation marks
5 points - properly adjusts the base for base-tags
5 points - properly abstracts href attribute parsing with a method
5 points - properly handles fully specified url paths
5 points - properly handles relative to a base url paths
5 points - properly handles absolute url paths without an explicit server name