Linking
Every HTML document should contain certain standard
HTML tags. Each document consists of head and body text.
The head contains the title, and the body contains the
actual text that is made up of paragraphs, lists, and
other elements.
Required elements are shown in this sample bare-bones
document:
<html>
<head>
<TITLE>A Simple HTML Example</TITLE>
</head>
<body>
<H1>HTML is Easy To Learn</H1>
<P>Welcome to the world of HTML.
This is the first paragraph. While short it is
still a paragraph!</P>
<P>And this is the second paragraph.</P>
</body>
</html>
The required elements are the <html>,
<head>, <title>, and <body> tags (and
their corresponding end tags). Because you should include
these tags in each file, you might want to create a
template file with them.
To see a copy of the file that your browser reads to
generate the information in your current window, select
View Source (or the equivalent) from the browser menu.
This is an excellent way to see how HTML is used and
to learn tips and constructs. Of course, the HTML might
not be technically correct.
This element tells your browser that the file contains
HTML-coded information. The file extension .html
also indicates this an HTML document and must be used.
(If you are restricted to 8.3 filenames (e.g., LeeHome.htm,
use only .htm for your extension.)
The head element identifies the first part of your
HTML-coded document that contains the title. The title is
shown as part of your browser's window (see below).
The title element contains your document title and
identifies its content in a global context. The title is
displayed somewhere on the browser window (usually at the
top), but not within the text area.
For example, you might include a shortened title of a
book along with the chapter contents: NCSA Mosaic
Guide (Windows): Installation. This tells the
software name, the platform, and the chapter contents,
which is more useful than simply calling the document Installation.
Generally you should keep your titles to 64 characters or
fewer.
The second--and largest--part of your HTML document is
the body, which contains the content of your document
(displayed within the text area of your browser window).
The tags explained below are used within the body of your
HTML document.
HTML has six levels of headings, numbered 1 through 6,
with 1 being the most prominent. Headings are displayed
in larger and/or bolder fonts than normal body text. The
first heading in each document should be tagged <H1>.
The syntax of the heading element is:
<Hy>Text
of heading </Hy>
where y is a number between 1 and 6 specifying
the level of the heading.
Do not skip levels of headings in your document. For
example, don't start with a level-one heading
(<H1>) and then next use a level-three
(<H3>) heading.
Unlike documents in most word processors, carriage
returns in HTML files aren't significant. Multiple spaces
are collapsed into a single space by your browser.
In the bare-bones example shown in the Minimal HTML
Document section, the first paragraph is coded as
<P>Welcome to the world of HTML.
This is the first paragraph.
While short it is
still a paragraph!</P>
In the source file there is a line break between the
sentences. A Web browser ignores this line break and
starts a new paragraph only when it encounters another <P>
tag.
Important: You must indicate
paragraphs with <P> elements. A browser ignores any
indentations or blank lines in the source text. Without
<P> elements, the document becomes one large
paragraph. (One exception is text tagged as
"preformatted," which is explained below.) For
example, the following would produce identical output as
the first bare-bones HTML example:
<H1>Level-one heading</H1> <P>Welcome to the world of HTML. This is the
first paragraph. While short it is still a
paragraph! </P> <P>And this is the second paragraph.</P>
To preserve readability in HTML files, put headings on
separate lines, use a blank line or two where it helps
identify the start of a new section, and separate
paragraphs with blank lines (in addition to the <P>
tags). These extra spaces will help you when you edit
your files).
NOTE: The </P>
closing tag can be omitted. This is because browsers
understand that when they encounter a <P> tag, it
implies that there is an end to the previous paragraph.
Using the <P> and </P> as a paragraph
container means that you can center a paragraph by
including the ALIGN=alignment
attribute in your source file.
<P ALIGN=CENTER>
This is a centered paragraph. [See the formatted version below.]
</P>
This is a centered paragraph.
HTML supports unnumbered, numbered, and definition
lists. You can nest lists too, but use this feature
sparingly because too many nested items can get difficult
to follow.
Unnumbered Lists
To make an unnumbered, bulleted list,
- start with an opening list <UL>
(for unnumbered list) tag
- enter the <LI> (list item) tag
followed by the individual item; no closing </LI>
tag is needed
- end the entire list with a closing list </UL>
tag
Below is a sample three-item list:
<UL>
<LI> apples
<LI> bananas
<LI> grapefruit
</UL>
The output is:
- apples
- bananas
- grapefruit
The <LI> items can contain multiple
paragraphs. Indicate the paragraphs with the <P>
paragraph tags.
Numbered Lists
A numbered list (also called an ordered list,
from which the tag name derives) is identical to an
unnumbered list, except it uses <OL>
instead of <UL>. The items are tagged
using the same <LI> tag. The following
HTML code:
<OL>
<LI> oranges
<LI> peaches
<LI> grapes
</OL>
produces this formatted output:
- oranges
- peaches
- grapes
Definition Lists
A definition list (coded as <DL>)
usually consists of alternating a definition term
(coded as <DT>) and a definition
definition (coded as <DD>). Web
browsers generally format the definition on a new line.
The following is an example of a definition list:
<DL>
<DT> NCSA
<DD> NCSA, the National Center for Supercomputing Applications,
is located on the campus of the University of Illinois
at Urbana-Champaign.
<DT> Cornell Theory Center
<DD> CTC is located on the campus of Cornell University in Ithaca,
New York.
</DL>
The output looks like:
- NCSA
- NCSA, the National Center for Supercomputing
Applications, is located on the campus of the
University of Illinois at Urbana-Champaign.
- Cornell Theory Center
- CTC is located on the campus of Cornell
University in Ithaca, New York.
The <DT> and <DD>
entries can contain multiple paragraphs (indicated by <P>
paragraph tags), lists, or other definition information.
The COMPACT attribute can be used
routinely in case your definition terms are very short.
If, for example, you are showing some computer options,
the options may fit on the same line as the start of the
definition.
<DL COMPACT>
<DT> -i
<DD>invokes NCSA Mosaic for Microsoft Windows using the
initialization file defined in the path
<DT> -k
<DD>invokes NCSA Mosaic for Microsoft Windows in kiosk mode
</DL>
The output looks like:
- -i
- invokes NCSA Mosaic for Microsoft Windows using
the initialization file defined in the path.
- -k
- invokes NCSA Mosaic for Microsoft Windows in
kiosk mode.
Nested Lists
Lists can be nested. You can also have a number of
paragraphs, each containing a nested list, in a single
list item.
Here is a sample nested list:
<UL>
<LI> A few New England states:
<UL>
<LI> Vermont
<LI> New Hampshire
<LI> Maine
</UL>
<LI> Two Midwestern states:
<UL>
<LI> Michigan
<LI> Indiana
</UL>
</UL>
The nested list is displayed as
- A few New England states:
- Vermont
- New Hampshire
- Maine
- Two Midwestern states:
Use the <PRE> tag (which stands for
"preformatted") to generate text in a
fixed-width font. This tag also makes spaces, new lines,
and tabs significant (multiple spaces are displayed as
multiple spaces, and lines break in the same locations as
in the source HTML file). For example, the following
lines:
<PRE>
#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
rm *
</PRE>
display as:
#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
rm *
The <PRE> tag can be used with an
optional WIDTH attribute that specifies the
maximum number of characters for a line. WIDTH
also signals your browser to choose an appropriate font
and indentation for the text.
Hyperlinks can be used within <PRE>
sections. You should avoid using other HTML tags within <PRE>
sections, however.
Note that because <, >, and & have special
meanings in HTML, you must use their escape sequences (<,
>, and &,
respectively) to enter these characters. See the section Escape Sequences for more information.
Use the <BLOCKQUOTE> tag to include
lengthy quotations in a separate block on the screen.
Most browsers generally change the margins for the
quotation to separate it from surrounding text.
In the example:
<BLOCKQUOTE>
<P>Omit needless words.</P>
<P>Vigorous writing is concise. A sentence should contain no
unnecessary words, a paragraph no unnecessary sentences, for the
same reason that a drawing should have no unnecessary lines and a
machine no unnecessary parts.</P>
--William Strunk, Jr., 1918
</BLOCKQUOTE>
the result is:
Omit needless words.
Vigorous writing is concise. A sentence should
contain no unnecessary words, a paragraph no
unnecessary sentences, for the same reason that a
drawing should have no unnecessary lines and a
machine no unnecessary parts.
--William Strunk, Jr., 1918
The <ADDRESS> tag is generally used
to specify the author of a document, a way to contact the
author (e.g., an email address), and a revision date. It
is usually the last item in a file.
For example, the last line of the online version of
this guide is:
<ADDRESS>
A Beginner's Guide to HTML / NCSA / pubs@ncsa.uiuc.edu / revised April 96
</ADDRESS>
The result is:
A Beginner's Guide to HTML / NCSA /
pubs@ncsa.uiuc.edu / revised April 96
NOTE: <ADDRESS> is
not used for postal addresses. See "Forced
Line Breaks" below to see how to format postal
addresses.
The <BR> tag forces a line break
with no extra (white) space between lines. Using <P>
elements for short lines of text such as postal addresses
results in unwanted additional white space. For example,
with <BR>:
National Center for Supercomputing Applications<BR>
605 East Springfield Avenue<BR>
Champaign, Illinois 61820-5518<BR>
The output is:
National Center for Supercomputing Applications
605 East Springfield Avenue
Champaign, Illinois 61820-5518
The <HR> tag produces a horizontal
line the width of the browser window. A horizontal rule
is useful to separate sections of your document. For
example, many people add a rule at the end of their text
and before the <address> information.
You can vary a rule's size (thickness) and width (the
percentage of the window covered by the rule). Experiment
with the settings until you are satisfied with the
presentation. For example:
<HR SIZE=4 WIDTH="50%">
displays as:
HTML has two types of styles for individual words or
sentences: logical and physical. Logical styles
tag text according to its meaning, while physical
styles indicate the specific appearance of a
section. For example, in the preceding sentence, the
words "logical styles" was tagged as a
"definition." The same effect (formatting those
words in italics) could have been achieved via a
different tag that tells your browser to "put these
words in italics."
NOTE: Some browsers don't attach any
style to the <DFN> tag, so you might
not see the indicated phrases in the previous paragraph
in italics.
If physical and logical styles produce the same result
on the screen, why are there both?
In the ideal SGML universe, content is divorced from
presentation. Thus SGML tags a level-one heading as a
level-one heading, but does not specify that the
level-one heading should be, for instance, 24-point bold
Times centered. The advantage of this approach (it's
similar in concept to style sheets in many word
processors) is that if you decide to change level-one
headings to be 20-point left-justified Helvetica, all you
have to do is change the definition of the level-one
heading in your Web browser.
Another advantage of logical tags is that they help
enforce consistency in your documents. It's easier to tag
something as <H1> than to remember
that level-one headings are 24-point bold Times centered
or whatever. For example, consider the <STRONG>
tag. Most browsers render it in bold text. However, it is
possible that a reader would prefer that these sections
be displayed in red instead. Logical styles offer this
flexibility.
Of course, if you want something to be displayed in
italics (for example) and do not want a browser's setting
to display it differently, use physical styles. Physical
styles, therefore, offer consistency in that something
you tag a certain way will always be displayed that way
for readers of your document.
Try to be consistent about which type of style you
use. If you tag with physical styles, do so throughout a
document. If you use logical styles, stick with them
within a document. Keep in mind that future releases of
HTML might not support physical styles, which could mean
that browsers will not display physical style coding.
Logical Styles
- <DFN>
- for a word being defined. Typically displayed in
italics. (NCSA Mosaic is a World Wide
Web browser.)
- <EM>
- for emphasis. Typically displayed in italics. (Consultants
cannot reset your password unless you call the
help line.)
- <CITE>
- for titles of books, films, etc. Typically
displayed in italics. (A Beginner's Guide
to HTML)
- <CODE>
- for computer code. Displayed in a fixed-width
font. (The <stdio.h> header
file)
- <KBD>
- for user keyboard entry. Typically displayed in
plain fixed-width font. (Enter passwd
to change your password.)
- <SAMP>
- for a sequence of literal characters. Displayed
in a fixed-width font. (Segmentation fault:
Core dumped.)
- <STRONG>
- for strong emphasis. Typically displayed in bold.
(NOTE: Always check your links.)
- <VAR>
- for a variable, where you will replace the
variable with specific information. Typically
displayed in italics. (rm filename
deletes the file.)
Physical Styles
- <B>
- bold text
- <I>
- italic text
- <TT>
- typewriter text, e.g. fixed-width font.
Character entities have two functions:
- escaping special characters
- displaying other characters not available in the
plain ASCII character set (primarily characters
with diacritical marks)
Three ASCII characters--the left angle bracket (<),
the right angle bracket (>), and the ampersand
(&)--have special meanings in HTML and therefore
cannot be used "as is" in text. (The angle
brackets are used to indicate the beginning and end of
HTML tags, and the ampersand is used to indicate the
beginning of an escape sequence.) Double quote marks may
be used as-is but a character entity may also be used
(").
To use one of the three characters in an HTML
document, you must enter its escape sequence
instead:
- <
- the escape sequence for <
- >
- the escape sequence for >
- &
- the escape sequence for &
Additional escape sequences support accented
characters, such as:
- ö
- the escape sequence for a lowercase o with an
umlaut: ö
- ñ
- the escape sequence for a lowercase n with an
tilde: ñ
- È
- the escape sequence for an uppercase E with a
grave accent: È
You can substitute other letters for the o, n,
and E shown above. Check this online reference for
a longer list of special
characters.
NOTE: Unlike the rest of HTML, the
escape sequences are case sensitive. You cannot, for
instance, use < instead of <.
The chief power of HTML comes from its ability to link
text and/or an image to another document or section of a
document. A browser highlights the identified text or
image with color and/or underlines to indicate that it is
a hypertext link (often shortened to hyperlink
or link).
HTML's single hypertext-related tag is <A>,
which stands for anchor. To include an anchor
in your document:
- start the anchor with <A (include
a space after the
A
)
- specify the document you're linking to by
entering the parameter HREF="filename"
followed by a closing right angle bracket (>)
- enter the text that will serve as the hypertext
link in the current document
- enter the ending anchor tag: </A>
(no space is needed before the end anchor tag)
Here is a sample hypertext reference in a file called US.html:
<A HREF="MaineStats.html">Maine</A>
This entry makes the word Maine the hyperlink
to the document MaineStats.html, which is in
the same directory as the first document.
You can link to documents in other directories by
specifying the relative path from the current
document to the linked document. For example, a link to a
file NYStats.html located in the
subdirectory AtlanticStates would be:
<A HREF="AtlanticStates/NYStats.html">New York</A>
These are called relative links because you
are specifying the path to the linked file relative to
the location of the current file. You can also use the
absolute pathname (the complete URL) of the file, but
relative links are more efficient in accessing a server.
Pathnames use the standard UNIX syntax. The UNIX
syntax for the parent directory (the directory that
contains the current directory) is "..".
If you were in the NYStats.html file and
were referring to the original document US.html,
your link would look like this:
<A HREF="../US.html">United States</A>
In general, you should use relative links because:
- it's easier to move a group of documents to
another location (because the relative path names
will still be valid)
- it's more efficient connecting to the server
- there is less to type
However use absolute pathnames when linking to
documents that are not directly related. For example,
consider a group of documents that comprise a user
manual. Links within this group should be relative links.
Links to other documents (perhaps a reference to related
software) should use full path names. This way if you
move the user manual to a different directory, none of
the links would have to be updated.
The World Wide Web uses Uniform Resource Locators
(URLs) to specify the location of files on other servers.
A URL includes the type of resource being accessed (e.g.,
Web, gopher, WAIS), the address of the server, and the
location of the file. The syntax is:
scheme://host.domain
[:port]/path/
filename
where scheme is one of
- file
- a file on your local system
- ftp
- a file on an anonymous FTP server
- http
- a file on a World Wide Web server
- gopher
- a file on a Gopher server
- WAIS
- a file on a WAIS server
- news
- a Usenet newsgroup
- telnet
- a connection to a Telnet-based service
The port number can generally be omitted.
(That means unless someone tells you otherwise, leave it
out.)
For example, to include a link to this primer in your
document, enter:
<A HREF="http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html">
NCSA's Beginner's Guide to HTML</A>
This entry makes the text NCSA's Beginner's
Guide to HTML a hyperlink to this document.
For more information on URLs, refer to:
Anchors can also be used to move a reader to a particular
section in a document (either the same or a different
document) rather than to the top, which is the default.
This type of an anchor is commonly called a named
anchor because to create the links, you insert HTML
names within the document.
This guide is a good example of using named anchors in
one document. The guide is constructed as one document to
make printing easier. But as one (long) document, it can
be time-consuming to move through when all you really
want to know about is one bit of information about HTML.
Internal hyperlinks are used to create a "table of
contents" at the top of this document. These
hyperlinks move you from one location in the document to
another location in the same document. (Go to the top of this document and then click on
the Links to Specific Sections hyperlink in
the table of contents. You will wind up back here.)
You can also link to a specific section in another
document. That information is presented first because
understanding that helps you understand linking within
one document.
Links Between Sections of Different Documents
Suppose you want to set a link from document A (documentA.html)
to a specific section in another document (MaineStats.html).
Enter the HTML coding for a link to a named anchor:
documentA.html:
In addition to the many state parks, Maine is also home to
<a href="MaineStats.html#ANP">Acadia National Park</a>.
Think of the characters after the hash (#) mark as a
tab within the MaineStats.html file. This
tab tells your browser what should be displayed at the
top of the window when the link is activated. In other
words, the first line in your browser window should be
the Acadia National Park heading.
Next, create the named anchor (in this
example "ANP") in MaineStats.html:
<H2><A NAME="ANP">Acadia National Park</a></H2>
With both of these elements in place, you can bring a
reader directly to the Acadia reference in MaineStats.html.
NOTE: You cannot make links to
specific sections within a different document unless
either you have write permission to the coded source of
that document or that document already contains
in-document named anchors. For example, you could include
named anchors to this primer in a document you are
writing because there are named anchors in this guide
(use View Source in your browser to see the coding). But
if this document did not have named anchors, you
could not make a link to a specific section because you
cannot edit the original file on NCSA's server.
Links to Specific Sections within the Current
Document
The technique is the same except the filename is
omitted.
For example, to link to the ANP anchor
from within MaineStats, enter:
...More information about <A HREF="#ANP">Acadia National Park</a>
is available elsewhere in this document.
Be sure to include the <A NAME=>
tag at the place in your document where you want the link
to jump to (<H2><A
NAME="ANP">Acadia National
Park</a></H2>).
| top | on to part 2
|
