Course Content‎ > ‎

Section 02: Client-side Development (I)


HyperText Transfer Protocol (HTTP)

HyperText Transfer Prototol (HTTP) is the protocol that allows Web browsers and servers to communicate. It forms a basis of what a Web server must do to perform its most basic operations. HTTP started out as a very simple protocol, and even though it has had numerous enhancements, it is still relatively simple. As with other standard Internet protocols, control information is passed as plain text via a TCP connection. In fact, as a simple experiment try the following:

Before starting, you should consider firewalling issues. In DCU for example, you typically use a "proxy server" when downloading web pages. Hence, you make the request to the proxy server, which retrieves the pages on your behalf. This means that you will only be able to connect to "local" sites INSIDE the firewall. So, if you are attempting this from DCU, you will find that you will likely be able to connect to sites like: www.dcu.ie, www.eeng.dcu.ie, www.rince.ie etc. but not to www.apache.org. Likewise, if you are attempting this from work and are firewalled then you should try a local web server. If you are trying this from home, then all addresses should work. As a secondary point, some sites are not set up to support HTTP/1.1 persistent connections for various reasons (some good and some bad). At the last edit of these notes, www.dcu.ie was one of these sites. If HTTP/1.1 requests are behaving like HTTP/1.0 requests, then try an alternative site for the purpose of demonstration.
  • Using the standard telnet command (START->Run->telnet on Windows) connect to www.eeng.dcu.ie on port 80. The standard port for Web requests is port 80, so we connect to that port number. From command line this can be done by typing telnet www.dcu.ie 80. Assuming no firewall issues, on linux/solaris the server will return the following:
  • Trying 136.206.1.33.... Connected to www.dcu.ie. Escape character is '^]'.

    Note: Using Windows command prompt to invoke telnet will somtimes result in no connection message and a seeming lack of input for the following step. Simply type the command "blind" and obtain the output, or alternatively use a telnet program

  • Once connected we can type in and send a HTTP request, followed by the request headers. In order to retrieve the home page, the following can be typed:

    GET /index.html HTTP/1.0

    After entering the line, press RETURN twice - the first ends the request line, and the second marks the end of the optional request headers. 

  • In response to this HTTP GET command, the Web server returns to us the page "index.html" across the telnet session, and then closes the connection to signify the end of the document. At the client-side the browser, such as Internet Explorer or Netscape renders this HTML code into a graphical web page.

  • It is often more convenient to send a 'HEAD' request instead of 'GET'. This makes the server behave exactly as it it was handling a GET, but it doesn't bother to send the actual document. This makes it easier to see the response headers, and means you do not have to wait to download the document itself. For example, to see what response headers that the above example returns use:

    HEAD /index.html HTTP/1.0

    The server will return response headers similar to the following:

    HTTP/1.1 200 OK Date: Sun, 03 Feb 2008 11:28:13 GMT Server: Apache/2.0.52 (Red Hat) mod_perl/1.99_16 Perl/v5.8.5 DAV/2 PHP/5.2.3 ... Last-Modified: Wed, 28 Nov 2007 21:09:17 GMT ETag: "6fc891-3195-9c3a7940" Accept-Ranges: bytes Content-Length: 12693 Connection: close Content-Type: text/html

HTTP is a simple, stateless protocol - a client such as a web browser, makes a request, the web server responds, and transaction is done. When the client sends a request, the first thing it specifies is an HTTP command, called a method, that tells the server the type of action it wants performed. In HTTP/1.0, a connection must be made to the Web server for each object the browser wishes to download. Many web pages are very graphic intensive, which means that in addition to downloading the base HTML page, the browser must also retrieve a number of images. Establishing a connection for each one is wasteful, as several packets have to be exchanged between the Web browser and Web server before the image data can start transmitting.


HTTP/1.1

It wasn't long before HTTP was refined into a more complex protocol by the World Wide Web Consortium (http://www.w3.org). HTTP/1.1 addressed a number of issued which needed to be handled since HTTP/1.0. The basic operation of HTTP/1.1 remains the same as for HTTP/1.0, and the protocol ensures that browsers and servers of different versions can all interoperate correctly. If the browser understands version 1.1, it uses HTTP/1.1 on the request line instead of HTTP/1.0.

As previously stated, using seperate connections for each item on a web page can be very slow, especially across the Internet when there is a delay involved in each connection and disconnection. To help make pages with inline elements quicker to download, HTTP/1.1 defines persistent connections, where a number of documents can be requested over a single connection, on at a time. An early implementation of persistant connections was known as keep-alive, and Apache as well as a number of other servers and browsers supported this sort of connection. However, persistant connections are first officially documented in HTTP/1.1 and are implemented slightly differently from keep-alives. For a start, in HTTP/1.1, persistant connections are the default. Unless the browser explicitly tells the server not to use persistent connections, the server should assume that it might be getting multiple requests on a single connection. Persistent connections are controlled by the Connection header. Unless a Connection: close header is given, the connection will remain open. This can be tested by connecting to www.apache.org and sending a simple request, for example:

% telnet www.apache.org 80 HEAD /index.html HTTP/1.1 Host: www.apache.org HTTP/1.1 200 OK Date: Sun, 03 Feb 2008 11:54:13 GMT Server: Apache/2.2.8 (Unix) Last-Modified: Wed, 23 Jan 2008 23:24:46 GMT ETag: "164127-4cd-3b69606d" Accept-Ranges: bytes Content-Length: 17129 Cache-Control: max-age=86400 Vary: Accept-Encoding Content-Type: text/html  


Two affects are immediately noticable. Firstly, The Connection: close is no longer in the output, signifying that the connection will not automatically close after the item is received. Secondly, there will be a pause of a few seconds before the connection closes automatically. This time-out is configured at the server. Using HTTP/1.1 the request can still be sent with a Connection: close header and this will cause the connection to close immediately after the request headers have been sent.

In the HTTP/1.1 example we were required to provide the "Host: " header line.  The "Host" header distinguishes between various DNS names sharing a single IP Address , allowing name-based virtual hosting.  Name-based virtual hosting is where a number of "sites" can be run off the same web/application server on the same machine using the same ip address.  For example, we could run www.wesellcds.com and www.wesellbooks.com on the same web server.   Hence this header, while optional in HTTP/1.0, it is mandatory in HTTP/1.1.

As well as a number of other useful changes, HTTP/1.1 includes a lot of infromation and new features for people implementing proxies and caches. In addition to improved documentation, HTTP/1.1 also includes a range of new features to make implementing proxies and caches easier, and in particular to reduce network traffic by allowing proxies and caches to send more 'conditional' requests and to do transparent content negotiation. A conditional request is like a normal request, except te sender (the proxy or cache server) includes some information about whether it really needs the document. For example, a proxy or cache can send an entity-tag which identifies a document it already has, and the server only sends back the document if the cache does not already have this document. Conditional requests can also be based on the last-modified time of the document.


Method Types

When a client connects to a server in order to make a HTTP requests, these requests can be one of a number of different types, known asmethods.

  • GET - this method is most used for simply retrieving information from the server. During the most common transaction, where a user simply requests a web page from the server, it is the GET method which is used. Although it is used primarily for reading information, the GET method can include additional information, via the URL which better describes what it wishes to obtain. For example, consider a server-side programme called phonesearch which you could normally call by using the following URL:

    http://www.server.com/scripts/phonesearch

    The script called as above, might for example, return all the names and phones numbers of employees in a company. However, we can pass information as a query string, placing the extra parameters into the URL. Considering the same example, we might want to search for a particular phone number (assuming the server-side programme can do this!) and our new URL might be:

    http://www.server.com/scripts/phonesearch? firstname=david&lastname=molloy

    Placing the extra information in the URL in this way, allows the page to be bookmarked or emailed like any other link. The query string in this example is firstname=david&lastname=molloy Because GET requests shouldn't be used for sending large quantities of information, the address of URLs together with query strings, has been limited to 240 characters. Additionally, writing information into URLs is not suitable for passing security or authentication parameters, as the URLs are sent in plain text across the web, through proxy servers and can lie in browser history on multiple user computers. Therefore, we need a more secure method of passing information.

  • POST - The POST method uses a different technique to transfer data to a server. Often the POST method may be used to send large quantities of information, for which GET is entirely unsuitable. A POST request passes all of its data, of unlimited length, directly over the socket connection as part of its HTTP request body. The URL for the transaction remains the same regardless of what information is passed to the server programme and the transaction is transparent to the client. As a result, POST requests cannot be bookmarked or emailed or often even reloaded - although the browser will actually create a bookmark, when the browser attempts to visit the page it will likely fail due to required data not being present. Reloading might be prevented in the case, where for example, a bank transaction had just occurred to transfer 1,000 euro from one account to another - even though the user might click reload, they would not necessarily want to transfer this money twice. As an example of a POST request consider a webpage:

    http://www.server.com/login.html

    This page has a simple, common form which requests a username and password. In the form code, it defines the method to be POST and the server-side programme to deal with logging in as being:

    http://www.server.com/scripts/dologin

    Using the POST method the information is passed transparently when compared to the GET method - the data being transferred cannot be seen in the URL and the dologin page could not be simply bookmarked and revisited. It would most likely return an error. The only way to access the dologin programme correctly is to fill out a POST request using the expected data.

  • HEAD - As was mentioned previously, the HEAD method makes the server behave exactly as though it was handling a GET, but doesn't bother to send the actual document. It is essentially a GET with a blank page, which enables the user to see the response headers more clearly.

  • Others - OPTIONS, TRACE, DELETE, PUT A number of other method requests, some of which originated in HTTP/1.0, and others which later arrived with the advent of HTTP/1.1. These methods are infrequently used, but serve useful purposes in their own right. PUT is used to place documents directly on the server and DELETE is used to delete documents from the server (obviously need to be used carefully and in the right scenarios). TRACE is used as a debugging aid and it returns back to the client the exact contents of its request. Finally OPTIONS can be used by the client to ask the server which methods it supports or what options are available for a particular resource on the server. However, primarily of interest overall are the GET, POST and HEAD commonly used methods.


Hypertext Markup Language (HTML)

This course makes no pretence at being a web design course and only very basic HTML will be covered. To learn more about web design, there are any number of resources on the web and literally thousands of books on the subject out there. However, in order to use many of the development tools in this course an understanding of basic HTML must be there as a foundation. Indeed, the most common responses from server programmes takes the form of HTML back to the client's browser.

What are HTML Documents?

HTML stands for HyperText Markup Language. HTML documents are plain-text (also known as ASCII) files that can be created using any plain text editor tool, for example; on unix, emacs or vi, on Mac, SimpleText or on Windows, notepad. Additionally, it is also possible to use word-processing software, such as WordPad on Windows, if the option is taken to save the files are plain text files. Perhaps the most common method used to write HTML files is to use a WYSIWYG (What you see is what you get!) editor, such as DreamWeaver or Frontpage. These programs allow you to design your HTML documents visually, as if you were using a word processor, instead of writing the markup tags in a plain-text file and imagining what the resulting page will look like. It is very useful however, to know enough HTML to create a document, before you begin to use a WYSIWYG editor, in case you want to add features later which your editor does not support.

When writing HTML manually (which is what we will be doing!), you add "tags" to the text in order to create the structure. These tags tell the browser how to display the text or graphics in the document. These .html files are placed on the web server generally through the use of some remote transfer program such as FTP (file transfer protocol). Each web page on a server, has what is known as a URL (Universal Resource Locator) which looks something like:

http://www.server.com/directory/file.html

The URL or address of a web page depends both on the server configuration and the location of where the file had been placed on the server. When the client types in the correct address for a web page, the browser downloads that file from the server (using HTTP GET method), interprets the HTML code and displays the page graphically.


Example HTML Document

The following is an example of a very basic HTML Document which simply displays the a few headings and some text. Following it, is a screenshot showing what this page would look like when viewed by a browser.

<HTML> <HEAD> <TITLE>Simple HTML Page</TITLE> </HEAD> <BODY> <H1>Hello World!</H1> <P>This is a basic webpage, consisting of a title, a heading (heading 1 type) and a short paragraph!</P> </BODY> </HTML>


Figure 2.1. Basic HTML Document shown in Browser



It can be seen in the above figure that the URL for the page is a local filename. It is possible to simply save HTML files on your local hard drive and load them in a browser similar to the Internet. Obviously, in this case, unless your machine is set up as a server (i.e. is on the Internet and runs some web server software), these files can not be seen from other computers on the Internet. Generally, web designers will develop their HTML on a local machine until it is ready for deployment, at which point they will FTP the files to the server.

The required elements in the above basic example are the <html>, <head>, <title> and <body> tags (and their corresponding end tags!). Because you should include these tags in each file, designers generally create a template which already includes these items and the modify the file - alternatively most WYSIWYG editors will generate these tags automatically when you select to create a New Page. Many browsers will display poorly written HTML, omitting these tags, correctly but some browsers won't. So beware!

Before introducing a slightly more complicated example, we will introduce a few of the tags and their usage. Many tags have additional configurable attributes, which are used inside the tags to further define settings. The list below shows a subset of the more common tags used and a few available additional attributes for some tags:


Primary HTML Tags

The following list shows some of the most commonly used HTML tags and a very basic example on how they are used. Students should accustomise themselves with the use of these common tags and be able to create webpages. Again, repeating the fact that this is not a web design course, it should be noted that HTML is needed as a precursor to Javascript and forms, which combined form the most common client side interface to server-side applications.

Some Common HTML Tags
Note: HTML is case-insensitive.  Hence, we can use lower case tags and attributes, or alternatively upper-case tags and attributes.
  • <html> - This element tells your browser that the file contains HTML-coded information. The file extension .html (or .htm) also indicates that this is an HTML document and must be used. On some browsers, it is not *required* that the html tag is present in order to display a page, but other browers or indeed future browsers may cause issues.

  • <head> - The head element identifies the first part of the HTML-coded document that contains the title, style formatting/reference and META-data regarding the document. Similar to the html tag is can sometimes be ignored but it is not recommended.

  • <title> - The title element contains your document title and identifies its content in a global context. The title is typically displayed in the title bar at the top of the browser window, but not inside the window itself. The title is also what is displayed on someone's hotlist or bookmark list, so something descriptive, unique and relatively short should be chosen. It is this title, which is also used to identify you page to search engines such as Infoseek or Google. The title tag is used within the heading (head) section of the document.

  • <body> - The body element contains the largest part of most HTML documents, which consists of the content of the document (displayed within the main area of the browser window). The body tag simply signifies where the body starts and ends.

  • <br/> - To force a new line on your displayed webpage simply use <br/>

  • <p> - Unlike documents in most word processors, carriage returns in HTML files aren't significant. In fact, any amount of whitespace -- including spaces, linefeeds, and carraige returns -- are automatically compressed into a single space when your HTML document is displayed in a browser. Therefore, in order to indicate paragraphs you must use the P tag. The P tag can be used with additional attibutes such as ALIGN, which controls the alignment of the text in the paragraph. For example:

    <p align="center">This is a very short paragraph!</p>
  • <h1> - The heading tags range from H1 (largest) to H5 (smallest) and are simply tags for adding in headers. Before the Font tag appeared, it was these tags which were used for setting font sizes on titles for various sections. They are still commonly used (especially with stylesheets, which we won't go into here!) and are supported by all browsers.

  • <font> - The font tag had commonly been used by web designers and has since been deprecated in the releases of the HTML specifications by the World Wide Web Consortium (http://www.w3.org). It can be used to simply change the font formatting and color of the text nested between the opening and closing tags. The three primary attributes which can be used are size, color and face. While it is not recommended to use the font tag in any serious website development, it is no major issue to use it as an example to show it in action:

    <font face="Arial" size="10pt" color="red">The text which is affected</font> is only that which was encapsulated by the tags!
  • <b> - The bold tag is a basic tag which simply can be placed around text in order to emphasise the text.

    <b>NOTE:</b> The previous NOTE: has been bolded!
  • <i> - The italics tag works in the same way as the bold tag. It is simply placed around text in order to italicise the text.

    There was not a sufficient <i>quorum</i> to hold the meeting.
  • <a> - In order to create links to other pages, locations within pages, files or documents the <a> tag is used. This tag requires attributes to be of use, in particular those which inform the browser of the location to be linked. <a>s can be placed around text, for standard text links or alternatively around images, for a more advanced link. The display area for any link can be set to be the default same window, a seperate frame, floating window or new blank window - this is set using the 'target' attribute. So for example:

    Click <a href="www.dcu.ie" target="_blank">here</a> to go to the DCU homepage.
  • Lists - HTML supports unnumbered, numbered, and definition lists. You can nest lists too, but this should be used sparingly as too many nested items can get difficult to follow. To create a numbered list you use the following: 

    <ul> <li>Phone</li>
    <li>House</li>
    <li>Car</li>
    </ul>
  • As an example of an Ordered List you could use:
    <ol> <li>Cat</li> <li>Dog</li>
    <li>Horse</li>
    </ol>
  • <pre> - PRE is short for "preformatted", which can be used to generate text in a fixed width font. In contrast to HTML these tags make encapsulated spaces, new lines, and tabs significant - new lines break in the same location on the browser as in the HTML code and multiple spaces are displayed as multiple spaces. <pre> is most commonly used for program listings within web pages, or precisely formatted plaintext. As an example, see below.  <pre> can be used with an optional 'WIDTH' attribute that specifies the maximum number of characters for a line. Note that because <,>, and & have special meanings in HTML, you must use their escape sequences (&lt;, &gt; and &amp; respectively) to enter these characters.

  • <pre> #!/bin/csh cd /usr/tmp rm * </pre>
  • Images Most web browsers can display images, which are in .bmp, .gif, .png or .jpg format. When a page is accessed which contains images (as most do to varying extents), the referenced images in the web page are downloaded and formatted according to the HTML code. Each image takes additional time to download and slows the overall loading of the page, so images should be used sparingly and preferably in a format such as .gif or .jpg, which are highly compressed formats of images. To insert an image into your web page, we use the following as an example:

    <img src="images/intro1.gif" alt="Sample Image" title="This will appear on mouseover" />

    When using images "ALT" attributes should *always* be used - traditionally it was these tags which explained what the images represented to text-only browsers. However, more importantly, today they are used primarily for "accessibility" for visually impaired readers. If visually impaired users visit a graphical site, which has omitted ALT tags, then the site is meaningless - if the tags are included their computer software can read out the description. Consider even the simple case of a 'Submit' and 'Clear' button for a form and the difficulties which would be encountered.

    The "title" attribute is used to give a tool tip when the mouse is hovered over an image and should not be confused with the "alt" attribute.

  • <table> - Tables are used in two primary ways in the deployment of web pages. Firstly, they can be used in their primary intended way, as a means of displaying tabulated data, such as a timetable or a price list. Secondly, they can be used for the purposes of "layout" on precisely structured sites and pages. Many graphical-based sites use tables to place their images and text in a rigid structure. A simple example of this would be in the layout of a newspaper type page, with two columns and an image. Individual cells can be made to "span" rows and columns.  However, in modern web-site design CSS (Cascading Stylesheets), introduced later in this section, are used for the purposes of layout and the use of tables for this purpose is discouraged.  

    There are two essential tags to be used in any table, <tr> and <td>, which are used for delimiting rows and columns respectively. As with most other tags, <table>, <tr> and <td> can all be used with specific attributes. The following example consists of two columns, with an image spanning both colums of the second row.  Using additional tags such as <th>, header rows can be specified, but this has not been included in the following example.

    <TABLE WIDTH="600" CELLPADDING=2 CELLSPACING=5 BORDER=1> <TR> <TD ALIGN="Center">Column 1</TD> <TD ALIGN="Center">Column 2</TD> </TR> <TR> <TD COLSPAN="2" ALIGN="Center"> <IMG SRC="images/intro1.gif" ALT="Image in a Table"> </TD> </TR> <TR> <TD ALIGN="left">Left Aligned Text</TD> <TD ALIGN="right">Right Aligned Text</TD> </TR> </TABLE>

HTML vs XHTML

XHTML 1.0 is simply a reformulation of HTML 4.0 into an XML (eXtensible Markup Language) form.  While it seems strange to immediately introduce something new (when we've just barely introduced HTML) we will see great similarities between the two markup languages.  There are some differences however, principally:
  • HTML is much more forgiving and less strict on following syntax
  • XHTML is case sensitive, with all tags and attributes in lower case letters
  • XHTML, being XML must be well-formed (more on this in the XML chapter).  All elements (tags) must have a start tag and an end tag and must be correctly ordered.

Well-formed XHTML

<b>This is a <i>test</i></b>
<img src="images/test.jpg" alt="Test Image"/>        ( "<img ..../>" is shorthand for writing <img ....></img>

Invalid XHTML

<b>This is a <i>test</i>                         (missing closing tag)
<b>This is a <i>test</b></i>                     (elements are nested incorrectly and do not have a parent-child relationship)
<IMG src="images/test.jpg" alt="Test Image">     (two causes, no closing tag and uppercase tag name)
  • In XHTML all attributes should be surrounded by single ( ' ) or double quotes ( " ).  HTML allows quotes to be omitted in certain circumstances (and will most likely render correctly even if omitted in other circumstances).
There are a number of other differences but, for our purposes, these are the main ones which affect us.  Rather than get confused between both markup languages, we will simply provide further examples in XHTML.  XHTML provides a number of advantages for us (particularly in relation to DOM, JQuery, JavaScript and other topics we will introduce later.  

Example XHTML Document

In previous years, this was actually a second HTML example.  Now, we have taken the example, changed the tags so that they are lowercase, tidied up attributes and ensured that tags are well-formed.  Otherwise, we are simply using the tags introduced for HTML above.  

So, combining these new tags together to form a relatively more "complicated" XHTML document we get:

<html> <head> <title>Second HTML Page</title> </head> <body> <h1>Web Application Development</h1> <p>The obvious importance of <i>HTML</i> lies in the fact that is the primary language used for webpages across the Internet.</p> <p>Perhaps, the most surprising element of <i>HTML</i> is the fact <b>it is so easy to write!</b></p>. <br/> <br/> To visit the web pages for this module in a <b>new window!</b> please click <a href="http://www.eeng.dcu.ie/~ee553" target="_blank">here</a>. <p>Some of the concepts covered in this module are:</p> <ul> <li>Servlets</li> <li>JSPs</li> <li>EJBs</li> </ul> <p>Lastly, we will combine some of the remaining elements together into a table to show how everything can be combined: </p> <table style="width:600px;border:1px solid black;"> <tr> <td><b>PRE Example</b></td> <td><b>Linked Image Example</b></td> </tr> <tr> <td> <pre> #!/bin/csh cd /usr/tmp rm * </pre> </td> <td>Click for DCU Home<br/> <a href="http://www.dcu.ie"> <img src="https://www.dcu.ie/sites/default/files/tealhoriz120h-b.png" alt="DCU Home" style="border:none" /> </a> </td> </tr> <tr> <td colspan="2"><b>NOTE:</b> You have probably noticed that at times I switch between lower and upper case for tags - the W3C recommend using upper case for consistancy but it is not important as all browsers recognise either. By the way, these sentences span both columns of the table!</td> </tr> <tr> <td style="text-align:left">Left Aligned Text</td> <td style="text-align:right">Right Aligned Text</td> </tr> </table> </body> </html>


Figure 2.2. Example XHTML Document shown in Browser

XHTML Output


Note:

Technically, this XHTML document will not validate against a validator, such as the W3C XHTML Validator.  In order to get it to validate, we need to add in a few additional XML-based tags at the beginning of our document.  

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml" lang="EN">                     (To replace the <html> tag currently above)

The first line declares the file to be an HTML document type file.  The second line specifies the rules file against which validation is performed.  

To test your own validation, try copying and pasting the source code from the example xhtml_example.html  into the W3C XHTML Validator.  It should validate successfully.

When you are writing HTML/XHTML in the majority of situations (for pages, JSPs etc. later on) you can ignore these three lines.  However, the practices of using lowercase, making all HTML well-formed, quoting attributes etc. are useful practices for us to follow and will have benefits later on.


Practicing your HTML Skills

This module is not a web design course and, as a result, the level of HTML expected is not high.  For most web application developers working in a company, you will likely be referring the web design aspect to a graduate of a graphical design programme.  Graphical design for the web medium is surprisingly similar to that of the print medium, requiring skills in using graphical design programmes such as Photoshop and Illustrator.

However, all web application developers are required to know the basics of HTML pages.  If a developer is building a web application, it is likely that they will develop a fully working interface for their application, and it would only be passed to the graphical designer for improving the look & feel and for professional branding and marketing.

For our purposes, it is important to be familiar with the basics of developing web pages, in particular all of the tags introduced above.  The best way of practicing HTML is to write your own and to spend a couple of hours practicing it.  It really isn't very difficult and is simply a case of familiarising yourself with the available tags.

To aid this process, the following tool is recommended:

This tool allows you to type HTML in the window on the left hand side and simply click a button to render it to a webpage on the right hand side.  It is very useful for practicing basic tags, although it does give a strong feeling of placing files on a server (on the internet) for all to see.  

To start you off, try copy/pasting the Example documents above into the editor and viewing the output.


Did you actually practice?

It's hard to write this section without coming across as a school teacher waving a big stick!  

This is the first place in the notes where it has been recommended that you practice some "source code" to get familiar with the topic.  There are many people who prefer to learn the tags off by heart and move on.  If you start to do this now, you will most likely FAIL this module.  The course has been designed to be practical in nature, which means that, when I assess your progress through examination and assignment, I will be looking to assess your ability to write web application software.  

If you attempt to learn off the module, you will learn nothing from this course.  If you work through the examples, developing your own skills you will learn a whole lot more, do better in the examinations and be in a position to take on a graduate role in web application development.

More and more, I am moving the examination to be entirely practical questions intended to reward those that study in this manner more.  If you are having difficulty understanding the practical aspects of this module and instead find yourself "learning off" then please come and talk to me.  


HTML Accessibility

HTML accessibility refers to the inclusive approach of designing web pages in such a manner so as to make websites usable by people with all abilities and disabilities.   For example, by the simple inclusion of "alt" attributes in "<img>" elements we are able to improve usability for visually impaired users, who perhaps rely on text-to-speech converters when visiting web sites.  Likewise, if website chosen fonts are large and clear this provides assistance to users with poor sight.   

The Web Content Accessibility Guidelines (WCAG) 1.0 guidelines were published by the World Wide Web Consortium (W3C) in 1999.  These guidelines were widely accepted as the definitive approach to designing websites to be accessible.   In December 2008, the Web Content Accessibility Guidelines (WCAG) 2.0 recommendation was published, which aims to be more up to date, technology neutral and cater for a number of newer aspects appearing in browsers.  

While there is a moral and ethical responsibility on developers to ensure that websites cater for users with disabilities, in general there are legal requirements in most countries.  In Ireland, the Disability Act 2005 makes reference to this.

Section 28.2
"Where a public body communicates in electronic form with one or more persons, the head of the body shall ensure, that as far as practicable, the contents of the communication are accessible to persons with a visual impairment to whom adaptive technology is available."


There are dozens of accessibility tools available for checking certain aspects of web sites - try pasting the XHTML code above into http://wave.webaim.org/ and view the output.  You will see how, despite there being "no accessibility errors" detected, there are a certain amount of recommendations.  Often with regards to accessibility, there is a good deal of common sense involved. 

For example, you could "technically" pass an accessibility check by giving a submit image button (note: whether you should use this is another question) an alt attribute = 'Button'.  Of course, this doesn't distinguish it from the 'Reset' image button beside it with alt attribute = 'Button'.  Despite both images having an 'alt' tag, there is now a 50% chance that a blind user might reset their form rather than submit it (or vice versa).



HTML5

Work began on HTML5 in 2004 but it was finalised and published on October 2014 by the World Wide Web Consortium (W3C).  While it was the fifth revision of the HTML standard - the previous version HTML4 was standardised in 1997. 

Features of HTML5

  • Markup: a range of new elements to reflect modern websites (tags like <video>, <audio>,<footer> etc.  Other elements are removed, such as <font> and <center>.  Overall it should promote simplification of HTML design.  
  • Drawing on the Page : HTML5 introduces <canvas> for 2D drawing direct in the browser.  For example, this could be used to render charts from a server-side database and present them directly in the client browser.  Other examples: http://www.devlounge.net/code/10-awesome-html5-canvas-examples
  • Video and Audio: HTML5 comes with a video player which can be modified through the modification of CSS (we'll come to this next).  This should mean the removal of the need for plugins to access audio and video content on the web.
  • Application Front-end: HTML5 actually allows the generation of web application front end through the use of Canvas.  However, it also introduces a range of new FORM controls (we'll come to this later).  You can find a list of these at: http://www.scriptol.com/html5/forms.php
  • Offline Storage Database: In a very simple explanation, this could be considered to be an improvement on the current cookies system providing the ability to store more data and providing a better programming interface.  It also facilitates the idea of offline web applications (which can make sense in some scenarios).  Consider an application like Google Docs, where it may make sense at times  to work offline until the document is complete and ready for saving on the server.
  • <datagrid>: for presenting tabular data with extra features such as selecting rows and columns, sorting etc.
  • Drag and Drop: the ability to drag and drop designated element into another element without the requirement of major JavaScript libraries.  Also provides the ability to drag and drop files from folders and the desktop into web applications.  
  • Inline Document Editing: HTML5 indicates that we can make elements editable - we are able to allow the user to edit its content.  In fact, it is also possible to make entire documents editable.  There is an attribute contenteditable which is used for this purpose.  Users can edit content directly in the page without an editor.  Try Google sites for an example of this - these notes are deployed on Google sites and I am currently typing directly onto this web page.
  • .... and too many other features possible to list


Further HTML Information

Sites of particular interest for both learning and reference are:



Cascading StyleSheets (CSS)

Cascading Style Sheets were first introduced in late 1996 and represented an exciting new opportunity to create more sophisticated page design, both in layout and content. It also greatly simplified the process of making web pages accessible to as many readers as possible, regardless of the device they use to read a page, and regardless of any disability they might have. CSS addresses the distinction between what a document should look like (referred to as its appearance and the underlying structure of the document. When HTML first burst upon the scene, problematic ways of coding page appearance took off, among these were the <font> and <b> elements and other presentational HTML elements. Some of these elements have been shown in the previous section and indeed, many web developers still implement their web site appearance in this way.  In addition, a number of pseudo web tools, such as using Microsoft Word to save a document as a HTML file, will result in pages with large amounts of styling bundled into these HTML files.  

Cascading Style Sheets [CSS] is a recommendation of the World Wide Web Consortium (W3C) and provides the means for web authors to separate the appearance of web pages from the content of web pages. This powerful tool allows developers to simplify the task of maintaining web sites while providing sophisticated layout and design features for web pages, disregarding the needs for plugins and long downloads. Specifically, the W3C made two recommendations, namely: Cascading Style Sheets 1 (CSS1) and Cascading Style Sheets 2 (CSS2), which incorporates and extends CSS1.  At the time of updating of these notes, CSS Level 3 is under development.

Unlike CSS1 and CSS2, which were released as full specifications, CSS3 is being released as a number of separate modules.  As of January 2016, the current status looked something like this (from green to red -> Recommendation, Candidate Recommendation, Last Call, Working Draft).

CSS3 Specification Status
(By Krauss (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons)

During the initial years of CSS, there was little uptake in usage, for a number of reasons but principally due to browser compatibility.  CSS only really worked in Netscape 4.0+ and Internet Explorer 3+ upwards. While older browsers are little used today, it does explain why the early uptake on CSS was slow.  Even today, despite the best attempts of the W3C, there are a number of differences in support for CSS with the various browsers.

So essentially, CSS involves more time spent learning a new web technology, cross browser issues and possible headaches! So why bother?

  • Basic Principle -Web pages should seperate content from appearanceThis means that the information in your web site should go into your HTML files, but those files should not contain information about how that information should be displayed. CSS implements this, as seen in the following figure (taken from The Complete CSS Guide):


    Figure 2.3. CSS: Seperating Content from Appearance


  • Style Benefits -CSS Styling allows access to a vast array of layout and presentation options, which are not available simply within plain HTML. Virtually everything relating to layout and presentation may be modified.

  • Single Style File - When using CSS the most common route is to create a .CSS file on your website, which is referred to by the HTML files on the site. Because all HTML files reference the same file dealing with style, a single change in the .CSS file propagates the change across the entire website. As an example, consider a web designer who wishes to change the standard text on the site to be 10pt and Arial (perhaps readers complained that the text was too small!) ; without CSS the web designer would need to change *every* font tag on *every* page in the site; with CSS simple one word and one number need be changed. In large companies, various subsections can use the same set of style sheets, to maintain consistancy across the company web site.


CSS Example

The following section provides a basic example of some of the features of CSS. Course participants should try their own examples and attempt to grasp the concepts behind CSS for themselves. In the examples, a style sheet is provided as well as a HTML file which "uses" the style sheet and the resulting browser output shown. By all means, if you have any older browsers, check how well these style elements are supported (as an example, link rollovers don't highlight in Netscape 4.7).

Example CSS File: example.css

a:LINK { font-family: Verdana, Geneva; font-weight: normal; font-style: normal; font-size: 9pt; color: orange; text-decoration: none; } a:VISITED { font-family: Verdana, Geneva; font-weight: normal; font-style: normal; font-size: 9pt; color: orange; text-decoration: none; } @media screen { /* hide from IE3 */ a:hover { color: blue; text-decoration: underline; } } body { font-family: Verdana, Geneva; font-style: normal; font-size: 8pt; color: #black; background: green; } h1{ font-family: Arial, Verdana; font-weight: normal; font-size: 14pt; color: blue; background: yellow; } b { font-style: normal; font-weight: bold; } b.somename { color: blue; } td { font-family: Verdana, Geneva; font-size: 8pt; color: blue; background: yellow; } td.heading { font-weight: bold; font-size: 10pt; color: white; background: blue; }
It should be noted that we are not restricted to "primary" colours when specifying them in CSS files or style attributes.  One can define essentially any colour using RGB (Red Green Blue) properties.  Example:   b { font-color: #ff0000; }  This would result in a red colour, as it represents in hexidecimal 255 red, 0 green, 0 blue.  There are 256(0-255) x 256 x 256 possible combinations of colour using this approach, which is over 16.7 million combinations (224).  For more information and a useful colour selection chart, visit http://en.wikipedia.org/wiki/Web_colors.  Personally, I use the following site when choosing colours for websites, diagrams etc.: http://www.colorpicker.com/

Example HTML file using example.css

<html> <head> <title>Sample CSS HTML File</title> <link rel="stylesheet" type="text/css" href="example.css" /> </head> <body> <h1>A Really <b>Ugly Web</b> Page</h1> <b>Please, Please, Please</b> take the time to choose a nice style for your webpages! This should be much easier now that you are using CSS files in future! <b class="somename">Please!!!!</b> <br/> <br/>We can put in a really ugly table now - obviously to highlight styles! Perhaps, you should rewrite the style sheet to show a decent style! Practice! <table> <tr> <td class="heading">Title Column 1</td><td class="heading">Title Column 2</td> </tr> <tr> <td>Data A</td><td>Data B</td> </tr> <tr> <td>Data C</td><td>Data D</td> </tr> </table> <br/>Do you notice the way, that this file does not contain any font tags! We are seperating content from presentation! <br/> <br/>Why not visit the <a href="http://www.dcu.ie" target="_blank">DCU Website</a> </body> </html>

Now you should take a look at the resulting output from this web page and then return here for an explanation! View the Output.


CSS Explanation

The example above consists of two files, the CSS file, which handles style and layout and the HTML file, which handles content. Concentrating on the CSS file first:

CSS files consist of rules, which are made up of two main parts: selector ('H1') and declaration ('font-family .....'). The declaration itself is split up into a number of name-value pairs (properties) referring to elements of style for that selector (eg. font-size: 14pt). The selector is the link between the HTML document and the style sheet, and all HTML element types are possible selectors. The list of available properties, which can determine the presentation of an HTML document can be found in the W3C CSS1 SpecificationsInheritance plays an important part in the layout of CSS files. Consider the following from the HTML file:

<h1>A Really <b>Ugly Web</b> Page</h1>

Due to its position, "Ugly Web" will inherit all of the properties of the parent element h1, ie. Arial, 14pt, blue with yellow background, normal font. However, as it is also encapsulated by the <b> tags with its own set of properties these will override the parent element properties. So considering the two applicable lines from the example CSS file we have:

h1 { font-family: Arial, Verdana, Helvetica; font-weight: normal; font-style: normal; font-size: 14pt; color: blue; background: yellow; } b { font-style: normal; font-weight: bold; }

It can be seen that "Ugly Web" will font-style: normal and font-weight: bold from the 'b' selector, while it will also inherit font-size: 14pt, color: blue and background: yellow from the h1 selector.

To increase the granularity of control over elements, a new attribute has been added to HTML, namely: 'class'. All elements inside the 'body' element of a HTML document can be classed, and the class can be addressed in the style sheet. Normal inheritance rules apply to classed elements, they inherit values from their parent in the document structure. One example of this can be seen in the b.somename selector, which automatically inherits all properties from the b selector. Effectively what 'class' allows us to do is to use the same standard HTML tag in countless ways: for example, on different sections of the same site we might want different styles of bold text, so we can define b.newsitem, b.highlightedlarge and so on.

Now moving on to the HTML file:

In order to specify which style sheet (css file) to be used on any page, a line such as the follow is added in to the HEAD section of each webpage:

<link rel="stylesheet" type="text/css" href="example.css" />

Obviously the href must point at an existing style sheet, such as that in the example - one which will recognise the subsequent tags in the HTML file. So for example, in a large website, each individual page would include the above line to "point" at its applicable stylesheet. At the client, when the browser views a webpage and discovers a stylesheet link, it will check all subsequent tags for a relevant entry in the stylesheet and display the content appropriately. Where 'class' is used it is simply defined by an extra attribute within the standard HTML tag, such as:

<b class="somename">Please!!!!</b>
The above code gives plenty of examples of how we can affect the style and presentation of our HTML pages by combining them with CSS files.  Again, this is not a web design module and we won't be looking at CSS styling in more depth.  However, we should also provide some discussion regarding CSS for layout.

CSS for Layout

In addition to the standard styling CSS properties affecting font color, size, font, background colour, borders, opacity, etc., CSS also provides a number of options relating to positioning and layout.
The CSS positioning properties allow the develop to position elements.  However, before we talk about these positioning properties, let us first introduce <div> and <span>.  These two elements are extremely useful to us when working with CSS.  

  • <div> provides the opportunity to apply CSS style to whole sections of HTML elements.  In addition, it also provides the facility to provide an identifier for sections of HTML.  This will prove very useful later in the module when we start working with JavaScript and in particular JQuery.  The primary attributes of the <div> (and <span>) element are:  style, class and id.  Let us take an example:
<h2>Div Example</h2>
<p>
Some text before the div we are demonstrating.....

<div id="news_section" style="font-size:8pt" class="news">
    Today in the news were a number of startling revelations...
</div>
Some text after the div we are demonstrating....
</p>

In a corresponding CSS class, there could be an entry, such as:

div.news { font-family: Arial, Verdana; color: black; background: #eeeeee; margin:20px; padding:10px; }

Of course, the style attribute could be made redundant by placing the 'font-size' property into the .news entry of the CSS file.  As this code stands, the styling properties from both the CSS .news entry and the style attribute are applied to this div.  So now, by wrapping the heading and list elements in this div, we have effectively applied the presentation properties of that div to the underlying elements.  Of course, these child elements are free to override any of these styling properties with their own (e.g. h3 { color: red; font-size:14pt; } )

It is possible to use either .news or div.news when applying style to this div.  Technically, div.news is better practice, since .news applies to any elements of class name 'news' which might exist in any pages referencing this CSS file.  div.news is a little more specific and less likely to cause strange issues later.
  • <span> actually operates very similarly to <div> in that it can be used to change the properties of the elements that it encloses.  The primary difference between <div> and <span> is that <span> does not do any formatting of it's own.  <div> actually acts as a paragraph type break moving the content of the <div> to a new "line".  <span> can be used inline inside other elements, such as paragraphs without affecting the paragraph.  

Let us see this with a similar example demonstrating both <div> and <span> and view the output:

<html>
<head>
<title>Div and Span Example</title>
<style type="text/css">
div.news { font-family: Arial, Verdana; color: black; background: #eeeeee; margin:20px; padding:10px; }
span.announcements { font-family: Arial, Verdana; color: black; background: lightblue; margin:20px; padding:10px; }
</style>
</head>

<body>
<h2>Div Example</h2>
<p>
Some text before the div we are demonstrating.....
<div id="news_section" style="font-size:8pt" class="news">
    Today in the news were a number of startling revelations...
</div>
Some text after the div we are demonstrating....
</p>

<hr/>   <!-- Putting in a horizontal rule to separate the examples -->

<h2>Span Example</h2>
<p>
Some text before the span we are demonstrating.....
<span id="nameofspan" style="font-size:10pt" class="announcements">
    Today a number of announcements were made...
</span>
Some text after the span we are demonstrating...
</p>
</body>
</html>
          Source Code: div_span_example.html

Note: For ease of demonstration in this example, I have provided an internal CSS file.  This is achieved by wrapping the CSS in <style> tags inside the <head> section of a webpage.  In general, a separate file is typically used, as multiple pages would have common styling.


Figure 2.4. Output from Div and Span example


Both the <div> and <span> elements have had their background colours changed to demonstrate the difference between these two elements.  As can be seen, the <div> example has effectively placed the <div> into it's own block, whereas the <span> example has operated inline with no style changes other than those specified in the CSS.

This brings us nicely to the two main different types of HTML elements:
  • Block-level Elements : block level elements are elements such as <p> (paragraph) or <div>.  They start and end their content with new lines and may contain other block or inline elements.  For example, the block-level element <p> may contain inline elements such a <b> or <i>.      For example:
<p>This is a paragraph with some <b>emphasised text</b> and some <i>italicised text</i></p>
  • Inline Elements : inline elements are elements such as <b>, <i> and <span>.  They define text or data in a document but don't start new lines when they are used.  They typically only contain other inline elements and text/data.  A common mistake by individuals writing HTML/CSS is to try to set widths and heights for inline elements - this should not be attempted and is not supported by standards compliant browsers  (in fact only a few earlier IE browsers supports it, due to bad standards compliance).

Creating Page Structures

Traditionally, when creating more complicated layouts, tables were often used.  In general, this is bad practice as tables were originally intended for containing tabular data.  Combined with <div>s and other CSS positioning properties, we omit the need to use the <table> element for layout.

One of the things you may have noticed about the <div> element in the example above, is that it defaults to 100% of the available width of the page.   However, unlike inline elements, we are fully entitled to modify the width and height of block elements.  So we will use <div> for this purpose.  Widths and heights can be specified in units or as a percentage of the overall space.  Let's take an example where we want to place three columns side by side to lay our page out like a newspaper:

<html>
<head>
<title>Newspaper Layout Example</title>
<style type="text/css">
body { width: 900px; }
.columnleft { float:left; width:120px; background-color:lightblue }
.columnmiddle { float:left; width:600px; background-color:lightgreen }
.columnright { float:left; width:180px; background-color:brown }
</style>
</head>

<body>
<h2>Newspaper Column Examples</h2>
<p>Below we show three columns with different widths:</p>

<div class="columnleft">
<h4>Navigation</h4>
<p>This is the text content for the leftmost column</p>
</div>

<div class="columnmiddle">
<h4>Main Content</h4>
<p>The quick brown fox jumped over the lazy dogs.</p>
</div>

<div class="columnright">
<h4>News Column</h4>
<p>This is the text content for the rightmost column</p>
</div>

</body>
</html>

Figure 2.5: Output from columns_example.html


There are a few points of note to be made regarding this example:
  • We have introduced a new style property called float.  Float allows us to "float" a block element, positioning it either to the left or right of existing elements.  In this situation, we have floated each of the elements to the left of the remaining space in the containing element (<body> in this situation).  The <body> element is 900 pixels wide.  After the browser positions the first column there is 780 pixels remaining and so there is room to position the 600 pixel middle column to the left of that remaining space.  Finally, we can squeeze the 180 pixel remaining column into the final remaining 180 pixels remaining in the <body> parent element.
What happens if we change the width property of body to 899 pixels?  Try it and see - it is probably not quite what you would expect.  Now try and set all of the heights of the divs to be 200px and see what happens now? Probably a bit closer to what you were expecting in the first instance.  When working with float you will find that while it is extremely powerful, it will take some time before it becomes intuitive to use, particularly on more complicated layouts and when margin and padding properties are used.
  • This layout provides us with a page of fixed width.  What width should that be?  There have been raging debates in web circles for years regarding fixed width vs dynamic widths.  Fixed widths are easier to work with but can result in a lot of browser whitespace for users with larger resolution monitors.  Equally, making a page 1200pixels wide might work great on a high resolution machine, but introduces horrible horizontal rules to users with 800x600 resolution screens.  
Generally, my preference is to work with stretchable pages with a minimum specified width (any stretchable page is going to look rubbish when reduced to 100 pixels wide if someone makes their browser tiny).  This is slightly complicated to demonstrate, so we're just going to provide an example of dynamic widths and we'll introduce a head column to make it more interesting (one where you might typically place a logo, search tool etc.).  We will also introduce margin and padding to stop content ending up on top of each other.  So let's just modify the previous example somewhat.

<html>
<head>
<title>Newspaper Layout Example</title>
<style type="text/css">
body { }
.header { float:left; width:100%; background-color:blue }
.columnleft { float:left; width:15%; background-color:lightblue }
.columnmiddle { float:left; width: 60%; background-color:lightgreen }
.columnright { float:left; width:25%; background-color:brown }
.paddedcontent { padding:10px; }
</style>
</head>

<body>
<div class="header">
<h2>Newspaper Column Examples</h2>
<p>Below we show three columns with different widths:</p>
</div>

<div class="columnleft">
<div class="paddedcontent">
<h4>Navigation</h4>
<p>This is the text content for the leftmost column</p>
</div>
</div>

<div class="columnmiddle">
<div class="paddedcontent">
<h4>Main Content</h4>
<p>The quick brown fox jumped over the lazy dogs. The quick brown fox jumped over the lazy dogs.
The quick brown fox jumped over the lazy dogs.The quick brown fox jumped over the lazy dogs.</p>
</div>
</div>

<div class="columnright">
<h4>News Column</h4>
<p>This is the text content for the rightmost column</p>
</div>

</body>
</html>

Figure 2.6: Output from columns_2_example.html


This example will expand to the full size of the browser screen, regardless as to whether the browser is small or large.  This occurs, because we have removed the width specification on the body css property and changed all of the other divs to be a percentage of the total width, rather than a specific pixel dimension.  The header is quite simply an 100% div, so the next div box ('columnleft') is forced to be below this header when it attempts to "float:left".  

The only other aspect we have introduced is a new div class 'paddedcontent' which we use to provide some padding in the navigation and main content columns.  You can see how the 'News Column' has not had the padding applied and the text runs right to the edge of the column.  Without any padding, the text from the columns would pretty much run right up to each other, resulting in confusing reading for users of the site, particularly if columns were not highlighted in bright colours.

The Box Model

The final concept we will introduce relating to CSS is the "Box Model", which is extremely important in HTML layout. All HTML block-level elements have five spacing properties:
  1. width
  2. height
  3. margin
  4. border
  5. padding
Let us take a look at this in a diagram: (Reference: http://www.mandalatv.net/itp/drivebys/css/)
Figure 2.7: The Box Model
  
Margins set the outward spacing, while padding sets the inward spacing.  The border properties allow the creation of drawn borders around elements.  

Taking margin as an example, we can either define style properties for the entire margin (top,bottom,left and right) or we can individually apply styles using margin-bottom, margin-top etc.  
  • Default margins, borders and padding for block-level elements are all set to 0.  
  • Default widths for block-level elements are 100%.  
  • Default heights rely entirely on the contents of that element, although specific heights may be set
To demonstrate the box model, let us take an example of one div placed inside another:

<html>
<head>
<title>Newspaper Layout Example</title>
<style type="text/css">
body { background-color:green; margin:0px; }    /* body actually defaults with some margin  */
.outerbox { background-color:red; width:300px; margin:0px; padding: 20px; }
.innerbox { background-color:blue; width:200px; margin-top:50px; padding:10px; border:8px dashed black; }
</style>
</head>

<body>
<div class="outerbox">
This is the outer box
<div class="innerbox">
This is the inner box
</div>
</div>

</body>
</html>

Figure 2.8: Box Model Example with Measurements

One important thing to note from this example is that despite the indication in the style properties that the inner box has a width of 200px, it is contained in a box model which is actually larger than this.  To calculate the dimensions on either box:

Outer Box
Width:  300px + 20px (padding left) + 20px (padding right) = 340px
Height: Height of content + 20px (padding top) + 20px (padding bottom) = 40px + height of content  (we can specify a value for height if we want)

Inner Box
Width:  200px + 10px (padding left) + 10px (padding right) + 8px (border left) + 8px (border right) = 236px
Height: Height of content + 10px (padding top) + 10px (padding bottom) + 8px (border top) + 8px (border bottom) + 50px (margin top) 
    = 86px + height of content (we can specify a value for height if we want)

When laying out block-level elements in websites, it is important to understand the box model and the implications in particular of margin, padding and border on the overall dimensions of these boxes.  For example, consider again columns_example.html.  If we were to add any margin, padding or border properties to any of the columns, it would result in an overflow of the third column being forced below.  This would occur despite us having a sum of width properties = 900px the defined width of the body.

Further CSS Information

There are a range of further features within CSS regarding layout such as the positioning (relative, fixed, static and absolute) property.  These will not be covered in this module, as we do not wish to focus overly on CSS.  For further information on the above topics, positioning and other aspects, please try the following links for further information:


HTML Forms

The motivation for dealing with Forms in this course lies in their use as a client front-end to our server-side applications. Hence this section only provides a brief introduction to forms, to enable students to write their own forms, which they will use at a later stage, particularly in assignments. Forms exist on the client-side, within the browser. A form is simply a web page with some additional markup tags to instruct a web browser how to display the various form elements, such as checkboxes, selection lists, buttons and user-editable text areas. However, the web page itself does not process the data, nor does the web server, which doesn't know what you'd like to do with the user's answers. A separate program or script (typically server-side), must process that data, in whatever way you wish.

An HTML form is a section of a document containing normal content, markup, special elements called controls and labels on those controls. Examples of controls include checkboxes, menus, radio buttons and ordinary buttons. Forms are commonly seen on any interactive website, in particular where the user is expected to submit information or make choices. Users typically complete a form by modifying its controls, such as typing in a text box, selecting items etc. before selecting to submit the form to an external server to process. Controls are defined similarly to standard HTML elements using attributes. A control's "control name" is given by its name attribute. Each control has both an initial value and a current value, both of which are character strings. The details can change between form elements, but generally a control's initial value may be specified with the control element's value attribute. A control's initial value does not change, and hence when a form is reset, this initial value is used to reset the control's value.

Rather than replicating the very detailed specification for forms, students are recommended to examine the following W3C Guide to Forms, which is part of its HTML 4.0 recommendation. Students should be familiar with the available form types and the principle ways in which they are used. It is more important that students understand forms and how to apply them, rather than trying to learn off the W3C recommendations by heart! (Don't do this :) ) During the assignments/exam stage, students may be required to create their own forms, which will interact with server-side applications.

However, an example of a FORM, with a brief example is provided below to help with the understanding of FORMs.


Forms Example

The following code provides an example, which covers the more common form elements used. From the previous section regarding HTML you should at this stage be able to distinguish the newer form HTML tags from those already covered. The first line encountered relating to the form is the "FORM METHOD" line - it is this line which provides the destination address (query URL) for the form to the browser. The destination typically takes the form of a server-side application, such as a servlet or cgi/php script, which has been coded to know exactly what to do with the form upon receipt of the data. It may, for example, email the data to an administrator, save the data in a file/database or perform some calculations/queries and return back some dynamic data. The method attribute for the same element is the HTTP/1.0 method used to submit the fill-out form the the query server, such as GET or POST, which were described previously in this Chapter. The exact structure and available attributes for the controls can be viewed in the above W3C guide. Perhaps one of the easier ways of learning to write your own forms is to take the example below and experiment with it, adding and removing fields and studying the effects.

<html> <head> <title>Form Example</title> </head> <body> <h2>Form Example</h2> <p> This is a simple example of a form, with multiple controls: </p> <form method="post" action="http://www.aserversomewhere.com/cgi-bin/formhandler" name="ourform"> <br/> <h3>Text Entry Example</h3> <p>The first text entry field, with default value "test" is here: <input name="text1" value="test"></p> <p>The second text entry field, with no default value is here: <input name="text2"></p> <br/> <h3>Checkboxes Example</h3> <p>Now, here's three checkboxes right in a row: <ol> <li> <input type="checkbox" name="box1" value="cb1" checked> Checkbox 1, on by default. <li> <input type="checkbox" name="box2" value="cb2"> Checkbox 2, off by default. <li> <input type="checkbox" name="box3" value="cb3" checked> Checkbox 3, on by default. </ol> <br/> <h3>Radio Buttons Example</h3> <p>The thing about radio buttons, is that although they look a lot like checkboxes, you can only select one option at a time: </p> <ol> <li> <input type="radio" name="paymethod" value="cash" checked> Cash </li> <li> <input type="radio" name="paymethod" value="check"> Cheque </li> <li> <i>Credit card:</i> <ul> <li> <input type="radio" name="paymethod" value="mastercard"> Mastercard </li> <li> <input type="radio" name="paymethod" value="visa"> Visa </li> <li> <input type="radio" name="paymethod" value="americanexpress"> AMEX </li> </ul> </ol> <br/> <h3>Select Example</h3> <p>Selects perform a similar function to radio buttons, except that they are displayed on one line and are hence are more compact and common.</p> <br/> <br/> Which selection would you like to make? <select name="select1"> <option selected> This one </option> <option> No this one!</option> <option> The other one</option> <option>The last one!</option> </select> <br/> <br/> <font color="red"><b>Normally you would submit the form using this button. However, the form needs to be sent somewhere! <br/>This is where the server-side element kicks in! See the notes for more details! </font> <br/><input type="submit" value="Submit Query"> <p>To reset the various elements to their default states, press this button: <input type="reset" value="Reset To Default Values"></p> </form> </body> </html>

Now, you should view the resulting HTML page.

Note: if you copy/paste the above example directly from the notes and save it in notepad in test.html on your local machine, you can then subsequently open it in a browser (Select File/Open/Browse). It is then easy to modify the test.html file, save it and click 'Refresh/Reload' in your browser, to experiment with the affects of your modifications.

As mentioned before, there are a range of new FORM controls available in HTML5 but for now we will be content with the HTML4.01 available options.


Further Form Information

Sites of particular interest for both learning and reference are:



Document Object Model (DOM)

The Document Object Model, DOM for short, allows us to programmatically access and manipulate the contents of a web page (or XML document).  It provides us with an object-oriented representation of the elements and content of a page combined with methods for retrieving and setting properties for any of these elements.  The Document Object Model is platform and language neutral and allows programs to access and update the content, style and structure of these pages.  It does this through the provision of an Application Programming Interface (API).    

After we introduce JavaScript in the next section we will implement coding examples where we analyse and manipulate the DOM structure of web pages, but for now we will deal mainly with the referencing of elements and content.

History & Standardisation

DOM is standardised by the World Wide Web Consortium (W3C) with the initial DOM standard known as 'DOM Level 1' recommended in 1998.  DOM Level 2 was introduced in 2000, and introduced a range of additional functionality, including the important "getElementById" function.  DOM Level 3, the current release of DOM, was introduced in 2004 which in turn added a range of new features.  

Since 2005, DOM support could be considered well supported in the majority of modern browsers, including IE, Firefox, Chrome, Safari and Opera.  

DOM Document Tree

When a browser loads up a page, it forms a hierarchical representation of the contents of that page, resulting in a tree-like organisation of nodes.

Figure 2.9: Document Object Model Tree Structure

Each of the nodes, represents an element, an attribute, some content or some other object.  Figure 2.10 shows the same diagram represented simply as nodes, each of whom we will label for the purpose of demonstration.

Figure 2.10: DOM Structure showing Nodes

As can be seen, there are a number of different types of nodes shown.  Let us talk about these in the table below:

 Node Type Description Typical Children*
Document
(NodeA)
Represents the entire document.  The root-node of the DOM tree structure.Element (max one), DocumentType, Comment, Others*
DocumentType
(NodeA1)
Provides access to the attributes of the Document Type Definition (DTD).  More on this later in course. None
Element
(e.g NodeA2)
 Represents an elementElement, Attributes, Text, Comment, others*
Attribute
(e.g NodeA2c)
Represents an attribute Text
Text
(e.g NodeA2d)
Typically the textual content of an attribute or element None
* Some other child possibilities can occur, but we are concerned with the basic common ideas.  More details on node types can be found at W3Schools.

We indicated that DOM provides us with a way of modelling and modifying the hierarchical structure of a document.  To help with this, the Node object provides us with a number of properties used for navigating this tree structure of further nodes:

  • Node.firstChild:  Returns the first child node belonging to this node.
eg.  NodeA.firstChild = NodeA1
     NodeA2a.firstChild = NodeA2a1     
  • Node.lastChild: Returns the last child node belonging to this node
 eg. NodeA2.lastChild = NodeA2d
  • Node.childNodes: Returns the 0-indexed array of child nodes belonging to this node
eg. NodeA2.childNodes.length = 4
    NodeA2.childNodes[0] = NodeA2a
    NodeA2a.childNodes[2] = NodeA2a3    
  • Node.parentNode: Returns the parent of this node
eg. NodeA2a.parentNode = NodeA2
  • Node.nextSibling: Returns the next sibling node for this node
eg. NodeA2a.nextSibling = NodeA2b
  • Node.prevSibling: Returns the previous sibling node for this node
eg. NodeA2b.prevSibling = NodeA2a
  • Combining properties: We can combine these properties in the following manner
eg. NodeA2b1.parentNode.parentNode = NodeA2
    NodeA.lastChild.firstChild = NodeA2a

In our description above, we also indicated that DOM can be used not only to reference parents of documents, but also to update and modify the content of these document.  The Node interface also provides a number of methods for adding, removing and updating nodes, such as removeChild(), appendChild(), replaceChild(), closeNode() and insertBefore().  We will not discuss these in any more detail however.

The Whitepace Issue

As has been mentioned, an element may have children of type "text".  Consider the following example:

<div>
    <p>A paragraph <b>with bold </b> and normal text</p>
</div>

Let us consider <div> to be NodeA.  What is the expected outcome of NodeA.childNodes.length?  A smart man would say 'One', the paragraph element <p>.  An even smarter one would say 'Three'.  The reason for this is that under the DOM approach, "whitespace" (ie. spaces, carraige returns, tabs etc.) are viewed as being a text node.  So looking again at the same snippet:

<div>    
  NodeA1   <p>A paragraph <b>with bold </b> and normal text</p>  NodeA3 
</div>

The presence of whitespace in DOM can cause a number of unforeseen issues when programming in the Document Object Model.  There are a number of methods in JavaScript which reduce the effect of this problem.  In addition, it is possible when generating HTML code dynamically to remove the effect by simply avoiding whitespace.  For example:

<div><p>A paragraph <b>with bold </b> and normal text</p></div>

NodeA.childnodes.length is now = 1, the paragraph element  (which in turn has three child nodes - text, <b> element and text).

HTML Sample Tree Structure Question

Let us consider <html> as being called Node1 in this document:
<html>
<head><title>My Page</title></head>
<body>

<h1>My First Heading</h1>

<p>My first paragraph which contains
some <b>bold</b> and some <i>italics</i></p>

<ul>
   <li>Item 1</li>
   <li>Item 2</li>
   <li>Item 3</li>
</ul>

</body>
</html>

Under DOM, answer the following questions:
- What is Node1.firstChild?  The whitespace before the <head> element.
- How would we reference the <body> element? Node1.childNodes[3] *
- How would we reference the text ' and some ' ? Node1.childNodes[3].childNodes[3].childNodes[2]

* Note: There is not always one answer to these questions.  As an example, the following would also answer this question (although would be longer):  Node1.firstChild.nextSibling.nextSibling.nextSibling

HTML, XHTML and DOM

At this point in the course, we are mostly concerned with web page documents.  Reading the above material, you may also have realised the benefits of properly written XHTML over that of badly-written HTML.  If elements aren't properly closed, are poorly nested, have attributes without quotes, changes from upper case to lowercase etc. then the DOM model effectively breaks down (although the problem is with your HTML rather than the DOM model).  Let's take two snippets of HTML code:

Badly written HTML firstly:
<P>This is a paragraph with some <b>bolded and <i>italicised text</b></i>
<p>This is the next paragraph.

Now let's write it properly:
<p>This is a paragraph with some <b>bolded and <i>italicised text</i></b></p>
<p>This is the next paragraph.</p>

So now, let's try copying the code from each of these examples to the HTML previewer we've seen before: W3Schools TryIt HTML Editor

What do we learn from this?  Seemingly at first glance, it makes no difference.  Both the badly written code and properly written code result in the exact same output on this previewer (and on any popular browser if you try it).   In fact, most pages on the internet are badly written and non standards-compliant.  It is for this reason that web browsers are so forgiving in their ability to handle badly written code.

So why bother?  The problems with badly written code will manifest themselves when the developer attempts to write JavaScript and DOM code to dynamically reference and modify existing document structures.  For example - consider the badly written code above and let's call <P> Node1.  What is the effect of calling Node1.childNodes[1] and Node1.childNodes[1].firstChild?  As these elements are not well formed, it will result in errors or unreliable referencing at best.  

Making things easier in JavaScript: getElementById()

In the situation where we simply want to reference a <div>, <span> or other element, it would not make sense to try and traverse a whole HTML document.  We would end up with references like:  document.firstChild.childNodes[3].firstChild.childNodes[4].lastSibling.firstChild etc.  If a change was ever made to the document, it would have the potential to break any references such as these used in our JavaScript code.  

To the rescue comes getElementById().  This accesses the first element with the specified id.  In JavaScript (which we haven't introduced yet) it is referred to by:

document.getElementById("id")

"id" is required and is the id of the element that we wish to access or manipulate.  So considering the following sample HTML code:

<html>
<head>
<title>Sample Page</title>
</head>
<body>
    <h1>This is the title</h1>
    <div id="firstparagraph"><p>This is the contents of the first paragraph</p></div>
</body>
</html>

To start at the top of this document and "navigate" to the <div> node would be cumbersome.  Instead, now we can simply use document.getElementById("firstparagraph") to reference this div.  Now using other JavaScript methods we can read the contents, remove the contents or update the contents (or indeed use this as a point of navigation).  

We will return to DOM at a later stage after we have covered JavaScript and again later when we introduce AJAX (Asychronous JavaScript and XML).

Comments