HyperText Transfer Protocol (HTTP)HyperText Transfer Prototol (HTTP) is the protocol that allows Web browsers and servers to communicate. It forms a basis of what a Web server must do to perform its most basic operations. HTTP started out as a very simple protocol, and even though it has had numerous enhancements, it is still relatively simple. As with other standard Internet protocols, control information is passed as plain text via a TCP connection. In fact, as a simple experiment try the following: Before starting, you should consider firewalling issues. In DCU for example, you typically use a "proxy server" when downloading web pages. Hence, you make the request to the proxy server, which retrieves the pages on your behalf. This means that you will only be able to connect to "local" sites INSIDE the firewall. So, if you are attempting this from DCU, you will find that you will likely be able to connect to sites like: www.dcu.ie, www.eeng.dcu.ie, www.rince.ie etc. but not to www.apache.org. Likewise, if you are attempting this from work and are firewalled then you should try a local web server. If you are trying this from home, then all addresses should work. As a secondary point, some sites are not set up to support HTTP/1.1 persistent connections for various reasons (some good and some bad). At the last edit of these notes, www.dcu.ie was one of these sites. If HTTP/1.1 requests are behaving like HTTP/1.0 requests, then try an alternative site for the purpose of demonstration.
HTTP is a simple, stateless protocol - a client such as a web browser, makes a request, the web server responds, and transaction is done. When the client sends a request, the first thing it specifies is an HTTP command, called a method, that tells the server the type of action it wants performed. In HTTP/1.0, a connection must be made to the Web server for each object the browser wishes to download. Many web pages are very graphic intensive, which means that in addition to downloading the base HTML page, the browser must also retrieve a number of images. Establishing a connection for each one is wasteful, as several packets have to be exchanged between the Web browser and Web server before the image data can start transmitting. HTTP/1.1It wasn't long before HTTP was refined into a more complex protocol by the World Wide Web Consortium (http://www.w3.org). HTTP/1.1 addressed a number of issued which needed to be handled since HTTP/1.0. The basic operation of HTTP/1.1 remains the same as for HTTP/1.0, and the protocol ensures that browsers and servers of different versions can all interoperate correctly. If the browser understands version 1.1, it uses HTTP/1.1 on the request line instead of HTTP/1.0. As previously stated, using seperate connections for each item on a web page can be very slow, especially across the Internet when there is a delay involved in each connection and disconnection. To help make pages with inline elements quicker to download, HTTP/1.1 defines persistent connections, where a number of documents can be requested over a single connection, on at a time. An early implementation of persistant connections was known as keep-alive, and Apache as well as a number of other servers and browsers supported this sort of connection. However, persistant connections are first officially documented in HTTP/1.1 and are implemented slightly differently from keep-alives. For a start, in HTTP/1.1, persistant connections are the default. Unless the browser explicitly tells the server not to use persistent connections, the server should assume that it might be getting multiple requests on a single connection. Persistent connections are controlled by the Connection header. Unless a Connection: close header is given, the connection will remain open. This can be tested by connecting to www.apache.org and sending a simple request, for example:
Two affects are immediately noticable. Firstly, The Connection: close is no longer in the output, signifying that the connection will not automatically close after the item is received. Secondly, there will be a pause of a few seconds before the connection closes automatically. This time-out is configured at the server. Using HTTP/1.1 the request can still be sent with a Connection: close header and this will cause the connection to close immediately after the request headers have been sent. In the HTTP/1.1 example we were required to provide the "Host: " header line. The "Host" header distinguishes between various DNS names sharing a single IP Address , allowing name-based virtual hosting. Name-based virtual hosting is where a number of "sites" can be run off the same web/application server on the same machine using the same ip address. For example, we could run www.wesellcds.com and www.wesellbooks.com on the same web server. Hence this header, while optional in HTTP/1.0, it is mandatory in HTTP/1.1. As well as a number of other useful changes, HTTP/1.1 includes a lot of infromation and new features for people implementing proxies and caches. In addition to improved documentation, HTTP/1.1 also includes a range of new features to make implementing proxies and caches easier, and in particular to reduce network traffic by allowing proxies and caches to send more 'conditional' requests and to do transparent content negotiation. A conditional request is like a normal request, except te sender (the proxy or cache server) includes some information about whether it really needs the document. For example, a proxy or cache can send an entity-tag which identifies a document it already has, and the server only sends back the document if the cache does not already have this document. Conditional requests can also be based on the last-modified time of the document. Method TypesWhen a client connects to a server in order to make a HTTP requests, these requests can be one of a number of different types, known asmethods.
Hypertext Markup Language (HTML)This course makes no pretence at being a web design course and only very basic HTML will be covered. To learn more about web design, there are any number of resources on the web and literally thousands of books on the subject out there. However, in order to use many of the development tools in this course an understanding of basic HTML must be there as a foundation. Indeed, the most common responses from server programmes takes the form of HTML back to the client's browser. What are HTML Documents?HTML stands for HyperText Markup Language. HTML documents are plain-text (also known as ASCII) files that can be created using any plain text editor tool, for example; on unix, emacs or vi, on Mac, SimpleText or on Windows, notepad. Additionally, it is also possible to use word-processing software, such as WordPad on Windows, if the option is taken to save the files are plain text files. Perhaps the most common method used to write HTML files is to use a WYSIWYG (What you see is what you get!) editor, such as DreamWeaver or Frontpage. These programs allow you to design your HTML documents visually, as if you were using a word processor, instead of writing the markup tags in a plain-text file and imagining what the resulting page will look like. It is very useful however, to know enough HTML to create a document, before you begin to use a WYSIWYG editor, in case you want to add features later which your editor does not support. When writing HTML manually (which is what we will be doing!), you add "tags" to the text in order to create the structure. These tags tell the browser how to display the text or graphics in the document. These .html files are placed on the web server generally through the use of some remote transfer program such as FTP (file transfer protocol). Each web page on a server, has what is known as a URL (Universal Resource Locator) which looks something like:
The URL or address of a web page depends both on the server configuration and the location of where the file had been placed on the server. When the client types in the correct address for a web page, the browser downloads that file from the server (using HTTP GET method), interprets the HTML code and displays the page graphically. Example HTML DocumentThe following is an example of a very basic HTML Document which simply displays the a few headings and some text. Following it, is a screenshot showing what this page would look like when viewed by a browser.
Figure 2.1. Basic HTML Document shown in Browser It can be seen in the above figure that the URL for the page is a local filename. It is possible to simply save HTML files on your local hard drive and load them in a browser similar to the Internet. Obviously, in this case, unless your machine is set up as a server (i.e. is on the Internet and runs some web server software), these files can not be seen from other computers on the Internet. Generally, web designers will develop their HTML on a local machine until it is ready for deployment, at which point they will FTP the files to the server. The required elements in the above basic example are the <html>, <head>, <title> and <body> tags (and their corresponding end tags!). Because you should include these tags in each file, designers generally create a template which already includes these items and the modify the file - alternatively most WYSIWYG editors will generate these tags automatically when you select to create a New Page. Many browsers will display poorly written HTML, omitting these tags, correctly but some browsers won't. So beware! Before introducing a slightly more complicated example, we will introduce a few of the tags and their usage. Many tags have additional configurable attributes, which are used inside the tags to further define settings. The list below shows a subset of the more common tags used and a few available additional attributes for some tags: Primary HTML TagsThe following list shows some of the most commonly used HTML tags and a very basic example on how they are used. Students should accustomise themselves with the use of these common tags and be able to create webpages. Again, repeating the fact that this is not a web design course, it should be noted that HTML is needed as a precursor to Javascript and forms, which combined form the most common client side interface to server-side applications. Some Common HTML TagsNote: HTML is case-insensitive. Hence, we can use lower case tags and attributes, or alternatively upper-case tags and attributes.
HTML vs XHTMLXHTML 1.0 is simply a reformulation of HTML 4.0 into an XML (eXtensible Markup Language) form. While it seems strange to immediately introduce something new (when we've just barely introduced HTML) we will see great similarities between the two markup languages. There are some differences however, principally:
There are a number of other differences but, for our purposes, these are the main ones which affect us. Rather than get confused between both markup languages, we will simply provide further examples in XHTML. XHTML provides a number of advantages for us (particularly in relation to DOM, JQuery, JavaScript and other topics we will introduce later. Example XHTML DocumentIn previous years, this was actually a second HTML example. Now, we have taken the example, changed the tags so that they are lowercase, tidied up attributes and ensured that tags are well-formed. Otherwise, we are simply using the tags introduced for HTML above. So, combining these new tags together to form a relatively more "complicated" XHTML document we get:
Figure 2.2. Example XHTML Document shown in Browser Note: Technically, this XHTML document will not validate against a validator, such as the W3C XHTML Validator. In order to get it to validate, we need to add in a few additional XML-based tags at the beginning of our document.
The first line declares the file to be an HTML document type file. The second line specifies the rules file against which validation is performed. To test your own validation, try copying and pasting the source code from the example xhtml_example.html into the W3C XHTML Validator. It should validate successfully. When you are writing HTML/XHTML in the majority of situations (for pages, JSPs etc. later on) you can ignore these three lines. However, the practices of using lowercase, making all HTML well-formed, quoting attributes etc. are useful practices for us to follow and will have benefits later on. Practicing your HTML SkillsThis module is not a web design course and, as a result, the level of HTML expected is not high. For most web application developers working in a company, you will likely be referring the web design aspect to a graduate of a graphical design programme. Graphical design for the web medium is surprisingly similar to that of the print medium, requiring skills in using graphical design programmes such as Photoshop and Illustrator. However, all web application developers are required to know the basics of HTML pages. If a developer is building a web application, it is likely that they will develop a fully working interface for their application, and it would only be passed to the graphical designer for improving the look & feel and for professional branding and marketing. For our purposes, it is important to be familiar with the basics of developing web pages, in particular all of the tags introduced above. The best way of practicing HTML is to write your own and to spend a couple of hours practicing it. It really isn't very difficult and is simply a case of familiarising yourself with the available tags. To aid this process, the following tool is recommended: This tool allows you to type HTML in the window on the left hand side and simply click a button to render it to a webpage on the right hand side. It is very useful for practicing basic tags, although it does give a strong feeling of placing files on a server (on the internet) for all to see. To start you off, try copy/pasting the Example documents above into the editor and viewing the output. Did you actually practice?It's hard to write this section without coming across as a school teacher waving a big stick! This is the first place in the notes where it has been recommended that you practice some "source code" to get familiar with the topic. There are many people who prefer to learn the tags off by heart and move on. If you start to do this now, you will most likely FAIL this module. The course has been designed to be practical in nature, which means that, when I assess your progress through examination and assignment, I will be looking to assess your ability to write web application software. If you attempt to learn off the module, you will learn nothing from this course. If you work through the examples, developing your own skills you will learn a whole lot more, do better in the examinations and be in a position to take on a graduate role in web application development. More and more, I am moving the examination to be entirely practical questions intended to reward those that study in this manner more. If you are having difficulty understanding the practical aspects of this module and instead find yourself "learning off" then please come and talk to me. HTML AccessibilityHTML accessibility refers to the inclusive approach of designing web pages in such a manner so as to make websites usable by people with all abilities and disabilities. For example, by the simple inclusion of "alt" attributes in "<img>" elements we are able to improve usability for visually impaired users, who perhaps rely on text-to-speech converters when visiting web sites. Likewise, if website chosen fonts are large and clear this provides assistance to users with poor sight. The Web Content Accessibility Guidelines (WCAG) 1.0 guidelines were published by the World Wide Web Consortium (W3C) in 1999. These guidelines were widely accepted as the definitive approach to designing websites to be accessible. In December 2008, the Web Content Accessibility Guidelines (WCAG) 2.0 recommendation was published, which aims to be more up to date, technology neutral and cater for a number of newer aspects appearing in browsers. While there is a moral and ethical responsibility on developers to ensure that websites cater for users with disabilities, in general there are legal requirements in most countries. In Ireland, the Disability Act 2005 makes reference to this. Section 28.2 "Where a public body communicates in electronic form with one or more persons, the head of the body shall ensure, that as far as practicable, the contents of the communication are accessible to persons with a visual impairment to whom adaptive technology is available." There are dozens of accessibility tools available for checking certain aspects of web sites - try pasting the XHTML code above into http://wave.webaim.org/ and view the output. You will see how, despite there being "no accessibility errors" detected, there are a certain amount of recommendations. Often with regards to accessibility, there is a good deal of common sense involved. For example, you could "technically" pass an accessibility check by giving a submit image button (note: whether you should use this is another question) an alt attribute = 'Button'. Of course, this doesn't distinguish it from the 'Reset' image button beside it with alt attribute = 'Button'. Despite both images having an 'alt' tag, there is now a 50% chance that a blind user might reset their form rather than submit it (or vice versa). HTML5Work began on HTML5 in 2004 but it was finalised and published on October 2014 by the World Wide Web Consortium (W3C). While it was the fifth revision of the HTML standard - the previous version HTML4 was standardised in 1997. Features of HTML5
Further HTML InformationSites of particular interest for both learning and reference are:
Cascading StyleSheets (CSS)Cascading Style Sheets were first introduced in late 1996 and represented an exciting new opportunity to create more sophisticated page design, both in layout and content. It also greatly simplified the process of making web pages accessible to as many readers as possible, regardless of the device they use to read a page, and regardless of any disability they might have. CSS addresses the distinction between what a document should look like (referred to as its appearance and the underlying structure of the document. When HTML first burst upon the scene, problematic ways of coding page appearance took off, among these were the <font> and <b> elements and other presentational HTML elements. Some of these elements have been shown in the previous section and indeed, many web developers still implement their web site appearance in this way. In addition, a number of pseudo web tools, such as using Microsoft Word to save a document as a HTML file, will result in pages with large amounts of styling bundled into these HTML files. Cascading Style Sheets [CSS] is a recommendation of the World Wide Web Consortium (W3C) and provides the means for web authors to separate the appearance of web pages from the content of web pages. This powerful tool allows developers to simplify the task of maintaining web sites while providing sophisticated layout and design features for web pages, disregarding the needs for plugins and long downloads. Specifically, the W3C made two recommendations, namely: Cascading Style Sheets 1 (CSS1) and Cascading Style Sheets 2 (CSS2), which incorporates and extends CSS1. At the time of updating of these notes, CSS Level 3 is under development. Unlike CSS1 and CSS2, which were released as full specifications, CSS3 is being released as a number of separate modules. As of January 2016, the current status looked something like this (from green to red -> Recommendation, Candidate Recommendation, Last Call, Working Draft). ![]() (By Krauss (Own work) [CC BY-SA 4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons) During the initial years of CSS, there was little uptake in usage, for a number of reasons but principally due to browser compatibility. CSS only really worked in Netscape 4.0+ and Internet Explorer 3+ upwards. While older browsers are little used today, it does explain why the early uptake on CSS was slow. Even today, despite the best attempts of the W3C, there are a number of differences in support for CSS with the various browsers. So essentially, CSS involves more time spent learning a new web technology, cross browser issues and possible headaches! So why bother?
CSS ExampleThe following section provides a basic example of some of the features of CSS. Course participants should try their own examples and attempt to grasp the concepts behind CSS for themselves. In the examples, a style sheet is provided as well as a HTML file which "uses" the style sheet and the resulting browser output shown. By all means, if you have any older browsers, check how well these style elements are supported (as an example, link rollovers don't highlight in Netscape 4.7). Example CSS File: example.css
It should be noted that we are not restricted to "primary" colours when specifying them in CSS files or style attributes. One can define essentially any colour using RGB (Red Green Blue) properties. Example: b { font-color: #ff0000; } This would result in a red colour, as it represents in hexidecimal 255 red, 0 green, 0 blue. There are 256(0-255) x 256 x 256 possible combinations of colour using this approach, which is over 16.7 million combinations (224). For more information and a useful colour selection chart, visit http://en.wikipedia.org/wiki/Web_colors. Personally, I use the following site when choosing colours for websites, diagrams etc.: http://www.colorpicker.com/ Example HTML file using example.css
Now you should take a look at the resulting output from this web page and then return here for an explanation! View the Output. CSS ExplanationThe example above consists of two files, the CSS file, which handles style and layout and the HTML file, which handles content. Concentrating on the CSS file first: CSS files consist of rules, which are made up of two main parts: selector ('H1') and declaration ('font-family .....'). The declaration itself is split up into a number of name-value pairs (properties) referring to elements of style for that selector (eg. font-size: 14pt). The selector is the link between the HTML document and the style sheet, and all HTML element types are possible selectors. The list of available properties, which can determine the presentation of an HTML document can be found in the W3C CSS1 Specifications. Inheritance plays an important part in the layout of CSS files. Consider the following from the HTML file:
Due to its position, "Ugly Web" will inherit all of the properties of the parent element h1, ie. Arial, 14pt, blue with yellow background, normal font. However, as it is also encapsulated by the <b> tags with its own set of properties these will override the parent element properties. So considering the two applicable lines from the example CSS file we have:
It can be seen that "Ugly Web" will font-style: normal and font-weight: bold from the 'b' selector, while it will also inherit font-size: 14pt, color: blue and background: yellow from the h1 selector. To increase the granularity of control over elements, a new attribute has been added to HTML, namely: 'class'. All elements inside the 'body' element of a HTML document can be classed, and the class can be addressed in the style sheet. Normal inheritance rules apply to classed elements, they inherit values from their parent in the document structure. One example of this can be seen in the b.somename selector, which automatically inherits all properties from the b selector. Effectively what 'class' allows us to do is to use the same standard HTML tag in countless ways: for example, on different sections of the same site we might want different styles of bold text, so we can define b.newsitem, b.highlightedlarge and so on. Now moving on to the HTML file: In order to specify which style sheet (css file) to be used on any page, a line such as the follow is added in to the HEAD section of each webpage:
Obviously the href must point at an existing style sheet, such as that in the example - one which will recognise the subsequent tags in the HTML file. So for example, in a large website, each individual page would include the above line to "point" at its applicable stylesheet. At the client, when the browser views a webpage and discovers a stylesheet link, it will check all subsequent tags for a relevant entry in the stylesheet and display the content appropriately. Where 'class' is used it is simply defined by an extra attribute within the standard HTML tag, such as:
The above code gives plenty of examples of how we can affect the style and presentation of our HTML pages by combining them with CSS files. Again, this is not a web design module and we won't be looking at CSS styling in more depth. However, we should also provide some discussion regarding CSS for layout. CSS for LayoutIn addition to the standard styling CSS properties affecting font color, size, font, background colour, borders, opacity, etc., CSS also provides a number of options relating to positioning and layout. The CSS positioning properties allow the develop to position elements. However, before we talk about these positioning properties, let us first introduce <div> and <span>. These two elements are extremely useful to us when working with CSS.
Source Code: div_span_example.html Note: For ease of demonstration in this example, I have provided an internal CSS file. This is achieved by wrapping the CSS in <style> tags inside the <head> section of a webpage. In general, a separate file is typically used, as multiple pages would have common styling. This brings us nicely to the two main different types of HTML elements:
<p>This is a paragraph with some <b>emphasised text</b> and some <i>italicised text</i></p>
Creating Page StructuresTraditionally, when creating more complicated layouts, tables were often used. In general, this is bad practice as tables were originally intended for containing tabular data. Combined with <div>s and other CSS positioning properties, we omit the need to use the <table> element for layout.One of the things you may have noticed about the <div> element in the example above, is that it defaults to 100% of the available width of the page. However, unlike inline elements, we are fully entitled to modify the width and height of block elements. So we will use <div> for this purpose. Widths and heights can be specified in units or as a percentage of the overall space. Let's take an example where we want to place three columns side by side to lay our page out like a newspaper: <html> <head> <title>Newspaper Layout Example</title> <style type="text/css"> body { width: 900px; } .columnleft { float:left; width:120px; background-color:lightblue } .columnmiddle { float:left; width:600px; background-color:lightgreen } .columnright { float:left; width:180px; background-color:brown } </style> </head> <body> <h2>Newspaper Column Examples</h2> <p>Below we show three columns with different widths:</p> <div class="columnleft"> <h4>Navigation</h4> <p>This is the text content for the leftmost column</p> </div> <div class="columnmiddle"> <h4>Main Content</h4> <p>The quick brown fox jumped over the lazy dogs.</p> </div> <div class="columnright"> <h4>News Column</h4> <p>This is the text content for the rightmost column</p> </div> </body> </html> Source: columns_example.html Figure 2.5: Output from columns_example.html There are a few points of note to be made regarding this example:
<html> <head> <title>Newspaper Layout Example</title> <style type="text/css"> body { } .header { float:left; width:100%; background-color:blue } .columnleft { float:left; width:15%; background-color:lightblue } .columnmiddle { float:left; width: 60%; background-color:lightgreen } .columnright { float:left; width:25%; background-color:brown } .paddedcontent { padding:10px; } </style> </head> <body> <div class="header"> <h2>Newspaper Column Examples</h2> <p>Below we show three columns with different widths:</p> </div> <div class="columnleft"> <div class="paddedcontent"> <h4>Navigation</h4> <p>This is the text content for the leftmost column</p> </div> </div> <div class="columnmiddle"> <div class="paddedcontent"> <h4>Main Content</h4> <p>The quick brown fox jumped over the lazy dogs. The quick brown fox jumped over the lazy dogs. The quick brown fox jumped over the lazy dogs.The quick brown fox jumped over the lazy dogs.</p> </div> </div> <div class="columnright"> <h4>News Column</h4> <p>This is the text content for the rightmost column</p> </div> </body> </html> Source: columns_2_example.html Figure 2.6: Output from columns_2_example.html This example will expand to the full size of the browser screen, regardless as to whether the browser is small or large. This occurs, because we have removed the width specification on the body css property and changed all of the other divs to be a percentage of the total width, rather than a specific pixel dimension. The header is quite simply an 100% div, so the next div box ('columnleft') is forced to be below this header when it attempts to "float:left". The only other aspect we have introduced is a new div class 'paddedcontent' which we use to provide some padding in the navigation and main content columns. You can see how the 'News Column' has not had the padding applied and the text runs right to the edge of the column. Without any padding, the text from the columns would pretty much run right up to each other, resulting in confusing reading for users of the site, particularly if columns were not highlighted in bright colours. The Box ModelThe final concept we will introduce relating to CSS is the "Box Model", which is extremely important in HTML layout. All HTML block-level elements have five spacing properties:
Figure 2.7: The Box Model Margins set the outward spacing, while padding sets the inward spacing. The border properties allow the creation of drawn borders around elements. Taking margin as an example, we can either define style properties for the entire margin (top,bottom,left and right) or we can individually apply styles using margin-bottom, margin-top etc.
To demonstrate the box model, let us take an example of one div placed inside another: <html> <head> <title>Newspaper Layout Example</title> <style type="text/css"> body { background-color:green; margin:0px; } /* body actually defaults with some margin */ .outerbox { background-color:red; width:300px; margin:0px; padding: 20px; } .innerbox { background-color:blue; width:200px; margin-top:50px; padding:10px; border:8px dashed black; } </style> </head> <body> <div class="outerbox"> This is the outer box <div class="innerbox"> This is the inner box </div> </div> </body> </html> Source: box_model_example.html Figure 2.8: Box Model Example with Measurements One important thing to note from this example is that despite the indication in the style properties that the inner box has a width of 200px, it is contained in a box model which is actually larger than this. To calculate the dimensions on either box: Outer Box Width: 300px + 20px (padding left) + 20px (padding right) = 340px Height: Height of content + 20px (padding top) + 20px (padding bottom) = 40px + height of content (we can specify a value for height if we want) Inner Box Width: 200px + 10px (padding left) + 10px (padding right) + 8px (border left) + 8px (border right) = 236px Height: Height of content + 10px (padding top) + 10px (padding bottom) + 8px (border top) + 8px (border bottom) + 50px (margin top) = 86px + height of content (we can specify a value for height if we want) When laying out block-level elements in websites, it is important to understand the box model and the implications in particular of margin, padding and border on the overall dimensions of these boxes. For example, consider again columns_example.html. If we were to add any margin, padding or border properties to any of the columns, it would result in an overflow of the third column being forced below. This would occur despite us having a sum of width properties = 900px the defined width of the body. Further CSS InformationThere are a range of further features within CSS regarding layout such as the positioning (relative, fixed, static and absolute) property. These will not be covered in this module, as we do not wish to focus overly on CSS. For further information on the above topics, positioning and other aspects, please try the following links for further information:
HTML FormsThe motivation for dealing with Forms in this course lies in their use as a client front-end to our server-side applications. Hence this section only provides a brief introduction to forms, to enable students to write their own forms, which they will use at a later stage, particularly in assignments. Forms exist on the client-side, within the browser. A form is simply a web page with some additional markup tags to instruct a web browser how to display the various form elements, such as checkboxes, selection lists, buttons and user-editable text areas. However, the web page itself does not process the data, nor does the web server, which doesn't know what you'd like to do with the user's answers. A separate program or script (typically server-side), must process that data, in whatever way you wish. An HTML form is a section of a document containing normal content, markup, special elements called controls and labels on those controls. Examples of controls include checkboxes, menus, radio buttons and ordinary buttons. Forms are commonly seen on any interactive website, in particular where the user is expected to submit information or make choices. Users typically complete a form by modifying its controls, such as typing in a text box, selecting items etc. before selecting to submit the form to an external server to process. Controls are defined similarly to standard HTML elements using attributes. A control's "control name" is given by its name attribute. Each control has both an initial value and a current value, both of which are character strings. The details can change between form elements, but generally a control's initial value may be specified with the control element's value attribute. A control's initial value does not change, and hence when a form is reset, this initial value is used to reset the control's value. Rather than replicating the very detailed specification for forms, students are recommended to examine the following W3C Guide to Forms, which is part of its HTML 4.0 recommendation. Students should be familiar with the available form types and the principle ways in which they are used. It is more important that students understand forms and how to apply them, rather than trying to learn off the W3C recommendations by heart! (Don't do this :) ) During the assignments/exam stage, students may be required to create their own forms, which will interact with server-side applications. However, an example of a FORM, with a brief example is provided below to help with the understanding of FORMs. Forms ExampleThe following code provides an example, which covers the more common form elements used. From the previous section regarding HTML you should at this stage be able to distinguish the newer form HTML tags from those already covered. The first line encountered relating to the form is the "FORM METHOD" line - it is this line which provides the destination address (query URL) for the form to the browser. The destination typically takes the form of a server-side application, such as a servlet or cgi/php script, which has been coded to know exactly what to do with the form upon receipt of the data. It may, for example, email the data to an administrator, save the data in a file/database or perform some calculations/queries and return back some dynamic data. The method attribute for the same element is the HTTP/1.0 method used to submit the fill-out form the the query server, such as GET or POST, which were described previously in this Chapter. The exact structure and available attributes for the controls can be viewed in the above W3C guide. Perhaps one of the easier ways of learning to write your own forms is to take the example below and experiment with it, adding and removing fields and studying the effects.
Now, you should view the resulting HTML page. Note: if you copy/paste the above example directly from the notes and save it in notepad in test.html on your local machine, you can then subsequently open it in a browser (Select File/Open/Browse). It is then easy to modify the test.html file, save it and click 'Refresh/Reload' in your browser, to experiment with the affects of your modifications. As mentioned before, there are a range of new FORM controls available in HTML5 but for now we will be content with the HTML4.01 available options. Further Form InformationSites of particular interest for both learning and reference are:
Document Object Model (DOM)The Document Object Model, DOM for short, allows us to programmatically access and manipulate the contents of a web page (or XML document). It provides us with an object-oriented representation of the elements and content of a page combined with methods for retrieving and setting properties for any of these elements. The Document Object Model is platform and language neutral and allows programs to access and update the content, style and structure of these pages. It does this through the provision of an Application Programming Interface (API). After we introduce JavaScript in the next section we will implement coding examples where we analyse and manipulate the DOM structure of web pages, but for now we will deal mainly with the referencing of elements and content. History & StandardisationDOM is standardised by the World Wide Web Consortium (W3C) with the initial DOM standard known as 'DOM Level 1' recommended in 1998. DOM Level 2 was introduced in 2000, and introduced a range of additional functionality, including the important "getElementById" function. DOM Level 3, the current release of DOM, was introduced in 2004 which in turn added a range of new features. Since 2005, DOM support could be considered well supported in the majority of modern browsers, including IE, Firefox, Chrome, Safari and Opera. DOM Document TreeWhen a browser loads up a page, it forms a hierarchical representation of the contents of that page, resulting in a tree-like organisation of nodes. Figure 2.9: Document Object Model Tree Structure Each of the nodes, represents an element, an attribute, some content or some other object. Figure 2.10 shows the same diagram represented simply as nodes, each of whom we will label for the purpose of demonstration. Figure 2.10: DOM Structure showing Nodes As can be seen, there are a number of different types of nodes shown. Let us talk about these in the table below:
We indicated that DOM provides us with a way of modelling and modifying the hierarchical structure of a document. To help with this, the Node object provides us with a number of properties used for navigating this tree structure of further nodes:
In our description above, we also indicated that DOM can be used not only to reference parents of documents, but also to update and modify the content of these document. The Node interface also provides a number of methods for adding, removing and updating nodes, such as removeChild(), appendChild(), replaceChild(), closeNode() and insertBefore(). We will not discuss these in any more detail however. The Whitepace IssueAs has been mentioned, an element may have children of type "text". Consider the following example: <div> <p>A paragraph <b>with bold </b> and normal text</p> </div> Let us consider <div> to be NodeA. What is the expected outcome of NodeA.childNodes.length? A smart man would say 'One', the paragraph element <p>. An even smarter one would say 'Three'. The reason for this is that under the DOM approach, "whitespace" (ie. spaces, carraige returns, tabs etc.) are viewed as being a text node. So looking again at the same snippet: <div> NodeA1 < p>A paragraph <b>with bold </b> and normal text</p> NodeA3 </div> The presence of whitespace in DOM can cause a number of unforeseen issues when programming in the Document Object Model. There are a number of methods in JavaScript which reduce the effect of this problem. In addition, it is possible when generating HTML code dynamically to remove the effect by simply avoiding whitespace. For example: <div><p>A paragraph <b>with bold </b> and normal text</p></div> NodeA.childnodes.length is now = 1, the paragraph element (which in turn has three child nodes - text, <b> element and text). HTML Sample Tree Structure QuestionLet us consider <html> as being called Node1 in this document: <html> <head><title>My Page</title></head> <body> <h1>My First Heading</h1> <p>My first paragraph which contains some <b>bold</b> and some <i>italics</i></p> <ul> <li>Item 1</li> <li>Item 2</li> <li>Item 3</li> </ul>
</body> </html> Under DOM, answer the following questions: - What is Node1.firstChild? The whitespace before the <head> element. - How would we reference the <body> element? Node1.childNodes[3] * - How would we reference the text ' and some ' ? Node1.childNodes[3].childNodes[3].childNodes[2] * Note: There is not always one answer to these questions. As an example, the following would also answer this question (although would be longer): Node1.firstChild.nextSibling.nextSibling.nextSibling HTML, XHTML and DOMAt this point in the course, we are mostly concerned with web page documents. Reading the above material, you may also have realised the benefits of properly written XHTML over that of badly-written HTML. If elements aren't properly closed, are poorly nested, have attributes without quotes, changes from upper case to lowercase etc. then the DOM model effectively breaks down (although the problem is with your HTML rather than the DOM model). Let's take two snippets of HTML code: Badly written HTML firstly: <P>This is a paragraph with some <b>bolded and <i>italicised text</b></i> <p>This is the next paragraph. Now let's write it properly: <p>This is a paragraph with some <b>bolded and <i>italicised text</i></b></p> <p>This is the next paragraph.</p> So now, let's try copying the code from each of these examples to the HTML previewer we've seen before: W3Schools TryIt HTML Editor What do we learn from this? Seemingly at first glance, it makes no difference. Both the badly written code and properly written code result in the exact same output on this previewer (and on any popular browser if you try it). In fact, most pages on the internet are badly written and non standards-compliant. It is for this reason that web browsers are so forgiving in their ability to handle badly written code. So why bother? The problems with badly written code will manifest themselves when the developer attempts to write JavaScript and DOM code to dynamically reference and modify existing document structures. For example - consider the badly written code above and let's call <P> Node1. What is the effect of calling Node1.childNodes[1] and Node1.childNodes[1].firstChild? As these elements are not well formed, it will result in errors or unreliable referencing at best. In the situation where we simply want to reference a <div>, <span> or other element, it would not make sense to try and traverse a whole HTML document. We would end up with references like: document.firstChild.childNodes[3].firstChild.childNodes[4].lastSibling.firstChild etc. If a change was ever made to the document, it would have the potential to break any references such as these used in our JavaScript code. To the rescue comes getElementById(). This accesses the first element with the specified id. In JavaScript (which we haven't introduced yet) it is referred to by: document.getElementById("id") "id" is required and is the id of the element that we wish to access or manipulate. So considering the following sample HTML code: <html> <head> <title>Sample Page</title> </head> <body> <h1>This is the title</h1> <div id="firstparagraph"><p>This is the contents of the first paragraph</p></div> </body> </html> To start at the top of this document and "navigate" to the <div> node would be cumbersome. Instead, now we can simply use document.getElementById("firstparagraph") to reference this div. Now using other JavaScript methods we can read the contents, remove the contents or update the contents (or indeed use this as a point of navigation). We will return to DOM at a later stage after we have covered JavaScript and again later when we introduce AJAX (Asychronous JavaScript and XML). |
Course Content >