HTML to XML: Extending the Markup Language
A great majority of people who are in the computer industry know what HTML (Hypertext Markup Language) is. It has been around for a very long time and has been used extensively in webpage design that although it is already rare to see webpages written solely in HTML, it is considered as basic knowledge to the whole process of creating webpages.
XML (Extensible Markup Language), on the other hand is a more recent and much less known technology compared to HTML. XML was created in 1996 by a group of 11 people as an adaptation of SGML (Standard Generalized Markup Language) for use in the World Wide Web. XML is a more structured and strict markup language compared to HTML that allowed users to create their own definitions and modularized code. It was made to create a standardized specifications for creating custom mark-up languages which are now known as XML dialects. It might not be instantly apparent but custom markup languages like HTML, RSS, and Atom were all built from XML as a method of increasing the usability of the internet.
Since XML was adapted from SGML it contains a lot of code and techniques that were originally from SGML like its strictness and a so called well-formedness. Characteristics that extend even to the descendants of XML. Certain rules should always be considered when creating code that is based on XML. There is even a well formed declaration with every document to state what type of document it is and to what rules should the processing be based on. This is very different compared to the very relaxed coding that is used in HTML.
When you process an HTML page, you would have some sort of result regardless of what the input was. The HTML processor tries to make sense of what was in the document and makes an output which it thinks best represents the input data. This is not true comes to XML. XML employs an error handling mechanism that is considered as ‘draconian’. Whenever the XML processor encounters something that it cannot comprehend, it just creates an error report and terminates the processing of the file. That leaves you with an error box and no result at all unlike in HTML.
To put it in perspective, HTML is a markup language used to quickly and easily display some manner of output. It does not concern itself with correctness of the input and just tries to create an output based on the input file. XML on the other hand is a very strict markup language which is not usually used to create content. Its primary use is as a tool for creating other markup languages that create the needed content.