It’s been a while since XML has seen a version update since the introduction and adoption of XML 1.0 by the W3C in 1998 and is currently in its 5th edition. XML’s strength lies in its strong solid conceptual idea that emphasizes on simplicity, usability and extendibility that allows it to be adapted and supported widely across the internet by various applications some include RSS, ATOM, SOAP and XHTML.
XML stands for Extensible Markup Language and is simply a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
XML uses a Document Type Definition DTD that set the rules on how the XML document is created and can be used. This has the benefit of providing a schema that can be validated for “well-formedness”, all elements in the schema must abide by these rules. The end return is a structured document that holds “data” that can be read and processed by machines as well as humans. Say a human (H) creates an XML document and feeds is to a machine (M1) to carry out some actions, say it provides information about an address that the machine can use to find its geo-location on a map, that machine can read the “data”, implement its function and destroy the XML document. Another machine (M2) makes a request to that machine for information it needs, say it needs to ship a parcel. Using XML, Machine (M1) can recreate the same document and feed it to Machine (M2), Machine (M2) can now read and process the data in the XML document for it to own use to complete its task, beautiful right!?
<?xml version="1.0" encoding="UTF-8"?> <shiporder orderid="889923" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="shiporder.xsd"> <orderperson>John Smith</orderperson> <shipto> <name>Ola Nordmann</name> <address>Langgt 23</address> <city>4000 Stavanger</city> <country>Norway</country> </shipto> <item> <title>Empire Burlesque</title> <note>Special Edition</note> <quantity>1</quantity> <price>10.90</price> </item> <item> <title>Hide your heart</title> <quantity>1</quantity> <price>9.90</price> </item> </shiporder>
Why XML 1.1
This conceptual idea has been so powerful that there hasn’t been a need to provide a new version in years, W3C choose XML 1.0 based on its definition on Unicode 2.0, the then-current version of the Unicode Standard. This meant providing a unique number code for every character in the world, so it is represented and processed correctly a machine. It has taken several years and version to build up it to what it is today, for example A is represented in decimal by &# 065;. Because XML 1.0 is based on Unicode 2.0 the designers have chosen to limit these constructs to a range of characters, which sets restrict limits on what can be done on XML 1.0. There has been a need to revise XML to handle newly released Unicode 3.0 and any other subsequent versions which already lack support in XML 1.0
Different between XML 1.1 and XML 1.0
First difference you notice is how processors handle XML format due to forward compatibility. Unlike XML 1.0, XML 1.1 is forward compatible with Unicode Standard allowing XML processors to be able to process documents that use characters only assigned in the future versions of Unicode Standards. XML 1.0 constructs its elements by explicitly allowing certain characters and excluding the rest, XML 1.1 allows every element expect certain characters.
XML 1.1 improves on some of the short comings of XML 1.0, notably recognizing end of line. There currently exists a misalignment between what XML and Unicode defines as end of line, this affects particularly IBM machines, as well as systems communicating with them .
XML 1.1 also adds 2 new important characters NEL and a line separator. NEL 0x85 is used to mark the end of line and for completeness 0x2028 is used to mark the separation of line, this is extremely helpful in normalizing linefeed in text which plagues XML 1.0.
Last improvement in XML 1.1 is character normalization and character control. On character normalization the intent on Unicode is to provide a unique number for every character, but certain characters are represented in more than one way. The good news is that Unicode also defines several ways to normalize strings before they are processed XML 1.1 processor allows you validate this. XML 1.1 allows you to use character control by use of character references, these concern characters 0x01 through 0x1F
|XML 1.0||XML 1.1|
|Backward compatible.||Forward compatible.|
|No text characters are presented to interpret the characters that are located at the end of line.||Recognizes the characters at the end of line.|
|Normalization and new name characters are not supported.||The NEL character 0x85 is normalized to a linefeed in text.|
|Ambiguity always exists for coding characters.||New set of control characters support characters that are ambiguous.|