Wednesday, July 4, 2012

Parsing XML Documents in JavaScript


Today XML has become the backbone of many Web Applications and like Server Side Programming language extensively supporting XML. JavaScript also supports parsing xml files at client end. The only thing the JavaScript programmers would love is to write xml files inJavaScript. In the tutorial we will understand how we will parse an xml file in JavaScript.
This Tutorial is divided into 3 different section.
  • What is XML Document?
  • XML Parsers in JavaScript
  • Properties of XML Parsers in JavaScript
What is XML Document?
XML stands for Extensible Markup Language. It is classified as an extensible language because it allows its users to define their own tags. The primary purpose of this document is the sharing of structured data across different information systems, particularly via the Internet.
But Unfortunately when you see HTML files and XML files you will find some similarity but there exists an difference between HTML and XML Document.
1. XML was designed to describe the data and to focus on what data is.
2. HTML was designed to display data and to focus on how data will look.
Rules for XML Document:
  • Start tag should have a End tag
  • Empty element may be marked with an self-closing tag such as . This is equal to
  • All attributes values are quoted with either single quote(‘) or double quotes(”)
  • Tags may be nested but must not overlap
  • Element tag name are case-sensitive. Example: must be avoided
  • Example of XML Document:
    < ?xml version="1.0" encoding="UTF-8" ?>
    	
    		 id="001" >John
    		
    			 id="2000">100,000
    			 id="2001">140,000
    			 id="2002">200,000
    		
    	
    XML Parsers in JavaScript
    To manipulate an XML document in javascript, you need an XML parser. Today all browsers come with in-built parsers that can parse the XML document. The parser loads the document into your computer’s memory. Once the document is loaded, its data can be manipulated using the DOM(Document Object Model). There is significant differences in implementation of Microsoft Browser based XML parser and the Mozilla browsers based XML parser.
    XML Parser in Microsoft Browser:
    Microsoft’s XML parser is a COM component that comes with Internet Explorer 5 and higher. To load the XML Parser in JavaScript will have to follow series of steps.
    1. Create instance of XML Parser:
    <script type="text/javascript">
         var xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
    script>
    This will load the xml parser in the memory and will wait for the xml document. This component will automatically get erased when you close the browser window or the Browser. Here the xmlDoc holds the XML Object for JavaScript.
    2. Synchronous load the XML Data
    <script type="text/javascript">
        xmlDoc.async="false";
    script>
    This line turns off asynchronous loading, to make sure that the parser will not continue execution of the script before the document is fully loaded.
    3. Callback function
    <script type="text/javascript">
        xmlDoc.onreadystatechange = function name
    script>
    Calls the callback function on change of every state while loading the xml document.
    ReadyStates in Microsoft Browsers
    1LoadingPreparing to read the XML file. Did not try yet
    2LoadedParsing the XML file. Object model still not available
    3InteractivePart of XML file successfully parsed and read in. Object model partially available for read only
    4CompletedLoading of the XML file has been completed, successfully or unsuccessfully
    4. Load XML Document
    <script type="text/javascript">
        xmlDoc.load("note.xml");
    script>
    Tells the parser to load note.xml file.
    XML Parser in Firefox/Opera Browsers:
    Similar to Microsoft IE, Mozilla Firefox and Opera browsers too comes with their own xml parsers. To load the XML Parser in JavaScript will have to follow series of steps:
    1. Create instance of XML Parser
    <script type="text/javascript">
        var xmlDoc=document.implementation.createDocument("","",null);
    script>
    This will load the xml parser in the memory and will wait for the xml document. This component will automatically get erased when you close the browser window or the Browser. Here the xmlDoc holds the XML Object for JavaScript.
    2. Load XML Document
    <script type="text/javascript">
        xmlDoc.load("note.xml");
    script>
    This line tells the parser to load an XML document called “note.xml”.
    3. Callback function
    <script type="text/javascript">
        xmlDoc.onload=function-name
    script>
    This line tells the parser to call function when the XML document is loaded.

    Properties of XML Parsers in JavaScript:

    The XML Object comes with inbuilt properties that allows easy iterate to the XML document. To easily understand these properties i am going to refer to the below XML file and the code for loading the XML file is also written below.
    Employee.xml
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    < ?xml version="1.0" encoding="UTF-8" ?>
    	
    		 id="001" >John
    		
    			 id="2000">100,000
    			 id="2001">140,000
    			 id="2002">200,000
    		
    	
    Javascript Code Snippet for loading employee.xml(Microsoft Browsers)
    
     
      Read XML in Microsoft Browsers
      
     
     
     
     
     
    

    Note:
    XML Object properties are case-sensitive.
    The various properties of XML Parsers are:
    documentElement
    documentElement property will always point to the root element of the xml document.
    Syntax:
    xml_object.documentElement
    Referring to the above code Line 20 will point to company tag and tagName will return company.
    childNodes
    childNodes property will always hold array of children nodes from the current pointing node.
    Syntax:
    xml_object.current_pointing_node.childNodes[Array Index]
    firstChild 
    This property will point to the first occurance of the child node found from the current pointing node. IE Specific
    Syntax:
    xml_object.firstChild
    Referring to the above code Line 24 will point to year tag and tagName will return year.
    lastChild 
    Will point to the last occurance of the child node found from the current pointing node. IE Specific
    Syntax:
    xml_object.lastChild
    Referring to the above code Line 28 will point to average tag and tagName will return average.
    attributes
    Allows you to access the attributes values for the elements. This proerty is used along with the nodeValue property.
    Syntax:
    xml_object.current_pointing_node.attributes[Attribute Name/Array Index]
    Referring to the above code Line 33 and 34 will point to employee tag and will return the value assigned to id Attribute.
    nodeValue
    Return you the values assigned to the attributes. This properties is only used to extract the attribute content.
    Syntax:
    xmlDoc.current_pointing_node.attributes[Attribute Name/Array Index].nodeValue
    Referring to the above code Line 33 and 34 will point to employee tag and will return the value assigned to id Attribute.
    getElementsByTagName
    This properties allows you to point to any tag name provided you have to specify that tag name as a parameter. So it is mandatory to know the xml structure to use this property. It always returns an array of nodes.
    Syntax:
    xml_object.getElementsByTag(Tag Name)
    Referring to the above code Line 39 will point to year tag and will return the value assign to id attribute for the first year tag only.
    text
    Returns the text content assigned to the tag elements. This property cannot be used for reading attribute content. Also this property can only be used for Microsoft Browsers
    Syntax:
    xml_object.current_pointing_node.text
    Referring to the above code Line 43 will point to employee tag and will return John.
    textContent
    Returns the text content assigned to the tag elements. This property cannot be used for reading attribute content. Also this property can only be used for Mozilla, Firefox, Opera Browsers
    Syntax:
    xml_object.current_pointing_node.textContent
    hasChildNodes
    Return boolean value indicating the current pointing node has child nodes or not.
    Syntax:
    xml_object.hasChildNodes
    Referring to the above code Line 47 will return True as the company tag has childnodes.
    tagName
    Return you the element name
    Syntax:
    xml_object.current_pointing_node.tagName
    Referring to the above code Line 20 it points to company tag and return the company.
    Important Note on Cross Browser XML Scripting:
    To implement cross-browser functionality childNodes properties plays an important role.
    http%3A%2F%2Fwww.hiteshagrawal.com%2Fjavascript%2Fjavascript-parsing-xml-in-javascript