Related Posts with Thumbnails

Introduction to XML

Monday, May 3, 2010

XML stands for eXtended Markup Language. XML has some similarities to HTML, which is also another Markup Language (HyperText Markup Language).

An important difference between XML and HTML is, XML is used to define data, while HTML is basically used to display/format text. Main use of HTML is to make web pages. But XML is used in a wide variety of applications now and is becoming a standard for data exchange between different paltforms.
Unlike HTML, XML has no pre-defined set of tags. You can define your own tags to structure and organize your data. Let us see some sample XML : 




 
  
  
  
 
 
 
  
  
  
 
The above piece of XML is quite self explanatory. It is used to represent the data related to a company. One thing to note here is, the tags like , etc are not predefined. Any XML document can have it's own set of tags. Only thing is, a valid XML document should follow some rules (like each tag should have a matching closing tag. There more such rules...)

Need for XML

XML is a platform neutral standard. Since it is represented in plain text, any platform can understand it. You can use XML to exchange data between Unix and Windows, Mainframes and windows applications and virtually any other platforms. It is a simple, easy to use *language* to exchange/store data.

In real world, if you have to store data, you have several choices. One of the most common mechanism is databases. Databases are the best choice if you have lot of data to be stored/manipulated. But if you have to send data to another application or platform, databases will not be much helpful. Suppose you have your data stored in an SQL Server database. What if your manager ask you to give the list of Employees in the Sales department ? How will you give the data to him? You cannot give your SQL Server database to him. He may not have a SQL Server software or he may not have the technical expertise to view data from SQL Server. XML is very helpful in such scenarios. You just need to generate an XML from the SQL Server database with the required records and hand over the XML as a print out or email. XML is so simple so that any one can simply read it using any text editor.

There are several XML viewers and editors available, which will help you read XML in the form of a tree view. Internet Explorer is a good XML viewer. Simply copy the above XML and save into a file called 'sample.xml'. Then open the xml file in internet explorer and see how it works! 

You can define your own XML tags based on your needs. You can have nested tags too, which help you organize the data any way you want it. See a slightly different version of the above XML:





 
  
  
  
 
 
 
  
   
    Fred
    

3000 Briarcliif Way
Fred@spiderkerala.com 616 304 1093 616 304 1093 Joe
3960 Whispering Way
Joe@spiderkerala.com 616 304 1093 616 304 1093
The XML data altogether is called an 'XML Document'. .NET gives several classes to read and manipulate XML Documents. We will explore some of them in upcoming chapters.

You might have seen our Tutorial Index, which lists all chapters. You can click on any chapter title and navigate to the corresponding page. This index is populated from an XML file.

Click here to view the XML file, from which we load our tutorial chapters.

The reason we load it from an XML file is, our chapters are continuously changing and we add new chapters frequently. All our chapters have a link to Next Chapter. If we hard-code these links in all chapters, we have to make changes in several places when we change the order of some chapters or when we add new chapters. There is all possibility that we might make some mistakes and some of the chapters may lead to 'wrong next chapter'. In our current implementation, we programmatically read the XML file and display the Tutorial Index. Also, each page has a piece of code which read the current page URL and find the Next Page Url from the XML file based on the current page URL. This mechanism helps us add new chapters without making any changes to existing pages. All we have to do is, just insert one line of entry in the XML file and the changes will be automatically reflected in all other relevant pages. Even if we want to re-order some of the chapters, we just need to re-arrange the entries in the XML file.

We use the following code to load our Chapter Index from the XML file: 


public string GetChapters( string filePath )
{
 // Create an xml document.
 XmlDocument doc = new XmlDocument();   

 // Load the XML from the file.
 doc.Load(filePath);

 // Retrieve all categories from the xml.
 XmlNodeList categories  = doc.SelectNodes("DotNetSpider/tutorials/Category");

 string chapterString = "";
 int chapter = 0;

 // Iterate through all the categories
 foreach ( XmlNode categoryNode in categories )
 {
  // Add category name to the string.
  chapterString += "

" + categoryNode.Attributes["Name"].Value + "
";

  // Get all chapters in the current category.
  XmlNodeList chapters = categoryNode.SelectNodes("Chapter");

  // Loop through all chapters in the current category and add to the string.
  foreach ( XmlNode chapterNode in chapters )
  {
   ++chapter;

   chapterString += "      
  • Chapter " + chapter + " : " + chapterNode.Attributes["Name"].Value + ""; } } // Return the string, which contains the list of chapters. return chapterString; }
  • We explored some areas where XML can be used. But XML can be used in a very wide range of ways. We will explore more advanced uses of XML in another chapter.

    XML Specification

    We gave a very brief introduction to XML in this chapter. This chapter is meant to give a very brief introduction and we haven't covered many important aspects of XML here. We will explain most of them in some other chapters.

    XML has become a standard now and you can read the rules and specification for this standard here : http://www.w3.org/TR/2004/REC-xml-20040204

    0 comments:

    Post a Comment

    Site Rate