DOC_LM_Unit-1 (1)
DOC_LM_Unit-1 (1)
What is XML?
▪ XML stands for Extensible Markup Language. It is a textual data format which
application used to communicate (send and receive data) with other application.
▪ It is a text-based markup language derived from Standard Generalized Markup Language
(SGML).
▪ It is a markup language it uses tags.
▪ To define and store data in a shareable manner. XML supports information exchange
between computer systems such as websites, databases, and third-party applications.
HTML VS XML
HTML XML
We cand only use pre-defined tags We can use both pre-defined and used
defined tags
Ex.<p><b><br> Ex. <friendlist> <channel><xml>
Not case sensitive case sensitive
Why XML?
In real world xml use for create GUI, Java based web application and increase searching of
your website.
Raj Ram
English,Hindi,Gujarati Chinese,Koren,English
Here, data format English
Application1 Application2
Here, data format may be json,xml,etc
Note: (JavaScript Object Notation) is a data interchange format that uses human-readable
text to store and transmit data.
XML allows data to be exchanged between databases, user desktops, and third-party
applications in any platform and any language.
▪ Data structure
XML uses tags to define the structure and meaning of data, such as the beginning and
end of a paragraph or the location of an image.
▪ Data sharing
XML is a flexible way to create information formats and share structured data
electronically.
▪ Data readability
XML is designed to be readable by both humans and machines.
▪ Data customization
XML is highly customizable and can be used to create different content types, such as
web, print, and mobile content.
▪ Web searching
Search engines use XML tags to make searches more accurate. For example, a search
engine can limit search results to pages that contain a specific tag, such as an author's
name.
o This makes it much easier to create data that can be shared by different applications.
Characteristics of XML
▪ There are three important characteristics of XML that make it useful in a variety of
systems and solutions:
o XML is extensible: XML allows you to create your own self-descriptive tags, or
language, that suits your application.
o XML carries the data, does not present it: XML allows you to store the data
irrespective of how it will be presented. XML page carries data from other data
files.
o XML is a public standard: XML was developed by an organization called the
World Wide Web Consortium (W3C) and is available as an open standard.
1. XML is a structured format, which means that we can define exactly how the data is
to be arranged, organized and expressed within the file. When we are given a file, we
can validate that it conforms to a specific structure, prior to importing the data.
2. XML is a described format, which means that within the text file, every item of data
has a name that is both human- and machine-readable as well as being uniquely
identifiable. Ex: <youtube xs:string>
3. XML can easily describe hierarchical data and the relationships between data.
<youtube>
<channel>sony<channel>
<subscriber>1k</subscriber>
</youtube>
4. XML can be validated, which means we can provide a second XML file – an XML
Schema Definition file – that describes exactly how the XML data file should be
structured.
5. XML is a strongly-typed format, which means the schema definition file specifies
the data type of each element. When importing the data, the application can check the
schema definition to identify the data type to import it as.
6. XML is a global format. There is only one way to express a number in an XML file
(with US number formats) and only one way to express a date The most common
types are: xs:string. xs:decimal. xs:integer.
7. XML is a standard format. It also allows different applications to read, write,
understand and validate the same XML files, allowing us to share data between
applications in an extremely strong manner.
Structure of XML
▪ XML document is a well-organized collection of components and associated markup.
▪ An XML document can hold a wide range of information. For instance, a database
having numbers or a mathematical equation etc.
▪ An XML document can have following elements:
1) Document Prolog: It contains XML & document type declaration. These components
should appear before root of the document and at very first line of the document.it is
optional.
<?xml
version="version_number"
encoding="encoding_declaration"
standalone="standalone_status"
?>
▪ If the XML declaration is present in the XML, it must be placed as the first line in the
XML document.
▪ If the XML declaration is included, it must contain version number attribute.
▪ The Parameter names and values are case-sensitive.
▪ The names are always in lower case.
▪ The order of placing the parameters is important. The correct order is: version,
encoding and standalone.
▪ Either single or double quotes may be used.
▪ The XML declaration has no closing tag i.e. </?xml>
2. Document element
1. Root: XML document must have a root element. A root element can have child
elements and sub-child elements. For example: In the following XML
document, <message> is the root element and <to> , <from> , <subject> and
<text> are child elements.
2. Comments: <!-- My Comment --> but after prolog statement.
3. DocType: Document Type Declaration node can take 2 forms, a reference to an
external file which contains the DTD Schema, or an inline DTD Schema
description.
Note: A "well formed" XML document is not the same as a "valid" XML document. But A
"valid" XML document must be well formed. In addition, it must confirm to a
document type definition.
Example :
<!DOCTYPE book[
<!ELEMENT book (title,author,price)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT price (#PCDATA)> ]>
• !DOCTYPE book defines that the root element of the document is book.
• !ELEMENT book defines that the book element must contain the elements: "title,
author, price”
• !ELEMENT title defines the title element to be of type "#PCDATA"
<student>
<id>543</id>
<name>Ravi</name>
<age>21</age>
<addr>Guntur</addr>
<email>nsr@gmail.com</email>
<ph>9855555</ph>
<gender>male</gender>
</student>
2) External DTD
Save the above code as “student.dtd” and prepare “student.xml” as follows... Doctype
define type of file .Here student is name of tag.
SYSTEM: This indicates that the Document Type Definition (DTD) file is specified
externally. The SYSTEM keyword is used to provide the location (a URI or a path) of
the external DTD file.
<student>
<id>543</id>
<name>Ravi</name>
<age>21</age>
<addr>Guntur</addr>
<email>nsr@gmail.com</email>
</student>
In the above example we are using <!DOCTYPE student SYSTEM "student.dtd"> which
is used to provide “student.dtd” code in our “student.xml” file.
If the above xml code follows the exact rules defined in DTD then we can conclude
that our xml document is a valid document. Otherwise it is an invalid document.
<x>...</x>
<y>...</y>
<y>...</y>
</root>
5. Attribute:The value must be quoted ex. <name id=”a”> here id is attribute of <name>
element.
6. Comments: <!--- comment text -->
7. An element name can contain any alphanumeric characters. The only punctuation
mark allowed in names are the hyphen (-), under-score (_) and period (.).
8. An element, which is a container, can contain text or elements as seen in the above
example.
Syntax Rules for Tags and Elements
Element Syntax − Each XML-element needs to be closed either with start or with end
elements as shown below −
<element>....</element>
or in simple-cases, just this way −
<element/>
9. Nesting of Elements − An XML-element can contain multiple XML-elements as its
children, but the children elements must not overlap. i.e., an end tag of an element
must have the same name as that of the most recent unmatched start tag.