Thursday, November 5, 2009

XML Interview Questions

What are the different kind of parsers used in XML?
There are 2 parsers:
1) DOM (Document object model): This will interpret Complete XML document.Microsoft major concentration is DOM Parser.
2) SAX Parser (Simple Aplication programming Interface for XML): This will interpret XML document based on the event occurrence only it wont interpret complete document at a time. Sun mycrosystems major concentration is SAX Parser.What is XPath?
XPath is used to navigate through elements and attributes in an XML document.

Difference between XML and HTML
What is the differnece between XML and HTML
1) XML is not a replacement for HTML.
2) XML and HTML were designed with different goals.
3) XML was designed to describe data and to focus on what data is.
4) HTML was designed to display data and to focus on how data looks.
5) HTML is about displaying information, XML is about describing information
User definable tags
Content driven
End tags required for well formed documents
Quotes required around attributes values
Slash required in empty tags
Defined set of tags designed for web display
Format driven
End tags not required
Quotes not required
Slash not required

What is XML and Binary Serialization?
XML Serialization serializes the object into an xml file. This file is human readable and can be shared with other applications.

Binary serialization is more efficient but the serialized file is in binary format. It may not make any sense for a human being to open this file and understand what it contains. It is a stream of bytes.

What is XSL?
XSLT - a language for transforming XML documents
XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element.
XPath - a language for navigating in XML documents
XSL-FO - a language for formatting XML documents
What is DTD and Schema in XML

A DTD is:

The XML Document Type Declaration contains or points to markup declarations that provide a grammar for a class of documents. This grammar is known as a document type definition or DTD.

The DTD can point to an external subset containing markup declarations, or can contain the markup declarations directly in an internal subset, or can even do both.

A Schema is:

XML Schemas express shared vocabularies and allow machines to carry out rules made by people. They provide a means for defining the structure, content and semantics of XML documents.

In summary, schemas are a richer and more powerful of describing information than what is possible with DTDs.
What is XML?
XML is the Extensible Markup Language. It improves the functionality
of the Web by letting you identify your information in a more accurate,
flexible, and adaptable way. It is extensible because it is not
a fixed format like it’s written in SGML, the international standard meta language for
text document markup (ISO 8879).
What is a markup language?
A markup language is a set of words and symbols for describing
the identity of pieces of a document (for example ‘this is
a paragraph’, ‘this is a heading’, ‘this
is a list’, ‘this is the caption of this figure’,
etc). Programs can use this with a style sheet to create output
for screen, print, audio, video, Braille, etc.

Some markup languages (eg those used in word processors) only describe
appearances (’this is italics’, ‘this is bold’),
but this method can only be used for display, and is not normally
re-usable for anything else.
Where should I use XML?
Its goal is to enable generic SGML to be served, received, and
processed on the Web in the way that is now possible with HTML.
XML has been designed for ease of implementation and for interoperability
with both SGML and HTML.
Despite early attempts, browsers never allowed other SGML, only
HTML (although there were plugins), and they allowed it (even encouraged
it) to be corrupted or broken, which held development back for over
a decade by making it impossible to program for it reliably. XML
fixes that by making it compulsory to stick to the rules, and by
making the rules much simpler than SGML.
But XML is not just for Web pages: in fact it’s very rarely used
for Web pages on its own because browsers still don’t provide reliable
support for formatting and transforming it. Common uses for XML
Information identification because you can define your own markup,
you can define meaningful names for all your information items.
Information storage because XML is portable and non-proprietary,
it can be used to store textual information across any platform.
Because it is backed by an international standard, it will remain
accessible and processable as a data format. Information structure

XML can therefore be used to store and identify any kind of (hierarchical)
information structure, especially for long, deep, or complex document
sets or data sources, making it ideal for an information-management
back-end to serving the Web. This is its most common Web application,
with a transformation system to serve it as HTML until such time
as browsers are able to handle XML consistently. Publishing the
original goal of XML as defined in the quotation at the start of
this section. Combining the three previous topics (identity, storage,
structure) means it is possible to get all the benefits of robust
document management and control (with XML) and publish to the Web
(as HTML) as well as to paper (as PDF) and to other formats (eg
Braille, Audio, etc) from a single source document by using the
appropriate stylesheets. Messaging and data transfer XML is also
very heavily used for enclosing or encapsulating information in
order to pass it between different computing systems which would
otherwise be unable to communicate. By providing a lingua franca
for data identity and structure, it provides a common envelope for
inter-process communication (messaging). Web services Building on
all of these, as well as its use in browsers, machine-processable
data can be exchanged between consenting systems, where before it
was only comprehensible by humans (HTML). Weather services, e-commerce
sites, blog newsfeeds, AJaX sites, and thousands of other data-exchange
services use XML for data management and transmission, and the web
browser for display and interaction.
Why is XML such an important development?
It removes two constraints which were holding back Web developments:
1. dependence on a single, inflexible document type (HTML) which
was being much abused for tasks it was never designed for;
2. the complexity of full SGML, whose syntax allows many powerful
but hard-to-program options.
XML allows the flexible development of user-defined document types.
It provides a robust, non-proprietary, persistent, and verifiable
file format for the storage and transmission of text and data both
on and off the Web; and it removes the more complex options of SGML,
making it easier to program for.
What is SGML?
SGML is the Standard Generalized Markup Language (ISO 8879:1986),
the international standard for defining descriptions of the structure
of different types of electronic document. There is an SGML FAQ
from David Megginson at;
and Robin Cover’s SGML Web pages are at
For a little light relief, try Joe English’s ‘Not the SGML
FAQ’ at

SGML is very large, powerful, and complex. It has been in heavy
industrial and commercial use for nearly two decades, and there
is a significant body of expertise and software to go with it.
XML is a lightweight cut-down version of SGML which keeps enough
of its functionality to make it useful but removes all the optional
features which made SGML too complex to program for in a Web environment.
Aren’t XML, SGML, and HTML all the same thing?
Not quite; SGML is the mother tongue, and has been used for describing
thousands of different document types in many fields of human activity,
from transcriptions of ancient Irish manuscripts to the technical
documentation for stealth bombers, and from patients’ clinical records
to musical notation. SGML is very large and complex, however, and
probably overkill for most common office desktop applications.
XML is an abbreviated version of SGML, to make it easier to use
over the Web, easier for you to define your own document types,
and easier for programmers to write programs to handle them. It
omits all the complex and less-used options of SGML in return for
the benefits of being easier to write applications for, easier to
understand, and more suited to delivery and interoperability over
the Web. But it is still SGML, and XML files may still be processed
in the same way as any other SGML file (see the question on XML
HTML is just one of many SGML or XML applications—the one
most frequently used on the Web.
Technical readers may find it more useful to think of XML as being
SGML– rather than HTML++.
Why is XML such an important development?
It removes two constraints which were holding back Web developments:

1. dependence on a single, inflexible document type (HTML) which
was being much abused for tasks it was never designed for;
2. the complexity of full question A.4, SGML, whose syntax allows
many powerful but hard-to-program options.
XML allows the flexible development of user-defined document types.
It provides a robust, non-proprietary, persistent, and verifiable
file format for the storage and transmission of text and data both
on and off the Web; and it removes the more complex options of SGML,
making it easier to program for.
Give a few examples of types of applications that can
benefit from using XML.
There are literally thousands of applications that can benefit
from XML technologies. The point of this question is not to have
the candidate rattle off a laundry list of projects that they have
worked on, but, rather, to allow the candidate to explain the rationale
for choosing XML by citing a few real world examples. For instance,
one appropriate answer is that XML allows content management systems
to store documents independently of their format, which thereby
reduces data redundancy. Another answer relates to B2B exchanges
or supply chain management systems. In these instances, XML provides
a mechanism for multiple companies to exchange data according to
an agreed upon set of rules. A third common response involves wireless
applications that require WML to render data on hand held devices.
What is DOM and how does it relate to XML?
The Document Object Model (DOM) is an interface specification maintained
by the W3C DOM Workgroup that defines an application independent
mechanism to access, parse, or update XML data. In simple terms
it is a hierarchical model that allows developers to manipulate
XML documents easily Any developer that has worked extensively with
XML should be able to discuss the concept and use of DOM objects
freely. Additionally, it is not unreasonable to expect advanced
candidates to thoroughly understand its internal workings and be
able to explain how DOM differs from an event-based interface like
What is SOAP and how does it relate to XML?
The Simple Object Access Protocol (SOAP) uses XML to define a protocol
for the exchange of information in distributed computing environments.
SOAP consists of three components: an envelope, a set of encoding
rules, and a convention for representing remote procedure calls.
Unless experience with SOAP is a direct requirement for the open
position, knowing the specifics of the protocol, or how it can be
used in conjunction with HTTP, is not as important as identifying
it as a natural application of XML.
Why not just carry on extending HTML?
HTML was already overburdened with dozens of interesting but incompatible
inventions from different manufacturers, because it provides only
one way of describing your information.
XML allows groups of people or organizations to question C.13, create
their own customized markup applications for exchanging information
in their domain (music, chemistry, electronics, hill-walking, finance,
surfing, petroleum geology, linguistics, cooking, knitting, stellar
cartography, history, engineering, rabbit-keeping, question C.19,
mathematics, genealogy, etc).
HTML is now well beyond the limit of its usefulness as a way of
describing information, and while it will continue to play an important
role for the content it currently represents, many new applications
require a more robust and flexible infrastructure.
Why should I use XML?
Here are a few reasons for using XML (in no particular order).
Not all of these will apply to your own requirements, and you may
have additional reasons not mentioned here (if so, please let the
editor of the FAQ know!).
* XML can be used to describe and identify information accurately
and unambiguously, in a way that computers can be programmed to
‘understand’ (well, at least manipulate as if they could
* XML allows documents which are all the same type to be created
consistently and without structural errors, because it provides
a standardized way of describing, controlling, or allowing/disallowing
particular types of document structure. [Note that this has absolutely
nothing whatever to do with formatting, appearance, or the actual
text content of your documents, only the structure of them.]
* XML provides a robust and durable format for information storage
and transmission. Robust because it is based on a proven standard,
and can thus be tested and verified; durable because it uses plain-text
file formats which will outlast proprietary binary ones.
* XML provides a common syntax for messaging systems for the exchange
of information between applications. Previously, each messaging
system had its own format and all were different, which made inter-system
messaging unnecessarily messy, complex, and expensive. If everyone
uses the same syntax it makes writing these systems much faster
and more reliable.
* XML is free. Not just free of charge (free as in beer) but free
of legal encumbrances (free as in speech). It doesn’t belong to
anyone, so it can’t be hijacked or pirated. And you don’t have to
pay a fee to use it (you can of course choose to use commercial
software to deal with it, for lots of good reasons, but you don’t
pay for XML itself).
* XML information can be manipulated programmatically (under machine
control), so XML documents can be pieced together from disparate
sources, or taken apart and re-used in different ways. They can
be converted into almost any other format with no loss of information.
* XML lets you separate form from content. Your XML file contains
your document information (text, data) and identifies its structure:
your formatting and other processing needs are identified separately
in a style sheet or processing system. The two are combined at output
time to apply the required formatting to the text or data identified
by its structure (location, position, rank, order, or whatever).
How would you build a search engine for large volumes
of XML data?
The way candidates answer this question may provide insight into
their view of XML data. For those who view XML primarily as a way
to denote structure for text files, a common answer is to build
a full-text search and handle the data similarly to the way Internet
portals handle HTML pages. Others consider XML as a standard way
of transferring structured data between disparate systems. These
candidates often describe some scheme of importing XML into a relational
or object database and relying on the database’s engine for searching.
Lastly, candidates that have worked with vendors specializing in
this area often say that the best way the handle this situation
is to use a third party software package optimized for XML data.

Does XML replace HTML?
No. XML itself does not replace HTML. Instead, it provides an alternative
which allows you to define your own set of markup elements. HTML
is expected to remain in common use for some time to come, and the
current version of HTML is in XML syntax. XML is designed to make
the writing of DTDs much simpler than with full SGML.


  1. You might also want to look at vtd-xml, the next generation XML processing model that is far more powerful than DOM and SAX

  2. Hi

    Tks very much for post:

    I like it and hope that you continue posting.

    Let me show other source that may be good for community.

    Source: Web interview questions

    Best rgs


Locations of visitors to this page