entities.html 9.2 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364
  1. <?xml version="1.0" encoding="ISO-8859-1"?>
  2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3. <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
  4. TD {font-family: Verdana,Arial,Helvetica}
  5. BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
  6. H1 {font-family: Verdana,Arial,Helvetica}
  7. H2 {font-family: Verdana,Arial,Helvetica}
  8. H3 {font-family: Verdana,Arial,Helvetica}
  9. A:link, A:visited, A:active { text-decoration: underline }
  10. </style><title>Entities or no entities</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="120"><a href="http://swpat.ffii.org/"><img src="epatents.png" alt="Action against software patents" /></a></td><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>Entities or no entities</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Developer Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html" style="font-weight:bold">Main Menu</a></li><li><a href="html/index.html" style="font-weight:bold">Reference Manual</a></li><li><a href="examples/index.html" style="font-weight:bold">Code Examples</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="tutorial/index.html">Tutorial</a></li><li><a href="xmlreader.html">The Reader Interface</a></li><li><a href="ChangeLog.html">ChangeLog</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="python.html">Python and bindings</a></li><li><a href="architecture.html">libxml2 architecture</a></li><li><a href="tree.html">The tree output</a></li><li><a href="interface.html">The SAX interface</a></li><li><a href="xmlmem.html">Memory Management</a></li><li><a href="xmlio.html">I/O Interfaces</a></li><li><a href="library.html">The parser interfaces</a></li><li><a href="entities.html">Entities or no entities</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="upgrade.html">Upgrading 1.x code</a></li><li><a href="threads.html">Thread safety</a></li><li><a href="DOM.html">DOM Principles</a></li><li><a href="example.html">A real example</a></li><li><a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="APIchunk0.html">Alphabetic</a></li><li><a href="APIconstructors.html">Constructors</a></li><li><a href="APIfunctions.html">Functions/Types</a></li><li><a href="APIfiles.html">Modules</a></li><li><a href="APIsymbols.html">Symbols</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://opencsw.org/packages/libxml2">Solaris binaries</a></li><li><a href="http://www.explain.com.au/oss/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://codespeak.net/lxml/">lxml Python bindings</a></li><li><a href="http://cpan.uwinnipeg.ca/dist/XML-LibXML">Perl bindings</a></li><li><a href="http://libxmlplusplus.sourceforge.net/">C++ bindings</a></li><li><a href="http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4">PHP bindings</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://libxml.rubyforge.org/">Ruby bindings</a></li><li><a href="http://tclxml.sourceforge.net/">Tcl bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>Entities in principle are similar to simple C macros. An entity defines an
  11. abbreviation for a given string that you can reuse many times throughout the
  12. content of your document. Entities are especially useful when a given string
  13. may occur frequently within a document, or to confine the change needed to a
  14. document to a restricted area in the internal subset of the document (at the
  15. beginning). Example:</p><pre>1 &lt;?xml version="1.0"?&gt;
  16. 2 &lt;!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
  17. 3 &lt;!ENTITY xml "Extensible Markup Language"&gt;
  18. 4 ]&gt;
  19. 5 &lt;EXAMPLE&gt;
  20. 6 &amp;xml;
  21. 7 &lt;/EXAMPLE&gt;</pre><p>Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
  22. its name with '&amp;' and following it by ';' without any spaces added. There
  23. are 5 predefined entities in libxml2 allowing you to escape characters with
  24. predefined meaning in some parts of the xml document content:
  25. <strong>&amp;lt;</strong> for the character '&lt;', <strong>&amp;gt;</strong>
  26. for the character '&gt;', <strong>&amp;apos;</strong> for the character ''',
  27. <strong>&amp;quot;</strong> for the character '"', and
  28. <strong>&amp;amp;</strong> for the character '&amp;'.</p><p>One of the problems related to entities is that you may want the parser to
  29. substitute an entity's content so that you can see the replacement text in
  30. your application. Or you may prefer to keep entity references as such in the
  31. content to be able to save the document back without losing this usually
  32. precious information (if the user went through the pain of explicitly
  33. defining entities, he may have a a rather negative attitude if you blindly
  34. substitute them as saving time). The <a href="html/libxml-parser.html#xmlSubstituteEntitiesDefault">xmlSubstituteEntitiesDefault()</a>
  35. function allows you to check and change the behaviour, which is to not
  36. substitute entities by default.</p><p>Here is the DOM tree built by libxml2 for the previous document in the
  37. default case:</p><pre>/gnome/src/gnome-xml -&gt; ./xmllint --debug test/ent1
  38. DOCUMENT
  39. version=1.0
  40. ELEMENT EXAMPLE
  41. TEXT
  42. content=
  43. ENTITY_REF
  44. INTERNAL_GENERAL_ENTITY xml
  45. content=Extensible Markup Language
  46. TEXT
  47. content=</pre><p>And here is the result when substituting entities:</p><pre>/gnome/src/gnome-xml -&gt; ./tester --debug --noent test/ent1
  48. DOCUMENT
  49. version=1.0
  50. ELEMENT EXAMPLE
  51. TEXT
  52. content= Extensible Markup Language</pre><p>So, entities or no entities? Basically, it depends on your use case. I
  53. suggest that you keep the non-substituting default behaviour and avoid using
  54. entities in your XML document or data if you are not willing to handle the
  55. entity references elements in the DOM tree.</p><p>Note that at save time libxml2 enforces the conversion of the predefined
  56. entities where necessary to prevent well-formedness problems, and will also
  57. transparently replace those with chars (i.e. it will not generate entity
  58. reference elements in the DOM tree or call the reference() SAX callback when
  59. finding them in the input).</p><p><span style="background-color: #FF0000">WARNING</span>: handling entities
  60. on top of the libxml2 SAX interface is difficult!!! If you plan to use
  61. non-predefined entities in your documents, then the learning curve to handle
  62. then using the SAX API may be long. If you plan to use complex documents, I
  63. strongly suggest you consider using the DOM interface instead and let libxml
  64. deal with the complexity rather than trying to do it yourself.</p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>