Jump to content

XMLStarlet

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Cedar101 (talk | contribs) at 09:52, 7 March 2017 (Example usage: lang="xml"). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

XMLStarlet
Original author(s)Dagobert Michelsen, Noam Postavsky, Mikhail Grushinskiy
Initial release8 February 2005; 19 years ago (2005-02-08)
Stable release
1.6.1 / 9 August 2014; 10 years ago (2014-08-09)
Written inC
Operating systemUnix-like, Windows, CygWin, Mac OS
TypeXML parser
LicenseMIT License
Websitexmlstar.sourceforge.net

XMLStarlet is a set of easy to use command line utilities (toolkit) to query, transform, validate, and edit XML documents and files using a simple set of shell commands in a way similar to how it is done with UNIX grep, sed, awk, diff, patch, join, etc commands.

This set of command line utilities can be used by those who want to test XPath query or execute commands on the fly as well as deal with many XML documents or for automated XML processing with shell scripts.

To run XMLStarlet utility you can download from official site, then simply type 'xml' on command line with the corresponding commands or queries to execute (see examples below).

The toolkit's feature set includes the following options:

  • Check or validate XML files (simple well-formedness check, DTD, XSD, RelaxNG)
  • Calculate values of XPath expressions on XML files (such as running sums, etc)
  • Search XML files for matches to given XPath expressions
  • Apply XSLT stylesheets to XML documents (including EXSLT support, and passing parameters to stylesheets)
  • Query XML documents (ex. query for value of some elements of attributes, sorting, etc)
  • Modify or edit XML documents (ex. delete some elements)
  • Format or "beautify" XML documents (as changing indentation, etc)
  • Fetch XML documents using http:// or ftp:// URLs
  • Browse tree structure of XML documents (in similar way to 'ls' command for directories)
  • Include one XML document into another using XInclude
  • XML c14n canonicalization
  • Escape/unescape special XML characters in input text
  • Print directory as XML document
  • Convert XML into PYX format (based on ESIS - ISO 8879), and vice versaXMLStarlet command line utility is written in C and uses libxml2 and libxslt from http://xmlsoft.org/. Implementation of extensive choice of options for XMLStarlet utility was only possible because of rich feature set of both libraries: libxml2 and libxslt. XMLStarlet is linked statically to both libxml2 and libxslt, so generally all you need to process XML documents is one executable file.

XMLStarlet is open source free software released under an MIT License which allows free use and distribution for both commercial and non-commercial projects.

Once downloaded from XMLStarlet you can extract the zip file into a directory and run the XMLStarlet utility. For MS Windows run a command line prompt then simply go to the installation directory and type 'xml.exe' on the shell prompt to see the list of options available or run a XPath query on any XML file.

Example usage

Consider the following XML document 'xmlfile1.xml' example:

<?xml version="1.0" encoding="utf-8"?>
<wikimedia>
  <projects>
    <project name="Wikipedia" launch="2001-01-05">
      <editions>
        <edition language="English">en.wikipedia.org</edition>
        <edition language="German">de.wikipedia.org</edition>
        <edition language="French">fr.wikipedia.org</edition>
        <edition language="Polish">pl.wikipedia.org</edition>
        <edition language="Spanish">es.wikipedia.org</edition>
      </editions>
    </project>
    <project name="Wiktionary" launch="2002-12-12">
      <editions>
        <edition language="English">en.wiktionary.org</edition>
        <edition language="French">fr.wiktionary.org</edition>
        <edition language="Vietnamese">vi.wiktionary.org</edition>
        <edition language="Turkish">tr.wiktionary.org</edition>
        <edition language="Spanish">es.wiktionary.org</edition>
      </editions>
    </project>
    <project name="Wikileaks" launch="2006-10-04">
     <editions>
        <edition language="English">en.wikileaks.org</edition>
     </editions>
    </project>
  </projects>
</wikimedia>

On a MS Windows command prompt the following five XPath queries are executed on the above XML file 'xmlfile1.xml'. Note: make sure to copy the file xmlfile1.xml to the directory where you unzipped the XMLStarlet software.

Example1: The XPath expression to select all name attributes for all projects.

cmnd> xml.exe sel -t -v "//wikimedia/projects/project/@name" xmlfile1.xml
out> Wikipedia
     Wiktionary
     Wikileaks

Example2: The XPath expression to select all attributes of the last Wikimedia project.

cmnd> xml.exe sel -t -v "/wikimedia/projects/project[last()]/@*" xmlfile1.xml
out> Wikileaks
     2006-10-04

Example3: The XPath expression to select addresses of all Wiktionary editions (text of all edition elements that exist under project element with a name attribute of Wiktionary).

cmnd> xml.exe sel -t -v "/wikimedia/projects/project[@name='Wiktionary']/editions/edition" xmlfile1.xml
out> en.wiktionary.org
     fr.wiktionary.org
     vi.wiktionary.org
     tr.wiktionary.org
     es.wiktionary.org

Example4: The XPath expression to select addresses of all Wikimedia Wiktionary editions that have languages different from Turkish and Spanish (all those NOT Turkish and Not Spanish) ).

cmnd> xml.exe sel -t -v "/wikimedia/projects/project[@name='Wiktionary']/editions/edition[@language!='Turkish' and @language!='Spanish']" xmlfile1.xml
out> en.wiktionary.org
     fr.wiktionary.org
     vi.wiktionary.org

Example5: The XPath expression to select all attributes of editions whose position is greater or equal to 3 in the list of editions.

cmnd> xml.exe sel -t -v "/wikimedia/projects/project/editions/edition[position() >= 3]/@*" xmlfile1.xml
out> French
     Polish
     Spanish
     Vietnamese
     Turkish
     Spanish

An XML document can be validated against an XSD schema saved in file 'xsdfile.xsd' as follows:

cmnd> xml.exe val -e -s xsdfile.xsd xmlfile1.xml
out> xmlfile1.xml - valid

See also

  • XML Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
  • XPath (XML Path Language) is a query language for selecting nodes from an XML document.
  • XSLT Extensible Stylesheet Language Transformations (XSLT) is a language for transforming XML documents into other XML documents or other formats such as HTML for web pages, plain text, etc.
  • Document type definition (DTD) defines the legal building blocks of an XML document.