
About the workshop
XML (eXtensible Markup language) is a ubiquitous data format across the social sciences and humanities, structuring every web page you see on the internet and powering an array of digital resources like library catalogs, vast scientific datasets, API responses, and digital textual editions. This workshop introduces XPath (XML Path Language): the query language designed specifically for traversing, analyzing, and parsing XML datasets. With its simple syntax, XPath offers a straightforward mechanism for interrogating XML data, allowing researchers to identify patterns, spot inconsistencies, and ask questions of their XML without any previous knowledge of programming languages or query syntaxes.
This workshop is aimed toward anyone who works with XML data and will provide participants with hands-on experience with using XPath. Using the Folger Shakespeare corpus as a sample dataset, this workshop will outline how to construct and execute XPath queries using oXygen XML editor and will demonstrate how participants can answer various research questions ranging in complexity about their data (for example, "What is the average length of Hamlet's soliloquies? To whom does he speak most often? Who speaks the highest number of verse lines across all of Shakespeare's plays?")
Researchers are encouraged to bring their own XML datasets and questions to the workshop to serve as real-life examples that, if time permits, can be addressed collaboratively.
Requirements
- No prior programming knowledge or experience with query languages is required
- Some familiarity with XML data is preferred, but not necessary