The books in the series, useful for joins to get title, etc…
This data is generated by extracting all TEI <text type="book">
nodes in the Digital Edition of Terra Ignota.
<text>
Nodes For?These <text type="book">
nodes contain the text of the novels in the series.
Here’s a simplified view of what these nodes look like:
text xml:id="TLtL" type="book" n="1">
<front>
<
...div>
<title type="main">TOO LIKE THE LIGHTNING</title>
<title type="desc">A Narrative of Events of the year 2454</title>
<byline>Written by MYCROFT CANNER, at the request of certain parties.</byline>
<div>
</
...front>
</body>
<div n="1" type="chapter"></div>
<
...body>
</text> <
The data dictionary below maps the information above to a column in the data file.
The data extracted from these <text type="book">
nodes is available as a CSV file.
This file was last updated on 2022-10-14.
The raw data is first extracted from the <text type="book">
nodes using an Xquery script.
For easy ingestion with the XML
package in R, the script’s output has a <records>
root node and one <book>
node per book in the original text.
xquery version "3.1";
declare namespace tei = "http://www.tei-c.org/ns/1.0";
declare variable $doc := doc("PATH_TO_TEI_FILE");
records>
<
{
for $book in $doc//tei:text[@type = "book"]
return book>
<
...
A node per column in the output file, see below for details
...book>
</
}records> </
The Xquery script outputs an XML output file of the form:
<?xml version="1.0" encoding="UTF-8"?>
records>
<book>
<book>1</book>
<id>TLtL</id>
<title>TOO LIKE THE LIGHTNING</title>
<subtitle>A Narrative of Events of the year 2454</subtitle>
<byline>Written by MYCROFT CANNER, at the request of certain parties.</byline>
<book>
</
etc...
records> </
This XML output file must then be cleaned before being saved to the csv file provided above.
First, the correct data types must be set for each column and missing values set to NA. This is very easy using the XML
and tidyverse
packages.
books <- xmlToDataFrame(xml_path) %>%
mutate(
# Missing values must be _NA_
book = na_if(book, "NA"),
id = na_if(id, "NA"),
title = na_if(title, "NA"),
subtitle = na_if(subtitle, "NA"),
byline = na_if(byline, "NA"),
# Column data types must be correct
book = as.integer(book),
)
NB: Luckily for you, when you read in this data as a CSV file the readr
package is smart enough to correctly guess on all of this.
There are currently 4 of 4 books present in the Digital Edition.
List of the columns in the data file explaining what they mean and how they were generated.
book
The number of the book within the series.
Required, numeric.
Derived from parameter n
of the <text type="book">
node itself.
book>
<
{
if ($book/@n)
then data($book/@n)
else "NA"
}book> </
id
A unique identifier used within the Digital Edition.
Required, unique, must conform to xml:id
requirements, e.g. can’t start with a number.
Derived from parameter xml:id
of the <text type="book">
node itself.
id>
<
{
if ($book/@xml:id)
then data($book/@xml:id)
else "NA"
}id> </
title
The title of the book.
Alphanumeric
Derived from the content of the child <title type="main">
node of the book’s <front>
node.
title>
<
{
if ($book/tei:front//tei:title[@type = "main"])
then data($book/tei:front//tei:title[@type = "main"])
else "NA"
}title> </
subtitle
The subtitle of the book.
Alphanumeric
Derived from the content of the child <title type="desc">
node of the book’s <front>
node.
subtitle>
<
{
if ($book/tei:front//tei:title[@type = "desc"])
then data($book/tei:front//tei:title[@type = "desc"])
else "NA"
}subtitle> </
byline
The byline of the book.
Alphanumeric
Derived from the content of the child <byline>
node of the book’s <front>
node.
byline>
<
{
if ($book/tei:front//tei:byline)
then data($book/tei:front//tei:byline)
else "NA"
}byline> </
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Glachant (2022, Oct. 14). Data Ignota: Books. Retrieved from https://syvwlch.github.io/Data-Ignota/tei/books/
BibTeX citation
@misc{glachant2022books, author = {Glachant, Mathieu}, title = {Data Ignota: Books}, url = {https://syvwlch.github.io/Data-Ignota/tei/books/}, year = {2022} }