The chapters, useful for joins to get titles, etc…
This data is generated by extracting all TEI <div type="chapter">
child nodes of <body>
nodes in the Digital Edition of Terra Ignota.
<div type="chapter">
Nodes For?These <div type="chapter">
nodes contain the text of the chapters in each book.
Here’s a simplified view of what these nodes look like:
text type="book" n="1">
<body>
<div n="1" type="chapter" resp="#MaG">
<head type="chapterNumber">Chapter the FIRST</head>
<head type="chapterTitle">A Prayer to the Reader</head>
<p>You will criticize me, reader, ...</p>
<
...body>
</text> <
The data dictionary below maps the information above to a column in the data file.
The data extracted from these <div type="chapter">
nodes is available as a CSV file.
This file was last updated on 2022-11-03.
The raw data is first extracted from the <div type="chapter">
nodes using an Xquery script.
For easy ingestion with the XML
package in R, the script’s output has a <records>
root node and one <chapter>
node per chapter in the original text.
xquery version "3.1";
declare namespace tei = "http://www.tei-c.org/ns/1.0";
declare variable $doc := doc("PATH_TO_TEI_FILE");
records>
<
{
for $chapter in $doc//tei:body//tei:div[@type = "chapter"]
return chapter>
<
...
A node per column in the output file, see below for details
...chapter>
</
}records> </
The Xquery script outputs an XML output file of the form:
<?xml version="1.0" encoding="UTF-8"?>
records>
<chapter>
<book>1</book>
<chapter>1</chapter>
<number>Chapter the FIRST</number>
<title>A Prayer to the Reader</title>
<editor>#MaG</editor>
<chapter>
</
etc...
records> </
This XML output file must then be cleaned before being saved to the csv file provided above.
First, the correct data types must be set for each column and missing values set to NA. This is very easy using the XML
and tidyverse
packages.
chapters <- xmlToDataFrame(xml_path) %>%
mutate(
# Missing values must be _NA_
book = na_if(book, "NA"),
chapter = na_if(chapter, "NA"),
number = na_if(number, "NA"),
title = na_if(title, "NA"),
editor = na_if(editor, "NA"),
# Column data types must be correct
book = as.integer(book),
chapter = as.integer(chapter),
)
NB: Luckily for you, when you read in this data as a CSV file the readr
package is smart enough to correctly guess on all of this.
Chapter | Edited by | |
---|---|---|
1.01 | A Prayer to the Read | #MaG |
1.02 | A Boy and His God | #MaG |
1.03 | The Most Important P | #MaG |
1.04 | A Thing Long Thought | #MaG |
1.05 | Aristotle’s House | #MaG |
1.06 | Rome Was Not Built i | #MaG |
1.07 | Canis Domini | #MaG |
1.08 | A Place of Honor | #MaG |
1.09 | Every Soul That Ever | #MaG |
1.10 | The Sun Awaits His R | #MaG |
1.11 | Enter Sniper | #MaG |
1.12 | Neither Earth nor At | #MaG |
1.13 | … Perhaps the Stars | #MaG |
List of the columns in the data file explaining what they mean and how they were generated.
book
The number of the book within the series.
Required, numeric.
Derived from parameter n
of the <text type="book">
ancestor node of the chapter’s node.
book>
<
{
data($chapter/ancestor::tei:text[@type = "book"]/@n)
}book> </
chapter
The number of the chapter within its book.
Required, numeric.
Derived from parameter n
of the chapter node itself.
chapter>
<
{
if ($chapter/@n)
then data($chapter/@n)
else "NA"
}chapter> </
number
The text giving the number of the chapter within its book. This changes over the course of the series.
Alphanumeric.
Derived from the content of the child <head type="chapterNumber">
node of the chapter’s node.
number>
<
{
if ($chapter/tei:head[@type = "chapterNumber"])
then data($chapter/tei:head[@type = "chapterNumber"])
else "NA"
}number> </
title
The title of the chapter.
Alphanumeric.
Derived from the content of the child <head type="chapterTitle">
node of the chapter’s node.
title>
<
{
if ($chapter/tei:head[@type = "chapterTitle"])
then data($chapter/tei:head[@type = "chapterTitle"])
else "NA"
}title> </
editor
The editor responsible for the TEI markup for the chapter.
Alphanumeric, required. NA indicates editing has not begun for that chapter.
Derived from the resp
attribute of the chapter’s node.
editor>
<
{
if ($chapter/@resp)
then data($chapter/@resp)
else "NA"
}editor> </
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Glachant (2022, Oct. 14). Data Ignota: Chapters. Retrieved from https://syvwlch.github.io/Data-Ignota/tei/chapters/
BibTeX citation
@misc{glachant2022chapters, author = {Glachant, Mathieu}, title = {Data Ignota: Chapters}, url = {https://syvwlch.github.io/Data-Ignota/tei/chapters/}, year = {2022} }