(1) Overview

Repository location: DOI: https://doi.org/10.7910/DVN/ZUVKQW

Morphology key: Bilby, M.G. (). Key to BibleWorks Greek Morphology (BGM) (v1.1). DOI: https://doi.org/10.5281/zenodo.4950243

Print sources:

Harnack, A. von (). Marcion: Das Evangelium vom fremden Gott (2nd ed.). Leipzig: J.C. Hinrichs. ARK: https://n2t.net/ark:/13960/t3611sv4v

Manen, W. C. van (). Marcion’s brief van Paulus aan de Galatiërs. Theologisch Tijdschrift 21, 382–404, 451–533. ARK: https://n2t.net/ark:/13960/s28vzsvnnq4

Zahn, T. (). Geschichte des neutestamentlichen Kanons (Vol. 2). Erlangen: Andreas Deichert. ARK: https://n2t.net/ark:/13960/t8cf9s958


Three major Greek scholarly reconstructions of the corpus of Marcion’s Apostolos have appeared in print over the last century and a half, namely those by Theodor Zahn (), Adolf von Harnack (), and Ulrich Schmid (). All of these editions offset a thorough, readable Greek text, however discontinuous at many points. An offset, largely continuous Greek text of Apostolos Galatians was first published by Willem Christiaan van Manen (). These works all additionally include secondary analysis.

Other scholarly works have examined the corpus or portions of the Apostolos without presenting a clear, thorough, offset Greek text. Hilgenfeld undertook the first meticulous analysis of Apostolos Galatians (), followed by a later examination of the Apostolos corpus (), yet both are unwieldy hodgepodges of critical commentary, piecemeal Greek reconstructions, Latin attestations, and scholarly cross-references. Karl Theodor Schäfer () compiled lists of Old Latin readings for Galatians, including attestations to Marcion’s Apostolos by Tertullian. Hermann Raschke () arranged an annotated catena of Epiphanius’ Greek scholia on Apostolos Galatians, 1–2 Corinthians, and Romans. John J. Clabeaux () assembled a succession of well-organized entries, collating attestations of Apostolos wording against manuscript and patristic citation variants for the canonical texts.

The most recent published reconstruction of the Apostolos corpus was the English-only edition by Jason BeDuhn (), which notes patristic attestations and corresponding canonical manuscript variants in well-organized endnotes. Markus Vinzent is making final refinements to his fully continuous, philologically-rigorous Greek reconstruction of the Apostolos corpus, together with an accompanying German translation, both slated for publication late in 2023 or early 2024. Vinzent’s reconstruction has been translated into English by Mark G. Bilby and edited by Jack Bull in a work already pre-released as an iterative open science book () and invited for submission to an academic press.

These reconstructions variously make use of an eclectic array of evidence, including Latin attestations by Tertullian and Rufinus (in a liberal translation of Origen), Greek attestations by Epiphanius and the Ps-Origen Adamantius Dialogue, Syriac attestations by Ephrem, as well as variants in Greek, Latin, and Syriac manuscripts of the canonical letters of Paul.

Four data papers and 12 accompanying datasets in this journal (; ; ; ) were the first peer-reviewed, normalized, digital, and enriched Greek texts of Marcion’s Evangelion to be published. The present data paper and accompanying six datasets are the first reborn digital texts published of Marcion’s Apostolos.

(2) Method

Challenges and Resolutions

The creation and normalization of Evangelion datasets started with the print edition of Harnack () because it is in the public domain and has been the standard reconstruction of that text for most of the past century. Harnack’s deeply ambiguous typographical indications and editorial tendencies presented a special challenge and optimal opportunity to create data normalization/regularization rules, to define a few encompassing datatypes, and to utilize two simple typographical symbols, all to distill a tokenizable text amenable to scientific analysis and cross-comparison by humans and machines.

The resulting datatypes are: 1) plain text for clearly restored wording, no matter the relative directness or clarity of the attestation, so long as the wording is not indicated as unlikely or dubious; 2) text within parentheses for lower confidence wording, for the more likely of different variants, and for contextually necessary wording; 3) text within square brackets for implied or improvised content corresponding to the standard New Testament critical edition of the time, usually the Editio Octava Critica Maior of Tischendorf (); 4) empty parentheses () for lacunae, ellipses, unclear content, and highly dubious content; and 5) empty square brackets [] to indicate variant(s) for one or more preceding words.

The same basic normalization rules previously applied to Harnack’s Evangelion and detailed in a previous data paper () are here applied to his Apostolos. A few tendencies particular to Harnack’s reconstruction of the Apostolos did present some additional considerations. For example, Harnack added brief concluding paragraphs after most letters of the Apostolos, in these summaries typically detailing particular omissions he purports Marcion had made. The previously developed rules called for most unattested wording to be omitted without any indication, and these concluding summaries proved more confirmatory than complicating. At several points in Apostolos Philippians and elsewhere, Harnack cross-references content from the apocryphal Epistle to the Laodiceans, for which he subsequently laid out Latin and Greek texts (). In keeping with the previously developed normalization rules, such cross-references were treated akin to footnotes, substituted with empty parentheses rather than treated as prompts to restore clear or even implicit wording. At some points, the main text of Harnack’s Apostolos includes unusual notes that verses “were unattested, but would not have been absent” / sind unbezeugt, aber werden nicht gefehlt haben (). Similar observations are strewn across the footnotes of Harnack’s Evangelion, but here the indications within the running text merit the restoration of corresponding wording drawn from Tischendorf’s edition, yet set within square brackets as improvised rather than explicitly restored.

The normalization rules previously developed for Zahn’s Evangelion and detailed in a previous data paper () held up well without any challenges particular to his Apostolos.

No prior normalization rules were developed for van Manen (), because he did not publish a reconstruction of the Evangelion. However, his reconstruction of the Apostolos is typographically simple. He uses ellipses for unknown portions of the text, and these lacunae are normalized as empty parentheses. Van Manen also places some words and some verse numbers in parentheses, but does not introduce or explain what these indications entail. A close reading suggests that parentheses can variously signify: notable differences of the Marcionite text (which van Manen believed to reflect earlier traditions oftentimes) from the canonical version; disagreement among witnesses to the Marcionite text; uncertainty about the wording; and/or a different order of a verse or group of words than that found in the canonical text. While van Manen’s order is preserved, normalized verse numbering is used and some verse subdivisions are imposed so as to conform to the other Apostolos editions. Parentheses around words are preserved as is.

Given the public domain status of all three underlying print works, their close relationships and overlapping data, and the lack of any previously published, reborn digital versions, these texts together make a formidable basis for a first batch of Apostolos datasets.

Quality and Version Control

In keeping with the Evangelion data papers previously published in JOHD, here also the first dataset of each reconstruction consists of normalized, human-readable Greek, while the second lemmatizes each word and appends part-of-speech and morphological tags based on the BibleWorks Greek Morphology (BGM). The transcription and tagging involved about 80 hours of manual work by Bilby, using interlinear compilations of various Apostolos editions and morphologically tagged canonical datasets to facilitate pattern matching and rapid tag application. Lemma and word form lookups and confirmations were performed as needed using the BibleWorks software then the Thesaurus Linguae Graecae. Quality control checks of the transcription and this data paper were subsequently made by Jack Bull. Quality control of the BGM-tagged data was performed by K. Lance Lotharp, who ran custom R code to check for comprehensiveness of coverage in the part-of-speech and morphological tagging, and then by Bilby, who used a blend of regex queries, Notepad++ plugins, Winmerge, and Excel sort and filter routines to find, review, and correct anomalous lemmata and tags.

If any errors are discovered in the data after peer-review and publication, these will be corrected with updated dataset versions and/or with data-wrangling code at the team’s open access repository at Github. Exploratory meetings have already taken place to build collaborations with larger Classical and Postclassical Greek corpus linguistics projects so as to transform these datasets into other formats, especially TEI-XML.

(3) Dataset description

Object name: Normalized Datasets of Zahn’s, van Manen’s, and Harnack’s Reconstructions of Marcion’s Apostolos

Format names and versions: UTF-8 encoded.txt

Creation dates: 2023-05-17/2023-07-22

Dataset Creators

Mark G. Bilby (PhD, University of Virginia) manually created all datasets. Jack Bull (PhD, King’s College London) provided quality control of the transcriptions. K. Lance Lotharp (Columbia University) and Bilby ran quality control checks of the lemmatized and morphologically tagged datasets.

Languages: Postclassical Greek. English

License: CC BY-NC-ND 4.0 international

Repository name: Journal of Open Humanities Data Dataverse

Publication date: 2023-08-10

(4) Reuse potential

These normalized versions of the three public domain Greek scholarly reconstructions of Marcion’s Apostolos build on the prior batch of open datasets of Marcion’s Evangelion published in JOHD and commence a new batch of datasets of past and forthcoming editions of the Apostolos. These datasets may prove pivotal to future scholarship, especially to resource quantitative, comparative analyses of the Marcionite scriptures and their canonical counterparts. Like the Evangelion, the Apostolos has been suppressed for eighteen centuries, but data science holds considerable potential to restore it to previously unattainable levels of completeness and confidence, and concurrently to clarify its historical and literary relationships with their canonical counterparts and various other writings attributed to, or treating of, the Apostle Paul.

The team’s choice of a CC BY-NC-ND license is not intended to prevent re-use, but instead to build collaborative research networks and resources around these texts. We welcome interested parties to reach out to the corresponding author to request permissions.