(1) Context and motivation

The Roman settlement patterns in the northern and northeastern rural areas of the former Roman province of Noricum are relatively understudied, particularly in contrast to the more extensively researched northwestern part. Noricum fell under Roman rule from 16/15 BC until 488 AD. It spanned an area broadly aligning with modern-day Austria and parts of Germany, Slovenia, and Italy. After administrative reforms, during Late Antiquity (c. 284 to 488 AD), Noricum was split into the provinces of Noricum ripense to the north and Noricum mediterraneum to the south. The northern region, which is the focus of the study, was part of a vital military and cultural border area (known as “ripa Norica” or the Danube Limes), with the Danube River marking the northern frontier of the Roman Empire. The lack of research into the rural hinterlands of northeastern Noricum led to the initiation of the project “Roman Rural Landscapes in Noricum: Archaeological Investigations of Roman Settlement in the Hinterland of Northern Noricum” (RRLN). Conducted from 2018 to 2021, the project enriched the Landscape and Settlement Archeology field by shedding light on the under-researched rural settlements. The project’s methodology was exhaustive, integrating all obtainable archeological data within a clearly defined Area of Interest (AoI), which spanned 1,161 km2 in the Mostviertel region of the Lower Austrian state (Figure 1). The AoI was delineated using natural landscape features such as watersheds and geological-topographical delineations, bordered by the Danube to the north, the Traisen river basin to the east, the Flysch zone’s northern edge and the southern fringes of the Erlauf, Pielach, and Traisen river valleys to the south, as well as the Erlauf river basin to the west. This area included both civilian and military zones, encapsulating the hinterland of the Noric “Danube Limes” with pivotal military sites like the auxiliary forts (castella) at Arelape-Pöchlarn, Favianis-Mautern an der Donau, and Augustianis-Traismauer, which also evolved into urban centers since Late Antiquity (“oppida;” “civitates”) and remain inhabited cities today. Additionally, the project extended to the environs of the urban municipium Aelium Cetium-St. Pölten. The RRLN project utilized open geodata and a myriad of unstructured archeological datasets. While its primary focus was the Roman period (16/15 BC to 488 AD), it also embraced the late La Tène period (Late Lt D, c. 150 BC to 15 BC) and the Early Middle Ages (after 488 AD). This comprehensive chronological scope provided a deeper insight into the AoI’s spatial and historical evolution. Concluded in 2021, the RRLN project has contributed significantly to the field of Digital Roman Archeology, filling substantial knowledge gaps concerning Roman-period settlement patterns by examining previously neglected rural sites. Considering all available archeological evidence, it has taken a holistic view, charting new territory in understanding Roman rural landscapes in Austria. (; ; ; ; , ; ; , , ; ; , , ; ; ).

Figure 1 

The RRLN project concentrated on a region in the Danube Limes’ hinterland, spanning the area between the Erlauf and Traisen river valleys in Lower Austria’s Mostviertel region. Here, it investigated several Roman-period sites (map after ).

All data was organized in the “Roman Rural Landscapes Database (RRLN-DB),” a structured compilation of diverse topical spatial datasets, primarily concentrating on rural settlements in Northern Noricum. This geodatabase typically contained vector data representing geographic point features, raster data such as satellite imagery and elevation models, and attribute data providing qualitative and quantitative information about these features. The geodatabase further included metadata, e.g., for source, spatial reference system information, and spatial indexes (; ; , , ; ).

When investigating Roman Landscape and Settlement Archeology in contemporary Austria, various datasets collected through numerous archeological measures serve as the foundation of this desktop-based research endeavor. These datasets contain mostly qualitative-descriptive and quantitative-technical information about georeferenced finds and archeological sites but also objects without context, ranging from single coin finds up to Roman camp gates (; ; ; ). Nevertheless, the use of these separate but complementary datasets was an ideal strategy for the research question (; ; ; ; ; ; ). Only information about legally unambiguous archeological objects was considered, avoiding undocumented discoveries from illegal trading and other activities. This collection process, potentially spanning months, included acquiring topical geodata from different authorities (e.g., EU Commission, Federal Austrian Geological Agency, State of Lower Austria, Federal Austrian Environment Agency) and further gathering archeological data from various sources. The latter was partly collected in the field but mainly provided by different research enterprises. These included data catalogs, registries, and gazetteers like Ubi Erat Lupa, Epigraphic Database Heidelberg, Pleiades, Digital Atlas of the Roman Empire (DARE), Vici.org– Atlas zur Archäologie des Altertums, and data derived from various archeological service companies like ARDIG GesmbH, Asinoe GmbH, and Archaeo Perspectives GesbR (; ; ; ; ; ; ). Hence, the essential data sources are the primary research contributions published in the Fundberichte aus Österreich (FÖ), a periodic anthology of (mostly) field reports since 1920, and the Austrian Federal Monuments Office’s Find Site Registry (“Fundstellendatenbank des Bundesdenkmalamtes [BDA]”/BDA-FSDB). These sources document nearly all publicly disclosed archeological measures taken in Austria, mainly involving on-site investigations to discover and examine archeological objects (; ). The qualitative knowledge framework for the project was established through an extensive literature review to ensure a comprehensive understanding of the subject matter ().

The rural settlement’s geospatial and qualitative archeological data were consolidated into a GeoPackage database (), forming the RRLN-DB. This open-source SQLite container () integrates all datasets within a Geographic Information System (GIS) framework. The database’s design enhances data handling and supports sophisticated spatial archeological assessments. In compliance with an Open Spatial Archeology approach, QGIS (developed in C++) in the Long Term Release (LTR) versions was used. QGIS is Free and Open Source Software (FOSS) and was applied for nearly all archeological work steps due to its versatility (; ; ; ). The majority of cases used the current regional Austrian survey system for the eastern parts of Austria – “Militärgeopgrahisches Institut (MGI) Gauß-Krüger (GK) East” (European Petroleum Survey Group [EPSG]:31256 MGI / Austria GK East) – as the coordinate reference system (). For easier reuse on a global scale, data has also been reprojected to the World Geodetic System (WGS 84) (EPSG:4326) (; ; ).

All archeological data, often received in Environmental Systems Research Institute (ESRI) Shapefile format (), were exported into the RRLN database. The GeoPackage used allowed for managing spatial and non-spatial information in a simple, platform-independent database, requiring no server. Thus, it was entirely maintenance-free and stored in a single file, which could also be opened on mobile devices. QGIS was utilized as a relational database management system for organizing the data.

Before adding them to the GeoPackage, tabular datasets were processed in Microsoft Excel to facilitate their import (). Queries conducted within the GIS were then exported as tabular datasets for further analysis in spreadsheet software when required by the research process. The choice of Microsoft Excel, a proprietary software, over FOSS alternatives was guided by administrative and practical considerations (). Notably, for long-term archiving, only dataset components offering unique insights into the area are preserved ().

The project aligns with open science principles, ensuring data adheres to FAIR (Findability, Accessibility, Interoperability, and Reusability) guidelines (; ; ). These principles ensure corresponding archeological data are discoverable and usable. Data must have unique identifiers like DOIs, comprehensive metadata, standard retrieval protocols, and clear licenses. Interoperability requires standardized formats and semantic annotations, while reusability involves detailed documentation and adherence to community standards. The RRLN project integrates with the University of Vienna’s repository, PHAIDRA (Permanent Hosting, Archiving and Indexing of Digital Resources and Assets) (). PHAIDRA, recognized in repository indices like Open Directory of Open Access Repositories (OpenDOAR) () or re3data.org (), is open to all academic disciplines and offers a robust Fedora Commons framework-based system for the storage and management of diverse file types, including texts, images, and audio files. The system employs an object-oriented data structure and leverages a customized metadata schema from the University of Vienna (UWmetadata), inspired by the Dublin Core standard as initially defined by ISO () and augmented by the Learning Object Metadata (LOM) scheme as defined by the Institute of Electrical and Electronics Engineers (). This structure requires several mandatory metadata fields such as “object type,” “title,” “description,” “keywords,” and “topic terms,” utilizing controlled vocabularies like the Österreichische Systematik der Wissenschaftszweige (ÖFOS) () or the Getty Arts and Architecture Thesaurus (AAT) (). Additional mandatory metadata fields encompass essential elements such as “contributor” and “license.” Furthermore, there is the provision for an individually adjustable number of optional metadata fields, ensuring a comprehensive description of the data and enhancing its accessibility. PHAIDRA offers interoperability through protocols such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) () and offers differentiated access rights. Using PHAIDRA ensures that open access objects are easily discoverable through search engines like the Bielefeld Academic Search Engine (BASE) () or the Open Access Infrastructure for Research in Europe (OpenAIRE) EXPLORE infrastructure, fostering greater visibility and accessibility (, , ; ; ; ; ).

(2) Dataset description

All project-related data is stored in a dedicated data top collection stored on PHAIDRA, entitled “Roman Rural Landscapes in Noricum: Archaeological investigations of the Roman settlement in the hinterland of Northern Noricum” (). It encompasses project-related data such as selected research publications – e.g., Hagmann () – supplementary files (e.g., maps), and RRLN-DB queries intended for long-term archiving (Table 1).

Table 1

Overview of the RRLN-DB collection’s inner structure, long-term archived on PHAIDRA.


TOP-COLLECTIONSUB-COLLECTIONSUB-SUB-COLLECTIONOBJECTREFERENCEDOIACCESS

Roman Rural Landscapes in Noricum10.25365/phaidra.100open

Roman Rural Landscapes in Noricum (RRLN) – Findspots and Sites10.25365/phaidra.386open

A Controlled Vocabulary for a Simple and Basic Chronology for the Roman Province of Noricum 10.25365/phaidra.390open

A Controlled Vocabulary of Archaeological Features in Austria for the PhD Project Roman Rural Landscapes in Noricum (RRLN-CV)10.25365/phaidra.321open

Roman Rural Landscapes in Noricum: Findspots10.25365/phaidra.387upon request

Roman Rural Landscapes in Noricum: Roman Findspots10.25365/phaidra.388upon request

Roman Rural Landscapes in Noricum – Sites10.25365/phaidra.389open

Roman Rural Landscapes in Noricum – Sites (CSV)10.25365/phaidra.453open

Roman Rural Landscapes in Noricum – Sites (CSV/MGI)10.25365/phaidra.451open

Roman Rural Landscapes in Noricum – Sites (CSV/WGS84)10.25365/phaidra.452open

To enhance clarity, the long-term archived data described in this paper has been organized into a dedicated sub-collection titled “Roman Rural Landscapes in Noricum (RRLN) – Findspots and Sites: Open archaeological data” (). This collection comprises the original archeological data (spatial data tables locating archeological objects) and the controlled vocabularies employed (, ). The data is stored in XLSX and CSV files. These files contain information representing spatial 2D point coordinates and attributes denoting the sites and their respective archeological objects (Table 2). XLSX (Office Open XML) file specifications follow the European Computer Manufacturers Association () and ISO/International Electrotechnical Commission (, , , ). CSV files utilize a comma (,) as the text separator and quotation marks (”) as the text delimiter. This follows the Request for Comments () specifications. The character encoding employed was the Unicode Transformation Format (UTF-8) without Byte Order Mark (BOM) (; ). Each digital object within this collection is assigned a Digital Object Identifier (DOI). This unique alphanumeric string is a persistent link to its online location, facilitating reliable and consistent access and citation of digital content ().

Table 2

List of all column header names used in the data tables.


#HEADERDESCRIPTION

1BDA_Bezeichnung Designation of the Federal Monuments Office on the site

2BDA_Datierung Dating of the Federal Monuments Office of the site

3BDA_FO_Nummer Site number of the Federal Monuments Office of the site

4BDA_Kategorie Category of the Federal Monuments Office of the site

5BescheidThe existence of a decision by the Austrian Federal Monuments Office relating to monument protection laws

6Cluster_sizeSize of the site cluster (number of find spots in the cluster)

7 Datum_StartThe start of the period is represented as a year

8 Datum_StopStop of the period, represented as a year

9FlurnamePlace name

10 Flurname_URIUniform resource identifier of the geoname

11 Fundplatz-IDThe numeric identifier of the site

12Geoname_KG_URIUniform resource identifier of the geoname of the cadastral municipality

13KG_NameName of the cadastral municipality

14KG_Nummer Unique identifier of the cadastral municipality

15Level_01First-order site category

16Level_01_URIUniform resource identifier of the first-order site category

17Level_02Second-order site category

18Level_02_URIUniform resource identifier of the second-order site category

19Level_03Third-order site category

20Level_03_URIUniform resource identifier of the third-order site category

21Level_04Fourth-order site category

22Level_04_URIUniform resource identifier of the fourth-order site category

23Level_05Fifth-order site category

24Level_05_URIUniform resource identifier of the fifth-order site category

25Level_06Sixth-order site category

26Level_06_URIUniform resource identifier of the sixth-order site category

27Mean_XMean X-coordinate (EPSG:31256 MGI / Austria GK East; https://epsg.io/31256)

28Mean_YMean Y-coordinate (EPSG:31256 MGI / Austria GK East; https://epsg.io/31256)

29Mean_Easting_GlobalMean X-coordinate (EPSG:4326 WGS 84; https://epsg.io/4326)

30Mean_Northing_GlobalMean Y-coordinate (EPSG:4326 WGS 84; https://epsg.io/4326)

31OSM-IDUnique identifier of the OpenStreetMap data entry

32OSM_Kategorie Assigned category of the OpenStreetMap data entry

33OSM_Name Descriptive designation of the OpenStreetMap data entry

34OSM_Typ Classification type of the OpenStreetMap data entry

35ParzellennummerParcel number(s)

36 Periode_StartStart of the period, descriptive designation

37 Periode_Start_URIUniform resource identifier for the start of the period (descriptive designation)

38 Periode_StopEnd of the period, descriptive designation

39 Periode_Stop_URIUniform resource identifier for the end of the period (descriptive designation)

40PhasePhase designation

41Plus-CodeIdentifier for the plus-code

42PosititonsgenauigkeitQualitative assessment of the position accuracy of the site

43Posititonsgenauigkeit_KommentarComment on the qualitative assessment of the position accuracy of the site.

44Site-IDUnique numeric identifier of the feature

45XX-coordinate (EPSG:31256)

46YY-coordinate (EPSG:31256)

At the core of the collection, data identifying Roman sites is stored in a freely accessible sub-sub-collection. This collection is presented as CSV tables for ease of access, reuse as well as sustainable availability and is titled “Roman Rural Landscapes in Noricum – Sites: Roman settlement places – open dataset (CSV)” (). This collection comprises, again, two objects: “Roman Rural Landscapes in Noricum – Sites (CSV/MGI)” provides coordinates of the relevant sites using the local coordinate system MGI GK East (). The other object utilizes WGS 84 for global-scale operations (). Alternatively, an XLSX table can be accessed via the “Roman Rural Landscapes in Noricum – Sites” object, employing MGI GK East ().

Two datasets are accessible for personal scientific research upon formal request due to administrative-technical limitations regarding possible source material copyright issues. One table links all find spots within the AoI to their corresponding features and forms a comprehensive query table (“Roman Rural Landscapes in Noricum: Findspots”). The other table comprises all Roman features linked to all Roman find spots in a separate dedicated table (“Roman Rural Landscapes in Noricum: Roman Findspots”); both utilize MGI GK East. However, the metadata for these datasets remain openly and freely accessible (, ).

Fundstellen-ID” is the primary respectively foreign key to link Roman sites with Roman find spots joined with the features. The comprehensive tables for (Roman) find spots are designed to function as a stand-alone dataset. The RRLN-DB dataset selected queries’ key metadata fields can be described as follows.

Object name

“Roman Rural Landscapes in Noricum (RRLN): Findspots and Sites: Open archaeological data” or “Roman Rural Landscapes Database (RRLN-DB): Selected queries” ().

Format names and versions

Tables (): XLSX format – Office Open XML; CSV format

Creation dates

2018-03-14 to 2021-06-17

Dataset creators

Dominik Hagmann

Language

German

License

Creative Commons Attribution 4.0 International (where applicable)

Repository name

PHAIDRA

Publication date

2021-06-17

(3) Method

(3.1) Database model: sites – findspots – features

The RRLN database employs a data model that organizes archeological data using “features” as the primary unit. Inspired by the BDA-FSDB model, it adopts a three-tier system to classify and locate archeological items within the AoI, summarized as: “An archeological ‘feature’ is documented at a specific ‘find spot’ within a ‘site,’ a cluster of find spots” (Figure 2):

  • – 1 “Site” groups n “Findspots” (clusters).
  • – 1 “Findspot” contains n “Features” (includes).
Figure 2 

Entity-relationship (ER) model of the RRLN-DB.

The smallest unit, the archeological “feature,” represents a distinct entity, like a Samian fragment (i.e., a find) or a kiln (i.e., a structure), identified by the presence of any archeological object at a “findspot.” A “feature” is therefore an abstract archeological container capturing information using a controlled vocabulary and based on the (generalized) information provided by the BDA-FSDB, not representing detailed components like an ash fill within a kiln. It is the first level of qualitative value, while the find spot, with coordinates in the EPSG:31256 system, is the second level.

The find spot, linked to local information like plot name or political municipality, indicates the spatial location of an archeological object without pinpointing its exact spot. Hence, a find spot can contain one or several features, establishing a 1:n relationship. However, the model does not describe the exact location of an object but rather the coordinates of the initial place of discovery. Feature assignment to a findspot is based on archeological activities recorded in the BDA-FSDB, with point coordinates marking the approximate center of the parcel(s) where the object was found. Therefore, both BDA and RRLN databases employ pseudo-geomasking, avoiding pinpointing the exact location for preservation purposes. Despite the relative inaccuracy, it allows for geographic determination of a find spot’s area of interest ().

The third level combines one or more find spots into a “site,” a superordinate conceptual entity envisioned as a cluster of spatially connected find spots with a unique ID per cluster () – such sites are the “features of interest” stored in the respective “Roman Rural Landscapes in Noricum – Sites” objects on PHAIDRA.

(3.2) Base dataset

The data originates from 217 independent BDA-FSDB queries, distributed across 604 cadastral municipalities (Figure 3) included in 73 political municipalities (Figure 4) within seven political districts (Figure 5), representing the state of archeological knowledge in 2016.

Figure 3 

Cadastral municipalities across the AoI (n = 604; map: D. Hagmann 2023; data: Land Niederösterreich [Land NÖ]; Bundesamt für Eich- und Vermessungswesen [BEV]; Umweltbundesamt).

Figure 4 

Political municipalities across the AoI – the numbering of the municipalities is resolved in Figure 10 (n = 73; map: D. Hagmann 2023; data: Land NÖ; BEV; Umweltbundesamt).

Figure 5 

Political districts across the AoI (n = 7; map: D. Hagmann 2023; data: Land NÖ; BEV; Umweltbundesamt).

Each BDA-FSDB query, geographically based on current municipal boundaries, included all BDA-registered sites within that area. In general, three BDA-FSDB queries per political municipalities were provided as at least two XLS files and one TXT file named after the respective municipality. Consequently, there are three distinct types of queries, each containing partially identical yet structurally different information sets for every site within each municipality. Every query within such a set thereby complements the others. Copies of the queries were modified for GIS-based analysis in a spreadsheet program, leaving the original BDA-provided dataset unaltered and serving as a backup.

Data in the fields are typically integers for numeric data like coordinates or strings for text-based information like archeological object descriptions (). The data covers administrative and archeological information from parcel locations to monument protection status. Importantly, it includes categorical descriptions of archeological features and their periods.

(3.3) Aggregation, normalization, and GIS integration

The 217 individual BDA-FSDB-queries were aggregated into a single table with 5010 BDA-FSDB-entries, representing all archeological find spots from the BDA-FSDB within the AoI. The aggregated data were then normalized, emphasizing aspects like location, qualitative classification of the archeological objects, and chronological characteristics, due to the importance of normalizing heterogeneous data (). The resulting data tables were thus assigned 46 unique headers detailing various attributes, including coordinates, labels, categorizations, and related administrative aspects derived from the BDA-FSDB (Table 2).

Tailored, controlled vocabularies served for the standardized qualitative () and chronological () attribution of the archeological objects. After reworking of the BDA-FSDB, each entry in the revised data table no longer represents a “BDA-FSDB-entry,” but instead corresponds to a newly defined “RRLN-feature.” After GIS verification, “RRLN-findspots” were identified using revised coordinates from the BDA-FSDB. These were then clustered in the GIS to create the “RRLN-sites” for the RRLN-DB. The GeoPackage geodatabase was used to store all the data.

(4) Results and discussion

(4.1) Results

The RRLN-DB comprises 7,694 features, with 6,924 being spatially locatable. Most features within all municipalities come from the Middle Ages (n = 1,773), Roman Antiquity (n = 1,484), the Modern Age (n = 1,293), and the Stone Age (n = 1,181), with fewer from the Bronze Age (n = 722), and the least from the Iron Age (n = 624). Although smaller, features of unknown dates (n = 494) and unassignable features (n = 123) also make up a substantial portion. For the AoI, which contains 5,030 features, Roman Antiquity is most represented (n = 1,184), followed by the Middle Ages (n = 1,108), the Stone Age (n = 746), and the Modern Age (n = 716). The least represented are the Bronze Age (n = 458), Iron Age (n = 419), undated (n = 307), and unassignable features (n = 92) (Figures 6 and Figure 7 g).

Figure 6 

Number and temporal distribution of the features for all affected municipal areas (n = 7,694) and the AoI (n = 5,030).

Figure 7 

GIS-based visualization of the corresponding features per period across the AoI and municipal territories (map: D. Hagmann 2023; data: BDA; BEV; Land NÖ).

The data reveals that in the AoI and surrounding political municipalities, there are no administrative units without archeological objects (Figure 8). This is further confirmed at the smaller cadastral municipality level, where only a small part shows no findings (Figure 9). Despite some “background noise” seen almost everywhere in the AoI, features concentrate mainly on sections associated with intensive construction work.

Figure 8 

Localized features (n = 6,924) per political municipality, revealing that there are no administrative units in the AoI and surrounding municipalities without archeological objects (map: D. Hagmann 2023; data: BDA; BEV; Land NÖ).

Figure 9 

Localized features (n = 6,924) per cadastral municipality, indicating that despite some “background noise”, archeological objects are mainly concentrated in settlement centers and further sections associated with intensive construction activity, especially freeways (map: D. Hagmann 2023; data: BDA; BEV; Land NÖ).

(4.2) Discussion

(4.2.1) A history of research: understanding the dataset

The RRLN-DB’s crucial data source, the BDA-FSDB, merits detailed discussion. Its data traces back to the mid-19th century, systematically recording archeological objects in Austria since the 1850s. The BDA-FSDB originated from the analog Central Finds File (“Zentrale Fundstellenkartei”; BDA-ZFSK) created by H. Adler in 1965 as a card index system. In 1995, C. Mayer replaced the BDA-ZFSK with the BDA-FSDB, initially designed as a relational database. Despite early GIS considerations, its integration was delayed due to various reasons, with a GIS client-server application later added in parallel (). The core of the BDA-FSDB, archeological knowledge, is captured through standardized categories and free text. This includes both entire structures and individual objects, requiring data aggregation for a uniform evaluation at the site-level. Furthermore, non-archeological objects, such as geofacts, were also recorded. The BDA-FSDB provides structured information on archeological objects, including dating, detailed find history, literature collection, and current storage locations, elements initially derived from the BDA-ZFSK. C. Mayer’s categorization defines “findspots” (“Fundstellen”) as landscape segments of human use and “find locations” (“Fundplätze”) as specific areas within findspots evidencing past human activities, adding to the system’s complexity. By 2016, the BDA-FSDB recorded 18,860 findspots and 52,083 find locations across 85% of all Austrian cadastral municipalities. The data quality varies due to different collection methods and research standards over time. Notable increases in recorded findspots were influenced by the 1923 Monument Protection Law, post-WWII archeological research, and environmental impact assessments since the 1990s (, , ).

Chronology is crucial in the BDA-FSDB, with varying periods and epochs reflecting the lifespan of an archeological record rather than precise moments. These datings, often provisional and sometimes contributed by “citizen scientists,” indicate the state of research at the time of the last data edit (; ).

Notably, C. Mayer published significant research on the BDA-FSDB during the 2000s, defining its general purpose: to record archeological objects (, , , ; ; ). In the past years, there have primarily been publications on the further development of the BDA-FSDB, such as in the form of projects for GIS-based cartographic recording of the BDA-FSDB’s contents and online dissemination of archeological information () within the framework of various focal point projects (; ; , ; ). There have also been focused efforts on the successor project, the Heritage Information System (HERIS), which has been gradually replacing the BDA-FSDB since 2020. While the BDA-FSDB was exclusively designed for archeological data, HERIS is intended to handle all cultural assets in Austria, including art-historical ones (; , ).

It was initially assumed that the 5,010 archeological sites from BDA-FSDB queries corresponded to actual locations with coordinates. However, many sites either lacked coordinates or had erroneous ones. Attempts to geocode these sites were only partly successful, leading to the exclusion of 771 sites without coordinates from the geodatabase. An alternative method, assigning sites to the geometric centers of their respective cadastral communities, was considered but dismissed due to the potential for introducing bias and distorting spatial analyses. Instead, data without coordinates were selectively included for their qualitative information in a separate table in the RRLN-DB and used for manual assessments where necessary. Anyway, entries in the BDA-FSDB that could not be spatially located represented often unclear, uncertain, or speculative data, more due to imprecise source content than database inaccuracies (Figure 10).

Figure 10 

BDA-FSDB findspots without coordinates per political municipality across the AoI (n = 771): (1) Krems an der Donau; (2) St. Pölten; (3) Aggsbach; (4) Bergern im Dunkelsteinerwald; (5) Dürnstein; (6) Furth bei Göttweig; (7) Gedersdorf; (8) Maria Laach am Jauerling; (9) Mautern an der Donau; (10) Paudorf; (11) Rossatz-Arnsdorf; (12) Spitz; (13) Weißenkirchen in der Wachau; (14) Bergland; (15) Bischofstetten; (16) Dunkelsteinerwald; (17) Erlauf; (18) Golling an der Erlauf; (19) Hürm; (20) Kilb; (21) Kirnberg an der Mank; (22) Klein-Pöchlarn; (23) Krummnußbaum; (24) Leiben; (25) Loosdorf; (26) Mank; (27) Marbach an der Donau; (28) Melk; (29) Persenbeug-Gottsdorf; (30) Petzenkirchen; (31) Pöchlarn; (32) Ruprechtshofen; (33) St. Leonhard am Forst; (34) Schönbühel-Aggsbach; (35) Schollach; (36) Ybbs an der Donau; (37) Zelking-Matzleinsdorf; (38) Texingtal; (39) Emmersdorf an der Donau; (40) Böheimkirchen; (41) Gerersdorf; (42) Hafnerbach; (43) Haunoldstein; (44) Herzogenburg; (45) Inzersdorf-Getzersdorf; (46) Karlstetten; (47) Markersdorf-Haindorf; (48) Neidling; (49) Nußdorf ob der Traisen; (50) Ober-Grafendorf; (51) Obritzberg-Rust; (52) Prinzersdorf; (53) Pyhra; (54) St. Margarethen an der Sierning; (55) Statzendorf; (56) Traismauer; (57) Weinburg; (58) Wilhelmsburg; (59) Wölbling; (60) Oberndorf an der Melk; (61) Purgstall an der Erlauf; (62) Steinakirchen am Forst; (63) Wieselburg; (64) Wieselburg-Land; (65) Wolfpassing; (66) Grafenwörth; (67) Kirchberg am Wagram; (68) Sitzenberg-Reidling; (69) Zwentendorf an der Donau; (70) Wang; (71) Kapelln; (72) Hofstetten-Grünau; (73) Scheibbs (map: D. Hagmann; data: BDA; BEV; basemap.at).

Upon examining the spatial locations and the thematic content of the 5,010 BDA-find spots provided in the BDA-FSDB dataset, challenges emerged with handling the assigned BDA-attributes. The BDA-attributes describe crucial archeological properties of each object. However, character limitations in corresponding fields of the queries led to incomplete or only partial attribute display. Consequently, issues arose with archeological object classification. To address this, the partially included BDA-attributes were compared with each other within all queries, hence completed, and mapped to the aforementioned controlled vocabulary to establish standardization. Therefore, for the 5,010 BDA-find spots, 7,694 attribute entries, corresponding to the 7,694 RRLN- features, were extracted and finally combined into 187 attribute-entries.

Besides the classification-based information, the BDA-FSDB also contains “cultural” (e.g., “Roman” or “Germanic”) data, indicating the BDA-FSDB’s role as a social-archeological interpretation tool (). To avoid the numerous challenges associated with the complex concept of “cultural affiliation,” particularly about controversial topics such as the still ongoing debate on the “Roman Way of Life,” the approach was taken not to consider these categories further (see recently, e.g., ; ; ).

(4.2.2) Selecting the “right” repository: opportunities and challenges

Considering the qualitative and quantitative framework of the data, it is necessary to discuss the approach chosen for long-term archiving: Specialized repositories such as PHAIDRA are crucial in meeting regional scholarly needs by providing tailored services and fostering local academic engagement. They adeptly preserve cultural and scholarly output, offering personalized support. However, Austria has no dedicated archeological repositories like the United Kingdom’s Archaeological Data Service (ADS) () for archiving specialized research data. Instead, several institutional repositories with a broad thematic scope are hosted in Austria alongside PHAIDRA. The most prominent of these is A Resource Centre for the HumanitiEs (ARCHE) run by the Austrian Academy of Sciences (), which caters to the broad arena of Digital Humanities.

In archeology, datasets from projects like RRLN are often stored in local repositories, highlighting the strong intrinsic link between archeological data and their geographical context, a practice that differs from other scientific disciplines. Hence, “regional projects” that provide “regional data” often align with local digital infrastructures: Local storage can improve the findability of data, particularly for local research efforts. Nevertheless, broad dissemination of these datasets can be facilitated through publication in international, peer-reviewed journals or the use of scientific social network sites. Yet, local repositories encounter challenges in global discoverability and accessibility, contributing to a segmented information landscape. Smaller repositories, in particular, face difficulties with interoperability and sustainability. As a result, standardizing practices and fostering collaborations are crucial for integrating these repositories into the global research community (; ; ; ; ; ; ; ; ; ; ; ; ; ; ; , , ; ; ; ; ; ; ).

The chosen repository PHAIDRA enables the archiving and dissemination of scholarly work across disciplines and formats. Its sustained operation for over a decade demonstrates long-term viability and stability. The technical framework of PHAIDRA enhances online discoverability, addressing the challenges of regionalized information. It exemplifies the benefits of local databases in international academic research, contributing significantly to wider scholarly endeavors. The RRLN project, funded like PHAIDRA by the University of Vienna, uses this infrastructure to enhance funding efficiency and data preservation, thereby improving research integrity and reliability. RRLN’s approach includes internal considerations that influence conceptual designs and execution, based on conditions arising from various factors. Furthermore, PHAIDRA›s influence extends beyond regional limits by supporting the FAIR principles, promoting collaboration locally and globally through open science (; ; ; ; ; ; ; ; ; ; ; ).

(5) Implications and Applications

The project’s careful data selection guarantees the preservation and availability of new insights into rural settlement in Northern Noricum while avoiding redundancy with widely accessible data. The RRLN-DB, unlike a printed catalog, consists of a dataset that can be dynamically updated with new data. This dataset allows for controlled modifications, such as error corrections, with changes documented via version history in PHAIDRA. In the repository, each object is stored permanently, and new versions are added as separate items linked to the original, ensuring that no data is deleted or overwritten. This method ensures maximum transparency and traceability, assuming PHAIDRA operates flawlessly. Where legal, the data is freely and openly reusable for long-term use (). Thus, for the first time in the study area, the entire long-term archived dataset was made sustainably and freely available online under the CC BY 4.0 license, as far as possible. This robust and lightweight collection of data, archived “FAIRly” and consisting of interrelated tables, ensures the application of the principle “as little as possible, as much as necessary,” preserving only what is essential for further research while avoiding unnecessary redundancy (). In addition, by encouraging meaningful follow-up work based on a reuse concept, it aims to maximize the value of this unique dataset and foster an environment of collaborative, progressive scholarship in the study of rural settlement in ancient Noricum. Furthermore, the implications of this data curation strategy may extend beyond the immediate research context. The approach used here could inform similar initiatives in other disciplines, highlighting the potential for improved efficiencies in data management and facilitating measured advances in historical and archeological research methods. As a practical example of data reuse, a recently published bioarcheological study has already utilized the controlled temporal vocabulary from this dataset ().