Digitization of Croatian Latin writers: a research project proposal
Neven Jovanović
(neven.jovanovic_AT_ffzg.hr)
Mar 21, 2006
Digitization of Croatian Latin writers here means transforming their texts into a
machine-readable format, accessible over the internet - creating a database, corpus,
collection of research material and tools named Croatiae
auctores Latini (CAuLa) . CAuLa research space and digital collection aims to
provide one place to find and use much of otherwise scattered textual and other
primary and secondary material by and about Croatian writers who used Latin as medium
of literature. The CAuLa corpus intends to simplify and enhance access to materials
which are often rare, precious, and fragile (manuscripts, early printed books). At
the same time, by providing digital versions of the documents, the corpus will help
preserve these materials for future use and research. The workspace and forum
eventually created by the CAuLa, globally accessible, will provide tools and medium for
further insights into this segment of Croatian, and European, literature, culture,
and intelectual history. The experience gained from building the CAuLA (and amply
documented and shared) will remain available to other digitized
heritage collections - both Croatian and Latin, textual and otherwise.
2 The state of research
- Croatian Latin heritage. Research on the writings of Marko Marulić
shows what amazing discoveries can still be made today concerning all aspects of
life and work of a Croatian author. The Lexicon of Croatian Writers (Leksikon
hrvatskih pisaca, Školska knjiga, 2000) demonstrates both part and role Croatian
writers in Latin play in national cultural heritage and literary production. Still,
texts of these writers - even when reliably edited and adequately published -
remain too dispersed, since publishing is often left to (much appreciated) private
and local initiatives (cf. again the example of Marko Marulić, whose Opera omnia,
now in its is 15th volume, is being published by the Splitski književni krug -
legally, a citizen's organization based in Split, Croatia).
- Digitizing textual heritage. Digital collections broaden the scope of
research; a computer-searchable and manipulable text may open new ways of reading
and study. Supported by fast networks, such collections become available globally
- everywhere and all of the time -, not only to research institutions, but to
schools, public libraries, private homes. A global information infrastructure may
enhance our own cultural identity - and help us learn about completely different
identities. But literary texts from the past require approaches different from
those now common on the internet: we have yet to explore how a stimulating and
useful workspace for prolonged and intensive textual study should look (and feel)
like.
- Digitizing cultural heritage in Latin. The leader here is the Perseus
Project (http://www.perseus.tufts.edu/), "an evolving digital library,
engineering interactions through time, space, and language", freely available, and
comprising, among other material, classical Greek and Latin texts, both literary
and non-literary (papyri and ostraka), as well as images (Greek vases) and
secondary literature; the Perseus Project solved important problems with
representing the ancient Greek writing system on the internet, with connecting
texts and secondary literature, with presenting statistical data; it offers a
stimulating workspace for research and teaching.
- Digitizing neo-Latin heritage. Let us quote a minimal and a maximal
example. Italian corpus "Poeti d'Italia in lingua latina"
(http://157.138.65.54:8080/poetiditalia/) offers textual searches on some 200
Italian authors (and some Croatian, too) from the Middle and Early Modern ages,
with minimal bio-bibliographical information, with no possibilty to add own
materials, comments or translations, and with somewhat unfriendly way of defining
subgroups of the corpus for searching purposes. On the other hand, the German MATEO
/ CAMENA corpus (http://www.uni-mannheim.de/mateo/camenahtdocs/camena.html)
seems to be more a repository than a database, but dynamical, growing steadily,
improving the markup, adding metadata and images, even though its search interface
is also somewhat unyieldy.
3 Aims of the CAuLa collection
Establish a frame for collecting research and material on Croatian neo-Latin writers;
at the moment both research and editions are dispersed too widely. Publish on the
internet a corpus of some 2 million Latin words, by 130 Croatian authors writing in
Latin from the Middle Ages onwards. Prepare the corpus by digitizing some 15.000
pages of already published text (published in old, rare, or not easily accessible
editions) and 1000 pages from manuscript. Enhance this material by additional
proofreading, metadata, and markup (following the non-proprietary TEI XML standard),
and by other types of media. Support and encourage new scholarship, both in-depth
research (made possible by a computer-searchable corpus) and wide syntheses and
interpretations (but relying on more precise, more explicit and controllable data,
and therefore much more falsifiable). Remind people of material dimension of these
texts. Open the collection to other kinds of research, to a community of
researchers, both international and Croatian, from various disciplines (history, art
history, anthropology, etc).
File translated from
TEX
by
TTH,
version 3.72.
On 21 Mar 2006, 13:03.