The Israel Academy of Sciences and Humanities founded the Hebrew Palaeography Project in 1965 upon the proposal of Malachi Beit-Arié and following the inspiration of Colette Sirat of the Institut de Recherche et d’Histoire des Textes, CNRS, Paris and in cooperation with the IRHT that adopted her initiative. Professor Gershom Scholem supported the establishment of the project and chaired its steering committee until the end of his life. The goal of the project was to locate all the medieval codices written in the Hebrew script, which contained explicit production dates or at least scribe names; to study and document all their visual and measurable material features and scribal practices in situ, i.e. in the libraries in which they were kept; and classify these features and practices in order to expose a historical typology of the hand-produced Hebrew book and provide users of Hebrew manuscripts with a tool for identifying the production region and assessing the period of the studied manuscripts. Indeed, since the initiation of the project, almost all the dated manuscripts that were located have been studied and documented in some two hundred and fifty libraries and private collections.

The project was sponsored by the Academy in Jerusalem, headed by Malachi Beit-Arié., It started to convert and store the accumulated multitude of bibliographical, textual, numerical, palaeographical data and images of selected pages of each manuscript into a computerised database already in the seventies of the twentieth century, at an early age of computing, using punch cards at the beginning. Over the years the dozens of fields and the hundreds of variables of the documentation expanded and engendered continuous changes in the database while successive transformations of computing technologies, culminating in the internet revolution, imposed many platform conversions The database was constructed by Eylon Meroz, who responded creatively to the project’s growing needs and goals, and accompanied the database inventively since its inception through all its stages. He and his team in the Meroz Software Systems developed a sophisticated retrieval system to meet our requirements of complex queries. The system assists in the identification of the provenance of the tens of thousands of undated manuscripts and assessment of their period based on shared combinations of codicological features and scribal practices and in comparison to their script. Moreover, it enables statistical processing which facilitate the setting up of a historical typology of the medieval Hebrew codex and contributes to historical research and textual criticism.

Many participated in the documentation and its processing and computerization and contributed to the accomplishing of its goals during the fifty years of the project’s activity. First and foremost Mordechai Glatzer, who greatly contributed to the construction of the early database and to the corpuses of the dated manuscripts published by the project; the late Lea Shalem, who documented many collections; Edna Engel, who was in charge of the processing of the script and Tamar Leiter, who contributed greatly to the development of the database and its consolidation and oversaw its conversion into the website. Others who contributed considerably were Jonathan Joel, Shimon Iakerson , Nurit Pasternak and Alexander Gordin. The French team, headed by Colette Sirat, took part in documenting a part of the collections and the two teams cooperated.

From its inception, and according to an agreement with the Israel Academy of Sciences and Humanities, the project was housed in the Jewish National and University Library and it continues to dwell there to this day in its transformation into The National Library of Israel. Furthermore, the website of SfarData is integrated in the website of the National Library and the conversion of the database was achieved thanks to the Library’s assistance mainly by the downloading of the images and managing them.

Brief eplanation
The codicological data and the interface are presented in Hebrew or English as selected by the user, but the textual data – names of the producers of the codex, colophons and most of the remarks are rendered in Hebrew alone. Similarly, the detailed some twenty-page field questionnaires (partly abridged), which were scanned and integrated within the manuscript records, were filled in handwritten Hebrew, though their form is printed in Hebrew, French and English. These documenting questionnaires also contain data that are not encoded; they are unavailable in the database and inaccessible in the retrieval system yet are valuable codicological and palaeographical attributes; Among them – text contents, notes, detailed descriptions, bibliographical remarks added over the years and peri-textual and para-scriptural graphic marks like the singling out of words and captions and various textual markers, decorations of catchwords, substitutes of the Tetragrammaton and line fillers. The tab images accesses selected pages of each codicological unit (or each scribe who participated in the copying) and its questionnaire. In addition, detailed descriptions of manuscripts included in the corpus of dated manuscripts in the libraries of France and Israel and in the corpus of all the dated manuscripts until 1200 that were published by the project, as well as entries of manuscripts included in the recent catalogues in which codicological and palaeographical identifications were based on the project - were scanned and made accessible. These publications are either in French and Hebrew, or in English:
* The database does not contain manuscripts, but codicological units that do not necessarily coincide with the entire manuscript. Many codices comprise several codicological units bound together -- manuscripts which were produced in different locations and periods and were copied in different hands and even in different types of script than the dated part of the codex. Even when a whole codex is dated (or its scribe’s name is indicated in it) and had been produced in a certain place and time, a variety of texts may have been added on the leaves and pages which were left blank by later owners, who could also completed its lacunae, or add a few quires. We isolated the copied part to which the colophon pertained and documented it only. The delimitation of codicological units is indicated only in a small part of the notes, but can always be found on the first page of the questionnaire. The same applies also to the delimitation of copied sections by scribes who shared the copying of the same codex. Each scribe of a codicological unit, in a manuscript that was copied by several hands, is documented in a separate record.

* The database contains several corpuses. The user should select one (or more) of the following:
Documented dated manuscripts – The main corpus that comprises the records of almost all the dated codices or parts of codices that were documented in the libraries that own them. Since each scribe of a multi-hand codicological unit has a separate record, it is essential, while searching, to distinguish between features which are shared by all the scribes of a codex, such as date and place, destination of the copying, the text subject, writing material and dimensions and to neutralise the number of hands by selecting □ main scribe, which will reduce the search to the record of the scribe who wrote the colophon. In searching scribal practices such as the ordering system of the codex, ruling techniques, line management etc. there is no need to reduce the corpus.
Undocumented dated manuscripts – Manuscripts that were not studied in their libraries are recorded briefly and partially according to microfilm copies kept by the Institute of Microfilmed Hebrew Manuscript at the National Library of Israel. Manuscripts of which no microfilms are available and which were not located - are recorded according to catalogues or published studies.
Documented undated with identified scribe – The corpus contains manuscripts with undated colophons that include other data, such as the scribe's or the patron's name and sometimes the geographical locality. The dating of these manuscripts has been suggested by us and is based on dated manuscripts written by the same scribe. Similarly, the dating of manuscripts without colophons but written by a scribe we have identified, is based on dated manuscripts written by that scribe. Many manuscripts without colophons are included in this corpus since the name of the scribe can be found highlighted in the copied text, mostly at the beginning or end of lines or by way of an acrostic. A selection of very early manuscripts was included, despite the absence of a colophon or an identified scribe, due to the importance of the manuscripts.
Undocumented undated with identified scribe – The manuscripts were partially recorded based on microfilms.
All qualified records – All the above corpuses.
Disqualified – Manuscripts with colophons that were disqualified, primarily because of their dates.

* The manuscripts are classified according to six codicological-palaeographical geocultural types:
Orient – Egypt, Palestine, Syria and Lebanon, Eastern Asia Manor (East Turkey), Iraq, Iran and regions in Central Asia such as Uzbekistan and Bukhara.
Yemen – independent sub-type within the Orient.
Sefarad – The Iberian Peninsula, Provence and Bas Languedoc (Occitania), North Africa and Sicily.
Ashkenaz – German-speaking regions that were part of the Holy Roman Empire and its bordering regions such as Bohemia and Moravia and later Poland, France (excluding the South) and in England.
Italy – The Italian Peninsula.
Byzantium – Greece, the Balkans, Western Asia Minor and regions surrounding the Black Sea, territories which constituted part of the Byzantine Empire before its decline.

Map of the main division of the geocultural types of the Hebrew scribal practices and book script