The Vai writing system is a syllabary representing syllables and morphemes of Vai, a Mande language spoken by approximately 167,000 people in Liberia and neighbouring Sierra Leone [1]. What makes the Vai script especially interesting for scholars of both writing and cultural transmission is that it was invented by non-literates and has been continuously transmitted to the present day. It is unknown how many people are presently literate in the script, but for the period 1973–1978, Scribner and Cole [2] estimated that 20.3 percent of the adult male population in their fieldsite could read and write in Vai. In its present standard form, the Vai script comprises 205 individual graphemes.1


Created by between six and eight non-literate Vai-speakers in Liberia in about 1833, the Vai script represents the world’s best-documented emergent writing system. The surprising success of the script preceded, and to some extent inspired, the creation of multiple new indigenous writing systems across the West African region [3, 4, 5, 6, 7, 8] a phenomenon that continues into the twenty-first century [9].

Beyond Africa, scholars have long been intrigued by the Vai script, the circumstances of its invention, and its potential to illustrate cultural processes [notably 10, 11]. Later thinkers speculated that the evolutionary trajectory of the Vai script must have recapitulated the evolution of writing itself (for a summary see Kelly [12]).

A few thinkers have sought to substantiate these ideas by comparing the Vai syllabary across different time periods [13, 14, 15, 3]. However, their comparison charts selected only a handful of historical sources, or they compared a small subset of the syllabary. Svend Holsoe, late professor emeritus of anthropology at the University of Delaware and scholar of Liberia, began compiling a more expansive comparative chart but was unable to complete the work before he passed away [Charles Riley, pers. comm.]. To the best of our knowledge the present dataset includes every dateable manuscript source that is currently available in public archives.


One of the challenges in compiling the dataset has been in the interpretation of early sources produced by non-Vai visitors. These men did not always understand the phonology of the Vai language and thus assigned graphemes to incorrect syllables. Specific instances of this are noted in the data description below. As for the form of the graphemes, both P.E.H. Hair (cited in [3]) and Gail Stewart [16] expressed scepticism that the early sources produced by Europeans were faithful to the Vai script as it was used at the time. However, our chart shows that syllabaries elicited by outsiders are remarkably congruent with original text sources written by the Vai scribes themselves in the same time periods (see commentary below).

Our dataset was compiled by sourcing unedited digital images of Vai manuscripts, identifying individual graphemes and placing them in a table. The table contains the following fields to facilitate searching: manuscript name and date, unicode number, sound value (transliteration into the IPA), unicode transliteration (transliteration into unicode orthography), IPA sound value, and syllable ending (to allow sorting via codas rather than onsets). The field ‘corpus frequency’ requires a little more explanation. The corpus is that provided by Rovenchak, Riley and Sherman [17] and we have simply provided a cumulative figure for the number of times each grapheme is attested in the four texts that they compiled, as a proxy for general frequency. The letter ‘L’ in our chart indicates that frequency was too low to be included in their analysis.

Dataset Description

Script samples that are presented in the table were retrieved from a wide variety of published manuscripts. All sources we have included are reliably dated, except for the Ndole manuscript from the Houghton Library, Harvard: the archival metadata gives an estimate of ca. 1845. We exclude sources of ambiguous origin as well as those copied or adapted from other existing sources. Very short texts are likewise excluded. For instance, the final dataset does not feature the seven-character house inscription copied by Edwin Norris in Cape Mount in 1848 and the accompanying ‘Specimen of Ms.’ [18], nor the brief letter of 26 graphemes collected by Oscar Baumann [19]. We have also excluded the original texts compiled in Ellis [20] since they cannot be reliably dated.

The manuscripts

In this section, we describe the circumstances in which the collected script samples were first documented. This is an essential step aimed at enabling researchers to assess the relevance of the data while addressing their own specific questions.

Gail Stewart has argued that some of the characters from early European sources were corrupted in the process of their documentation. She writes:

The appearance of the early Vai script was familiar to the Liberian Vais who taught me the modern script in the 1950s, but they found it largely illegible and even rather humorous. […] What amused the modern Vais was in reality a European interpretation of their script: that is, handwriting by Vais had been redrawn by Europeans so that it could appear in print. When the original manuscript of the ‘Book of Rora’ was turned up in 1967 in the Houghton Library of Harvard University, and, in the same year, the two-page Forbes manuscript, also in pre-1850 Vai, was ‘discovered’ in the British Museum, it became obvious that the foreign copyists, with all good intentions, had stylized and distorted the early Vai script to the point of absurdity, and sometimes beyond recognition.

Now we know that the difference between the old script and the modern is not as great as was supposed. [16, p. 1]

The two ‘corrupted’ texts that Stewart specifically mentions are Rora [21] and Koelle [22]. Neither are included in the table since they do not represent instances of primary script documentation. However, our dataset clearly shows that early documentation on the part of Europeans, including Koelle [22], is in fact highly consistent with samples provided by Vai scribes themselves within the same period. The difficulty that Stewart’s colleagues had in interpreting examples of the script is more likely due to the higher degree of synchronic variation witnessed in the 1850s; many allographic variants were to be eliminated, and no doubt forgotten, by the 1950s.

For a fuller historical context of the surviving nineteenth-century Vai manuscripts see Tuchscherer and Hair’s exhaustive commentary [23].

American Board of Commissioners for Foreign Missions (1834). “New invented native alphabet of Western Africa Recd. April 18, 1834 from Messrs Wilson and Wynkoop,” MS Vai 1, page 1. Reproduced by permission of the Houghton Library Harvard University and the United Church Board for World Ministries. Reproduced in Tuchscherer, Konrad, and Hair P E H (2002). “Cherokee and West Africa: Examining the origins of the Vai script.” History in Africa 29: 427–486.

The earliest known sample of the Vai script is conserved at the Houghton Library at Harvard University. It was written by Fan Dawo Kelondo at the request of John Leighton Wilson. The text is only a page long and has not yet been translated, but due its historical importance our table has excerpted every single character. Thus there are thirteen instantiations of ꕭ ‹ga›, while other relatively common characters (such as ꕌ ‹ha›, ꘈ ‹mɛ› and ꘊ ‹ɲɛ›) are not recorded at all. This maximal extraction from the source allows researchers the fullest possible scope for cross-character comparison in subsequent sources.

Stewart, G [ca. 1845] (1972). “The early Vai script found in the Book of Ndole.” In Conference on Manding Studies: Congrés d’Études Manding, 1–27. London: School of Oriental and African Studies.

Stewart’s paper is an analysis of an original hand-written copy of the Book of Ndole, also known as the Book of Rora, authored by Kaali Bala Ndole Wano. Here she helpfully extracts a full syllabary from this original text, which she dates as 1850 or earlier; we are using the Houghton Library date of ca. 1845.

Forbes, F E [1849] (1851). “Despatch communicating the discovery of a native written character at Bohmar, on the Western Coast of Africa, near Liberia, accompanied by a vocabulary of the Vahie or Vei tongue.” Journal of the Royal Geographical Society of London 20: 89–101.

This syllabary was collected by Lieutenant F. E. Forbes in 1849 and communicated in a dispatch to the Admirality on 23 April of that year. The dispatch was later published in the Journal of the Royal Geographical Society of London. Forbes reports:

A lucky chance took me to a town called “Bohmar,” about 8 miles E. of Cape Mount, and there I met a man by the name of Mormorro Dualoo Wohgnae, a nephew of the King of Sugury, who possessed a manuscript and understood the language.

On this man consenting to live on board her Majesty’s ship, I undertook to arrange the inclosed vocabulary, having collected and classed all the characters his book contained. [24, pp. 90–91]

Koelle, S W (1849). Narrative of an expedition into the Vy country of West Africa and the discovery of a system of syllabic writing recently invented by the natives of the Vy tribe. London: Seeleys, Fleet Street; Hatchards, Picadilly; J. Nisbet and Co. Berners Street.

This complete syllabary was produced in a village in Cape Mount by both Mɔmɔlu Duwalu Bukɛlɛ, one of the principle inventors of the script, and by the German missionary S. W. Koelle, who wrote: ‘my landlord [Bukɛlɛ] began to copy his book. I, however, had to finish it, and Doalu Bukara [Bukɛlɛ] afterwards said to me, ‘White people can write better than black people; you must copy my book for me.’ I gladly accepted the offer […]’ [25, p. 19]. Thus, the Vai characters in this source were produced under the supervision of a Vai scribe even if they were not all written by Bukɛlɛ himself.

Koelle’s account of the Vai script features occasional inaccuracies when it comes to assigning syllable values to graphemes. Thus, the graphical signs for ‹l› syllable series are consistently misinterpreted as ‹d› sound values, i.e. ꕞ ‹la› is documented by Koelle as ‹da›, and ꔷ ‹li› entered his report as ‹di›. Similarly, unicode ꕔ ‹kpa› and ꕕ ‹kpã› are both transcribed by Koelle as ‹gba›, and unicode ꕭ ‹ga› as ‹ka›. The adjudication of such conflicting cases in are dataset was based on their graphical similarity to more reliable sources (e.g. the Ndole syllabary reconstructed in Stewart [16]; see above).

Payne, J S (1860). “Foreign missions of the Protestant Episcopal Church: Africa.” Spirit of the Missions 25: 365–383.

A short note of 102 graphemes reproduced by J. S. Payne in the newsletter of the Protestant Episcopal Church. Payne writes in his letter of 6 June 1860: ‘Inclosed I send you a note received by [the missionary] Mr. [A. D.] Williams from a chief in the interior, which was translated for him by a Vey youth living near Mr. Williams’ residence.’ [26, p. 369]. The translation is not provided in the text.

Creswick, H C (1868). “On the syllabic characters in use amongst the Vey negroes.” Transactions of the Ethnological Society of London 6: 260–263.

The circumstances and location in which Creswick collected his published syllabary are, unfortunately, quite unclear. Creswick wrote: ‘My remarks are confined to a mere recital of what I saw and heard whilst living amongst the Vey Negroes. […] Certain it is that old men now living, with whom I have conversed, remember the time of its creation; and it is from information received in this manner that I learnt the origin of this, the only native African orthography in existence.’ [27, pp. 260–261]. However, of interest is the fact that Creswick documents the re-emergence of Vai schools: ‘There are, moreover, schools where the children are regularly taught [the Vai script] by means of small black boards and chalk’ [27, p. 261]. This suggests a possible pressure for standardisation of the script at this time. Koelle and Migeod report that Vai was taught in purpose built schools from ca. 1835 but that these were destroyed in war eighteen months later [22, 28].

Delafosse, M [1889] (1899). Les Vaï: Leur langue et leur système d’écriture. Paris: Maison.

The first to comment on Vai as an evolving system, Delafosse makes important observations about Vai pedagogy and the high degree of individual variation in writing styles at the end of the nineteenth century, especially among competent scribes. He includes a table that compares the script as recorded by Forbes with the contemporary script. Only his ‘contemporary script’ is recorded in our dataset; his source is not explicitly stated but it may be Ghaīsama Sando, who provided a sample of Vai writing in the same book.

Massaquoi, M (1899). “The Vey language.” The Spirit of Missions 64: 577–579.

The chart produced by Momulu Massaquoi [29], a descendant of Bukɛlɛ, was published as part of a campaign to both reform and standardise the Vai script. To this end he invented new characters to fill what he took to be gaps, eliminated ‘ambiguous’ characters and introduced punctuation marks. This chart was to complement his efforts at introducing Vai into a local school (St. John’s in Robertsport). Interestingly, despite his reformatory zeal he retained a handful of acceptable allographs, presumably because they were already too well established in the scribal community.

Massaquoi, M [1899] (1911). “The Vai people and their syllabic writing.” Journal of the Royal African Society 10 (40): 459–466.

Of this chart Massaquoi writes: ‘I reproduce here the phonetic chart of this language compiled by the writer ten years ago, and published by the Board of Missions of the Protestant Episcopal Church of America in their monthly periodical, the Spirit of Missions’ [30]. However, this version is by no means a facsimile reproduction of the 1899 source: despite being penned by the same author there are numerous subtle variations. It includes a few interesting annotations on the suspected iconic origins of certain graphemes. These icons were discussed at some length by Klingenheben [31].

Johnston, H (1906). Liberia. Vol. II. London: Hutchison & Co.

Like Delafosse before him, Johnston was conscious that the Vai script had evolved considerably since its invention, and that the modern forms were ‘simplified and better adapted for cursive writing’ [14, p. 1115]. To show this he compiled a chart comparing the forms documented by Forbes and Koelle with ‘modern types of the letters as accurately as [he] could obtain them from Vai scholars’ [14, p. 1115]. It is these ‘modern types’ that are excerpted in our dataset.

Migeod, F W H (1909). “The syllabic writing of the Vai people.” Journal of the Royal African Society 9 (33): 46–58.

Migeod’s consultants for his syllabary are unfortunately not recorded. He does, however, make a significant observation about the variability of the syllabary in the early twentieth century:

With the creation of new characters based on error it can be seen that the compilation of a standard syllabary is a matter of great difficulty. In fact, it can hardly be said that one exists. Accordingly I cannot claim for the accompanying syllabary that it is either complete or exact. I am frequently coming across new signs, without, unfortunately, always having the means of assigning to them a value. [28, p. 51]

Klingenheben, A (1933). “The Vai script.” Journal of the International African Institute 6 (2): 158–171.

Klingenheben’s essay comments on a number of the earlier sources. We have extracted graphemes from the appendix of characters cited in the article as well as the text of Psalm 23, provided by Zuke Kandakai.

Stewart, G (1958). [manuscript chart] reproduced in Dalby, D (1967). “A survey of the indigenous scripts of Liberia and Sierra Leone: Vai, Mende, Loma, Kpelle and Bassa.” In African Language Studies, edited by Malcolm Guthrie, 1–51. London: School of Oriental and African Studies.

Dalby’s Vai chart is a comparison of the Koelle manuscript 1849 with the version provided by Gail Stewart in 1958, as well as the University of Liberia ‘standard’ syllabary of 1962 (see entry below). For our dataset we have extracted the graphemes provided by Stewart which are described as having been derived from ‘a manuscript chart of the characters most widely accepted by modern users of the Vai script, compiled by [Stewart] in 1958’ [3, p. 6].

Kandakai, Z, Johnson, J S, Moore B T and Massaquoi Fahnbulleh, F (1962). The standard Vai script. Liberia: The University of Liberia African Studies Program.

This document represents the first formal attempt to collaborate on a national standardisation project. It involved eleven consultants and a committee of four, headed by Zuke Kandakai [32]. The Standardization Committee also introduced new signs for ‘r’, ‘sh’, ‘sz’ and ‘th’. Contrary to Massaquoi’s 1899 standardisation attempt, no allographs are permitted: there is only one sign per syllable.

Scribner, S and Cole, M (1981). The psychology of literacy. Cambridge, Mass.: Harvard University Press.

Of their reproduced syllabary, Scribner and Cole write: ‘This is a modified version of the modern syllabary prepared at the University of Liberia by a group of indigenous Vai script experts and a foreign linguist’ [2, pp. 313–314]. However, it is clear that the chart is not a direct facsimile reproduction of the 1962 set from the University of Liberia. We assume that the discrepancies (or ‘modifications’) are in line with advice from Scribner and Cole’s informants. Interestingly, certain allographic variants are re-introduced here, suggesting that the University of Liberia standardisation campaign of twenty years earlier was not wholly successful.

Everson, M, Riley, C and Rivera, J (2005). “Proposal to add the Vai script to the BMP of the UCS.” Universal Multiple-Octet Coded Character Set International Organization for Standardization ISO/IEC JTC1/SC2/WG2 N2948R: L2/05-159R.

The most recent standardisation attempt is witnessed in the Vai unicode proposal of 2005. The Vai script was added to the Unicode Standard in 2008. Everson et al. explain the sources for the unicode version of Vai thus:

The primary sources for the Vai characters in the character set proposed are the 1962 Vai Standard Syllabary (which was a distillation of many sources specifying characters for modern use), modern primers and texts which use the Standard Syllabary (and a few glyph modifications reflecting modern preferences), the 1911 additions of Momolu Massaquoi, and the characters found in The Book of Ndole. Secondary sources, such as Johnston 1906 and Dalby 1967, are used as supplementary material and as checks for some of the archaic characters. [33, p. 2]

In summary, the 2005 set accepted by unicode is derived from the 1962 standard with the addition of ‘a few glyph modifications’, and Massaquoi’s additions. The graphemes are rendered in Dukor, the first Vai font to emerge from the unicode proposal.

Reuse Potential

The data compiled here have relevance to contemporary speakers and writers of Vai who wish to explore their cultural heritage and trace the history of their writing system. One of the most immediate practical advantages of the dataset is that it would allow manuscript historians to estimate the age of the many undated Vai manuscripts that are held in archives such as the Indiana University Liberian Collections and the Houghton Library, Harvard. Graphemes that have changed significantly, such as ꘂ ‹yɛ› and ꗞ ‹mɔ›, may well be diagnostic of specific time-spans in the history of the script. Beyond age estimates, the chart also provides an effective cypher for old manuscripts that may otherwise resist transliteration and translation on account of changes to the system. These changes include obvious alterations to graphic forms but also cases were graphemes disappeared from the system altogether and can no longer be interpreted by literate Vai.

At present there is a growing interest in so-called emergent languages, such as the Nicaraguan and Bedouin Al-Sayyid sign languages [34, 35], and mixed languages like Light Walpiri [36] and Gurindji Kriol [37]. Emergent sign languages have been developed ex nihilo by linguistic communities and are thus independent of any ‘parent’ languages and lineages. Mixed languages are also set apart because they involve a naturalistic re-engineering of existing linguistic structures to generate a new system. Since emergent languages (and to a lesser extent mixed languages) sit outside established language families, studies of these systems have the potential to reveal the spontaneous emergence of structure without the ‘noise’ of inheritance and contact. We contend that the Vai writing system has comparable value in tracing the evolution of graphic codes, a field of study that has so far been limited to laboratory settings.

Lastly, the independence of Vai from known script lineages may also allow help refine research into universals and variation in writing systems, e.g. [38, 39, 40, 41, 42]. We expect, for example, that documented changes to the Vai script will enrich discussions of how graphic codes make the transition from non-linguistic sign systems to full writing.

Limitations and Provisos

Although dated manuscripts may yet come to light in informal archives, our table is a compilation of dated Vai sources that have hitherto survived in the documentary record. It does not, and cannot, represent the complete history of the script. The full extent of variation and branching for certain graphemes, or sets of graphemes, may never be known. But while extinct grapheme lineages may be under-represented in our data, we can be confident that the graphemes attested from the 1960s onwards are those that survived selection pressures of earlier generations.