State-of-the-Art: Esperanto Linguistics

by Asya Pereltsvaig

Esperanto Linguistics, the linguistic study of Esperanto itself, its structure, internal history, and acquisition are understandably at the core of Esperantic Studies. Esperanto is unusual in many ways: it has been originally created artificially, in a highly multilingual environment. Moreover, it was designed with the expressed purpose of becoming a language of interlingual communication, a language easy to learn for people from the widest range of linguistic backgrounds. Although it never became a universal lingua franca, Esperanto now has up to 2 million users, according to the Ethnologue (see also Wandel 2015), and a sizeable number of native speakers (about 1,000 native speakers, as reported in, e.g., Versteegh 1993, Corsetti 1996, Bergen 2001, Lindstedt 2006). Yet even for such native speakers, Esperanto virtually never is their only language. Its use is limited to certain domains, and for the great majority of the speakers, including native ones, Esperanto is not the dominant language. These facts may make Esperanto and Esperanto speakers useful in tests of the robustness of generalizations about linguistic typology, Universal Grammar, first and second language acquisition, language contact and creolization, variation and change.

However, relatively very little linguistic (including psycholinguistic and sociolinguistic) work has been done to date on these topics. Currently existing descriptive grammars of Esperanto are mostly learner-oriented; for rare exceptions, see Wells (1989), in Esperanto; Gledhill (2000), a corpus-based descriptive grammar of Esperanto; and Jansen (1999), a popular scientific introduction to Esperanto. Only selected aspects of Esperanto grammar received attention in general linguistics literature, including its parts-of-speech system (Jansen 2013a), degree words (Dasgupta 1989) and adverbs (Dasgupta 1987), inflectional morphology (Li 1996), tense and temporal expressions (Dankova 1997, 2009, 2015), and reflexives (Jansen 2012, 2013b). The question of expressing aspectual distinctions using active and passive participle, the so-called –ata/-ita problem in Esperanto, is discussed in Fischer (2014). Another grammatical issue in Esperanto, with implications for linguistic politics, is that of gender: while grammatical gender concerns personal nouns, pronouns, and word-formation means, it has also provided fodder for debates on gender-fair language use, in Esperanto as in other languages spoken in Western countries. A number of reform proposals have been brought forward in response to objections from inside the Esperanto community and the feminist language movement in Western countries; yet, these proposals have not been adopted widely because the objections are not in agreement with the standard use of the language, as discussed in Fiedler (2014).

The phonology of Esperanto is even less well-studied than its morphology and syntax; as noted in van Oostendorp (1999: 52), the completely agglutinative nature of Esperanto, with “no allomorphy, no fusion, and … no assimilation or dissimilation rules” makes it uninteresting to many phonologists. However, van Oostendorp goes on to show that the phonological description of Esperanto is more illuminating than most phonologists gave it credit for. For example, the appearance of /v/ in onsets and rhymes indicates its dual status as an obstruent and sonorant. Moreover, van Oostendorp shows that Esperanto obeys the Sonority Sequencing Principle, the Principle of Maximal Differentiation, and the Principle of Prosodic Licensing, as well as other principles of syllabification, found in other languages as well. Still, van Oostendorp (1999: 77) notes that although “the Esperanto syllable structure is of course very similar to that of Indo-European languages, more in particular to that of Romance and Germanic languages”, which “is not surprising, given the fact that most of the morphemes are borrowed from these languages”, but it does not match precisely the syllable structure in any given Indo-European (or other) language (“the phonology of Italian comes close, but also this is still different”, writes van Oostendorp). It is, thus, interesting to consider where those patterns came from; as van Oostendorp (1999: 77) points out, “there are no reasons to assume that the details of phonological structure were of primary concern to [Zamenhof or other Esperanto pioneers]”.

(A treatment of Esperanto in the framework of Functional Discourse Grammar is found in Jansen 2015.)

While in some respects Esperanto can be treated linguistically as just any other language, a number of questions are raised by the unique nature of Esperanto as a constructed language for interlingual communication, particularly in relation to the typology of natural languages. Does Esperanto exhibit the linguistic properties and clusters of properties predicted by typological generalizations? Or does it have unexpected features, which may be a result of its peculiar history and use? For example, it has been pointed out (Sherwood 1982) that the accusative marker –n in Esperanto (see also Bergen 2001) may be seen as redundant because of the predominant nature of the fixed SVO word order (for a detailed discussion of word order in Esperanto, see Jansen 2006, 2007, 2008, 2009). Sherwood notes that “many fluent speakers drop the accusative marker in conversation”; yet, it has not disappeared from Esperanto entirely. Is this persistence of the seemingly redundant accusative marker due to language-internal or language-external factors? The explanation, put forward by Sherwood (1982: 6) ascribes the persistence of the accusative marker to a language-external “social contract on the untouchability of the basic core of Esperanto” (this “social contract” is also discussed in Piron 1989, Pabst 2014). However, alternative, language-internal explanations remains to be explored (and ruled out, if the “social contract” theory is to be proven). For example, it is possible that the omission of the accusative marker in the speech of L2 Esperanto users is parallel to the omission of inflectional morphology by L2 speakers of other languages. Alternatively, it is possible that the presence of accusative case marking in a language need not correlate with free word order. A quick examination of typological databases, such as World Atlas of Linguistic Structures (, shows that numerous languages combine strict word order and case marking: Japanese, Tatar, and Chechen are merely some illustrative examples. While the coexistence of strict word order and case marking is particularly typical for Subject-Object-Verb (SOV) languages, such as the ones mentioned above, it also happens in some Subject-Verb-Object (SVO) languages, especially those with predominant but not fixed SVO order, like Russian. It therefore seems that a typological explanation for the persistence of the accusative marker in Esperanto is easily available and needs to be explored further. (Another study that examines Esperanto from the typological perspective is Comrie 1996.)

The unusual nature of Esperanto and its relation to the typology of natural languages raises another issue: since Esperanto was created with the intention of making it a means for interlingual communication, it was designed to be as simple as possible for speakers of (potentially all) other languages to learn. But is it? This issue is considered in detail in Charters (2015). It should be noted, however, that while there is no objective metrics to measure the ease of learning a given foreign language (cf. Pereltsvaig 2011), many scholars agree that the closer a given language is to the learner’s native language, the easier it is to learn. (An alternative view is argued in Pool 1991: 83; with reference to Lenneberg 1957, he writes: “while an artificial language can be more similar to one group’s native language than to another and thereby easier for one group than for another group to learn, this effect appears to be minor; instead, regularity is the overwhelming determinant of learning effort”.)

Thus, theoretically, a language would prove to be the easiest to learn if it is the most like other languages, i.e. the most typologically neutral. However, its creator Ludwik Zamenhof did not have the benefit of knowing about linguistic typology as much as we know about it today. He did not have typological databases, such as the abovementioned World Atlas of Language Structures, at his fingertips to refer to, as dis for example Paul Frommer and David J. Peterson, the creators of Na’vi (Avatar) and Dothraki (Game of Thrones), respectively. Because these inventors of sci-fi languages wanted to make their creations simultaneously human-like enough to be manageable for human actors and also alien enough to fit the genre, both Frommer and Peterson looked for sounds and structures that are within bounds of what is possible in a human language, but are at the same time fairly exotic in languages of the world. Zamenhof’s goals were in a way exactly the opposite: to create a language as “un-exotic” as possible. As Zamenhof developed the basics of Esperanto without the benefit of the contemporary typological knowledge, a fascinating issue is whether he succeeded in intuitively producing a language that is typologically neutral. Theoretically, typological neutrality can be defined as those properties or sets of properties that are most commonly instantiated by natural languages. One attempt at measuring typological neutrality (and hence “simplicity”) of Esperanto is found in Koutny (2015). However, measuring typological neutrality in practice turns out to be a tricky business, as can be seen from the discussion in Schnoebelen (2013) and Pereltsvaig (2013).

The question of typological neutrality can be narrowed to one continent: Esperanto is often “accused” of being “too European” (e.g. Allerton 2002: 150). Since Zamenhof was familiar with mostly languages of Europe and Southwest Asia, but not the languages indigenous to Australia, Oceania, sub-Saharan Africa, or the Americas, it is only natural to expect that his creation will be biased by the features typically found in European languages but perhaps “exotic” or “weird” from a wider typological perspective. This issue is examined in Parkvall (2010), where Esperanto is considered against the background of typological properties catalogued in the World Atlas of Language Structures, and it is concluded that “Esperanto is indeed somewhat European in character, but considerably less so than the European languages themselves” (p. 63).

Another important area of Esperantic linguistics concerns Esperanto as a native language. Although most people who speak Esperanto learn it as adults, around 1,000 native speakers have been reported (e.g., Versteegh 1993, Corsetti 1996, Bergen 2001, Lindstedt 2006). Documented cases of families using Esperanto with the children date back to 1919. In 1995, a coordinating body, Rondo Familia, was founded, and today approximately 350 families speaking Esperanto to children are documented (Corsetti 1996). Esperanto thus provides an excellent opportunity for linguists to examine what happens when a language “goes native”. Received wisdom among linguists holds that when a language acquires native speakers, it undergoes certain changes that bring it in line with Universal Grammar. One may thus ask whether such changes are observable with Esperanto-speaking children. However, Lindstedt (2006) argues that this is not so, showing that changes introduced by native speakers are typically due to one of three causes: transfers from the child’s other native language, differences between the spoken and written register of Esperanto, or incomplete acquisition. Lindstedt (2006: 47) concludes that “Esperanto has already been adjusted to the requirements of language universals or Universal Grammar in the process of being used by non-native speakers, which is why native speakers need not introduce immediate changes into it”.

First language acquisition of Esperanto, particularly of its phonology, is also studied in Versteegh (1993) and Bergen (2001). The latter, in particular, studied the naturally produced speech of eight children acquiring Esperanto as their L1 (i.e. they were being raised in households in which one parent addressed them primarily in Esperanto), ranging in age from six to fourteen years and speaking a variety of adstrate languages: Hebrew, Slovak, French, Swiss German, Russian, and Croatian. Bergen found that native Esperanto exhibits the following effects: loss of the tense/aspect system, phonological reduction, emerging predominance of the SVO order, loss or modification of the accusative case, pronominal cliticization, and changes in the placement of stress. As these effects are shared by children speaking different adstrate languages, they cannot be ascribed to transfer from the speaker’s other native language (except for the shifts in lexical stress, which seem to be affected by the stress system of the speaker’s other native language). In particular, Bergen (2001: 579-580) found that the loss of the aspectual system is observable in the speech of Russian- and Slovak-speaking native Esperanto speakers, despite those languages having a rich and complex system of aspectual marking. This loss of aspectual marking in native Esperanto cannot be ascribed to incomplete acquisition either since, as shown in Pereltsvaig (2004a, c), bilingual children undergoing incomplete acquisition (or “attrition”) of Russian retain some aspectual marking but use it to encode lexical rather than grammatical aspect. In contrast to the aspectual system, the loss of the accusative marking can be ascribed to interference from the speaker’s other native language, argues Bergen (2001: 583-584), as “while the Slovak … and Russian … speakers use the accusative most closely mirroring SE …, the French speaker … and the Hebrew speakers … are the least standard…”. The actual contexts in which different speakers use the accusative marking differ from speaker to speaker, ranging from the most idiomatic (i.e. saluton ‘hello’) to nearly flawless retention “show[ing] some minor ‘mistakes’, due most likely to inattention (or conversely, concentration on the task they were asked to perform)” (Bergen 2001: 585). (The influence of the speaker’s other native language on the acquisition of Esperanto is also studied in Nagata and Corsetti 2005 and Krägeloh 2009.)

The emergence of native Esperanto makes it a rare test case for theories of creolization. (Another widely discussed test case is Nicaraguan Sign Language, whose pidgin-to-creole development attracted much attention from linguists; see discussion in Pinker 1994, and elsewhere.) One theory, once universally accepted (Hall 1966, Bickerton 1983) but now questioned by some (Mufwene 2001, 2003, 2008, DeGraff 2009) yet still believed by most scholars (McWhorter 2014a, inter alia), is that creoles emerge when pidgins become the dominant language of a group and thence the mother tongue of a first generation of children. Comparing Esperanto to creoles, therefore, proves to be a fruitful avenue of research; see Parkvall (2008), which considers the place of the virtually exception-less Esperanto grammar in the continuum of language complexity (see also McWhorter 2009). Whether Esperanto grammar is indeed more regular than the grammars of natural languages, though assumed by Parkvall and many other researchers, is an empirical question that needs further research.

Esperanto is also compared to creoles in Bergen (2001), who notes that Esperanto differs from creoles in that “the greater part of [native Esperanto] speakers have parents of the same linguistic background (Corsetti 1996), thus not leading to the use of Esperanto as an emergency code … and that the Esperanto of parents is not very much like a pidgin language” (p. 587-588). (Parenthetically, I must note that Esperanto shares with many pidgins a curious property of overtly marking words for part-of-speech: all Esperanto nouns end in –o, all adjectives end in –a, all adverbs end in –e; similarly, in Russenorsk, for instance, nouns end in –a, whereas verbs end in –om; cf. Pereltsvaig 2012: 234.)

An explicit comparison of Esperanto and creoles is also to be found in Heil (1999), reviewed in Haitao (2001). In particular, Heil compares Esperanto (and two other planned languages) to three French-based creoles (those of Mauritius, Reunion, and Haiti) and shows that Esperanto exhibits a lesser degree of grammatical reduction than the creoles under consideration.

While issues related to its nativization and creole-like nature bring Esperanto into the realm of contact linguistics, the range of potential issues concerning the mixed-through-contact nature of Esperanto is considerably broader. Esperanto was created by a man who was himself a fluent speaker of several languages and was deeply immersed in the multilingual world of the late 19th-century Eastern Europe, where Slavic languages—Russian, Belarusian, Ukrainian, and Polish—were spoken alongside Lithuanian, Yiddish, Romani, Karaim, and other languages. Virtually all speakers of Esperanto, including its native speakers (see above), are bilingual or multilingual, and speak a wide array of languages. It is therefore to be expected that interferences from these languages can be unwittingly brought into Esperanto, and indeed it has been shown to be the case (cf. Lindstedt 2006). This makes Esperanto an ideal test case for the study of language contact. This issue is further explored in Lindstedt (2009), who considers Esperanto from the perspective of contact linguistics and shows “that early Esperanto can be fruitfully discussed as a contact language which arose partly spontaneously, and which exhibits substratal traces of its Jewish and Slavonic background” (p. 1). The connection between Esperanto and the Jewish world of Eastern Europe in discussed further in “State-of-the-Art: Esperanto History”).

Another interesting issue, which only recently came into focus in contact linguistics, is the gender bias in language contact. Recent genetic studies (e.g. Forster and Renfrew 2011) examined genetic signatures in the Y-DNA and mtDNA (passed along male and female line, respectively) from individuals in communities that seem to have arose from sex-specific migrations and observed a correlation between language and Y-DNA. For example, Icelanders trace their male descent to the Norsemen, but the contribution of Norse women to their mtDNA is comparatively modest; instead, a significant influx of mtDNA from Ireland has been documented (see Goodacre at al. 2005). The language of this mixed community—founded by male Vikings marrying Irish maidens—is a descendent of Old Norse, not of Irish Gaelic. The language of the men “won over” the idiom of the women. Forster and Renfrew (2011) found a similar pattern in coastal Papua New Guinea, where Austronesian languages co-exist with languages autochthonous to the region: “the Polynesian mtDNA level (40-50%) is similar in these areas regardless of language, whereas the Y chromosome correlates strongly with the presence of [Austronesian] languages”. The same patterns are found in India, the Americas, and Russia (see also Malyarchuk et al. 2004): the language of the men becomes the language of the entire mixed community. Yet, the women leave a trace in the resulting language as well, often in the form of deep grammatical patterns that are subconsciously introduced by female non-native learners and to which children are predominantly exposed. This source of contact-induced change has been explored in the case of English in McWhorter (2009) and for Yiddish in Pereltsvaig (2015). Esperanto fits into this paradigm in an interesting way: according to Corsetti (1996), in virtually all of the documented bilingual families that bring up children in Esperanto, it is the language spoken to the child by the father. It would be interesting to examine what effect this gender-biased parallel acquisition has on native Esperanto.

The acquisition of Esperanto by adult speakers of other native languages or by children acquiring it alongside another language makes native Esperanto interesting from the perspective of bilingual language acquisition. Most research in that field is focused on the parallel acquisition of two “ethnic” languages (e.g. parallel acquisition of English and French in Canada) or acquisition of a sign language alongside an oral one. Acquisition of Esperanto alongside another “ethnic” language provides a different set of conditions to test theories of bilingual acquisition because it differs from both types of parallel acquisition mentioned above. Unlike parallel acquisition of two “ethnic” languages, Esperanto is typically acquired as a non-dominant language whose domain of application is rather limited. In this, it is similar to a sign language; however, unlike the latter, acquiring Esperanto does not involve a distinct medium (sign vs. oral). While one might expect that parents decide to speak Esperanto to their infants because that is the only language that the two parents have in common, this turns out to be a relatively rare situation. According to Corsetti (1996), only about a third of these families has parents from different ethno-linguistic background, while in the remaining two-thirds of the families the parents come from the same background and the bilingual acquisition involves Esperanto alongside only one other language, shared by both parents.

As Esperanto grammar is extremely regular, one may ask whether the normal patterns of regularization apply to (bilingual) child acquisition of Esperanto. Monolingual children who acquire natural “national” languages, such as English or Russian, have been noted to extend the application of grammatical rules or default grammatical morphemes. Thus, small children acquiring English know that the plural of wug is wugs; they also produce forms like “childs” and “mans”. Similarly, Russian children extend the use of “regular” (or at least more frequent) case endings such as –ov for the genitive plural (e.g. over-regularized jablokov for jablok ‘apples.GEN’). Do children acquiring Esperanto produce similar over-regularizations, and if so, when and why? These issues are addressed in Corsetti, Pinto & Tolomeo (2004), which reports the results of a study of diaries kept by Esperanto-speaking parents, tracing the development of five children, one to five years of age, who were brought up speaking Esperanto as one of their two or three (European) mother-tongues.

The emergence of Esperanto as a native language and its acquisition from parents who themselves do not speak it natively make Esperanto comparable to the so-called “revival” of Modern Hebrew. Both languages emerged as artificial creations, refashioned chiefly out of European languages (in the case of Esperanto) and Biblical Hebrew (in the case of Modern Hebrew). Both the creator of Esperanto Ludwik Zamenhof and the “reviver” of Hebrew Eliezer Ben-Yehuda came from in the same linguistic and cultural milieu: the two of them were born only one year and 300 miles apart. (Zamenhof was born in 1859 in Białystok, and Ben-Yehuda was born in 1858 in Luzhky, Vitebsk Gubernia). Both were native speakers of Yiddish, and both also spoke Russian, Hebrew, German, French, and Lithuanian to different degrees of fluency. Although their declared goals were different, they were both influenced by the same political and ideological developments that split Eastern European Jewry into the followers of international socialism and those of Zionism. The two movements and their leaders “regarded the problem of Jewish identity as inseparable from the question of language” (Berdichevsky 2014: 34). (Parallels between Zamenhof’s and Ben-Yehuda’s philosophies and work are explored in Tonkin 2015.) The acquisition of Esperanto can therefore be fruitfully compared to the acquisition of Modern Hebrew in its early years; cf. Versteegh (1993), Corsetti (1996).

Another area where the study of Esperanto might inform—and in turn, be informed by—the study of Modern Hebrew involves direct and indirect borrowing of elements from Hebrew into Esperanto, touched upon by Berdichevsky (2014: 34-35; cf. also Berdichevsky 1986, 2007). For example, Berdichevsky suggests that the regular nature of Esperanto word formation might have been inspired by the non-concatenative nature of Hebrew morphology. One of his examples involves words for ‘king’, ‘queen’, ‘monarchy’, and ‘royal’, which in English all derive from different roots (‘queen’, for instance, derives from the PIE root meaning ‘woman’), whereas in both Hebrew and Esperanto the corresponding words are derived from the same root, using the root-and-pattern morphology and suffixation, respectively: melekh, malka, malkhut, malkhuti in Hebrew, reĝo, reĝino, reĝeco, rega. Regular word formation based on a relatively small number of roots is particularly characteristic of Esperanto and Hebrew in such areas as masculine-feminine noun pairs (cf. the words for ‘king’ and ‘queen’ above) and causative and passive verb formation. Berdichevsky (2014: 37) claims it to be “likely that Zamenhof copied these features from Hebrew and incorporated them into Esperanto”, although in reality it might be that the “copying” was mostly subconscious. Another intriguing, yet hardly provable, idea suggested by Berdichevsky is that the regular nature of Esperanto was inspired by “Zamenhof’s “Litvak”-rationalist frame of mind” (p. 34) (see also “State-of-the-Art: Esperanto History”). (On the parallel borrowing of elements from Yiddish into both Modern Hebrew and Esperanto, see Biró 2004.)

But from the sociolinguistic perspective, Esperanto and Modern Hebrew differ in a crucial way: the latter but not the former has clearly become an “ethnic” language, so much so that some scholars (Zuckermann 2003, and his later work) refer to it as “Israeli”. Has Esperanto become an “ethnic” language as well, by creating a de facto ethnic group out of its speakers? This issue is considered in Fettes (1996). Instead of contrasting Esperanto and “ethnic languages”, as most scholars do (e.g. Duličenko 1989), Fettes examines the sociolinguistic aspects of Esperanto-speaking community and hypothesizes that it can be considered a quasi-ethnic minority. Gledhill (2014: 323) takes the opposite view, pointing out that unlike in the case of ethnic languages, with Esperanto “there is … hardly any ‘instrumental’ or utilitarian reason to learn [it]: there is no community of native speakers to look to, or mass-media to follow; neither is there any legal code, territory, state authority or state-backed education system to defend or promote the language”. (This statement is not entirely accurate as there is some Esperanto media presence, as pointed out to me by Esther Schor.) According to Gledhill, “the main motivation for learning Esperanto is ‘integrational’, that is to say one actively seeks out to learn the language for cultural, ideological or psychological reasons, rather than financial, geographic, professional or other often coercive reasons”. A number of sociological and psychological studies (see Forster 1982, Edwards 1994, Stocker 1996, Fiedler 2006, Yaguello 2006, Okrent 2009) revealed the profile of a typical Esperanto speaker: “anti-conformist, well-educated, speaks several languages, often declares contrarian or left-leaning values such as ‘internationalism’, ‘humanitarianism’, ‘green politics’, and has sympathy for issues such as minority and regional language rights” (Gledhill 2014: 323). Yet, other studies (cf. Blanke 1985: 288; Jordan 1987: 111-114) have shown that “Esperanto speakers differ greatly among themselves in political ideology” (Pool and Grofman 1989: 147).

Unlike most ethnic groups, the global Esperanto community consists of native speakers of different languages. Yet it is not unique in this respect: Jews and Roma too are globally spread ethnic groups that do not necessarily share a common native language. Given that Esperanto is used by speakers of different mother tongues, one might wonder whether interferences from speakers’ first languages would splinter Esperanto into dialects and ultimately, if Esperanto is ever used on a large enough scale, into mutually unintelligible varieties (as has been the case with English; cf. Kachru 1985, Schneider and Kortmann 2004, Kachru, Kachru, and Nelson 2009, Ostler 2010: 31-62, inter alia). This issue is addressed in Sherwood (1982), who argued that Esperanto is not subject to the same principles and historical models as “ethnic” natural languages that serve primarily as native languages because Esperanto is “spoken mainly as an auxiliary second language”. Instead, Sherwood notes that “there has emerged an agreed-upon norm, despite the geographical dispersion of Esperanto speakers”.

The question of standardization of Esperanto, that is development of official, deliberate, prescriptive norm, has been considered in a number of works in general linguistics (Blanke 1985, Savatovsky 1989, Schubert 1989, Fiedler 2006, Burkina 2009) and among Esperantists (Duc Goninaz 1984, Piron 1986, 1989, Wells 1989). Several studies focus on the standardization of Esperanto grammar (Sherwood 1982, Duc Goninaz 1984, Dasgupta 1989, Schubert 1989, Dankova 1997, Gledhill 2000, Jansen 2012, 2013c); others on the standardization in phonology and pronunciation (Burkina 2009). Another fruitful yet under-researched area of study is examining the standardization of Esperanto as reflected in its phraseology (Dasgupta 1993, Fiedler 1999, 2007, 2015, Gledhill 2008, 2010, 2014). Codified Esperanto norm is reflected in prescriptive grammars and dictionaries (e.g. Kalocsay and Waringhien 1985, Duc Goninaz 2002, Wennergren 2005); Blanke (1989) focuses on technical terminology, covering various areas of science and technology.

A related area of Esperanto language research concerns humor: a wealth of literature has been written on language- and culture-dependent humor based on ethnic languages, yet little work has been done of Esperanto humor. However, the linguistic and historical peculiarities of Esperanto make it an interesting case for linguists and specialists in Humor Studies alike. The highly regular and agglutinative nature of Esperanto morphology make it particularly suitable for a certain type of pun: as discussed in Jordan (1988), many longer Esperanto words can be interpreted morphologically alternatively as a longer root + ending, or as a combination of two shorter roots + ending, or even as a shorter root + suffix + ending. The possibility of different morphological analyses of the same word creates room for imaginative puns and other types of language-dependent humor. Another (socio‑)linguistic property of Esperanto that makes it ripe for a certain type of humor is the fact that the majority of Esperanto speakers are non-native adult learners whose levels of mastery and fluency can be quite different. As a result, Jordan (1988: 145) notes that “many [Esperanto speakers] see a freshness in rather simple jests that would seem hackneyed in other languages, but […] many of them also fail to appreciate some attempts at humor because they do not understand them”. Moreover, the more fluent speakers may “enjoy involuted linguistic byplay simply because it is involuted and thus demonstrates mastery of the language”. Such language-dependent humor must be distinguished from culture-dependent humor, which depends not on the properties of a given language per se, but on the shared culture, allusions to known people and events, and the like. This topic is discussed further in “State-of-the-Art: Esperanto History”.

Last but not least, Esperanto plays an important role in the study of Second Language Acquisition (L2 acquisition, or SLA). Although much of the L2 acquisition research focuses on adult learners and on naturalistic acquisition, L2 acquisition in other contexts—particularly by children and in a classroom setting—is vital both from a theoretical perspective and from the standpoint of real-world pedagogical implications. Recent research with children learning a second language in naturalistic settings shows that they are able to reach higher levels of proficiency than older learners “provided that intensive exposure to the L2 continues over a considerable number of years” (Roehr-Brackin 2013: 19; cf. Birdsong 2006, Hyltenstam and Abrahamsson 2003). Yet, when it comes to classroom learning, younger children do well in an immersion program, but not in a typical classroom, where learning is limited to an hour or two a week. Research shows that in such contexts, “later starters consistently outperform younger starters on measures of L2 achievement, although there are indications that children who start learning an L2 early tend to have more positive attitudes towards language and language learning than children who start later” (Roehr-Brackin 2013: 19-20; cf. Harley and Hart 1997, Cenoz 2003, García Mayo 2003, Larson-Hall 2008, Muñoz 2009). Roehr-Brackin (2013: 20) suggests that “the most likely explanation for this phenomenon is the more advanced cognitive development of older children and adolescents, and the full cognitive maturity of adults”, which “facilitates L2 learning in the typical language classroom, characterised by small amounts of input such as one or two hours a week distributed over a school year, because it allows for effective explicit learning”. Still, younger children are not entirely incapable of explicit learning: “children begin to display metalinguistic awareness from around age 4 onwards, with metalinguistic abilities developing most visibly from around age 6 or 7” (ibid; cf. also Birdsong 1989, Karmiloff and Karmiloff-Smith 2002, Milton and Alexiou 2006). Thus, Roehr-Brackin (2013: 20) notes, “if young children’s budding metalinguistic awareness and their developing capacity to learn explicitly could be enhanced, their classroom-based L2 learning could potentially be made more successful”, even when it is limited to one or two hours a week over the school year. It is particularly applicable in a real-world school context, where “children often come from very varied linguistic and cultural backgrounds; moreover, it is far from clear which foreign language would be most advantageous for them to learn” (Greatrex 2013: 97). (Roehr-Brackin and Tellier’s project focuses on primary schools in the UK, and their main study is conducted in Manchester, with additional Springboard to Language programs in Germany and Hungary, but conditions similar to those described in the quote from Greatrex above obtain also at elementary schools in the U.S.)

This is where Esperanto comes in: as Tellier (2013a: 11) notes, “for young children, who delight in taking things apart and putting them back together, Esperanto is a de luxe construction kit for language learning”. Esperanto’s direct phoneme-grapheme correspondence, morphological “simplicity” (but see above) and regularity, and lexical similarity to many European languages potentially make it suitable as a “starter language” that can be learned by young children before, or in parallel with, learning another language in order to increase metalinguistic awareness, thus contributing to the development of the capacity for explicit learning, and to develop positive attitudes to language and language learning (for similar claims, see also Fonseca-Greber & Reagan 2008). Although the idea had some early proponents (cf. Picard 1911, Eaton 1928), surprisingly little research has been conducted until recently to investigate the propaedeutic value of Esperanto (i.e. the beneficial effect of using Esperanto as an introduction to foreign language study on the teaching of subsequent foreign languages). As discussed in Tellier (2013a: 12), early studies in this area (e.g. Symoens 1992, Fantini and Reagan 1992, Corsetti and La Torre 1995) suggest that “teaching and learning Esperanto as a starter language in the primary school gives advantages to primary-age children that a study of other languages at primary level does not” (Tellier 2013a: 12). Moreover, Tellier and Roehr-Brackin (2013b: 99) argue that although early exposure to Esperanto may not offer “any (long-term) advantages in terms of metalinguistic ability that could not be attained equally well via input in other L2s”, it may serve as “a ‘gentle’ introduction into metalinguistic thought and problem-solving, [through which] children’s willingness and ability to analyse language and treat it as an object of reflection may be honed, resulting in cognitive as well as affective benefits”. They also suggest that early exposure to Esperanto “may have had a lasting levelling effect, making children of different abilities more equal in terms of metalinguistic awareness” (ibid); in other words, pupils of lesser ability gained confidence and improved their literacy skills. More generally, the ongoing project by Karen Roehr-Brackin and Angela Tellier (e.g. Tellier 2013a‑c, Tellier and Roehr-Brackin 2013a, to appear; see also Chapter 8 of Tellier 2013b for a summary of their recent research and Greatrex 2013 for a review of Tellier 2013b) pursues the twin aims of conducting a detailed, long-term empirical study of the propaedeutic effect of Esperanto in the context of the UK primary schools and of drawing the attention of primary school educators and policy-makers to this results of this research.


Several fruitful areas for future research emerge from the above discussion. One such area involves examining Esperanto from the typological perspective. Cross-linguistic typology, while by no means a new field of study, has received in recent years a renewed interest among both descriptive and theoretical linguists. A new field of Formal Generative Typology has emerged, primarily from the work of Mark Baker (see Baker 2010), who advocates “combin[ing] a formal-generative perspective on language, including tolerance of abstract analyses, with a typological focus on comparing unrelated languages from around the world” and argues that “this can be a powerful combination for discovering linguistic universals and patterns in linguistic variation that are not detected by other means”. As linguists discover new universals, as well as new typological trends, the place of Esperanto, both in its non-native and native forms, in this range of “possible human languages” needs to be further investigated.

Another potentially fertile area of research concerns the nature of Esperanto as a native language and particularly its similarities to, and differences from, the acquisition of “ethnic languages” as home (or “heritage”) languages, especially in immigrant communities. The differences between such heritage speakers and their counterparts using the language as the dominant one have become the subject of much productive research in recent years (e.g. Kagan & Dillon 2001, Pereltsvaig 2004a-c, Polinsky 2008, Laleko 2008, 2010, inter alia on Heritage Russian) and have been ascribed alternatively to one of two processes: incomplete acquisition or attrition (i.e. “forgetting” the language). This research agenda is stimulated not only by the problem being intrinsically fascinating, but also by the growing pedagogical application: as more children of immigrant parents go to college and are interested in further exploring their home language and its attendant culture, programs in teaching certain languages as neither native nor foreign are being developed at several universities, including Columbia University’s program in Heritage Russian. Investigating native speakers of Esperanto in the heritage language perspective has the potential of not only enriching the Esperantic Studies as a scholarly discipline, but also improving the teaching of both Esperanto and of other heritage languages. Similarly, Esperanto can be further studied in light of the emergent study of sign languages.

Moreover, the emergence of globally dispersed Esperanto-speaking community, unified by language and its associated culture, sheds new light on the sociolinguistics of the concept of “ethnicity”. As most of the world’s ethnic communities are held together by a common language, linguistic unity is often seen as a prerequisite for “ethnicity”. Some ethnic groups share nothing but language: such is the example of the Kurds, who share a common language (or rather a group of closely related languages), but not a religion or a genetic bond (e.g. genetic differences between Kurdish Jews and Kurdish Muslims have been shown to be significant; cf. Hammer et al. 2000, Nebel et al. 2001). Yet, as language endangerment and language shift (i.e. a process by which a group abandons its indigenous language in favor of another, typically more economically and politically influential, tongue) become more and more rampant, many ethnic groups previously defined by language assimilate linguistically into other groups. One example is the Itelmen of the Kamchatka Peninsula, who virtually lost their indigenous Itelmen language and switched to Russian. Can Esperanto be considered a sufficient bond to hold together an ethnic group? Further sociolinguistic and anthropological research is clearly called for.

Another potentially interesting area of research concerns the issue of linguistic relativity, particularly the issue of whether “languages influence how their speakers think” (Jonathan Pool, personal communication). This problem has been a subject of a scholarly debate for a long time and has recently become a subject of much controversy, both in academic circles and in the popular press. The so-called “neo-Whorfian studies”, exploring whether there is a connection between aspects of grammar and ways of thinking, and the critique of the neo-Whorfian conclusions are summarized accessibly in McWhorter (2014b). What makes Esperanto a particularly interesting test case is the claim, often made by its detractors, that, being a planned language, Esperanto “would limit the ability of its users to formulate and communicate ideas” (Pool and Grofman 1989: 145). Yet, Pool and Grofman find no evidence to substantiate this “poverty argument”. Still, it is difficult to conduct a “clean” experiment to address this issue because, as Pool and Grofman note (p. 147), “unilingual speakers of planned languages are nonexistent”, so whatever cognitive bias is postulated for the planned language, its effects can be argued to be “diluted” by the speakers’ other native languages. (Esperanto in light of the Whorfian hypothesis is discussed further in Benczik 2011.)

For an additional annotated bibliography on Esperanto linguistics, see Blanke (2015) and Blanke (2014, in German).

