I’ll be exploring pangrams, which are sentences that contain every character in a writing system at least once. The pangram most familiar to English-speakers is “the quick brown fox jumps over the lazy dog.” Although they’re primarily used in typography, pangrams can also be useful for learning new writing systems.

In this blog post I’ll be looking at pangrams in Korean, Japanese, Arabic, and Hindi. Although all languages are created equal, some writing systems are more user-friendly than others: in the process of exploring pangrams, I’ll have the chance to contrast the relative merits of different orthographies.

Introduction

People have invented hundreds of different writing systems: they differ hugely in appearance, writing direction, and their pragmatic applications. However, all of this diversity can be reduced to a few basic categories:

  • Alphabets have letters that correspond to one vowel or consonant phoneme.
  • Abjabs have letters that correspond to one consonant phoneme, with optional diacritics to disambiguate vowels. For instance, the Arabic word شكرا (thanks) is pronounced shukran, even though it is written as shkra. Try reading the following sentence, and you’ll get a feel for the logic of written Arabic: Arbc hs n ntrstng wrtng systm.
  • Syllabaries have characters that correspond to entire syllables. For instance, the Japanese word おとこ (man) has the following syllables: = o, = to, and = ko.
  • Abugidas (or alphasyllabaries) have ‘base syllables’. The vowels of these base syllables can then be modified with diacritics. For instance, you can modify the Devanagari syllable (la) in several ways: ले = le, लु = lu, ली = lee, etc.
  • Logographies have characters that correspond to entire words or morphemes. For instance, the Chinese words 由于 (because of) and 鱿鱼 (squid) have the same pronunciation: yóuyú.

Pangrams are grammatical sentences in which each character in a writing system is used at least once. However, in many writing systems, characters have more than one form. In Latin-derived alphabets all letters have both MAJUSCULE and miniscule forms. The majuscule (A) and miniscule (a) forms are allographs of the same underlying letter. English has at least three allographs for each letter (a majuscule, a miniscule, and a cursive form). This is annoying, and means that even if we learn an English pangram like “the quick brown fox jumps over the lazy dog,” we still haven’t learned every form that the letters can take.

Korean Hangul (한글)

Things aren’t always so annoying. In Hangul, you don’t need to learn about memorizing tons of allographs, because there aren’t any. Like English, Korean Hangul has characters that correspond to individual consonants or vowels. Unlike other alphabets, Korean characters are squished together into blocks. This is an awesome advantage because it makes syllable boundaries easy to interpret. The word 김치 (kimchi) has two syllables, and . Unlike Chinese, each of these blocky units can be broken down into different letters. The syllable consists of the letters = k (g), = i, and = m.

It’s easy to create a pangram that contains all the letters of the alphabet:

밤새 컴퓨터로 요약을 해치우면 좋겠다.

This pangram is pronounced “BamSae KumPyooTuhRo YoYakEul HaeChiWooMyun JotGetDa” and means “I’d love to blaze through the summary with the computer overnight.”

However, there’s a small hitch. Because Korean characters are all squeezed into super-convenient syllable blocks, they get warped slightly depending on their location in the block. Our letter (g) is different when it’s in the syllable (gil, road), (gool, oyster), and (gwol, palace):

For a typographer creating a new font, the flexibility of Korean letters might be a stumbling block. In fact, a typographer might even say that the different shapes of Korean letters constitute diffeent allographs. But for a language learner using paper-and-pencil, this flexibility doesn’t cause much difficulty. In fact, Korean is one of the easiest writing systems to learn.

Japanese Hiragana (ひらがな)

Japanese is written with several different scripts, all used in combination. One of two kana syllabaries that Japanese uses is Hiragana. Hiragana characters usually follow root words written in Chinese characters, where they serve grammatical purposes such as inflecting verbs. But hiragana is powerful enough to represent all of spoken Japanese: small hiragana characters called furigana are sometimes even written next to obscure Chinese characters or in children’s books to aid comprehension.

Hiragana is almost a pure syllabary, except for a few hacks that are included in order to represent palatal sounds, nasal consonants, and some voiced consonants.

Hacks

While a word like おとこ is easily dissolved into its three syllables (o.to.ko), a word like ちゃ (tea, pron. cha) is hacked together with two characters. The first character, , makes the sound chi when it’s alone or followed by another normal-sized hiragana character. The second character, , is pronounced as ya. However, when is written in a smaller form, the preceding character gets bundled up with it and forms a digraph (notice the difference between ちゃ = cha and ちや = chiya).

The nasal coda (n) can also be tacked on to any syllable. For instance, the syllable (ya) becomes やん (yan) with the addition of the n character. In a very strict sense, the existence of means that hiragana is an alphasyllabary.1 However, humans have had a tough time creating pure syllabaries, so hiragana gets a pass.

The third hack that hiragana uses are diacritics. Some syllables, like (hi), take on marks that change their pronunciation: = pi and = pu. In this way, hiragana behaves in a similar fashion to Devanagari, an abugida, where all syllables have base forms that are modified by diacritics. And this makes sense, because hiragana is probably influenced by Siddham, a Brahmic script (in the same family as Devanagari) that was brought to Japan from India by Buddhist priests.

A Historical Aside

At the beginning of the 4th century CE, Indian culture was penetrating East Asia. Ruled by the Gupta Empire, Indian civilization was experiencing a golden age. This legacy includes the concept of zero, round earth theory, the base-10 numeral system, chess, the Kama Sutra, and an enormous body of Sanskrit literature. These creations were just too good to keep secret, and they quickly spread around the world.

The Silk Road

A snapshot of this period of cultural transmission is captured by the Dunhuang manuscripts.2 These manuscripts were discovered in Western China in the early 20th century and contain Buddhist, Nestorian Christian, Daoist, and Manichaean texts. Collected from the 4th to 11th centuries, these texts are written in Chinese, Sogdian, Hebrew, Old Uyghur, Khotanese, and Sanskrit.

Sanskrit texts written in the Siddham abugida made a large impact on Buddhism in China.3 But these texts influenced Japan to an even greater extent. One of the individuals influenced by the Silk Road transmission of Buddhism was Kukai, a Shingon priest who lived in the 8th century.

Kukai’s Dream

Kukai was born and educated in Japan. He pursued a Buddhist education, but felt estranged by the ritsuryo system that regulated the activity of priests and formalized religious doctrine. Deciding to live outside this system, he became an ascetic. After wandering the countryside and searching for meaning, Kukai had a dream. In his dream a man revealed to Kukai that the Mahavairocana Tanta contained the meaning that he was searching for. Although he was able to eventually obtain a copy of the text, it was written in Sanskrit using the Siddham abugida, and Kukai was unable to make sense of the document.

In order to understand the text, Kukai traveled to China to learn Sanskrit. On his return to Japan several years later, he brought with him the Siddham abugida. Kukai decided that Japanese should be written with a phonetic system instead of Chinese characters. Japanese folklore says that Kukai invented kana under the influence of Siddham to achieve this goal.

Hiragana’s Indian origins are apparent. Not only do the ordering of hiragana characters resemble the ordering of Devanagari characters (more on that later), but even today Siddham is still used by Japanese Shingon priests to write their Sanskrit mantras.

In fact, the Iroha, a well-known Shingon Buddhist poem (also attributed to Kukai), is a perfect4 hiragana pangram, one that doesn’t re-use any characters:

Iroha
いろ は にほへと Iro ha nihoheto Even the blossoming flowers
ちりぬる を Chirinuru wo Will eventually scatter
わ か よ たれ そ Wa ka yo tare so Who in our world
つね ならむ Tsune naramu Is unchanging?
うゐ の おくやま Uwi no okuyama The deep mountains of karma—
けふ こえて Kefu koete We cross them today
あさき ゆめ みし Asaki yume mishi And we shall not have superficial dreams
ゑひ も せす Wehi mo sesu Nor be deluded.

The English translation5 is worth a read:

Although its scent still lingers on
        the form of a flower has scattered away
For whom will the glory
        of this world remain unchanged?
Arriving today at the yonder side
        of the deep mountains of evanescent existence
We shall never allow ourselves to drift away
        intoxicated, in the world of shallow dreams.

Hiragana Pangram

While the Iroha is a lovely pangram, it’s a bit dated. Some of the characters, like and , have dropped out of use. Others, like , have changed their pronunciation. A more modern pangram (though it still contains some obsolete characters) is below:

とりなくこゑす ゆめさませ みよあけわたる ひんかしを そらいろはえて おきつへに ほふねむれゐぬ もやのうち

This pangram is pronounced “torinakukowesu yumesamase miyoakewataru hinkashiwo sorairohaete okitsuheni hofunemurewinu moyanōchi” and means “Awaken from dreaming to the voice of the crying bird and see the coming daylight turning the east sky-blue; shrouded in mist is a flock of ships on the open sea.”6

Kukai’s Second Dream

Kukai’s dream led him to learn the Siddham abugida. But Kukai had another dream – a dream about a simple phonetic script to write Japanese: one based on the holy alphabet of Buddhism. However, things didn’t work out the way Kukai intended. The creation of kana only added to the confusion. While Hiragana might be one of the easiest syllabaries to learn, this is just the tip of the iceberg when it comes to writing Japanese. Not only does Japanese have another parallel syllabary (katakana), but Japanese also uses thousands of Chinese logograms. These logograms have multiple written forms and come packaged with complex rules governing how their pronunciation changes based on context. Kukai’s first dream – to spread Buddhist teachings to Japan – has been largely successful. But his second dream – a simple script for Japanese – has turned into a nightmare.

Arabic (العربية)

Like Korean, Arabic letters change their form based on their location in a word. However, the changes that Arabic letters undergo are much more dramatic. Every Arabic letter has an isolated, initial, medial, and final form (although sometimes these forms are the same), and the letters that make up a word are connected together in a continuous line (though there are some ‘interrupting letters’ that break the flow). For instance, the letter k can appear four ways: ك = isolated, كـ = initial, ـكـ = medial, and ـك = final (reminder: Arabic is written from right to left).

The mutability of its characters makes finding satisfying pangrams in Arabic a bit difficult. Should we go the way of English, and simply ignore some letter forms and prioritize others? Or should we try to hack together a pangram that uses every form its letters take?

نص حكيم له سر قاطع وذو شأن عظيم مكتوب على ثوب أخضر ومغلف بجلد أزرق

This is pronounced “naṣun ḥakymun lahu syrun qāṭiʿun wa ḏu šānin ʿẓymin maktubun ʿala ṯubin aẖḍra wa muġalafun biǧildin azraq”7 and means “A wise text which has an absolute secret and great importance, written on a green tissue and covered with blue leather.”8

I haven’t found a pangram that will solve the multiple-forms problem. Things get even trickier though, because Arabic can also be written with diacritics to disambiguate vowels. Although adult readers of modern varieties of Arabic don’t use diacritics, they appear in Classical Arabic (the Arabic of the Koran) and in learning materials for children and foreigners. You can learn more about Arabic diacritics here.

Devanagari (देवनागरी)

Devanagari has a similar logic to Arabic. However, instead of writing consonants without vowels, you write consonants with obligatory vowels. Devanagari characters are written independently before being connected with a vertical line that runs along their tops. Most characters connect to their neighbors; however, like in Arabic, there are a few ‘interrupting characters.’ Because Devanagari is difficult to form pangrams in (more on that later!), I’ve provided an acrostic word-list to assist learning. Examples of the primary 61 Devanagari characters are shown below:

An Almost Acrostic9

Character IPA Example Pronunciation Meaning
k काली kaalee black
खेल khel game
g गाय gaay cow
ɡʱ घास ghaas grass
ङ * ŋ      
चावल chaaval rice
tʃʰ छह chay six
जूता joota shoe
dʒʱ झील jheel lake
ञ * ɲ      
ʈ टांग taang leg
ʈʰ ठग thag cheat
ɖ डेस्क desk desk
ɖʱ ढक्कन dhakkan lid
ण * ɳ बाण baan arrow
तलवार talavaar sword
t̪ʰ थरमस tharmas thermos
दूध doodh milk
d̪ʱ धारा dhaara stream
n नीला neela blue
p पनीर paneer cheese
फोजी phojee soldier
b बैंक baink bank
भूरा bhoora brown
m महिला mahila woman
j युवा yuva young
r रोटी rotee bread
l लाल laal red
v, ʋ, w वाइन vain wine
ʃ शलजम shalajam turnip
ʃ, ʂ षट्भुज shatbuj hexagon
s सफेद saphed white
ɦ हरा hara green
क़ ** q क़िला qila fortress
ख़ ** x ख़ान khan Khan
ग़ ** ɣ आग़ा aga Aga
ड़ ** ɽ खिड़की khidakee window
ढ़ ** ɽʱ पढ़ना padhana read
फ़ ** f फ़ोन fon phone
ज़ ** z ज़ेबरा zebara zebra
झ़ ** ʒ झ़ामबिल jaambil Jambyl
क्ष kʃ, kʂ क्षर kshar Kashar
त्र t̪r त्रिनिदाद trinidaad Trinidad
ज्ञ gj ज्ञानपुर gyaanpur Gyanpur
श्र ʃr श्रीनगर shreenagar Srinagar
a अनार anaar pomegranate
आदमी aadamee man
ɪ इंडिया indiya India
ईरान eeraan Iran
ʊ उदास udaas sad
ऊंट oont camel
एवोकाडो evokaado avocado
ɛː, æ ऐनक ainak spectacles
ओंठ onth lip
ɔː औषधि aushadee medicine
अं ŋ or m अंगूर angoor grape
अः †† h      
ऋषि rishi saint
††      
†† ɛ      
ɒ ऑस्ट्रेलिया ostreliya Australia

* Typically appears only in magic square.

** Nuqta, used in loanwords.

Irregular ligature.

†† Does not appear in modern Hindi.

Diacritics

All Devanagari letters have an inherent vowel that can be modified or eliminated by a diacritic. For instance, (ka) can become कि (ki), कु (ku), का (kaa), etc. Diacritics that change the vowel are called matras. There are nine of them:

Diacritic Letter with Diacritic Pronunciation
none sa
सा saa
ि सि si
सी see
सु su
सू soo
से se
सै sai
सो so
सौ saw

A letter can also be given a final n by the anusvara diacritic. For instance, कुंजी (kunjee, key).

A virama, or killer stroke, can eliminate a vowel. For instance, the Hindi transliteration of the name Chris (क्रिस) has a virama under the ka, making it k. However, the virama is not always used. For instance, the word for purple (बैंगनी, bainganee) has a final letter (na) that is modified by ी to become नी (nee). But the word for eggplant (बैंगन, baingan) should have a virama below the – it doesn’t: the virama is implied.

The chandrabindu diacritic (ँ) nasalizes the vowel that it sits on top of.

Devanagari also contains many ligatures. Ligatures are created when two letters sit next to each other in the same syllable. For instance, when (au) gets the chandrabindu diacritic, it becomes 🕉, the om symbol sacred in Hinduism, Buddhism, Sikhism, and Jainism. (A ligature that Westerners might be most familiar with is the German ß, which occurs when two s letters combine, such as in the word Straße (strasse, street).) Ligatures are not always easy to dissolve into their constituent sounds, and often need to be memorized.

Devanagari Pangram

Because of its many diacritics and ligatures, it’s difficult to write a pangram in Devanagari. But that’s okay: the point of creating pangrams is to assist us in learning new writing systems. Luckily, Devanagari has an organizational property that takes it far beyond other writing systems.

The Magic Square

Alphabets like English are organized linearly: there are the ABC’s, Alpha and Omega, the beginning and the end – they’re one-dimensional. The ancient Brahmi script and its descendents, like Devanagari, are organized differently: they’re two-dimensional.

The middle of the Devanagari abugida is arranged into a magic square. Letters are arranged vertically by place of articulation (the first line are gutturals, then palatals, retroflex, dentals, and labials) and horizontally by manner of articulation (alternating from unaspirated to aspirated). This is similar to the organization of the International Phonetic Alphabet, a system invented by linguists in order to accurately transcribe all human speech.

This clever organization is fundamental to shiksa (phonology), one of the six Vedic studies. The understanding that consonants are more than just series of individual sounds opens up linguistics to a deeper form of inquiry, where the specific relationship between sounds can be discovered. It’s no coincidence that Brahmic abugidas and Indian numerals spread across much of the world: India was far ahead of the rest of humanity in its understanding of linguistics. This organization was so influential that it was even used hundreds of years later as the basis of the constructed Cree Abugida.10

Chinese (漢字)

While writing pangrams in Devanagari might be tough, they’re impossible in Chinese. The Chinese language is written using Chinese characters – logograms that are only slightly more user-friendly than Sumerian Cuneiform or Egyptian Hieroglyphs.

The principle of Chinese logograms is simple: one character for one morpheme (or word). This eliminates all the complicated rules that other writing systems have: rules that govern diacritics, ligatures, allographs, and etc. But this is an example of something that is simple in principle but unwieldy in practice. While it might make sense to have an individual character for each word if your language only has a few hundred words, human languages have tens of thousands.11 A pangram in Chinese would need to be at least as long as a novel.

Weaving Everything Together

Pangrams give us some hints about which writing systems are easiest to learn. Korean is a simple alphabetic script that lacks annoying allographs (unlike English) and can be learned in just a few hours.

Writing systems like the Japanese kana are a bit more complicated, but have some simple hacks that allow them to represent all the sounds in the language. The Iroha is a pangram poem that captures all of Japanese Hiragana.

Arabic is more challenging to find a satisfying pangram in, as each letter has several allographs. However, these allographs resemble each other, and Arabic is not difficult to learn.

Devanagari, an abugida, has numerous ligatures and diacritics that make it difficult to create pangrams. However, Devanagari has an interesting organizational principle – the magic square – that can be used in language learning.

Chinese logograms are difficult to acquire and must be learned by hundreds of hours of rote memorization.

My Pangram

Here’s my attempt at an English pangram:

“Waxy fish pangrams blaze,” Jake equivocated.

I’ll end with that.


  1. In an even stricter sense, Japanese kana can be described as moraic systems, as characters correspond more closely to mora than to syllables. 

  2. The Dunhuang manuscripts are a treasure-trove of religious literature. Although they’re mostly Buddhist texts written in Chinese – Manichaean, Daoist, and Nestorian Christian texts are also included. Some of the languages represented among the manuscripts include Khotanese, Old Turkic, Sanskrit, Hebrew, and Sogdian. 

  3. A description of Siddham texts among the Dunhuang manuscripts. 

  4. The Iroha is almost perfect: it contains every syllable except (n). 

  5. Ryuichi Abe, 1999. The Weaving of Mantra: Kukai and the Construction of Esoteric Buddhist Discourse

  6. From Clagnut’s List of Pangrams

  7. Romanized using ISO

  8. From Clagnut’s List of Pangrams

  9. I’ve used my own variation of the standard Hunterian transliteration. I’ve used double vowels (aa) instead of overlines (ā) to represent long vowels. 

  10. The Cree Abugida was based off of Devanagari. 

  11. The Oxford English Dictionary lists 171,476 words in English, a language that is not exceptionally word-dense.