Friday, November 28, 2014

Old Church Slavonic, Revisited

After a circuitous route of reading, I decided to revisit Old Church Slavonic. It started when I was looking at some of the contested inscriptions in the Basarabi Cave Complex, where I saw the following sequence:
With the bar over it, it looks like either an abbreviation or a nomen sacrum. It made me think of the Rohonc word I tentatively read as "Christ":

From there, I started reading about Glagolitic, and I realized that I should probably look at letter frequencies in Old Church Slavonic as it was written in both Glagolitic and and Cyrillic, so I reanalyzed OCS using the Codex Marianus (Glagolitic) and Codex Suprasliensis (Cyrillic).

Codex Marianus
big jer0.0%23.4%7.7%

Codex Suprasliensis
big jer0.1%15.4%5.8%

One of the interesting things about Glagolitic is that some of the forms of these letters resemble the forms of the most common Rohonc letters. But I can only steal 10 minutes away today, so I'll have to get back to that in another post.

Wednesday, November 12, 2014

Comparison with Old Church Slavonic and Koine Greek

In this post, I'll complete my initial attempt to compare the initial and final frequencies of the three most common Rohonc glyphs with the three most common letters in some candidate languages.

First, Old Church Slavonic:


Old Church Slavonic differs from Rohoncian in that the most common letter is more frequent as a final than as an initial, and there is a wide disparity between the frequency of the second letter as initial and final.

Second, Koine Greek:


Koine Greek differs from Rohoncian in that the third most common letter is more common as an initial than as a final (though if the iota ended up in third place, it would fit well, with initial and final frequencies of 2.7% and 8.6%, respectively).

Suppose we give each of these languages a location in six-dimensional space, indicated by the relative frequencies of the top three symbols as initials and finals...what would their distances from each other be in this space? And which would be closest to Rohoncian?

Interestingly, the two languages that are closest to each other by this measurement are Rohoncian and Latin, with a distance of 0.115. Next closest to Rohoncian is Old Hungarian, with a distance of 0.130. The languages that are most distant from each other are Koine Greek and Old Albanian, with a distance of 0.293.

Overall, I am weakly inclined to think that Rohoncian is some kind of Latin or Hungarian. Not only does this particular measurement favor these two languages, but there are graphical similarities between the Rohonc C, I and the Latin e, i.

(In case you are wondering, I also looked at Voynichese, just for the fun of it. It differs significantly from all of the other languages I have looked at, in that the second and third most common letters occur infrequently as initials or finals.)

Tuesday, November 11, 2014

Comparison with Old Hungarian

So far I've compared the relative frequencies of initials and finals for the three most common glyphs in Rohoncian with Latin and Old Albanian. Today I'll do Old Hungarian.

For Old Hungarian, I used the four gospels from the Hussite Bible, and I counted long and short vowels together. The top three letters break down as follows:

e, é16.5%6.1%16.6%
a, á10.1%8.5%10.7%

Like Latin and Rohoncian, the most common letter in Old Hungarian is more frequent as an initial than as a final, while the third most common letter is more frequent as a final than an initial. Like Latin, these two letters are e and t, respectively.

You might wonder why, if there are around 1,000 glyphs in Rohonc, I am comparing it statistically to alphabets instead of syllabaries or ideographic systems. The reason is that the most frequent glyphs in Rohonc are roughly as frequent as letters ought to be. Among Latin syllables, for example, the most common syllable in the Vulgate version of Genesis is et, but it only accounts for 3.29% of syllables. The most frequent Rohonc glyph is C, and it accounts for 12.9% of glyphs, putting it in the same ballpark as the most frequent letters of alphabetic systems.

Monday, November 10, 2014

Comparison with Old Albanian

A couple of days ago I looked at the relative frequency of the three most common Rohonc symbols as initials and finals, and compared that to the relative frequency of the three most common Latin letters.

Today I'll do the same with Old Albanian. My sample text for Old Albanian is Gjon Buzuku's Meshari, the three most common letters of which are e, i and h:


In some respects, Old Albanian fits better than Latin. Latin initial i is far more common than Rohonc I (9% > 5%), while Old Abanian shares with Rohonc I that both are far much more frequent as finals than initials. However, Albanian e occurs more frequently as a final than as an initial.

In order for this to work, the names of two of the evangelists would have to end in h. In fact, the names of two of the evangelists do end in h in the Meshari:

Maξeh: Matthew
March: Mark

Furthermore, Luke is also written with an hLucha.

Saturday, November 8, 2014

Relative frequency of initials and finals

In a previous post, I argued that we could use the presence or absence of hyphens at the end of a line to generate some basic statistics about word initials and word finals. At the time I was thinking of using this information to divide the text into words, but over the last few busy months I have been thinking about another use for this data.

In most (or all?) languages, the frequency of ranking initials differs somewhat from the ranking of finals and medials. For example, in Latin, the letter t occurs nearly four times more often at the end of a word than at the beginning, whereas u occurs about 3.5 times more frequently as an initial than a final.

Rohoncian is no different from known languages in that respect. For example, the glyph D occurs 8.5 times more frequently as a final than as an initial. If Rohoncian is a known real language, then the difference between frequency ranking in initial, medial and final positions could be used to help narrow it down.

For example, using the three most common glyphs in Rohoncian, we could construct a kind of litmus test. The relative frequencies of those glyphs are:


If we wanted to test the theory that Rohoncian is Latin and those three glyphs are alphabetic, then we would match them up to the most common three Latin letters:


In broad terms, this correspondence seems to work out well. C shares in common with e that both are ranked first in overall frequency and somewhat more frequent as initials. Similarly, D and t share the third position and are significantly more frequent as finals than initials.

The main problem with this, as far as earlier proposals go, is that two of the evangelists have names ending in D (i.e. CO IH D and XDC D). However, this is already a problem because it seems to work best to read those names as Luke (or Mark) and Matthew, and it is not clear what those names share in common that would lead them to be written with the same final.

On the positive side, I had previously proposed reading the word K O A D CX as "nights". If this is the word noctes, then the D falls in the right place, and CX could be read es. (The glyph CX looks like C, but with a dot).

Part of me wonders what we would get if we looked at initials, medials and finals in the Voynich manuscript. But that carcass has been picked over by smarter minds than mine, and yielded almost nothing.

Thursday, November 6, 2014

Quick (Tironian) Note

I've been very busy for a few months, and some of my projects have languished, including working on the Rohonc codex. (How do you prioritize a project that may not succeed?)

However, I happened to see a page from the Old Irish Book of Leinster that got me thinking about this again. I don't have much time (it's my lunch break) but I thought I could write a quick note about it.

The thing that caught my eye was the use of Tironian notes for their phonetic values. An example is the name "Conchobar", written (among other ways) as follows:

The first glyph in this name looks like a backwards C, but it is none other than the Tironian note for con:

Except, instead of representing the Latin morpheme con, it represents only the phonetic value. The same note is used in the name Conall. The RC does not look like a text that is written fully in Tironian notation, but it would be interesting to try to transcribe a sample of the Rohonc codex as though it were a subset of Tironian notation and see what it sounds like.

If you've ever wondered what a text written completely in Tironian notation looks like, here is a piece from the psalms, given at the end of a 9th century work titled Comentarii notarum tironianarum: