Jump to content

Talk:Marshallese language/Archives/2020/January

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia


Untitled section

Okay I know this isn't even close to the usual Wikipedia language article format but when I saw the article completely empty I had to add something. This is basically the text from and article I wrote on everything2 some years ago. Please edit this article into shape. I hope I've kick-started it now. — Hippietrail 23:30, 6 Sep 2004 (UTC)

Yokwe and io̧kwe are certainly the same word spelled differently—speculating, I'd guess the former may be an older or more Anglicized spelling; either that, or the language's spelling just isn't quite standardized yet. —Muke Tever 03:36, 13 Sep 2004 (UTC)
In fact, spelling varies greatly in Marshallese, as I observed while teaching high school in the Marshall Islands for a year. Another spelling for yokwe is iakwe. Yokwe is the most common spelling, but /y/ is gradually being replaced by /i/. mssever (Talk | Blog) 01:12, 30 May 2006 (UTC)
In 1998 the Marshall Islands issued a series of stamps, each with a letter of the alphabet and a picture of a word beginning with that letter. This suggests to me that they were celebrating a new official alphabet/orthography. Especially since the language manual and other sources are so different. I'm pretty sure the Marshallese dictionary also uses an older orthography but I haven't been able to find one to look at yet. Hehe - I've just realized I've already got a link to the alphabet stamps anyway! (-: — Hippietrail 06:41, 13 Sep 2004 (UTC)

More to be done

I'm grateful for all the help of User:Austronesier and User:Erutuon. I've copied wikt:Module:mh-pronunc over to Module:mh-pronunc here on Wikipedia, and I made edits to Template:IPAc2-mh. I have not updated the documentation, and I also may have broken its audio sample embedding (though no one ever used it, so I'm not certain). I have updated Help:IPA/Marshallese, though it still needs work (and still needed work even in the years before). In the long run, a lot of the work that needs to be done can be done myself, I think. But it would also help, at times, to have a clue what I'm doing in regards to keeping all these pages up to a state at least resembling Wikipedia's standards.

Also, the previous thread was getting far too long anyway, and I thought I could start a new one. - Kilkamej [kilʲ(i)ɡɑmʲɛtʲ] Gilgamesh (talk) 06:12, 9 December 2019 (UTC)


Oh, and some souvenirs:

- Gilgamesh (talk) 06:33, 9 December 2019 (UTC)


Some new thoughts I'd like to discuss with my fellow editors who've been following this topic.

Obviously in our IPA we've not been indicating syllable stress. And as Ng (2017) notes, stress isn't that contrastive in Marshallese, but it does exist. But where minimal pairs may arise is where a vowel's status as epenthetic can shift syllable stress patterns, as epenthetic vowels are essentially completely invisible to syllable stress considerations. However, while epenthetic vowels can never obtain syllable stress (the definition of asyllabicity), I don't get the impression they are necessarily audibly shorter in duration than ordinary vowels. So I have an idea for changing the notation as it exists now. Instead of indicating epenthetic vowels as short [◌̆] as the script does now, indicate them as non-syllabic between consonant glyphs [◌̯]. But in an epenthetic-vowel-glide-normal-vowel sequence /V̯GV/, where it's possible for a long vowel to form but only the second vowel may receive stress, I wonder if it might not be more appropriate to display both vowels as normal vowel glyphs, except that the second, non-epenthetic vowel can be indicated with the syllabic diacritic normally used for syllabic consonants, so Jālwōj is [tʲælʲo̯wɤtʲ ~ tʲælʲoo̩tʲ]. Why? Because of an ambiguity dilemma I noticed with the word eakeak {yakyak}: Its most organic pronunciation would seem to be [æɡæːk], while [ɛ̯ɑɡɛ̆ɑk] (as it is currently transcribed in the script) is a possible more enunciated articulation. The script used to transcribe both semi-vowels and epenthetic vowels with the same notation, but the transcription [ɛ̯ɑɡɛ̯ɑk] is ambiguous because a semi-vowel is shorter than an epenthetic vowel and an IPA transcription that uses the same diacritic for both does not clearly indicate that the second half of the word is audibly longer than the first half. Currently the script instead renders this word as [ɛ̯ɑɡɛ̆ɑk ~ æɡæːk], but this is also not necessarily ideal because, again, the epenthetic vowel is not necessarily shorter than a normal vowel—it's merely non-syllabic and can't affect stress patterns. So a possible third option is [ɛ̯ɑɡɛɑ̩k ~ æɡææ̩k]. I realize this transcription style is far more original, as in original research, as no other published reference uses transcription like this, but it at least makes it clear which of two vowels may accept syllable stress and which may not. There is linguistic precedent for a similar kind of distinction, as Classical Attic Greek long vowels could receive stress on either the first or second mora, which is why polytonic Greek writing supported separate diacritics for stress on a vowel's final mora ⟨ά⟩ /a͜á/ or on a long vowel's penultimate mora ⟨ᾶ⟩ /á͜a/. I also realize that I only have intuition and not hard evidence to back up the hypothesis that [æɡæːk] as {yakyak} is stressed any differently than [æɡæːk] as {yakayak} to begin with, and this entire point could be moot, and it is almost certainly moot if both words can only be stressed on their first syllable anyway. But at the same time, it doesn't seem safe to use phonetic transcription whose appearance precludes the possibility of minimal pairs between such sequences.

On another matter, I've been thinking about what Austronesier said earlier about audible off-glides after certain labialized consonants. And since Naan (2014) does seem to distinguish [Cʷ] vs. [Cw] in its IPA (though it transcribes the first without the rounding diacritic [C]), I've been experimenting in the script's sandbox at Wiktionary with transcriptions using full [w] as an off-glide, for example ikbwij as [iɡɯ̆bwitʲ] instead of [iɡɯ̆bˠitʲ], and io̧kwe as [i̯ɒɡwɛ] instead of [i̯ɒɡʷɛ]. This is also guided to some degree by patterns in the standard orthography, as there is no off-glide in m̧uļe [mˠulʷe], which the MOD explains may occur in spelling when a rounded consonant occurs after a rounded vowel. This seems to only apply to /nʷ, rʷ, lʷ/, though, as /kʷ, ŋʷ/ are virtually always spelt out as ⟨kw⟩, ⟨n̄w⟩ before an unrounded vowel, even after a rounded vowel, as in jukwa [tʲuɡwɑ].

Thoughts? I still care very much about peer review. - Gilgamesh (talk) 22:50, 12 December 2019 (UTC)


Okay, the more I read Naan (2014)'s pronunciations, the more skeptical I am of them. Some of them look like they completely naively sounded out the letters of a word without context, making the language suit the orthography rather than the orthography suit the language. For example:

  • doulul {dȩwilwil} [rʲoulʲŭwɯlʲ ~ rʲoulʲuːlʲ] is prescribed by Naan as [ˈɾ̪o.u.lul].
  • io̧kio̧kwe {yi'yakʷyi'yakʷey} [i̯ɒɡʷĭɒɡʷɛ] is prescribed as [iˈɒ.kiˌɒ.kwe].

Up to this point I've been using Naan as a supplemental reference, but its approach is so drastically different from Bender (1968)'s description. I know languages can change in 50 years, but there's such an seeming artificiality to this that seems to contradict Bender, Choi (1992) and Willson (2003). It's not that I think this isn't probably a real teaching technique for foreigners, or that the result isn't intelligible with spoken native Marshallese. But it also seems to take the "ignore Bender" advice of Rudiak-Gould (2004) to its most logical extreme to promote a prescriptive (rather than descriptive) form that did not previously exist in anyone's speech. I know descriptivism vs. prescriptivism has always been a thing in standardized languages, but this, if so, would seem unusually extreme, and even go as far as dismantling the vertical vowel system of Marshallese. I had added the "more careful" and "less careful" pronunciation modes to the module to try to reconcile the differences I read in Naan, but the more I consult it, the more I think maybe there aren't that many differences to reconcile, and that Naan is even less reliable a reference than I ever thought. Maybe eakeak really is just [æɡæːk], and wajwaj really is just [wɑzʲɒːtʲ], and so forth. If these newer teaching conventions prove influential even to native speakers, then they certainly merit some sort of description, but for now their reliability are severely in question, and I'm prone to believe Bender (1968) and associated descriptive references until there's new comprehensively descriptive data to properly update them. - Gilgamesh (talk) 02:06, 13 December 2019 (UTC)

I dismantled careful mode, as it was based on assumptions supplied by the Naan that no longer hold. Full consonant assimilations have returned, but vowel assimilations are not exactly the same as before, as there is now a difference in words like Jālwōj [tʲælʲo̯wɤtʲ] vs. Jalooj [tʲælʲoːtʲ]. I consulted audio samples to confirm Jālwōj. These two words are phonologically identical except for the epenthetic vowel vs. full vowel, but that really can make a difference in how the vowels congeal. I feel satisfied with the reliability and verifiability of the result. - Gilgamesh (talk) 00:13, 16 December 2019 (UTC)

@Gilgamesh~enwiki: I'm still here, but will need some time to give you detailed input on these matters. Just a short comment now: I'd avoid writing epenthetic vowels as non-syllabic vowels between consonants. On the other hand, the second "ɛ̆" [ɛ̯ɑɡɛ̆ɑk] is admittedly odd; I'd expect something like [ɛ̯ɑɡʌ̆ɛ̯ɑk] in more enunciated speech. –Austronesier (talk) 14:58, 16 December 2019 (UTC)
Admittedly the surface realizations of epenthetic vowels are not phonemic, because epenthetic vowels themselves are not phonemic but merely a way of binding unstable consonant clusters. As such, the pronunciation you suggest is as good as any. But for the time being, I haven't written the F2 of epenthetic vowels neighboring consonants to congeal any differently than similarly-situated ordinary vowels, because I didn't have enough evidence to say differently, and I still don't. In fact, I only have evidence that /CwV/ sequences differ from /CVwV/ sequences, and the word could still very well be [æɡæːk], but I decided to give /CjV/ a similar benefit of the doubt, so to allow the last vowel to be [ɑ]. I could have all epenthetic vowels assume the left consonant's F2, but I think more evidence is needed to support it. Note that it would also change other words, like [tʲælʲŏwɤtʲ tʲælʲĕwɤtʲ] and [i̯ɒɡʷĭɒɡʷɛ i̯ɒɡʷŭi̯ɒɡʷɛ].
Currently, since I dismantled careful mode, eakeak is actually [æɡɛ̯ɑk]. In careful mode, I had the {a} in {yak} and {yag} sequences behave like a final vowel if the next vowel was epenthetic. But then I realized, using the orthography as a guide, if [ɛ̯ɑk æɡV] is an automatic reflex even in grammatical inflections (I did a methodical search in the dictionary to be sure—compare intransitive eakto vs. transitive ākūtwe or eaktuwe), then it seems unjustifiable treating an epenthetic vowel after [ɡ] differently from a non-epenthetic suffix vowel, so the module now changes [ɛ̯ɑk æɡV] in all candidate circumstances. It's telling that eakto has a standard alternate spelling of ākto anyway, and it looks as if the eakC spelling is for situations where the morphemes may be enunciated in isolation from one another.
Now, [æɡɛ̆ɑk] (if we mark epenthetic vowels with breve as you suggest) was a bit of a guess. It could also be [æɡæ̆ɑk] or [æɡæ̆ɛ̯ɑk]. /j/ congeals differently than /w/ (there is no [j] in the module's phonetic mode output), and I just wanted to keep that part relatively simple in appearance, acknowledging the epenthetic vowel and the semi-vowel at the same time without committing the epenthetic vowel to any particular separate F2. - Gilgamesh (talk) 18:44, 16 December 2019 (UTC)
And sorry for peppering this talk page with so many notes to catch up with. But revisiting an earlier note, I said I was experimenting with [w] off-glides after /pˠ, kʷ, mˠ, nʷ, ŋʷ, rʷ, lʷ/ in certain circumstances in the phonetic output, influenced by orthographic patterns. Those are:
  • [pˠ, bˠ, mˠ][pˠw, bˠw, mˠw] (or perhaps more simply [pw, bw, mw]) before front vowels. In this case, a velarized bilabial approximant [β̞ˠ] may technically be more accurate than a labialized velar approximant [w] I suggest for use after dorsal non-glides, but since they are identical or near-identical sounds, the orthography uses ⟨w⟩ for both candidate contexts, and they occur in complementary distribution, it seems to be a distinction without difference.
  • [kʷ, ɡʷ, ŋʷ][kw, ɡw, ŋw] before unrounded vowels.
  • [nʷ, rʷ, lʷ][nw, rw, lw] before unrounded vowels, but not after other rounded vowels. As it is, the orthographic spellings ⟨ņw⟩, ⟨rw⟩, ⟨ļw⟩ seem extremely rare in the MED, though /nʷ, rʷ, lʷ/ themselves either before or after rounded vowels (and thus usually spelt ⟨ņ⟩, ⟨r⟩, ⟨ļ⟩) appear to be more common. This may actually be evidence that the off-glide does not exist in this case, and ⟨w⟩ is merely an orthographic disambiguator.
Of course, technically, an off-glide in addition to the secondary articulation isn't contrastive for any of these consonants, and is never actually necessary for their articulation, and an inclusion in their phonetic transcription is essentially IPA sugar. It also risks adding unnecessary complexity to the phonetic profile by using separate transcriptions for the same phoneme in different contexts. However, it's worth noting that before Bender (1968) discovered Marshallese was a vertical vowel system, linguists tended to identify six labial non-glides—/p, pʲ, pʷ, m, mʲ, mʷ/—of which unmarked /p, m/ were eliminated by Bender when he identified them as complementary distributions of /pʲ, pʷ, mʲ, mʷ/ instead, and soon after the supposedly labialized phonemes /pʷ, mʷ/ were reanalyzed as velarized /pˠ, mˠ/. But the pre-Bender analysis was—and in a sense, still is—influential enough that the spellings ⟨bw⟩, ⟨m̧w⟩ were standardized as part of the new orthography in a process Bender was a core part of. So in my opinion, the use of sugary [w] in IPA seems to be a case of no real harm done. You may disagree. Thoughts? - Gilgamesh (talk) 08:32, 18 December 2019 (UTC)
@Austronesier and Erutuon: What do you think of my proposed sandbox changes in wikt:Module:mh-pronunc? Relevant to my most recent comments. - Gilgamesh (talk) 10:07, 21 December 2019 (UTC)

Lately I'm struggling with an orthographic ambiguity in Marshallese, as reflected in the MOD, which is confounding certain choices I have to work with in refining the phonetic algorithm in the module. In this case, the problem is with words like rej, "they (progressive)." In fact, the entire MOD page for Marshallese words starting with R seems to take a consistent position that {reJ} combinations (where r is velarized and J is any palatalized non-glide) all have the vowel ⟨e⟩, a reflex not reflected anywhere else in the MOD except for the word rej and its reflexes, which by more common rules would otherwise be spelt rōj. So I updated the sandboxed version of the module to produce this reflex, and then I started seeing the existing Wiktionary entries that disagree: Aujtōrōlia {hawijtereliyah}, Jarōj {jarej} and rōplen {replen}.

Now, I already know there are certain alternative reflexes where the MOD recognizes common alternative spellings: wūj vs. uj, wōtōm vs. otem, eo̧n̄wōd vs. eo̧n̄ōd, Kuwajleen vs. Kuajleen, etc. But in those cases, most MOD entries readily mention the existence of the alternative spellings (or at least readily use the alternative spellings in example sentences), and often also have entries for the other spellings that link back to the primary entry spelling. That is not the case with this rōJ vs. reJ division, where most of the MOD covers only rōJ spellings (except for the exceptional word rej) and the R word page covers only reJ spellings. I honestly wonder how much of this is an error by the dictionary's primary editors that failed to update the R page. But it doesn't really help to speculate about this if the resulting hypothesis boils down to unverifiable original research.

I'm tentatively inclined to favor [ʌ, ɤ] instead of [ɛ, e] as the primary phonetic reflex in this cases, since the overwhelming majority of the dictionary prescribes ō instead of e for these words. That leaves me with the one exceptional word rej and its inflections. I could add logic to make an exception just for that one word, and let the inflected forms use the default algorithm (a position somewhat analogous to [ɛ̯ɑk æɡV], but covering only this one word). But I still wonder why this orthographic exception exists in the first place and is the only such {reJ} exception that seems to be reflected on other dictionary pages besides the R page.

There is another exceptional reflex involving R that is less problematic. Another word, roj, "ebb tide," is actually {rʷej}, which with any other first consonant would regularly be spelt rwōj, but the R page seems to prefer the ro spelling throughout before all palatalized non-glide consonants. The reason this isn't problematic is that I haven't, to my recollection, encountered words in other dictionary pages that actually use rwōJ (or rōJ after a rounded vowel) for this reflex, so I can tentatively make it a general phonetic rule to output [ɔ, o] as the vowel in all these circumstances.

Once again, I apologize for the verbose requests for peer review on this topic. But as you surely understand by now, Marshallese phonology and orthography are very strange creatures, demanding a certain higher quality of automated phonetic output from the Bender phonemes. It's no longer a rhetorical question of, "Why can't the orthography be simpler?" but also a matter of "Why can't we make the IPA transcriptions simpler?", because it's readily agreeable that the old approach, using IPA transcriptions like [ɛ̯ɛzʲ e̯e͡ɤdˠ ɑ̯ɑ͡æmʲ mʲe͡ou͡ɯrˠ], was an utter train wreck of an attempt to explain the subject to wiki readers. - Gilgamesh (talk) 00:51, 23 December 2019 (UTC)


In addition to the [pw, kw], etc. idea, I realized this and some other conventions could improve the readability of Marshallese IPA phonetic transcriptions.

Take Pajjipiik, "the Pacific" [pˠɑtʲːiːbʲiːk]. This is one of the best samples currently at Wiktionary of a Marshallese IPA transcription that suffers from modifier clutter. In addition to the secondary articulations, there's the [ː] symbols applied to both vowels and consonants, and the result is rather difficult for the eyes to scan. But I realized: While the orthography can spell long vowels with one or two letters (reflecting the difference between /CVGC/ vs. /CVGVC/ phonemic sequences), stable geminated consonants are always written with two letters. And borrowing an IPA notation convention from Italian phonology, the readability of Marshallese phonetic transcriptions can be improved by notating geminated consonants with two letters: [pˠɑttʲiːbʲiːk]. This ensures that consonants are only modified by secondary articulation symbols, and that vowels are the only symbols that can take [ː], reducing the complexity of mental logic required to skim IPA transcriptions. This also visibly associates geminated consonants as consonant clusters (two of the same consonant), which is what they are phonemically. So instead of having one set of symbols for geminates [pʲː, pˠː, tʲː, tˠː, kː, kʷː, mʲː, mˠː, nʲː, nˠː, nʷː, ŋː, ŋʷː, rʲː, rˠː, rʷː, lʲː, lˠː, lʷː] and a separate set of symbols for clusters [mbʲ, mbˠ, nzʲ, ndˠ, nrʲ, nrˠ, nrʷ, nlʲ, nlˠ, nlʷ, ŋɡ, ŋɡʷ, rlʲ, rlˠ, rlʷ, lʲtˠ, lˠtˠ, lrʲ, lrˠ, lrʷ], there can instead be a common combined set of symbols for both [ppʲ, ppˠ, ttʲ, ttˠ, kk, kkʷ, mbʲ, mbˠ, mmʲ, mmˠ, nzʲ, ndˠ, nnʲ, nnˠ, nnʷ, nrʲ, nrˠ, nrʷ, nlʲ, nlˠ, nlʷ, ŋɡ, ŋɡʷ, ŋŋ, ŋŋʷ, rrʲ, rrˠ, rrʷ, rlʲ, rlˠ, rlʷ, lʲtˠ, lˠtˠ, lrʲ, lrˠ, lrʷ, llʲ, llˠ, llʷ]. And with previously suggested [w] offglides, some of these can (situationally) become [ppw, kkw, mbw, mmw, ŋɡw, ŋŋw], and [kkw] is certainly easier to skim than [kʷː].

Another potential way of simplifying Marshallese phonetic IPA is by omitting secondary articulation symbols [ʲ, ˠ, ʷ] before their associated vowel allophones: No [ʲ] before [æ, ɛ, e, i], and no [ˠ] before [ɑ, ʌ, ɤ, ɯ], and no [ʷ] before [ɒ, ɔ, o, u]. The word jiljilimjuonn̄oul, "seventy (archaic)," could have its phonetic transcription simplified from [tʲilʲĭzʲilʲimʲĭzʲuɔnʲɤ̆ŋoulʲ] to [tilĭzilimĭzʲuɔnʲɤ̆ŋoulʲ]. A similar approach is used in Naan (2014), at least for some of the consonants. I only mention this out of thoroughness, though, because I'm not really convinced it's actually a good idea, because none of those secondary articulations are actually superfluous (they differentiate consonant phonemes), and excluding them forces the reader to have to learn the association between secondary articulations and vowel allophones to determine which consonant has which implied secondary articulation.

I really wanted thoughtful feedback before going ahead with these kinds of changes, but I guess it just hasn't been the season for that recently. At least I'm describing them in detail for the record, in case there's a problem later. - Gilgamesh (talk) 10:26, 25 December 2019 (UTC)


With all due respect and gratitude for your assistance, @Austronesier:, I must arrive at the conclusion that I still don't think it's appropriate for epenthetic vowels to be marked with a breve [◌̆] as if they are extra-short. A coda consonant occupies one mora. A consonant and a full vowel phoneme occupy one mora. A consonant and an epenthetic vowel occupy one mora. Epenthetic vowels are not necessarily audibly shorter than full vowels—they are just passed over for syllable stress. I think Bender was correct back in 1968 when he analyzed epenthetic vowels as non-syllabic [◌̯]. After further work over at Wiktionary, I realized that ākūtwe and eaktuwe aren't just synonyms, but they have the same sounds: ākūtwe {yakitwey} is [æɡɯdˠ(u)wɛ], while eaktuwe {yaktiwey} is [æɡ(ɯ)dˠuwɛ]—which, besides epenthesis, is the same sequence of audible consonants and vowels. However, it has become increasingly clear, in line with Ng (2017), that this seems to be the primary means Marshallese handles variations in syllable stress. Given a certain sequence of phonemes, stress seems to be predictable, but the shifting of phonemes between these two words changes which vowel is epenthetic. Now, lacking an additional source, I admittedly cannot prove these two words aren't just straight-up homophones with identical stress patterns. But because of Ng, we do know stress patterns treat consonant-epenthetic-vowel-consonant sequences as equivalent to consonant-consonant clusters and can vary accordingly, making it highly possible that ākūtwe and eaktuwe are stressed differently. As a matter of personal opinion, I also find this form of stress variation fascinating, as the stress exists but is not explicitly reflected in the phonemic model with any kind of separately phonemic syllable stress as can be found in so many other languages with variable stress (like English, Greek, Spanish, etc.).

All this said, I still want to build consensus on which direction to take this. But a disagreement between just two users creates a 50-50 split, robbing us of an editorial consensus on this decision. I'd like to hear an additional person's reasoned take on this, perhaps @Erutuon:'s, if he's up for the task. I can't say that I don't care which direction consensus leans, but in the interests of respecting the Marshallese language and the way Wikipedia represents it to the rest of the world, it's important that whatever consensus is reached have a strongly reasoned foundation that can withstand further dispute on the merits of its reasoning, so I can live with being outvoted. - Gilgamesh (talk) 12:04, 30 December 2019 (UTC)

@Gilgamesh~enwiki: Sorry I haven't contributed much yet to the last parts of your discussion. We should be careful not to conflate phonetic and phonological arguments. Morae are a phonological concept, where IPA syllabicity is a phonetic concept. Sure, epenthetic vowels may not add to syllable weight in the currently favored phonemic analysis, but they are audibly there—and not even shorter than "full" vowels, as you say. So phonetically, they are syllabic. Consequently, I still disprefer transcriptions such as e.g. [æɡɤ̯dˠo], since non-syllabic vowels cannot appear between two consonants. I'd rather skip the breve, then (i.e. [æɡɤdˠo]). –Austronesier (talk) 15:58, 30 December 2019 (UTC)
@Austronesier: Thank you for your response. And you're right—I should try not to conflate those concepts.
This seems like a predicament the IPA wasn't clearly designed for, doesn't it? Rather than phonemic stress, Marshallese effectively has anti-stress, and there doesn't seem to be adequately suitable IPA notation to reflect that. At this point, even something ad hoc might be a better substitute than the variously suggested options, as long as it doesn't inappropriate imply something else. Hmmm...
Another option I've considered (instead of non-syllabic, breve or plain vowel), is to use superscript vowels. For kijdik "dog": [kizʲⁱrʲik] instead of [kizʲirʲik, kizʲi̯rʲik, kizʲĭrʲik]. As shown in a chart at the secondary articulation article, there are superscript symbols [ᵋᶺᵓᵉᵒⁱᵚᵘ] for [ɛʌɔeoiɯu], but there is no superscript equivalent for [ɤ]. There is [ᵊ], which can be paired [ˠᵊ] to reduce ambiguity. I'm still giving this thought. I haven't given up. If there is no consensus for this suggestion, I can keep thinking. - Gilgamesh (talk) 19:33, 30 December 2019 (UTC)
I've now considered additional options, as well, using jiljilimjuonn̄oul as an example.
  • HTML superscripts: [tʲilʲizʲilʲimʲizʲuɔnʲɤŋoulʲ]. Embedding HTML in IPA is not the most elegant solution, and these superscripts don't always match the metrics of the Unicode-based superscript characters, but at least they work for every vowel. Any HTML solution will also have the issue of the effect not necessarily translating through a copy-and-paste of the IPA text.
  • HTML subscripts: [tʲilʲizʲilʲimʲizʲuɔnʲɤŋoulʲ]. Visually distinct from the secondary articulations. Not that lovely, though.
  • HTML font size reduction: [tʲilʲizʲilʲimʲizʲuɔnʲɤŋoulʲ]. Visually works well with most of the vowels, but [i] tends to be hardest to notice (or easiest to confuse with [ː]) when shrunken down.
  • CSS translucency: [tʲilʲizʲilʲimʲizʲuɔnʲɤŋoulʲ]. Like HTML-reliant solutions, CSS-reliant solutions don't adapt well if at all in copy-and-paste, but it at least has the benefit of being a color-based solution that adapts well to any page color scheme, no matter the text color or the background color. Another drawback is that I had to change the {{IPA| markup to {{IPA|1= because of the = characters present in the inline CSS markup. This is not necessarily insurmountable on the template level, but makes things annoying on the manual level.
  • Parentheses: [tʲilʲ(i)zʲilʲimʲ(i)zʲuɔnʲ(ɤ)ŋoulʲ]. A non-HTML-reliant solution that simultaneously highlights the vowel's audibility and its ephemeral nature. My only major concern is how visually disruptive the parentheses could be by making the vowel appear significantly more prominent than non-epenthetic vowels.
- Gilgamesh (talk) 02:40, 31 December 2019 (UTC)
And, once again, thank you for your time and patience in entertaining my own critical input. - Gilgamesh (talk) 03:33, 31 December 2019 (UTC)
I thought of yet another option.
  • Small parentheses: [tʲilʲ(i)zʲilʲimʲ(i)zʲuɔnʲ(ɤ)ŋoulʲ]. Less visually disruptive, and they just revert to ordinary parentheses during copy-and-paste without changing visual meaning.
- Gilgamesh (talk) 03:54, 31 December 2019 (UTC)
@Austronesier: I tried to reread Ng (2017)'s notes on syllable stress, but her phonological notation in that section is...rather confusing to read, though I was able to match some of her examples words with words in the MOD. But I did read her reference that Bender (1968) had made comments about stress. So I reread 1968, and Bender mentions stress only once:

Excrescent vowels between full consonants are reduced in stress to such an extent that they contrast with inherent vowels in similar environments; excrescent vowels contiguous to a semiconsonant seemingly do not undergo such reduction.

What this tells me, that I didn't know before, is that, even in minimal pair circumstances where a word's stress patterns do not change for the other vowels, an epenthetic vowel is noticeably different from a normal vowel. Again, not necessarily shorter (though this is possible), but unstressed and indeed completely repellent and invisible to all considerations of syllable stress. I've been assuming up to this point that this means that epenthetic vowels influence differences in syllable stress patterns, but if it's possible for two otherwise identical words to have stress in the identical position and epenthetic vowels still being perceptibly different in nature, then this may be a game changer.
Now, you've been assuming, not without well-informed reason (as it is generally true in most languages the world over), that epenthetic vowels in any language still form syllables of their own. But in this case, what if they...really don't? At least, not in the way Marshallese speakers treat their own language. If it's possible for consonants in other languages (like in English or Czech) to become syllabic without neighboring a vowel, why isn't it possible for vowels, at least on the perceptive level, to be non-syllabic even when they occur between two non-glide consonants? To be not necessarily approximants or semivowels, but fully-audible vowels with without syllabicity. In which case, Bender's non-syllabic notation for them is logical.
This is not to say that the ears of speakers of other languages may not perceive the existence of syllables, but phonotactics means that people are accustomed to hearing distinctions made in languages they are familiar with and have to learn (sometimes with difficulty) to perceive distinctions that are alien to them. And I know this to be true from experience: My mother was a fairly well-educated woman and a native speaker of English, but not only did she speak with the pen-pin merger, but for her entire life she could not hear a difference in the sounds when other people spoke these words differently—she would always hear both words as "pin," say them both back as "pin," and that never changed no matter how much anyone tried to teach her the difference between [pʰɛn] and [pʰɪn]. It was just one epiphany that was either too elusive or too personally unimportant for her to reach. And I had a similar experience with the word "orange," as I always spoke it and heard it as one syllable [~oɻndʒ] even when many other people spoke it with two [ˈɒɹɪndʒ ~ ˈɔɹɪndʒ], and I didn't finally learn to notice the difference until early adulthood. (And I had already begun studying linguistics as a teenager, so it took a little while.) Since then, I fully perceive the difference, but I've come to embrace the quirkiness of my accent's syllable deletion, and my other one-syllable words like "carrot", "cereal", "foreign", "middle", "mirror" and "sandwich", and my two-syllable words like "cooperate", "dangerous" and "deodorant".
Now, I don't fully understand how Marshallese ears can differentiate normal vowels from epenthetic vowels in otherwise similarly-articulated, similarly-stressed words—but if Bender is correct, then they just do. I can only speculate what cues may be involved, whether it involves subtle differences in tone or voice or whatever. But not being able to fully grasp how it works does not mean I'm not able to accept that it does work on some level. Know what I'm saying? Until we do understand it more comprehensively, it remains a sort of quantum mechanic of linguistics—something (as of yet) difficult to directly analyze, but predictable in its result.
And I agree once again: As far as Austronesian languages go, Marshallese phonology is a bizarre and fascinating creature, and the cause of no shortage of proverbial hair-pulling in study. And I've never had so much fun studying one language. - Gilgamesh (talk) 05:10, 1 January 2020 (UTC)
At least, I hope I read Bender's comment correctly. It's times like these that I feel like this article needs a permanent {{Expert needed}} notice, not in request or anticipation that an expert in Marshallese will arrive, but as an acknowledgment that there never is an expert in Marshallese participating. I wish I'd learnt more of the language conversationally when I was young. - Gilgamesh (talk) 22:13, 1 January 2020 (UTC)
I appreciate the comprehensive demonstration of possibilities. The parentheses seem most readable to me. The other transcription methods, such as subscripts and superscripts and smallness and transparency, are awfully innovative and not really proper IPA. As far as I know, adding the syllabic diacritic to vowels, or the nonsyllabic diacritic to a vowel not neighboring another vowel, is similar. (These conventions would probably eventually suffer criticism from phonological editors if they made it out into the cold bright light of article pages.) The breve, the "extra short" diacritic, would be acceptable, though, if it were accurate.
But it sounds like these transcriptions don't clarify to me what the phonetic distinction between epenthetic vowels and full phonemic short vowels actually is. If they have some effect on prosody (stress and such), it would be clearer to indicate that. However, I'm getting the impression that prosody hasn't been explored enough for it to be transcribed, in which case we're stuck. — Eru·tuon 07:13, 2 January 2020 (UTC)
@Erutuon: That is a very fair assessment, and thank you for chiming in.
The impression I'm getting from Bender (1968) and Ng (2017) is that syllable stress isn't as contrastive as in a lot of languages, though it does exist and there are reportedly minimal pairs that make a grammatical difference. And yet, it didn't exist on enough of a phonological level for Bender to indicate it in any way in his phonemic transcription. This led me to assume that the status of epenthetic vowels must have shifted the syllable stress around, but rereading that Bender (1968) comment, even this may not necessarily be a given, if even an identically-situated normal vowel and epenthetic vowel contrast with each other. What is a given is that epenthetic vowels are completely repellant to stress and play no factor in word stress patterns—an epenthetic vowel between two consonants is treated like a consonant cluster without a vowel between them, as they do not exist on a phonemic level.
So what directly sets apart these epenthetic vowels on a superficial phonetic level?
  • Are they shorter than normal vowels? Not necessarily.
  • Are they differentiated by a quality of voice or pitch? Not clear.
  • Maybe it's rhythm? They are completely repellant to syllable stress. But it seems like Bender was saying that, even when similarly situated, they still contrast.
A normal vowel is analyzed as a vowel phoneme, and an epenthetic vowel is analyzed as filler. So what verifiable precedent is there for differentiating them in IPA? Bender (1968) puts a non-syllabic inverted breve beneath epenthetic vowels [◌̯] and leaves normal vowels unmarked. Willson (2003) does not notate them differently at all. Normal vowels and epenthetic vowels also fail to contrast in many situations neighboring glide consonant phonemes, where they fuse with a neighboring normal vowel to create an audible long vowel: Compare [mˠɑːzʲɛlˠ] (not [mˠɑɑ̯zʲɛlˠ]) for M̧ajeļ, phonemically {m̧ahjeļ} or /mˠæɰtʲɛlˠ/. But I think you're already aware of this. So, Bender says they (conditionally) contrast, and Willson doesn't explore this at all beyond acknowledging some of the ways epenthesis can occur.
So that leaves the editorial disagreement on how we are to notate epenthetic vowels:
  • I wanted to use inverted breve [◌̯] like Bender (1968) uses. It reflects sourced precedent, and appears to have at least some reason behind it I cited in that excerpt a little earlier.
  • Austronesier wanted to use normal breve [◌̆], believing that they still form syllables. But because they can't be assumed to have a shorter quality than normal vowels, he changed his position and suggested not notating them differently from normal vowels [◌].
  • You are correct that many of my suggested solutions are not real IPA and would not withstand outside scrutiny, and it is my priority that the solution be able to withstand such scrutiny. You like the solution with parentheses, and find them readable. (Do you prefer normal-sized parentheses [tʲɛrˠ(ʌ)bˠɑlʲ] or small parentheses [tʲɛrˠ(ʌ)bˠɑlʲ]? Strictly where parentheses are concerned, I mean—my vote is still for inverted breves.)
  • This discussion needs experts in the Marshallese language itself. It reeeeeeeeeally needs those experts. Fluent, well-read, active and interactive online. All the published sources and anecdotal evidence in the world only help us so much if none of us are even conversational in the language.
To be honest, at this point I'm leaning just about equally towards either inverted breve [◌̯] or parentheses [()] (while preferring the former, all things being equal), and would support whichever of the two options receives majority support from other editors. If a majority of editors support a different approach (normal breve [◌̆] or unmarked [◌]), I'd still have reservations, and maybe occasionally bring them up again if I find another pertinent reference in the published sources, but unless consensus shifts in a different direction, I'd cooperate. - Gilgamesh (talk) 11:00, 2 January 2020 (UTC)
So, as it stands... One vote for inverted breve, one vote for no special marking, and one vote for parentheses. I suppose no one could be convinced to break the deadlock? ... No, I suppose not. The problem here seems to be that we still lack conclusive notes describing what makes epenthetic vowels unique, so different editors have different preferences on how to proceed. That's entirely fair—there's just still no consensus. I kinda feel bad I wasn't able to find that, but... It's not a failure. Consensus can always congeal in the future. In the meantime, I'll keep collecting and examining references, and sharing any new notes that I find on the topic. - Gilgamesh (talk) 11:14, 3 January 2020 (UTC)

I have just thought that for a remedy, we should also look at old sources pre-Bender. Naturally, these sources are not influenced by a phonemic analysis, maximally only by the spelling conventions (remember the quote from Hernsheim (1880), when he deplores the "missionary spelling" exactly for its not rendering epenthetic vowels?). Erdland (1906) writes on p. 197:

"Wird eine mit einem harten Konsonanten beginnende und geschlossene Silbe wiederholt, so muß meistens ein euphonisches e eingeschaltet und wenn der Endkonsonant ein k ist, dieser vor dem e in ein g abgerundet werden."
("If a syllable beginning and ending with a hard consonant is repeated, usually a euphonic e must be inserted, and if the final consonant is k, the latter must be softened to g before the e.")
Example: meloklok ("forget"), pronounced melógelok

Unfortunately, it is not clear whether "euphonic e" is [e/ɛ] or [ə], since German e is ambiguous here. From "melógelok" however, we can conclude that Erdland means [ə]. I'll try to look further in other sources. –Austronesier (talk) 12:41, 3 January 2020 (UTC)

All right. I suppose I have a lot more faith in Bender in than some editors would readily subscribe to, but part of that is that he was responsible for some of the most transformative phases of Marshallese study. Discovering the vertical vowel system. Discovering that every consonant had a secondary articulation. Crafting a supplemental phonemic reference orthography. Helping refine the new standard orthography and the MED. There is no modern academic study of Marshallese without Bender's influence, whether welcome or unwelcome. But yes, it's possible for anyone to make mistakes. That's part of why I asked for help in the first place, because I know I've made mistakes crafting these articles, even when I wasn't entirely certain what those mistakes were.
Anyway, I thought of something else, as somewhat anecdotal as it may be. It's not just that epenthetic vowels may be truly non-syllabic, but that they can also conditionally disappear in certain mods, such as in song and chanting. Song can vary in some qualities (like plosive vs. fricative quality of ⟨j⟩), but either way, it seems more common to completely omit epenthetic vowels. And while I know there are dangers in conflating the fashions of verse and prose (even in English, people can speak and sing in what are effectively different accents), I do now find myself wondering whether it it's not just whether epenthetic vowels are non-syllabic or ephemeral, but are actually both at once. So instead of just a diacritic [◌̯] or parentheses [(◌)], both at the same time [(◌̯)] may actually be more appropriate. But again, since this is kind of anecdotal, I can't say by myself how appropriate or supportable this idea actually is. - Gilgamesh (talk) 14:10, 3 January 2020 (UTC)
Oh, and if the word you're referring to is the same as the MED's meļo̧kļo̧k {meļakʷļakʷ}, then the epenthetic vowel is probably some variation (diphthongal or monophthongal) of [ɔ͡ʌ]: The module currently yields [mʲɛlˠɒɡʷ(ɔ)lˠɒkʷ]. When diphthong mode is enabled, it yields [mʲɛ͜ʌlˠɑ͜ɒɡʷ(ɔ͜ʌ)lˠɑ͜ɒkʷ]. It's not difficult to imagine this could be heard as [ɡʷə ~ ʷɡə]. - Gilgamesh (talk) 14:17, 3 January 2020 (UTC)
And, of course, allowing for over a century's worth of language drift. - Gilgamesh (talk) 14:22, 3 January 2020 (UTC)
@Gilgamesh~enwiki: When I suggest to consult sources pre-Bender, this is not motivated by distrust in Bender's analysis, but rather by trying to find sources which are not primarily concerned with phonology, which by nature entails a degree of abstraction that is actually an obstacle for our goal here, viz. a uniform and well-sourced phonetic transcription of Marshallese. The only source post-Bender which actually covers acoustic phonetics is Choi, who unfortunately relegated epenthetic vowels to further research (p.122).
The different pronounciation in song and prose is not surprising, and at least confirms that the abstract phonemic analysis (which omits epenthetic vowels) follows speaker intuition. If we had a quotable source which confirmed that epenthetic vowels are optional, or omitted in certain types of declaration (including singing), we could fully justify the parenthesis solution: [tʲɛrˠ(ʌ)bˠɑlʲ] then would essentially indicate that both [tʲɛrˠʌbˠɑlʲ] and [tʲɛrˠbˠɑlʲ] are possible, depending on speech style. –Austronesier (talk) 16:13, 3 January 2020 (UTC)
Ahh, that's a very good point. Yes, we should find more of those pre-Bender phonetic sources.
Anyway, here are some YouTube song samples, with lyrics provided in the video for word-to-phonetic comparison.
  • A cover of a traditional song, Ij Io̧kwe Ļo̧k Aelōn̄ Eo Aō. J is plosive, and all obstruents seem more or less voiceless in all positions. It makes me think of how, at the time the MED was compiled, sibilant pronunciations of J were considered lisping. As an added note, the lyrics in the video obviously came from the Wikipedia article, because I assembled that article originally (updating an old orthography lyrical reference with MED-informed new orthography spellings), so that particular reference is circular, but it is still a useful textual frame of reference to go with the audio.
  • A pop song. J is sibilant fricative in all positions. Many epenthetic vowels are elided, though some remain. It makes me think of French singing, where [ə] is traditionally articulated syllabically in positions where it would be completely silent in typical speech, except in this case the opposite occurs where the epenthetic vowels of speech are partially or wholly omitted in song. Some of the consonant cluster assimilations found in speech also seem absent in song.
But because song can be so different from speech, the prose vs. verse conflation risk comes to mind. From what I recall, French dictionaries with IPA only give pronunciations for speech, not for song, which is why frère and Jacques together are [fʁɛʁ ʒɑk], not [fʁɛ.ʁə ʒɑ.kə] as in song. Singing, in whatever language, is something that tends to be learnt separately from speaking. It's worth studying in its own right, but prose pronunciation guides are about speech and not about song. - Gilgamesh (talk) 18:55, 3 January 2020 (UTC)
It seems to me that the greater absence of both epenthetic vowels and consonant cluster assimilations in Marshallese singing has to do with a more focused phonetic control inherit in artistically cultivated verse. Singing, in just about any language, tends to be more enunciated than in speech, and is usually slower than speech as well. In Marshallese prose, both epenthetic vowels and consonant cluster assimilations appear to be the general rule. After this additional consideration, I still feel inclined to notate epenthetic vowels with inverted breve, no parentheses [◌̯]. - Gilgamesh (talk) 15:50, 4 January 2020 (UTC)

@Austronesier: You wanted a pre-Bender phonetic analysis of Marshallese, right? Have you read Notes on Marshallese consonant phonemes by Denzel Carr (1945)? I realized I had a copy collecting proverbial dust on my hard drive after someone gave me a bunch of Marshallese-related references years ago. While this one mostly pertains to consonants, it does touch on epenthetic vowels. - Gilgamesh (talk) 05:38, 5 January 2020 (UTC)

@Gilgamesh~enwiki: Thank you for this one, no, I hadn't read it before. It's great, and almost all what we need is here:
  1. "Aside from these three cases, a vowel is pronounced, but not written, between the consonants." (When they are there, they are there, and not just ephemerally)
  2. "This vowel disappears if the syllables are pronounced slowly and separately." (This confirms your observation about pronunciation in songs)
  3. "-mt-, -nk-, -ñp-]... become [-m(ə)t-, -n(ə)k-, -ŋ(ə)p-]" (We have a source for using parentheses, so nobody can say we're doing OR)
  4. "The excrescent vowel takes on the coloring of the preceding and following sounds, particularly the vowel of the following syllable." But («bummer»): "This will be treated as a part of the general phenomenon of assimilation to be taken up with a study of the vowel phonemes."
Too sad the latter never saw the light, so when it comes to the quality of epenthetic vowels, we still need to put together all hints from various sources. –Austronesier (talk) 18:59, 5 January 2020 (UTC)
  1. So for a vowel that is pronounced and not just ephemerally...
  2. ...and yet disappears in careful pronunciation...
  3. ...maybe [◌̯] with non-syllabicity is appropriate after all, since the vowel is neither quite ephemeral nor does it occupy a counted syllable. In slow and separate speech, consonant assimilations, consonant voice patterns, etc., may not occur either. Again, the French IPA analogy may apply, as dictionary pronunciation guides are generally for speech rather than for song. Parentheses may not be necessary if the former assumption holds.
  4. Maybe the vowel's coloring was studied in further detail, going by Bender (1968), Choi (1992), Willson (2003).
The module's algorithm currently gives epenthetic vowels between two non-glides (and after non-glides and before a few exceptional glide conditions) a vowel height consistent with all of Bender's 1968 examples: max(F1[left vowel], F1[right vowel], F1[/ɛ/]). And its F2 in the same conditions is calculated the same as for every other vowel, so as to avoid making additional unsupported assumptions. Whether the F1 algorithm is also appropriate before surfaced bare [w] (as in [irˠu̯wɤtʲ] instead of [irˠo̯wɤtʲ] for irwōj) is...a judgment call on my part, but appears to fall squarely in the realm of "no harm done." There may never be a 100% accurate system for describing the surface realizations of epenthetic vowels, since they are defined more by their presence than by their height or quality.
It's also increasingly clear that word boundaries may transform significantly as a matter of rather complex sandhi in the same uninterrupted speech: Short vowels can become long (Aelōn̄in [ɑelʲɤŋinʲ] + Ae [ɑɛ] = [ɑelʲɤŋinʲɑːɛ]), vowel reflexes can change (eak [ɛ̯ɑk] + eak [ɛ̯ɑk] = eakeak [æɡɛ̯ɑk]), consonants pronounced voiceless become voiced and may even assibilate (Jalwōj [tʲælo̯wɤtʲ, tʲælo̯wɤzʲ-, -zʲælo̯wɤtʲ, -zʲælo̯wɤzʲ-]), and consonant clusters assimilate or epenthesize even across word boundaries (kajin [kɑzʲinʲ] + M̧ajeļ [mˠɑːzʲɛlˠ] = [kɑzʲinʲ(i)mˠɑːzʲɛlˠ]). These seem to be learnt as part of the wider nature of the language's phonotactics—a version of the algorithm internalized as a speech reflex. As the many different permutations of word boundary sandhi increase with different word combinations, a pronunciation guide becomes increasingly complex to write algorithmically if we were to describe every condition. And the considerations grow even more complicated when dealing with affixes with bare vowels like ri- {ri-} that fuse to morphemes, which interact with word boundaries to change words like Wōjjā {wejjay} [wʌttʲæ] into ri-Wōjjā {ri-wejjay} [rˠuɔttʲæ], and In̄lij {yiglij} [iŋ(i)lʲitʲ] into ri-In̄lij {ri-yiglij} [rˠiːŋ(i)lʲitʲ]. The details that are currently provided by the Marshallese phonemes and each word's isolated phonetic realization may have to suffice for a pronunciation guide, as there technically is enough data present to predict the various reflexes. At least we're not dealing with something like French liaison where entire fully-qualified obstruent phonemes appear or disappear outright in grammatically-sensitive conditions. - Gilgamesh (talk) 01:01, 6 January 2020 (UTC)
Based on points 1–3, I now essentially agree with Erutuon in using parentheses. My point 4 specifically referred to Carr's research. But anyway: Bender (1968) says it's predictable, but only gives examples and no rules; so does Willson (2003), while Choi (1992) postponed the topic. "Coloring" referes to F1 and F2. We have agreed that F1 certainly follows the same rules for full and epenthetic vowels (backed by Willson, implicitly also by Choi), while for F2, Carr's statement "the excrescent vowel takes on the coloring of the preceding and following sounds, particularly the vowel of the following syllable" is the only rule-like formulation in the published lit that I am familiar with as of now. It might help to refine the existing algorithm max(F1[leftVowel], F1[rightVowel], 2).
I have done some reading about the topic of phonetic non-glide non-syllabic vowels. In textbook phonetics, non-syllabic vowels are equated with glides, which always occur next to a syllabic full vowel. This is the main reason why I have opposed the inverted breve in spite of its use in Bender (1968). Recent phonological research by Hall (2006) about "intrusive vowels" however gives a more differentiated picture. Hall writes that a syllable can be rigidy defined only as an abstract phonological unit, while there is no valid cross-linguistical characterzation of a syllable as an acoustic [i.e. phonetic] object. So she does describe "intrusive vowels" as non-syllabic; however, she uses a sub-bar in her notation. Other authors describe such intrusive vowels as a "boundary phenomenon" between phonetics and phonology, e.g. Sebregts (2015). So far, I have found no example where an author writes non-syllabic intrusive vowels with the IPA inverted breve, maybe because the latter is universally associated with a glide. Marshallese epenthetic/excrescent vowels fit well in the cross-linguistic description of intrusive vowels. To equate them would be OR, of course. The main objective of my cross-linguistic search was to find examples which could help us to justify the notation with an inverted breve within the modern IPA framework (negative so far, but still searching). Btw, at the time of Bender (1968) the inverted breve was not the IPA-standard for non-syllabic vowels, so his use of the inverted breve must be understood as an ad hoc-notation. –Austronesier (talk) 12:52, 6 January 2020 (UTC)
All right, we have consensus on parentheses. And updated accordingly for all versions.
And no, I didn't realize that Bender's 1968 inverted breve notation was considered ad hoc at the time. I did recognize some of his phonetic notations as ad hoc, though, like superscript "y".
As for epenthetic F2... Without additional clarity, the description "particularly the vowel of the following syllable" seems too vague to glean anything new from it. As it is, the module's algorithm already often assigns the F2 of vowels to match the secondary articulation of the following consonant, depending on the F1 and the consonants involved. And because the secondary articulations of consonants influence vowels on both sides, it's not necessarily unusual for both of those vowels to share the same F2. It's worth noting that it wasn't until Bender (1968) that it was firmly established that Marshallese is a vertical vowel system, and back in 1945 their understanding of the language's phonological nature was still incomplete. So the 1945 paper's comment, in its vagueness, doesn't necessarily contradict the algorithm as it stands. - Gilgamesh (talk) 02:15, 7 January 2020 (UTC)

After rereading the Bender (1969) section on phonology (very early in the book), I realized that where /nʲ, nˠ, rʲ, rˠ, lʲ, lˠ/ are concerned, the palatalized phonemes place the tongue behind the front teeth, and the velarized phonemes place the tongue behind (not on) the alveolar ridge. Palatalized dental [n̪ʲ, r̪ʲ, l̪ʲ] and velarized postalveolar [n̠ˠ, r̠ˠ, l̠ˠ]. However, /tʲ, tˠ/ are different: /tʲ/ can be alveolar [tʲ] (especially in song) or postalveolar [t̠ʲ] (especially in speech) or inbetween, which is why I conservatively notate all its allophones with alveolar symbols [tʲ, zʲ] instead of postalveolar [t̠ʲ, ʒʲ ~ ʑ] or prepalatal symbols [c̟, ʝ̟] (which they can still be, in free variation). But /tˠ/ is specifically dental [t̪ˠ], behind the front teeth. I find this rather peculiar that /tˠ, nʲ, rʲ, lʲ/ are dental (one velarized, the other three palatalized), but [tʲ, nˠ, rˠ, lˠ] are postalveolar (one palatalized, the other three velarized). This gives /tʲ, tˠ/ a polar opposite relationship with the alveolar ridge that /nʲ, nˠ, rʲ, rˠ, lʲ, lˠ/ have with the alveolar ridge. (In any event, the rounded consonant phonemes have the same place of articulation as the velarized phonemes, but with added lip rounding. /nʷ, rʷ, lʷ/ exist as rounded postalveolar phonemes [n̠ʷ, r̠ʷ, l̠ʷ], but /tʷ/ does not exist at all.) Now, in terms of stable consonant clusters, any combination of two of /nʲ, nˠ, rʲ, rˠ, lʲ, lˠ/ can form a non-epenthetic cluster (though with regressive assimilation of secondary articulation, and the first of the pair assimilates to /n/ if the second of the pair is /n/), and /nʲtʲ, nʲtˠ, nˠtʲ, nˠtˠ/ can also form non-epenthetic clusters (again with regressive assimilation of the secondary articulation), and besides an exceptional allowance for /lʲtˠ, lˠtˠ/, any other combination of these four primary articulations is unstable /tn, tr, tl, rt, lʲtʲ, lˠtʲ/ and triggers epenthesis. But it got me thinking that if the stable clusters trigger regressive assimilation of secondary articulations, does it also trigger regressive assimilation of primary place of articulation, too? Like...dental before dental, and postalveolar before postalveolar? If true, that would mean that even if [n̪ʲ] is palatalized dental in isolation, and [n̠ˠ] is velarized postalveolar in isolation, maybe /nˠtˠ/ is actually [n̪d̪ˠ], with a velarized dental [n̪ˠ] that otherwise cannot exist in isolation. Not that all this is a huge concern if we omit dental vs. postalveolar diacritics in simplified phonetic notation, but it makes me wonder whether this is why, in non-compound words (mostly loanwords), spellings with ⟨nt⟩ appear far more commonly in the MED than spellings with ⟨ņt⟩, even though ⟨nt⟩, as written, is a mismatch of secondary articulations /nʲtˠ/. Just an oddity I observed, that's all. - Gilgamesh (talk) 09:30, 5 January 2020 (UTC)


There is another oddity I've observed. It's the wholesale differences in vowel reflexes neighboring phonemically rounded consonants /Cʷ/ vs. those neighboring velarized labial consonants /pˠ, mˠ/. All of these consonants have very pronounced phonetic rounding, and all of them seem to have conditional [w] off-glides, as reflected in orthographic spellings bw, kw, ļw, m̧w, ņw, n̄w, rw. And, from what I can tell, while these off-glides are not themselves phonemically contrastive (they are effectively euphonic), they do genuinely seem to occur, and even pre-Bender linguists believed there were separate /p, m, pʲ, mʲ, pʷ, mʷ/ phonemes. And, like Austronesier said much, much earlier in this talk page, it's at least relatively normal for there to be [w] off-glides in other languages with velarized labials, including Irish. And yet, other than the [w] off-glides, vowel reflexes behave no differently around b, bw, m̧, m̧w than they do around other velarized unrounded phonemes.

I have one (very OR but potentially supportable) theory to explain this: It may be that the rounded phoneme off-glides are actually labialized velar approximants [ɰʷ], and the velarized labial phoneme off-glides are actually velarized bilabial approximants [β̞ˠ]. Both are essentially labiovelar approximants [w], but the differences in vowel reflexes neighboring them effectively make them seem like different sounds altogether. Even if theoretically these could be realized as acoustically different sounds, I'm not aware of any language that contrasts them phonemically, and yet Marshallese seems to contrast them phonetically.

In the module algorithm as it stands, when a likely off-glide is detected, the phonetic symbols [pˠ, bˠ, kʷ, ɡʷ, mˠ, ŋʷ] are instead (rather experimentally) expressed as [pw, bw, kw, ɡw, mw, ŋw], and given that these are fairly similar to the orthography, they are rather easy to read. Below is the (cached) algorithmic feedback of how similarly-situated /pˠ/ and /kʷ/ behave in CVC sequences.

Pattern /pˠ/ /kʷ/
/tʲæC, Cætʲ, tʲæCætʲ/ [tʲɑpˠ, pˠɑtʲ, tʲɑbˠɑtʲ] [tʲɒkʷ, kwɑtʲ, tʲɒɡwɑtʲ]
/tʲeC, Cetʲ, tʲeCetʲ/ [tʲepˠ, pˠɤtʲ, tʲebˠɤtʲ] [tʲokʷ, kwɤtʲ, tʲoɡwɤtʲ]
/tʲiC, Citʲ, tʲiCitʲ/ [tʲipˠ, pwitʲ, tʲibwitʲ] [tʲukʷ, kʷutʲ, tʲuɡʷutʲ]
/jæC, Cæj, jæCæj/ [ɛ̯ɑpˠ, pwæ, ɛ̯ɑbwæ] [ɛ̯ɒkʷ, kwæ, ɛ̯ɒɡwæ]
/jeC, Cej, jeCej/ [epˠ, pwe, ebwe] [e̯okʷ, kwe, eɡwe]
/jiC, Cij, jiCij/ [ipˠ, pwi, ibwi] [i̯ukʷ, kwi, iɡwi]
/kˠæC, Cækˠ, kˠæCækˠ/ [kɑpˠ, pˠɑk, kɑbˠɑk] [kɒkʷ, kwɑk, kɒɡwɑk]
/ɰæC, Cæɰ, ɰæCæɰ/ [ɑpˠ, pˠɑ, ɑbˠɑ] [ɑkʷ, kwɑ, ɑɡwɑ]
/ɰeC, Ceɰ, ɰeCeɰ/ [ɤpˠ, pˠɤ, ɤbˠɤ] [ɤkʷ, kwɤ, ɤɡwɤ]
/ɰiC, Ciɰ, ɰiCiɰ/ [ɯpˠ, pˠɯ, ɯbˠɯ] [ɯkʷ, kwɯ, ɯɡwɯ]

The vowels behave very different around the [w] in [pw] than they do around the [w] in [kw], and the circumstances which trigger the presence of the off-glides at all are also different. So if I'm right and these are actually [β̞ˠ] vs. [ɰʷ], what is the acoustic difference? Both are rounded, but perhaps [ɰʷ] has protruded lips and [β̞ˠ] does not? Maybe rounded phonemes involve not just rounding, but also said lip protrusion? At this point this is all speculation, and needless to say, both [pβ̞ˠ] and [kɰʷ] are visually uglier and less quickly readable in phonetic transcription than simple [pw, kw]. But a better (and preferably sourced) understanding of the existence and nature of these off-glides may affect how they are addressed in the article. - Gilgamesh (talk) 10:34, 8 January 2020 (UTC)


And by the way (yes, this is part of the same edit as above), isn't there a bot that archives old sections of talk pages that grow too big? This page could really use some automated archiving. - Gilgamesh (talk) 10:34, 8 January 2020 (UTC)

New sections

Maybe we should start new sections more often for topics shifts. I know it's generally more my style to shift amorphously from topic to topic, but it would help keep this talk page organized and easier to archive. My next comment will either be a response to an existing thread (if someone replies or there's something new), or a new section. And maybe some of the more monolithic sections could conceivably be split up into sections where feasible, I don't know. - Gilgamesh (talk) 15:03, 8 January 2020 (UTC)

Byron W. Bender

In other news, I just found out today that Byron W. Bender passed away on January 4 of this year. For all these decades there was practically no major linguistics publication about the Marshallese language that wasn't either written or co-written by him or mentioned, referenced or built upon his work. I know these talk pages aren't discussion forums, but I thought everyone should know. - Gilgamesh (talk) 23:41, 12 January 2020 (UTC)

Which IPA off-glides are better?

So, given the following orthographic and phonemic sequences, which phonetic sequence is a better presentation, balancing concerns of verifiability, orthographic resemblance and readability?

  ku m̧ū n̄u bwi kwi m̧wi n̄wi   Spellings using the standard orthography.
bih kʷiw m̧ih gʷiw biy kʷiy m̧iy gʷiy Phonemic spellings using Bender's orthography.
pˠiɰ kʷiw mˠiɰ ŋʷiw pˠij kʷij mˠij ŋʷij Phonemic IPA sequences.
Possible phonetic IPA transcriptions. The first four IPA columns have no phonetic off-glides and are identical for all rows.
#1 pˠɯ kʷu mˠɯ ŋʷu pˠi kʷi mˠi ŋʷi   Secondary articulations with no off-glides at all. Makes no difference in meaning, because off-glides, if and where they exist, are not phonemic. One could even make the plausible case that off-glides may not occur with any reliability at all and are purely orthographic.
#2 pˠɯ kʷu mˠɯ ŋʷu pwi kwi mwi ŋwi Graphically simple off-glides, with the secondary articulations omitted but implied by context, given [w] is both velarized and rounded, and all the consonants involved also have some sort of phonetic rounding. There is no ambiguity, as there are no separate /pʷ, mʷ/ phonemes to complement /pˠ, mˠ/—the phonetically rounded off-glide is a reflex of the velarized labial phonemes themselves before front vowels. This is the option the module is currently using, as I consider it the least visually cluttered and most readable option.
#3 pˠɯ kʷu mˠɯ ŋʷu pˠwi kwi mˠwi ŋwi Secondary articulations omitted and implied by off-glides after rounded phonemes, but explicitly included after velarized labial phonemes with off-glides.
#4 pˠɯ kʷu mˠɯ ŋʷu pˠwi kʷwi mˠwi ŋʷwi Explicit secondary articulations with off-glides in all candidate positions, especially if the off-glide is thought of as part of the following vowel reflex instead of the consonant itself.
#5 pˠɯ kʷu mˠɯ ŋʷu pβ̞ˠi kwi mβ̞ˠi ŋwi Use [β̞ˠ] for off-glides after velarized labials, emphasizing that it is phonetically but not phonemically rounded. The difference between [w] and [β̞ˠ] may not necessarily be graphical, either, if velarized labials and rounded phonemes have different manners of rounding, but this may be straying more into OR.
#6 pˠɯ kʷu mˠɯ ŋʷu pβ̞ˠi kɰʷi mβ̞ˠi ŋɰʷi Same as above, but even more pedantic and unquestionably OR, included here for completeness. The only language I can recall whose conventional IPA uses [ɰʷ] transcription (phonetically identical to [w]) is Guarani.
#7 pˠɯ kʷu mˠɯ ŋʷu pˠβ̞ˠi kʷwi mˠβ̞ˠi ŋʷwi I'll be honest: OCD is the only reason these two rows exist at all.
#8 pˠɯ kʷu mˠɯ ŋʷu pˠβ̞ˠi kʷɰʷi mˠβ̞ˠi ŋʷɰʷi

Where off-glides are used, their presence is algorithmically informed by regular patterns in the orthography. Up to this point, this has been a matter of judgment, but it would be preferable if this could meet consensus and withstand scrutiny.

@Austronesier: You've already touched on this briefly before, but I'd especially like to pick your mind on this, here and now. - Gilgamesh (talk) 08:20, 12 January 2020 (UTC)

@Gilgamesh~enwiki: I'm still following, but—among other things—too much involved in tiring discussions (e.g. Talk:Hindustani language) to the expense of our challenging, but very inspiring exchange here. Too sad to hear about Bender. He was a real great, not just in the field of Micronesian linguistics, but also as an academic heading the Linguistics department in Manoa, and as long-time editor of Oceanic Linguistic.
I opt for #1. Solution #4 is maybe closer to the acoustic facts, but redundant, given the fact that the naturally occurring off-glide after a labioverlarized consonant will at the same time be heard as an on-glide to the vocalic element that follows. –Austronesier (talk) 12:20, 13 January 2020 (UTC)
@Austronesier: I'm pleased you find our exchanges inspiring. Sometimes I fear they grow repetitive, but I enjoy the constructive feedback process. I've been in discussions before where users try to tear down each other's logic as if it's a zero-sum game, and it can get uncivil. What's the use (or fun) in that? It doesn't help improve everyone's collective understanding of a topic. As much as I enjoy the process here, I hope the process is similarly enjoyable for others.
Yes, I found out emailing Steve Trussel, webmaster of the MOD (first email I sent him since 2011), asking him questions about the purpose and intent behind the orthographic variants used by the MOD (ḶḷṂṃṆṇÑñỌọ). Since I've even seen at least one print book using these variants, I wondered if it represented not just a display workaround, but whether he and Dr. Bender intended them to complete supercede for the standard orthography's letters (ĻļM̧m̧ŅņN̄n̄O̧o̧) in all kinds of media. But no, he told me they're purely a workaround, and that the MOD would probably be updated to use the standard letters as soon as all of them become available in Unicode as precomposed glyphs.
That's when he also told me that Dr. Bender had just passed on. Just hours before I sent him that email, I had finally decided to order a copy of Marshallese Reference Grammar (co-authored by Bender) to add to our family heirloom collection of Micronesian cultural books. I had no idea at the time I ordered it that he had just died, so it was unexpectedly startling to find out that he had.
Anyway, I informed Mr. Trussel of the known display solutions for Marshallese text using the standard orthography (ĻļM̧m̧ŅņN̄n̄O̧o̧), including some of the fonts that display these letters correctly (aside from the Latvian comma diacritic issue), and also that the Noto fonts have specifically addressed Marshallese display issues and now display all letters perfectly if the display language is set to a Marshallese language code (lang="mh", etc.), and he appreciated me telling him that.
He also said that the Unicode Consortium approval process for new precomposed glyphs has been slow. Personally (and I didn't think to tell him this at the time), I think it may be possible that approval for the precomposed letters will be rejected, if only for the reason that with today's digital devices and font rendering technology, it can now be considered purely a display issue with separate letters and combining diacritics being judged sufficient, and that even most of Unicode's established precomposed Latin letters with diacritics (ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏ etc.) might not be approved if they had been submitted today for that same reason. I can sort of see the wisdom that it's not the Consortium's job as a standards agency to fix inadequate fonts and text rendering software—just to issue the standards describing how such text can be encoded in documents and rendered in display. And yet, considering places like the Marshall Islands where inadequate access to internet or the newest digital devices can be an issue, I can see why Bender, Trussel and the MOD (temporarily) "fixed" the alphabet to work on what is available. It's not always pretty, but it works.
Now as for this thread's discussion topic at hand... You're saying that it would be better to treat off-glides as context-specific vowel on-glides, like I noted for #4? I mean, I can see the wisdom, as you say, in only going with #1. But if the off-glides are treated as part of the vowel (which is already currently the case in how the module renders the phonetic IPA for some words like jo̧kleej {jakʷleyej} /tʲækʷlʲɛjɛtʲ/ [tʲɒɡʷ(wʌ)lʲɛːtʲ], "candy; chocolate"), then it's not really a consonant display issue anymore.
This makes me think of the module's logic rendering where /w/ surfaces into a [w] glide in words like Kuwajleen, also alternatively spelt Kuajleen, both of which are {kʷiwajleyen} /kʷiwætʲlʲɛjɛnʲ/ [kʷu(w)ɑzʲ(ɛ)lʲɛːnʲ]. The underlying glide already affects the vowels, and whether [w] surfaces or not actually doesn't add any distinction (and appears especially redundant directly after [u]), but the orthography evidences that the glide can and does surface anyway. So instead of treating [w] like a surfaced glide, it may be better to think of it as like the /w/'s off-glide which is actually the following vowel's on-glide, make it (much like the F2 of neighboring vowels) part of the consonant's echo rather than a direct representation of the consonant itself. ...If that makes sense. Marshallese can be strange.
I'm now increasingly seeing the wisdom in notating the non-glide consonants as primary+secondary in all cases as ([pˠ, bˠ, kʷ, ɡʷ, mˠ, ŋʷ]), etc., and not trying to pseudo-cluster them into [pw, bw, kw, ɡw, mw, ŋw], etc. If all instances of [w] are treated as part of the vowel, then the issue of how to pair them with consonants, goes away.
But that also leaves my lingering curiosity of just what makes the [w] in [pˠw] and [kʷw] so different from each other such that the former has zero direct F2 rounding influence on neighboring vowels apart from what is normally expected from unrounded velarized consonants. They almost have to be different flavors of rounding altogether (compression vs. protrusion?), which is why I toyed with using [β̞ˠ] for the former. But besides the obvious OR issues, I can't even necessarily prove whether or how they round differently, which makes it moot as a question of what will survive peer review.
I think, for the time being, I'll just go with option #1 as you say, give it a look over, and maybe try #4 instead if it hinders IPA readability at a quick glance, which something visually cluttered like [β̞ˠ] almost certainly would have done anyway. - Gilgamesh (talk) 14:03, 13 January 2020 (UTC)
There's no two ways about it—having a full [w] glyph absolutely aids IPA readability, especially where it agrees with the orthography, as in [kʷwɑlˠ(ɤ)mˠwe] instead of [kʷɑlˠ(ɤ)mˠe] for kwaļm̧we {kʷaļm̧ȩy} /kʷælˠmˠej/, "to fall prematurely (of coconuts)." The problem is that the eye is drawn to the larger letters more quickly than the superscript letters used for secondary articulations, and since every non-glide consonant or consonant cluster has a secondary articulation symbol, at first quick glance a [ʷ] might not immediately stand out compared to a [ˠ] especially if otherwise the neighboring primary vowel reflexes don't always differ. In these cases the [w] absolutely stands out and aids IPA reading with that first wave of visual information. And, again, it helps that the [w] in these circumstances is audibly present. All in all, solution #2 is still more readable than #3, which is more readable than #4, which is more readable than #1. But the graphically simpler solutions are mainly more readable from a position of established familiarity (the curse of knowledge), and solution #2 requires additional explaining that #1 and #4 don't really need quite so badly. And one of the reasons I recently started trying to overhaul IPA for these articles to begin with, was how much more difficult it was for readers unfamiliar with Marshallese to read the IPA transcriptions before. The interests of quick readability and informative accuracy need to be balanced.
I'd have more to say right now, and this certainly isn't a discussion that should be unilaterally decided when comment was specifically requested, but I'm decidedly too groggy to continue this instant. I'll give this another look later. - Gilgamesh (talk) 04:26, 14 January 2020 (UTC)
I was right. /pˠ/ and /mˠ/ do have compressed rounding. Not as a guess, but as part of the nature of the thing. /pˠ/ and /mˠ/ are themselves already described as phonetically rounded, and while /ʷ/ can be used to notate protruded rounding, /ᵝ/ (a superscript voiced bilabial fricative) can be used to notate compressed rounding, because bilabial consonants can be rounded (with velarization) but are not protruded. So it's not actually much of a leap in logic at all.
I also realized that there's another, simpler way of transcribing a compressed [w] besides [β̞ˠ]—it can also more straightforwardly be [wᵝ]: [kʷwɑlˠ(ɤ)mˠwᵝe]. I mean, [β̞ˠ] may be more accurate, given that its velarized bilabial notation harmonizes with [pˠ, bˠ, mˠ], but since the off-glides can be thought of more as a part of the following vowel reflex, [wᵝ] gets more to the heart of the matter of it being a "w"-like off-glide, and is fairly consistent with emerging conventions for notating specifically compressed-rounded vowels and semivowels in other languages, such as Japanese and Swedish. - Gilgamesh (talk) 11:50, 14 January 2020 (UTC)
Okay, I must admit. Even after all this reasoning, [kwɑlˠ(ɤ)mwe] is still very, very pleasing to the eye. But I know it's because [pw, bw, kw, ɡw, mw, ŋw] are essentially syntactic sugar, and the same notation doesn't exactly work before epenthetic vowels when parentheses are used: better [tʲɒɡʷ(ʌ)lʲɛːtʲ] or [tʲɒɡʷ(wʌ)lʲɛːtʲ] (harmonizing with [tʲɒkʷ | lʲɛːtʲ] if the syllables are enunciated in isolation), not [tʲɒɡw(ʌ)lʲɛːtʲ] or [tʲɒɡ(wʌ)lʲɛːtʲ] (because [tʲɒkw | lʲɛːtʲ] would look strange by comparison and [tʲɒk | lʲɛːtʲ] is just wrong). Let's responsibly review and discuss the options all the same. @Erutuon: Do you have any thoughts on this matter? (Starting with the beginning of this section.) - Gilgamesh (talk) 12:04, 14 January 2020 (UTC)

Enunciated mode

I've been experimenting with a new enunciated mode. Unlike previous ambitious yet misguided "careful mode" attempts, this does not attempt radical differences in vowel reflexes from the primary mode. Rather, it puts a short prosodic break at consonant clusters, shows VGV long vowel sequences as double vowels instead of geminated vowels, and expresses each prosodic fragment as if it were its own word, causing vowel reflexes to fall where they may. As such, it also does not show consonant assimilations or epenthetic vowels, but an enunciated pronunciation of Marshallese words, morpheme-by-morpheme, with vowel reflexes at least visible and thus more intuitive to the reader than a purely phonetic transcription.

Some examples, with the enunciated mode in bold:

  • Āne-jaōeōe {yanȩy-jahȩyhȩy} /jænʲej-tʲæɰejɰej/ [ænʲeːzʲɑɤe̯ɤːe̯] [ænʲe|tʲɑɤe̯|ɤe̯]
  • eakeak {yakyak} /jækjæk/ [æɡɛ̯ɑk] [ɛ̯ɑk|ɛ̯ɑk], "ghost; monster; hobgoblin"
  • iiaeae {'yiyahyahyey} /jijjæɰjæɰjɛj/ [iːɑːɛ̯ɑːɛ] [i|ɛ̯ɑ|ɛ̯ɑ|ɛ], "rainbow color"
  • jo̧uwi {jawwiy} /tʲæwwij/ [tʲɒuwi] [tʲɒ|wi], "to not be delicious, of fish"
  • Kuwajleen {kʷiwajleyen} /kʷiwætʲlʲɛjɛnʲ/ [kʷuwɑzʲ(ɛ)lʲɛːnʲ] [kʷuwɑtʲ|lʲɛɛnʲ], "Kwajalein Atoll"
  • o̧o̧jo̧j {wawajwaj} /wæwætʲwætʲ/ [ɒːzʲ(ɔ)wɑtʲ] [ɒɒtʲ|wɑtʲ], "to ride a horse"
  • ri-Ujae {ri-wijahyey} /rˠi-witʲæɰjɛj/ [rˠuːzʲɑːɛ] [rˠu‿utʲɑ|ɛ], "person from Ujae Atoll"
  • utut {witwit} /witˠwitˠ/ [wudˠ(u)wɯtˠ] [wutˠ|wutˠ], "to wear flowers; to wear a lei"

I'm still refining it. Bender and phonemic IPA modes are useful as an abstract description of its phonemes, and fully phonetic IPA shows all assimilations and epentheses in effect, but these enunciations are somewhere inbetween, intended more to resemble how Marshallese speakers enunciate words with care without making too many additional assumptions about how they do it. I got the idea from reading that 1945 document I recovered and showed Austronesier further up.

Also, I'm thinking that if this proves productive for Wiktionary templates, maybe the epenthetic vowel notation with parentheses can be changed to something...less visually disruptive, as standard parentheses have still proven to be something of a visual speed bump where it rises significantly above cap height when it really needn't rise far above x-height, seeing as the readability of IPA vowels is aided by the fact that none of them have lowercase ascenders whereas consonant letters may or may not have ascenders. I tried to use small parentheses on Wiktionary, but was disallowed from embedding HTML inside IPA templates. Given that IPA readability is a major concern for Marshallese, I'm strongly motivated to find the best possible option. Breves or inverted breves are still just discreet enough to be a visually viable option.

Also, @Austronesier: I recall what you said about non-syllabic vowels needing to neighbor vowels, but I also recall that this is not always the case in languages. In Ukrainian phonology, the phoneme represented by the Cyrillic letter ⟨в⟩ is pronounced [u̯] directly before another consonant, including, crucially, at the beginning of a word at the beginning of a prosodic unit when it does not neighbor any vowel at all, as in the three-syllable given name Всеволод (Vsevolod) [ˈu̯sˠɛ.ʋo.lˠodˠ]. As such, it is indeed possible and viable for semivowels to neighbor only consonants without forming a necessary extra syllable. Even what constitutes a syllable is a concept that can vary between each language's phonotactic rule set. I still can't say for absolutely certain whether the inverted breve [◌̯] is the most accurate or appropriate symbol for this, given that it was ad hoc during the time of Bender (1968), but it is not accurate to say that these sequences cannot realistically occur—just that in many languages they don't occur this way. Given other available options, I would still most prefer the inverted breve, but would still just prefer that whatever option is chosen reflect a sincere and well-reasoned consensus.

(Incidentally, in the interests of disclosure, in the past I helped expand many sections of the Ukrainian phonology article, relying on what at the time were assumed to be factually-accurate references. But Ukrainian IPA on Wikipedia may be in a future state of flux, given that papers more recently published in this scarcely-written-about linguistics topic cast doubt on some aspects of the current phonetic analysis, especially in regards to the sounds of the letters ⟨в⟩ and ⟨г⟩. That said, the first ⟨в⟩ in Всеволод being [u̯] is not one of these doubts, as the new references still support it.) - Gilgamesh (talk) 11:45, 17 January 2020 (UTC)

Additional notes: I know the prosodic dividers [◌|◌] may appear unusual, but from what I understand, this is how the IPA specifies prosodic units should be separated. I also considered using two syllable breaks in a row [◌..◌] (as in a longer syllable break), but this is rather ad hoc and I can't think of any precedent for it. - Gilgamesh (talk) 14:41, 17 January 2020 (UTC)

You know, maybe I'm overthinking the prosodic breaks. I could just use ordinary spaces:

  • Āne-jaōeōe {yanȩy-jahȩyhȩy} /jænʲej-tʲæɰejɰej/ [ænʲeːzʲɑɤe̯ɤːe̯] [ænʲe tʲɑɤe̯ ɤe̯]

- Gilgamesh (talk) 16:34, 17 January 2020 (UTC)

Marshallese Reference Grammar

This book I ordered just arrived in the mail. It is called Marshallese Reference Grammar (2016), co-written by the late Byron W. Bender, Alfred Capelle and Louise Pagotto. And from what little I've seen of this paperback thus far, it is...splendid. There is so much detail. It'll take me some time to get into most of it, and I hesitate to do anything that may damage the pages or the binding. And it feels like a shame that I can't immediately share entire references in this book with my fellow editors, nor do I have a searchable electronic edition I can read on an electronic screen. Still, I'm grateful to receive this text, and I hope it can help us refine these articles even further.

A few standout details I'm gleaned so far, early in the book:

  • On page 24:

    Retroflex liquid {r}. For retroflex sounds, the oral cavity is rapidly closed and opened and opened by curling back the tongue and trilling it against the gum ridge behind the upper teeth, as in the following words:

    rar {rar} 'dry leaves over fire'
    rōreo {rereyew} 'clean (E)'
    reeaar {rẹyyahar} 'east'

    The {r} is termed a heavy retroflex liquid—retroflex referring to the curling back of the tongue while performing the trill against the gum ridge.

    A retroflex consonant, and a trill, no less? Does this mean that, more narrowly, [ɽˠ] would be a more accurate phonetic transcription for {r}? I mean, not completely accurate, since [ɽ] is specified as a retroflex flap rather than a retroflex trill, but the digraph [ɽ͡r] is not only visually noisy, but Marshallese trills may allophonically be flaps anyway, in free variation.
  • On page 27:

    Light and heavy b and m were not generally distinguished in earlier Marshallese spelling practices, except indirectly through differences in preceding or following vowels, or by the insertion of a following w. Most writers used the letter b for both oral stops, but some used p for the light variety, especially at the end of words. The COSM decided to extend this latter practice consistently throughout words, and to make the heavier m. The committee also decided to continue inserting w after a heavy b and before the vowels i, e, and ā to maintain the usual shape of words such as bwebwe and ṃwe. It should be understood, however, that the letter w in such words does not stand for the rounded semiconsonantal phoneme {w} as in wa or awa; it simply emphasizes that the preceding labials are of the heavy variety.

    So, the ⟨w⟩ used here is not the same as the ⟨w⟩ used for /w/. I'd already suspected that. But what still isn't clear to me is whether this means ⟨bw⟩, ⟨m̧w⟩ have no off-glides at all, or whether whatever off-glides may exist are just a different kind of sound. This an important distinction, because I had already suspected that the velarized labial phonemes, already known to be in some way rounded, have a different style of rounding than the true rounded phonemes—compressed vs. protruded, specifically. There is strong evidence this is probably true, given that using a superscript [β] accompanying symbols for rounded sounds is already a common emerging convention for compressed rounding in the phonetic transcription of Japanese and Swedish. In lack of more explicit evidence to support this for Marshallese (such as a crystal clear indication that the off-glides exist at all), I temporarily resolved to use [w] for both kinds of off-glides to make the phonetic transcriptions more readable in line with the orthography. I think I better understand what Austronesier was saying earlier about my transcription options: Option #4 is probably closer to the truth, but he preferred going with option #1. I sorta-kinda punted this decision by instead treating the off-glides as part of the vowel reflex. But this new information just reminds me that that decision is not cast in stone.

I also noticed that this printed text uses the same ḶḷṂṃṆṇÑñỌọ letters as the Marshallese-English Online Dictionary, instead of the official ĻļM̧m̧ŅņN̄n̄O̧o̧ letters. If I hadn't already specifically asked Mr. Trussel about this, this would make me further wonder whether ḶḷṂṃṆṇÑñỌọ are indeed officially replacing ĻļM̧m̧ŅņN̄n̄O̧o̧. But he clarified to me in no uncertain terms that ĻļM̧m̧ŅņN̄n̄O̧o̧ have official status and ḶḷṂṃṆṇÑñỌọ do not, and that ḶḷṂṃṆṇÑñỌọ serve as a graphical kludge until ĻļM̧m̧ŅņN̄n̄O̧o̧ become more practical to display. Given that this book seems to be printed in Times New Roman or something similar to it, the glyphs may have been selected for their existing support in the font. In the same discussion, I told Mr. Trussel about other existing fonts that currently display official Marshallese letters well, including Windows' Cambria and Google's Noto Serif, and he wasn't aware of these before I told him. I honestly don't know what to expect from the future of Marshallese in digital and print media before the official letters finally become easier to display across all devices.

There's also a bit of sadness about this book. I ordered this book on January 12, the same day just hours before I'd heard that Dr. Bender had died eight days earlier. But I'm glad he managed to complete this text after what I'd heard were decades of delays. - Gilgamesh (talk) 23:31, 21 January 2020 (UTC)


Been reading even more of the book, and it clarified some more things for me. This time I won't transcribe sections, because it's very difficult to do with a new book without damaging it. (Now that I've bought this book, I really wish I also had an electronic copy.)

For one thing, the orthography is not necessarily the closest reflection of the pronunciation of vowels, but rather emphasizes continuity with older orthographies while trying make spelling far more regular than in the past. The actual pronunciation of vowels is more like Choi and Willson's diphthongs when pronounced slowly, but sounds like the vowel in the middle when spoken at normal speed. Additionally, between palatalized and rounded consonant phonemes, these sounds are described as being more like triphthongs instead of diphthongs: iūu, uūi, etc. (OR alert.) I'm beginning to see why Bender always preferred to treat these the back vowels as central, instead. But in reality it may be more complicated even than that:

u
/ |
ʉ̜
/ |
i - ɨ - ɯ

That's six possible allophone nuclei, but it's easy to see how as a matter of phonotactics it's easier to reduce three of them to primary allophones, i], [ʉ̜ ɯ] and [u̜ u]. (End OR.) What's additionally significant about this is that it goes in both directions, and it doesn't matter that much if one of the neighboring consonants is a glide. Though, as Choi notes, CVGVC sequences seem to have a different acceleration and deceleration of transition that spends more time in the nucleus of the long vowel.

I could create similar tables for the other vowel F1s, but that brings me to another issue. This book specifically describes the phoneme {e} as mid (not mid-open), and {ȩ} as high-mid. Furthermore, like Choi, it notes that {ȩ} is the odd vowel phoneme of the four, as it seems to be created through phonological processes from an interplay of {e} of {i}. In particular, the book advances the hypothesis that a variety of historical sound changes involving metathesis and F1 neutralization occurred to compensate when Oceanic morphemes lost their final vowels in Marshallese:

  • {CiCe → *CieC → CȩC}
  • {CeCi → *CeiC → CȩC}

When these morphemes take a suffix, the ancestral vowels reappear. This makes {ȩ} a form of cheshirization, and also reminds me of the Irish vowels ⟨ia⟩, ⟨ua⟩ which formed in a similar process but remained diphthongal. There are some other ways {ȩ} forms too:

  • {eGi → ȩGi}, because the former sequence is unstable and automatically assimilates. (I should program a rule for that.)
  • {CaCCaC → CaCCȩC}, which reminds me of the Biblical Hebrew law of attentuation (where [a] became [i] near another [a] in a neighboring syllable), though I'm not sure if this is quite the same principle or not.
  • {eCȩ, ȩCe → ȩCȩ}, because Marshallese words don't generally like {e} and {ȩ} coexisting in the same morpheme, and the book specifically describes this as a form of vowel harmony.

Situations like that. Marshallese has also accumulated some doublets of similar but non-identical meanings, as words with {ȩ} sprout new suffixes with an additional {ȩ} as a backformation, so {CȩCȩ-} words coexist with separate {CiCe-} or {CeCi-} words.

The book explains further that the reason {e} and {ȩ} are spelt with the same spellings, is because their coexistence is unstable, and the two phonemes can no longer reliably be separated by speakers. Actual F1 of vowels can still vary from word to word, but no longer reliably more than it does from speaker to speaker and from dialect to dialect. Conservatively we can notate the phonemes differently, but the orthographic committee already judged their distinction too insignificant half a century ago, and Choi (1992) reported that his sample of native speakers already largely seemed to treat them as homophonous, and now their descent into free-variation homophony is in an advanced state of completion.

On allophonic voicing of obstruent consonants, the book does mention some being voiceless and some being voiced, but it's now not such a clear-cut distinction of voiceless initials, voiced medials and voiceless finals. Rather, the book treats initial and medial obstruents as both being voiced (but not 100%), and and final consonants as voiceless.

And as for {j}, I had known from reading Choi (1992), Willson (2003), etc. that pronunciation of this phoneme could vary between speakers. What I didn't know until reading this book, is that this variation also often occurs between different words by the same speaker, or sometimes between different contextually equivalent utterances of the same word by the same speaker. And Marshallese who are bilingual in English are especially prone to pronouncing {j} differently on a word-by-word basis, especially if they have learnt to distinguish the pronunciation of the different kinds of fricative and affricate consonants found in English words. Basically, trying to come up with an algorithmically regular set of phonetic realizations for this consonant was always going to be a fool's errand. And it's complicated by another accent-sensitive language concept in the language: weejej, which can best be translated as "lisping." Speakers who speak primarily with sibilant realizations of {j} are traditionally considering to be lisping, and this perception tends to increase after Marshallese learn and become proficient in English. There's lots of weejej in Marshallese society, especially among women, bilinguals and in pop music, and it's not necessarily a "bad thing" to have, but it was apparent when the MED was first compiled that there was (and perhaps still is) a certain sort of stigma attached to it. On this basis, I might change the notation of voiced notation of {j} from [zʲ] to [dʲ], except that from what I've seen on sites like YouTube, it's not necessarily strange to teach foreigners learning Marshallese to pronounce this sound as [z].

All this and I still haven't gotten to reading about the real meat of the language's actual grammar. - Gilgamesh (talk) 04:09, 22 January 2020 (UTC)


I should probably be clear that that six-allophone triangle diagram is not from the book, nor does the book directly suggest that to be the case. It's just my own analysis (OR) of the mid-point between each of those three primary allophones. - Gilgamesh (talk) 08:02, 22 January 2020 (UTC)


Hmm...just experimenting a bit here with more OR.

ɒ ɔ o u
/ | / | / | / |
ɒ̜̈ ɒ̜ ɞ̜ ɔ̜ ɵ̜ ʉ̜
/ | / | / | / |
æ - ɐ - ɑ ɛ - ə - ʌ e - ɘ - ɤ i - ɨ - ɯ

Low, mid, mid-high and high, respectively. The front-unrounded-to-back-unrounded midpoint is supportable, as is the back-unrounded-to-back-rounded midpoint. The front-unrounded-to-back-rounded midpoint is less certain, given that both Bender (1968) the Marshallese Reference Grammar (2016, hereafter abbreviated MRG) describes these as being smooth transitions of āao̧, o̧aā, eōo, oōe, iūu, uūi, with MRG being clear that this characterization holds in both directions. Given that Dr. Bender was consistent about this characterization over a span of 48 years, it's worth taking seriously. But what isn't clear is whether this does or doesn't rule out a smooth transition between the extremities, indicated as a diagonal path on the chart. I mean, I suppose the transition could occupy only the horizontal and vertical paths on the chart, but it doesn't seem entirely likely for front-back and unrounded-rounded transitions to only take place mutually exclusively from one another during a single greater transition. Of course, it's not unacceptable to use the back unrounded phonetic symbols for the midpoint of the diagonal transition, since these are, by definition, unstable midpoints between stable allophones rather than stable allophones themselves. (Would they be called "meta-allophones?" Continuing on.)

The MRG also clarifies that, between a palatalized non-glide and a velarized non-glide, and vice versa, the orthographic preference for ⟨a⟩ marking a low vowel in both combinations, and ⟨i⟩ marking a high vowel in both combinations, is an orthographic rather than phonetic distinction—as Austronesier suspected in words like dik, a surface realization of [i] isn't necessarily any more dominant than [ɯ]. Because, in truth, the midpoint is likely neither: dik is neither [rʲik] nor [rʲɯk], but really more like [rʲɨk]. Using this kind of approach could greatly simplify the currently convoluted vowel reflex lookup table, which is currently this orthographically-informed mess:

-- [f1]
local aEei = { "a", "E", "e", "i" }
local AEei = { "A", "E", "e", "i" }
local AV7i = { "A", "V", "7", "i" }
local AV7M = { "A", "V", "7", "M" }
local AV7u = { "A", "V", "7", "u" }
local AOou = { "A", "O", "o", "u" }
local QOou = { "Q", "O", "o", "u" }
-- [F2[secondaryR]][f1]
local _jv_X = { aEei, AEei, QOou }
local njv_X = { aEei, AV7i, QOou }
local hjvtX = { aEei, aEei, QOou }
local hjvkX = { AV7i, AV7i, QOou }
local _Gv_X = { AV7i, AV7M, QOou }
local rGv_X = { AEei, AV7M, QOou } -- not currently used
local hGv_X = { AV7M, AV7M, AV7M }
local _wv_X = { AV7u, AOou, QOou }
local rwv_X = { AOou, AOou, QOou }
local hwv_X = { AV7M, AOou, QOou }
local hwvtX = { AV7M, AV7M, QOou }
-- [F2[secondaryL]][F2[secondaryR]][f1]
local _Xv__ = { _jv_X, _Gv_X, _wv_X }
local nXv__ = { njv_X, _Gv_X, hwv_X }
local rXv__ = { _jv_X, _Gv_X, rwv_X }
local hXv__ = { _jv_X, hGv_X, hwv_X }
local hXvt_ = { hjvtX, hGv_X, hwvtX }
local hXvk_ = { hjvkX, hGv_X, _wv_X }
local hXvr_ = { hjvtX, hGv_X, hwv_X }
-- [primaryR][F2[secondaryL]][F2[secondaryR]][f1]
local __vX_ = {
	["p"] = _Xv__, ["t"] = _Xv__, ["k"] = _Xv__,
	["m"] = _Xv__, ["n"] = _Xv__, ["N"] = _Xv__,
	["r"] = _Xv__, ["l"] = _Xv__
}
local n_vX_ = {
	["p"] = nXv__, ["t"] = nXv__, ["k"] = nXv__,
	["m"] = nXv__, ["n"] = nXv__, ["N"] = nXv__,
	["r"] = nXv__, ["l"] = nXv__
}
local r_vX_ = {
	["p"] = rXv__, ["t"] = rXv__, ["k"] = rXv__,
	["m"] = rXv__, ["n"] = rXv__, ["N"] = rXv__,
	["r"] = rXv__, ["l"] = _Xv__
}
local h_vX_ = {
	["p"] = hXv__, ["t"] = hXvt_, ["k"] = hXvk_,
	["m"] = hXv__, ["n"] = hXv__, ["N"] = hXvk_,
	["r"] = hXvr_, ["l"] = hXv__
}
-- [primaryL][primaryR][F2[secondaryL]][F2[secondaryR]][f1]
local VOWEL_REFLEX = {
	["p"] = __vX_, ["t"] = __vX_, ["k"] = __vX_,
	["m"] = __vX_, ["n"] = n_vX_, ["N"] = n_vX_,
	["r"] = r_vX_, ["l"] = n_vX_, ["h"] = h_vX_
}

The code is fast, but hard both to read and to maintain, and this is already after attempts to make the code more readable. The algorithm could stand to be greatly simplified with the added tradeoff of using symbols for five degrees of F2 (front unrounded, central unrounded, back unrounded, back semirounded, back rounded) instead of just three, making it 20 allophone symbols in all instead of 12. With five degrees, each vowel midpoint F2 between two non-glides could be calculated quite trivially through averaging the F2 of each consonants' secondary articulations:

-- assuming front/palatalized = 0, central = 1, back/velarized = 2, semirounded = 3, rounded = 4
(F2[secondaryL] + F2[secondaryR]) * 0.5

But of course, all this is mainly just a nice idea. It's still OR, and I'm not sure I'd feel confident trying to defend it. - Gilgamesh (talk) 19:20, 22 January 2020 (UTC)


Is there any way my fellow collaborating editors could check out this book digitally so we could discuss it? I was thrilled to get this text, but it's difficult to review parts of it without others' eyes on it as well. - Gilgamesh (talk) 20:26, 22 January 2020 (UTC)


MRG pages 75 and 76 directly describe and thus confirm the epenthetic vowel F1 formula of max(F1[left vowel], F1[right vowel], F1[e]), at least between two non-glide consonants. Still reading as I go. - Gilgamesh (talk) 21:08, 22 January 2020 (UTC)


Another interesting note, this time concerning consonant clusters and epenthetic vowels. (Can't remember the page number off-hand.) Stable consonant clusters CC are pronounced with the same rhythm as CVC sequences, with the difference that where there would be a vowel, the Marshallese cluster consonant holds and suppresses the vowel. But this pattern of pronunciation originates from an ancestral language where a vowel did exist between the two consonants. When the cluster is unstable, the vowel reliably epenthetically reappears in normal-paced speech, but is not perceived to meaningfully exist (much less as its own syllable), and disappears altogether when the word is enunciated. But as Marshallese is a mora-timed language, and not a syllable-timed language, whether or not an epenthetic syllable acoustically exists is moot point, because speakers perceive CCV, C(V)CV and CVCV each as two morae in length. The text also makes it clear that epenthetic vowels, when present, are indeed no shorter than ordinary vowels, and are audible to foreign ears, but to the Marshallese mind hearing and analyzing speech for recognized words and meanings, the epenthetic vowels are completely filtered out and not perceived to exist, effectively making the unstable clusters...simply clusters.

Now, the MRG does not use IPA transcriptions, at least any that I've seen so far, so it can't be said that it prefers one IPA transcription over another. The book, like some of Bender's other writings, is not written for the expert linguist, but for the well-educated lay person, so it certainly succeeds at describing its concepts well, but does not do so with IPA. But the section on epenthetic vowels ("excrescent vowels") does have two separate conventions for marking epenthetic or lost ancestral vowels. For instance, the word m̧akm̧ōk, 'arrowroot,' has an ancestral form of *makĭmakĭ, where the breves indicate unstressed vowels which were syncopated out of existence. Some of these final ancestrally lost vowels do reappear as stem vowels when taking suffixes. But in the modern word, the epenthetic vowel is indicated with parentheses: m̧ak(ō)m̧ōk. (Interestingly, the MRG also has a convention of indicating individual glides, or "semiconsonants," with inserted brackets: The glide in M̧ajeļ {m̧ahjeļ} can be highlighted inline as M̧a{h}jeļ, and the epenthetic vowel inside the consonant cluster can further be highlighted in parentheses as M̧a{h}(a)jeļ. Rather elegant, I think.) But since none of these conventions are actually IPA or pretend to be IPA, it doesn't exactly inform a preferred use. Personally, I still prefer Bender's 1968 use of the inverted breve beneath a vowel, which then may have been ad hoc, but since has actually become conventional for Ukrainian, as I mentioned a few sections up in this talk page: Vsevolod is [ˈu̯sɛwolod]. While at one point it might have been considered unconventional to insert a non-syllabic vowel where there are no neighboring syllabic nuclei, it this does not seem to be such a hard and fast rule anymore.

It also occurs to me, after having read about all this, that it is only really appropriate to notate epenthetic vowels parenthetically if they represent an optional presence or absence of the sound. However, their presence or absence can potentially trigger other phonetic assimilations and sandhi (such as compensatory vowel lengthening which thus far we have not been notating with parentheses: [mˠɑːzʲɛlˠ]), and it may actually be more appropriate to use separate pronunciations for each case: [mˠɑːzʲɛlˠ, mˠɑ tʲɛlˠ], instead of [mˠɑ(ɑ)zʲɛlˠ], so therefore also [æɡɤ̯dˠo, ɛ̯ɑk tˠo] instead of [æɡ(ɤ)dˠo] or [ɛ̯ɑk(ɤ)tˠo].

And, of course, that reminds me, as I just mentioned in a recent earlier comment, that we still may have to revisit initial consonant voicing ([tˠ] vs. [dˠ]), and a different system for short vowels neighboring non-glide consonants ([mˠɑːzʲɛlˠ] vs. [mˠɑːzʲəlˠ]). - Gilgamesh (talk) 22:20, 22 January 2020 (UTC)


So, some salient phonological points digested from my last few days of reading:

  • While the orthography is now phonologically more regular than it was before the new standardization in 1960s and 1970s, many of its spelling choices explicitly reflect tradition more than phonetic reflex, and actual pronunciation does not always mesh with voice and semivowel spelling choices.
    • What this means, to me, for Wikipedia and Wiktionary, is that though in many ways the module's phonetic pronunciation algorithm is helpfully clear, it may be overly rigid in its adherence to orthographic vowel patterns.
  • The book emphasizes that many word pairs are phonologically equivalent to each other in reverse, despite their spelling patterns in reverse having significant orthographic differences, and have more or less mirrored pronunciations. Outstanding examples in section 2.3: Marshallese vowels, include jib {jib}–bwij {bij}, du {diw}–wūd {wid}, ļo̧k {ļakʷ}–kwaļ {kʷaļ} and eo̧ {yaw}–wā {way}.
    • For Wikipedia and Wiktionary, it might help if pronunciations appeared correspondingly more mirrored to reflex this.
  • All front-to-round and round-to-front vowels have a midpoint analyzed as closest to the back unrounded allophone, so that du–wūd are closer to [rʲɯw]–[wɯrʲ], and even eo̧–wā are closer to [æ̯ɑw]–[wɑæ̯]. This applies even to long diphthongs with non-identical pairings, so the number suffix -n̄oul {-gȩwil} 'ten' is more like [ŋo̜wɯlʲ] than [ŋoulʲ].
  • Phonetic reflexes that aren't reversible are fewer in number than we assumed, mostly limited to things such as consonant cluster assimilations and glide-glide consonant cluster epenthetic vowel heights.
  • The rhotic consonants are all specifically characterized and described (in terms of the tongue) as retroflex trills (equivalent to [ɽ͡r], though the book itself does not use IPA). The velarized non-obstruent coronal phonemes ⟨ļ⟩ and ⟨ņ⟩ are similarly at least postalveolar [l̠ˠ, n̠ˠ]. While these two phonemes' palatalized counterparts ⟨l⟩ and ⟨n⟩ are dental consonants [l̪ʲ, n̪ʲ], the palatalized rhotic counterpart ⟨d⟩ is both retroflex and dental as well, with both tongue curling and trilling against the front teeth. This is either an editorial oversight (the book isn't 100% free of errors such as typoes), or this makes its narrow pronunciation closer to the rather painfully stretched [ɽ͡r̪ʲ].
    • In any event, it seems to me that /rʲ, rˠ, rʷ/ are still appropriate phonemic notations, as they span and are limited to the coronal consonant range. I'm inclined, for the sake of simplicity to also choose [rʲ, rˠ, rʷ] as phonetic symbols, though if it were better to represent them all as retroflex, [ɽʲ, ɽˠ, ɽʷ] may suffice for the sake of smooth readability. Yet given that I can't even try pronouncing [ɽ͡r̪ʲ] without hurting my tongue and still failing at the articulation, [rʲ] may pose fewer question marks. (Such as, "are palatalized retroflex consonants even possible or practical anyway, let alone palatalized dually-articulated retroflex-dental trills?")
  • Marshallese has a quirk now (though not previously, to my knowledge) characterized as vowel harmony, with {e} and {ȩ} traditionally prohibited from coexisting within the same morpheme. This is enforced with an automatic reflex spanning glide consonants, where {eGi} regressively assimilates to {ȩGi}, and {eGȩ} regressively assimilates to {ȩGȩ}—this reflex also crosses affix boundaries: taktō {takteh} 'doctor' + -in {-in} (construct state suffix) = taktōin {taktȩhin} 'doctor of'. The combinations {ȩGe, ȩGa} are stated not to occur, though the book leaves it at that rather than saying that an assimilation rule enforces the more stable {eGe, eGa}. There's also an increasing newer trend of applying this vowel harmony across all consonant boundaries, not just glides, but this is not the prescribed rule.
    • For the most part these reflexes don't have to be algorithmically enforced in the module, as long as the vowel harmony is respected when inputting phoneme data.

That's all that springs to mind for now, as I have to leave now. And sorry for this looking more like a personal linguistics blog than a discussion page—I'm honestly still actively hoping my fellow editors will comment and this can be a community discussion. I've had a lot of time on my hands, and this has given me something to do lately. - Gilgamesh (talk) 12:29, 24 January 2020 (UTC)


Well, now we know what Marshallese stress patterns are. And though Marshallese has stress, the MRG says it's predictable, as it is largely inherited from an ancestral Micronesian language that had only CVCV syllables and always had stress on every other syllable, ending with the penultimate syllable: [CVˈCVCVˈCVCV], etc. The related nearby Gilbertese language still mostly has this syllable structure, though some consonant clusters are possible. Between the ancestral language and modern Marshallese, most of the unstressed syllables were deleted, causing true final consonants and consonant clusters to become possible: [CVˈCVCˈCVC], etc.

But Marshallese is still a mora-timed language, and speakers still intuitively divide articulations into morae rather than syllables, and just as epenthetic vowels are a subconscious reflex that are not necessarily perceived to exist as vowels, true consonant clusters are less a simple sequence of [C₁.C₂V], and more a sequence of [C₁C̩₁.C₂V], where the would-be (and ancestrally extant) vowel is suppressed, and yet held for the same standard duration as an unstressed vowel.

But it gets more complicated than this. In the ancestral language, if an odd number of syllables were suffixed to a word, the weak syllables became stressed and the stressed syllables became weak, reversing the stress profile of the word stem. The MRG says that this trend is still true in modern Marshallese, except that the CVCVC or CVCCVC structure of word stems has become fossilized. The book actually uses words we already discussed before here: eakto 'unload' and its suffixed transitive inflections eaktuwe and ākūtwe. eakto had an ancestral [ˈCVCVˈCVCV] structure that became [ˈCVCˈCVC] {yaktȩw}, with a stem form eaktu- {yaktiw-}. With the transitive suffix -e {-ey}, the simple inflection is eaktuwe {yaktiwey}, with the first and third vowels having stress and the second syllable being unstressed. However, the MRG explains that words like eaktuwe are considered relatively unstable because of the syncopated [ˈCVC.CVˈCVC] deviating from the ancestral stress pattern. So an alternative, more rhythmically stable form arose, ākūtwe {yakitwey}, with a stress pattern of [CVˈCVCˈCVC] made possible by metathesizing the third consonant and second vowel: {ti → it}. When epenthetic vowels are included, these two forms have identical vowel profiles, but differ in stress (something I had suspected but am now proven right about): eaktuwe [ˈæ͡ɑɡɯ̯.dɯ͡uˈwɔ͡ʌ͡ɛ] vs. ākūtwe [æ͡ɑˈɡɯdɯ̯͡u̯ˈwɔ͡ʌ͡ɛ]. (As you can see, I'm still trying to process the MRG's explanation of how vowels are pronounced. The tied short diphthongs are known to be visually confusing to Wikipedia readers.)

By this explanation, it can be predicted which vowels are stressed, as they can be counted from the end of the word: [ˈCVC, CVˈCVC, CVCˈCVC, ˈCVCVˈCVC], with the pattern syncopating and transferring to the previous syllable if a word-be stressed mora has no vowel: [ˈCVCCVˈCVC]. But this syncopated stress, as I mentioned, is considered unstable, and wants to shift to [CVˈCVCˈCVC] instead. I remember Ng (2017) mentioning something somewhere about long monophthongs of [VGV] wanting to be stressed as a unit, even if this could cause syncopation: [CVˈCVGVC], not [ˈCVCVˈGVC]. But I don't recall the MRG addressing this, so I'm not sure what to make of it. More reading and verification are needed. - Gilgamesh (talk) 06:45, 26 January 2020 (UTC)

@Gilgamesh~enwiki: I'll try to get a copy ASAP. Bender (1968) already made some observations about how phonemic deep-level /CeCi/ and /CiCe/ become {CȩC} in phonemic surface-level. Such deep level final vowels only take concrete shape if a suffix is added. A very similar analysis has been provided for Yapese where e.g. /Cæ:C/ represents deep-level /CaCi/. I'm currently working about a chapter on Palauan, but here the deep-level final vowels do not have a coloring effect on the main vowel. I'll definitely join you once I have a copy. –Austronesier (talk) 18:33, 26 January 2020 (UTC)
@Austronesier: If you can check out a copy from some library, or something similar to that, that would be fantastic. Though if you mean permanently obtaining a physical copy, it seems like a lot to ask a fellow user to buy a $30 book, unless it's something you actually really want. Regardless, any feedback and discussion you can contribute with information from the MRG would be fantastic. At the very least, the book could be seen as Dr. Bender's swan song, his decisive final achievement in informing Marshallese linguistics. I'll always hold onto my copy. - Gilgamesh (talk) 18:49, 26 January 2020 (UTC)

There really was a simpler way to approach those triangle diagrams. It's not as accurate, but it could suffice in the interests of keeping IPA simple.

ɒ ɔ o u
/ | / | / | / |
ɑ ɑ ʌ ʌ ɤ ɤ ɯ ɯ
/ | / | / | / |
æ - ɑ - ɑ ɛ - ʌ - ʌ e - ɤ - ɤ i - ɯ - ɯ

- Gilgamesh (talk) 03:02, 27 January 2020 (UTC)