I have some hypotheses for English graphotactics:
- 〈w〉 and 〈y〉 are optional positional variants (i.e. allographs) of 〈u〉 and 〈i〉, respectively, in digraphs that correspond with diphthongs or vowels: 〈aw〉 ≈ 〈au〉, 〈ew〉 ≈ 〈eu〉, 〈ow〉 ≈ 〈ou〉; 〈ay〉 ≈ 〈ai〉, 〈ey〉 ≈ 〈ei〉, 〈oy〉 ≈ 〈oi〉, 〈uy〉 ≈ 〈ui〉. They are the preferred allograph at the end of morphemes.
- 〈y〉 is a required positional variant of 〈i〉 at the end of native words, but digraph 〈ie〉 may be a possible alternate.
- Final 〈y〉 in a stem gets replaced by 〈i〉 when a inflection suffix follows unless it is part of a digraph: fly > flies/*flys/*flis but boy > boys/*boies/*bois.
- The apostrophe 〈’〉 is used to visually separate the possessive suffix 〈s〉 from proper names – i.e. words with initial capital – to ensure that #3 does not apply, so names have a constant representation.
- #4 is not necessary for pronouns, hence 〈its〉, 〈hers〉, 〈his〉 instead of *〈it’s〉, *〈her’s〉, *〈he’s〉. #4 is applied to other nouns as well, though.
- In vowel digraphs, round-top letters 〈a〉, 〈e〉 and 〈o〉 are preferred for first/left position whereas flat-top letters 〈i〉/〈y〉 and 〈u〉/〈w〉 are preferred for second/right position.
Are there any graphemic analyses of English that support these observations, especially #4?
I’m only aware of a bachelor thesis in German by Marlene Franke from 2008 which isn’t available online. It’s likely based on theories and work done by Fuhrhop/Buchmann (e.g. 2011: The length hierarchy and the graphematic syllable DOI: 10.1075/wll.14.2.05fuh) and Primus (e.g. foundational 2004: A featural analysis of the Modern Roman Alphabet), who support #6 at least.
Note that #4 is (usually) not extended to the only other possible suffix which is also an 〈s〉, i.e. the plural marker: all the Jennys and Billys.
The apostrophe is used to indicate something is missing.
So in don’t the letter o is missing.
The usage is still the same for possessives: James’ indicates the -es is missing from the end the word — Jameses being the correct pronunciation in this case. Similarly with Fred’s except we no longer say Fredes. This happens because old English used the Germanic suffix -es to indicate possession.
The situation gets more complex for plural entities. English uses the French suffix -s to indicate plurality. There is an obvious clash with the -es possessive suffix made even more painful by the fact that this is normally rendered as a plain -s.
- The boy’s home — the boy is home
- The boy’s home — the home belonging to this boy
- The boys’ home — the home belonging to those boys
- Or even The boys’ home account — the account belonging to an institution housing boys