Three problems in Persian

The examples below are in the Tahoma font which in earlier versions had a defective Persian Yeh in medial position.

 Kaf (k)

 

The Arabic letter K (Arabic Letter Kaf  U+0643):

stand-alone form:

 ك image:

and if you join 3 of them together, you can see the initial, medial & final forms:

 ككك image:

The Persian letter K (Arabic Letter Keheh U+06A9):

stand-alone form:

 ک image:

and if you join 3 of them together, you can see the initial, medial & final forms:

 ککک image:

(Note: Persian has another letter گ which is G and has an extra bar on top.)

 

Yeh (y)

The Arabic letter Y (Arabic Letter Yeh  U+064A):

stand-alone form:

 ي image:

and if you join 3 of them together, you can see the initial, medial & final forms:

 ييي image:

The Persian letter Y (Arabic Letter Farsi Yeh U+06CC):


stand-alone form:

 ی image:

and if you join 3 of them together, you can see the initial, medial & final forms:

 ییی image:

Note that the Persian Y does not have the 2 dots in final position!

However, there is a rare letter in Arabic which looks just like the Arabic Y which has no dots:

The other Arabic letter Y/A (Arabic Letter Alef Maksura U+0649):

ى image:

ککى image:

Pictured here in final form only after 2 K's since it ONLY occurs in final form.

Heh+HamzaAbove (heh-ye)

The Arabic letter ae (Arabic Letter AE U+06D5)

isolated form:

 ە image:

and three in a row:

ەەە image: 

The Persian letter (Arabic Letter Heh U+0647)

isolated form:

 ه image:

and three in a row:

ههه image:

The Persian diacritical Hamza Above (Arabic Letter Hamza Above U+0654)

isolated form:

 ٔ   image:

and when it sits on the Heh:

هٔ image:

which is identical visually to the now "deprecated":

The Persian diacritical Hamza Above (Arabic  Heh With Yeh Above U+06C0)

ۀ image:

Quoting  Roozbeh Pournader who is in charge of the Farsiweb project, the official Iranian government-appointed body to submit the draft for the Unicode Standard for Persian to the Unicode Consortium:

...That certain machine which contains some Unicode compliant software,
may decide to apply Unicode Normalization Forms converters to its input
data, which as Unicode says, is a completely Unicode-compliant thing to
do, and as W3C says, is *required* in some cases. These Normalization
Forms are specified in the Unicode Standard Annex UAX#15, at:

http://www.unicode.org/reports/tr15/

(All Unicode-compliant applications are asked then to treat any
normalization form of each certain string the same way.)

Now, if you have a string that contains the string U+06C0, and then one
converts it to the Normalization Form C, it will become the two-character
sequence <AE, HAMZA ABOVE>. That certain sequence is specified in the
Unicode data files, for example the one at:

http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

which contains a line like:

06C0;ARABIC LETTER HEH WITH YEH ABOVE;Lo;0;AL;06D5 0654;;;;N;ARABIC LETTER
HAMZAH ON HA;;;;

That line mentions "06D5 0654" which is of course <AE, HAMZA ABOVE>.

----

If I remember correctly, MS Typography's rebuttal to this was, "Ok, show me a situation where Heh+HamzaAbove will occur in anything other than word-final position where it would matter."  This they could not do because in fact,Heh+HamzaAbove only occurs naturally in word-final position.