Posts Tagged ‘dictionary’

VoxForge dictionary isn’t encoded in UTF-8

Saturday, June 21st, 2008

I just downloaded the VoxForge dictionary (2.6 MB), and opened it with Notepad++. Obviously, it is encoded in ANSI, not in UTF-8. That’s OK because it does contain just standard characters. I am guessing that this dictionary is compatible with ASCII. But I would suggest that future versions should be published in UTF-8.

Switching from Arpabet to IPA

Saturday, June 21st, 2008

Obviously, the CMU pronouncing dictionary is using the Arpabet. The Arpabet has the advantage that it is possible

“to represent phonemes with ASCII characters.”

But today, the UTF-8 standard is becoming more and more common. In my opinion, there should be a discussion to switch from Arpabet/ASCII to IPA/UTF-8. The IPA is easier to read than the Arpabet. And UTF-8 should be backwards compatible to ASCII (at least, as far as I know).


Bad Behavior has blocked 114 access attempts in the last 7 days.