<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>speech recognition software</title>
	<atom:link href="http://speech.blau.in/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://speech.blau.in</link>
	<description>- free and open source -</description>
	<lastBuildDate>Wed, 02 May 2012 20:35:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Schott&#8217;s General American dictionary 0.2</title>
		<link>http://speech.blau.in/?p=99</link>
		<comments>http://speech.blau.in/?p=99#comments</comments>
		<pubDate>Tue, 01 May 2012 16:46:50 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[eSpeak]]></category>
		<category><![CDATA[Linux Mint]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=99</guid>
		<description><![CDATA[About two years ago, I published Ralf&#8217;s General American dictionary version 0.1.1. I decided to develop the next version 0.2 of this dictionary from scratch. The dictionary gets a new name: Schott's General American dictionary instead of Ralf's General American dictionary. This article explains the creation of the dictionary: 1. Get an American English spelling [...]]]></description>
			<content:encoded><![CDATA[<p>About two years ago, I published <a href="http://spirit.blau.in/simon/2010/05/02/convert-voxforgedict-into-plsipa-format/#comment-304">Ralf&#8217;s General American dictionary version 0.1.1</a>. I decided to develop the next version 0.2 of this dictionary from scratch. The dictionary gets a new name: <code>Schott's General American dictionary</code> instead of <code>Ralf's General American dictionary</code>. This article explains the creation of the dictionary:</p>
<p>1. <a href="http://extensions.libreoffice.org/extension-center/american-british-canadian-spelling-hyphen-thesaurus-dictionaries">Get</a> an American English spelling dictionary with <a href="http://libreoffice-na.us/English-3.4-installs/add-on-dictionaries-large-list/kpp-american-english-dictionary-390K-word-list.oxt">390.000 words</a>.</p>
<p>2. License is GPL version 2.<br />
3. Encoding of the files en_US.dic and en_US.aff is UTF-8.<br />
4. Linux Mint terminal:</p>
<blockquote><p>cd /home/ubuntu/Documents/american-english<br />
unmunch en_US.dic en_US.aff &gt; american-wordlist</p></blockquote>
<p>5. Add speak tags at the beginning and the end of american-wordlist.<br />
6. Linux Mint terminal:</p>
<blockquote><p><code>espeak -f american-speak-audio -m -v en-us -q -x --phonout="american-espeak"</code></p></blockquote>
<p>7. Adding <code>&lt;lexicon&gt;</code> tags to the file <code>american-espeak</code> (<code>&lt;lexicon&gt;</code> at the beginning of the file; <code>&lt;/lexicon&gt;</code> at the end of the file).<br />
8. Linux Mint terminal:</p>
<blockquote><p><code>saxonb-xslt -ext:on -s:american-espeak -xsl:'http://spirit.blau.in/simon/files/2010/04/replace-newline-newline-space-by-phoneme-element.xsl' -o:american-phoneme-elements</code><br />
<code>mkdir espeak</code><br />
<code>paste american-speak-audio american-phoneme-elements > espeak/general-american-dictionary.xml</code></p></blockquote>
<p>9. <a href="http://script.blau.in/espeak/general-american-dictionary.xml.bz2" title="version 0.2">Download the dictionary (eSpeak edition)</a>.</p>
<p>10. I am planning to release an IPA version of this dictionary.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=99</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ralf&#8217;s Canadian English dictionary 0.1</title>
		<link>http://speech.blau.in/?p=87</link>
		<comments>http://speech.blau.in/?p=87#comments</comments>
		<pubDate>Sat, 28 Apr 2012 14:29:37 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[canadian]]></category>
		<category><![CDATA[dictionary]]></category>
		<category><![CDATA[eSpeak]]></category>
		<category><![CDATA[Linux Mint]]></category>
		<category><![CDATA[PLS]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=87</guid>
		<description><![CDATA[This article explains the creation of Ralf&#8217;s Canadian English dictionary version 0.1. 1. Get a Canadian spelling dictionary with 390.000 words. 2. License is GPL. 3. Encoding of the files en_CA.dic and en_CA.aff is UTF-8. 4. Linux Mint terminal: cd /home/ubuntu/Documents/canadian-english unmunch en_CA.dic en_CA.aff &#62; canadian-wordlist 5. Add tags at the beginning and at the [...]]]></description>
			<content:encoded><![CDATA[<p>This article explains the creation of Ralf&#8217;s Canadian English dictionary version 0.1.</p>
<p>1. <a href="http://extensions.libreoffice.org/extension-center/american-british-canadian-spelling-hyphen-thesaurus-dictionaries">Get</a> a <a href="http://libreoffice-na.us/English-3.4-installs/add-on-dictionaries-large-list/kpp-canadian-english-dictionary-390k-word-list.oxt">Canadian spelling dictionary with 390.000 words</a>.<br />
2. License is GPL.<br />
3. Encoding of the files en_CA.dic and en_CA.aff is UTF-8.<br />
4. Linux Mint terminal:</p>
<blockquote><p>cd /home/ubuntu/Documents/canadian-english<br />
unmunch en_CA.dic en_CA.aff &gt; canadian-wordlist</p></blockquote>
<p>5. Add <speak> tags at the beginning and at the end of canadian-wordlist.<br />
6. Linux Mint terminal:</p>
<blockquote><p><code>saxonb-xslt -ext:on -s:canadian-wordlist -xsl:'http://spirit.blau.in/simon/files/2010/04/create-audio-elements.xsl' -o:canadian-speak-audio</code><br />
<code>espeak -f canadian-speak-audio -m -v en -q -x --phonout="canadian-espeak"</code></p></blockquote>
<p>7. Adding <code>&lt;lexicon&gt;</code> tags to the file <code>canadian-espeak</code> (<code>&lt;lexicon&gt;</code> at the beginning of the file; <code>&lt;/lexicon&gt;</code> at the end of the file).</p>
<p>8. Create
<phoneme> elements:</p>
<blockquote><p><code>saxonb-xslt -ext:on -s:canadian-espeak -xsl:'http://spirit.blau.in/simon/files/2010/04/replace-newline-newline-space-by-phoneme-element.xsl' -o:canadian-phoneme-elements</code></p></blockquote>
<p>9. Combine <grapheme> and
<phoneme> elements:</p>
<blockquote><p>mkdir espeak<br />
paste canadian-speak-audio canadian-phoneme-elements > espeak/canadian-english-dictionary.xml</p></blockquote>
<p>10. <a href="http://script.blau.in/espeak/canadian-english-dictionary.xml.bz2" title="version 0.1">Download the dictionary (eSpeak edition)</a>.</p>
<p>I am planning to create an <a href="http://spirit.blau.in/simon/tag/ipa/">IPA</a> version of this dictionary.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=87</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ralf&#8217;s British English dictionary 0.1</title>
		<link>http://speech.blau.in/?p=75</link>
		<comments>http://speech.blau.in/?p=75#comments</comments>
		<pubDate>Sat, 28 Apr 2012 13:00:45 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[british]]></category>
		<category><![CDATA[dictionary]]></category>
		<category><![CDATA[english]]></category>
		<category><![CDATA[eSpeak]]></category>
		<category><![CDATA[Linux Mint]]></category>
		<category><![CDATA[PLS]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=75</guid>
		<description><![CDATA[This article explains the creation of a British English pronunciation dictionary. 1. Download the 390.000 words version of the dictionary. 2, License is GPL. 3. Linux Mint terminal: cd /home/ubuntu/Documents/british 4. Now I install Geany because I want to check the encoding of the files en_GB.aff and en_GB.dic: sudo apt-get install geany The encoding of [...]]]></description>
			<content:encoded><![CDATA[<p>This article explains the creation of a British English pronunciation dictionary.</p>
<p>1. <a href="http://extensions.libreoffice.org/extension-center/american-british-canadian-spelling-hyphen-thesaurus-dictionaries">Download</a> the <a href="http://libreoffice-na.us/English-3.4-installs/add-on-dictionaries-large-list/kpp-british-english-dictionary-390K-word-list.oxt">390.000 words version</a> of the dictionary.</p>
<p>2, License is GPL.<br />
3. Linux Mint terminal:</p>
<blockquote><p>cd /home/ubuntu/Documents/british</p></blockquote>
<p>4. Now I install Geany because I want to check the encoding of the files en_GB.aff and en_GB.dic: sudo apt-get install geany<br />
The encoding of both files is UTF-8.</p>
<p>5. <a href="http://spirit.blau.in/simon/tag/linux-mint/">Linux Mint</a> terminal:</p>
<blockquote><p>sudo apt-get install hunspell-tools<br />
cd /home/ubuntu/Documents/british<br />
unmunch en_GB.dic en_GB.aff &gt; british-wordlist<br />
sudo apt-get install espeak</p></blockquote>
<p>I need to know which <a href="http://espeak.sourceforge.net/voices.html">voice</a> I should use.<br />
Linux Mint terminal:</p>
<blockquote><p><code>espeak --voices</code></p></blockquote>
<p>I will use en-uk. What is the proper command? I had <a href="http://identi.ca/notice/14876900">generated US English phonemes</a>. The command was:</p>
<blockquote><p><code>espeak -f english-grapheme -m -v en-us -q -x --phonout="english-espeak"</code></p></blockquote>
<p>I will have to markup the dictionary file with <a href="http://www.w3.org/TR/speech-synthesis/#S3.3.1">speak and audio tags</a>.</p>
<p>6. Now I install saxonb-xslt with the following command:</p>
<blockquote><p>sudo apt-get install libsaxonb-java</p></blockquote>
<p>7. Add speak tags at the beginning and at the end of the file british-wordlist.<br />
8. Linux Mint terminal:</p>
<blockquote><p><code>saxonb-xslt -ext:on -s:british-wordlist -xsl:'http://spirit.blau.in/simon/files/2010/04/create-audio-elements.xsl' -o:british-speak-audio</code><br />
<code>espeak -f british-speak-audio -m -v en-uk -q -x --phonout="british-espeak"</code></p></blockquote>
<p>9. Adding <code>&lt;lexicon&gt;</code> tags to the file <code>british-espeak</code> (<code>&lt;lexicon&gt;</code> at the beginning of the file; <code>&lt;/lexicon&gt;</code> at the end of the file).</p>
<p>10. Create phoneme elements (compare with <a href="http://spirit.blau.in/simon/2010/04/30/ralfs-occitan-dictionary/">this</a> article):</p>
<blockquote><p><code>saxonb-xslt -ext:on -s:british-espeak -xsl:'http://spirit.blau.in/simon/files/2010/04/replace-newline-newline-space-by-phoneme-element.xsl' -o:british-phoneme-elements</code></p></blockquote>
<p>11. Combine grapheme elements with phoneme elements:</p>
<blockquote><p>paste british-speak-audio british-phoneme-elements > british-dictionary-espeak.xml</p></blockquote>
<p>12. <a href="http://script.blau.in/espeak/british-english-dictionary.xml.bz2" title="Version 0.1">Download the dictionary (eSpeak edition).</a></p>
<p>I am planning to create an IPA version of this PLS dictionary.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=75</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Compare graphemes of two dictionaries</title>
		<link>http://speech.blau.in/?p=71</link>
		<comments>http://speech.blau.in/?p=71#comments</comments>
		<pubDate>Mon, 05 Dec 2011 13:31:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=71</guid>
		<description><![CDATA[Here is the concept of a script that can compare two dictionaries with each other. The first dictionary uses grapheme elements which are in upper case letters, the second dictionary distinguishes between upper case and lower case. If there is a corresponding entry in the second dictionary, the entry of the first dictionary will be [...]]]></description>
			<content:encoded><![CDATA[<p>Here is the concept of a script that can compare two dictionaries with each other. The first dictionary uses grapheme elements which are in upper case letters, the second dictionary distinguishes between upper case and lower case. If there is a corresponding entry in the second dictionary, the entry of the first dictionary will be set to lower case in the resulting output tree. Here is the script:</p>
<p>&lt;?php<br />
// Compare the grapheme elements of two dictionaries</p>
<p>if (file_exists(&#8216;general-american-dictionary.xml&#8217;)) {<br />
$xml = simplexml_load_file(&#8216;general-american-dictionary.xml&#8217;);<br />
$english = simplexml_load_file(&#8216;english-dictionary.xml&#8217;);</p>
<p>foreach ($xml-&gt;lexeme as $lexeme) {<br />
$grapheme = $lexeme-&gt;grapheme;<br />
foreach ($english-&gt;lexeme as $lexemeenglish) {<br />
$graphemeenglish = $lexemeenglish-&gt;grapheme;<br />
if ($grapheme == strtoupper($graphemeenglish)) {<br />
$grapheme = $graphemeenglish;<br />
}<br />
}<br />
echo $grapheme, &#8216;    &#8216;, $lexeme-&gt;phoneme, PHP_EOL;<br />
}<br />
} else {<br />
exit(&#8216;Failed to open general-american-dictionary.xml.&#8217;);<br />
}<br />
?&gt;</p>
<p>Of course, this script is not yet finished. This script is working very slow, but never mind.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=71</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What are amplitude, frequency, wavelength?</title>
		<link>http://speech.blau.in/?p=67</link>
		<comments>http://speech.blau.in/?p=67#comments</comments>
		<pubDate>Sun, 30 Aug 2009 18:50:14 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=67</guid>
		<description><![CDATA[I think that 16 kHz / 16 bit recordings should be sufficient for the development of a speech model. But what does that mean? A good article explains the differences between amplitude (16 bit recordings are more precise than 8 bit recordings), and frequency, wavelength (the human ear can distinguish up to 20 kHz; you [...]]]></description>
			<content:encoded><![CDATA[<p>I think that 16 kHz / 16 bit recordings should be sufficient for the development of a speech model. But what does that mean? A <a href="http://www.makepages.com/membersonly/sound.html">good article</a> explains the differences between amplitude (16 bit recordings are more precise than 8 bit recordings), and frequency, wavelength (the human ear can distinguish up to 20 kHz; you need the double amount of kHz for recording; 16 kHz means that 8 kHz are distinguished &#8211; should be sufficient for speech).</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=67</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replacing HTK by Sphinx?</title>
		<link>http://speech.blau.in/?p=58</link>
		<comments>http://speech.blau.in/?p=58#comments</comments>
		<pubDate>Fri, 28 Aug 2009 22:45:07 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[modelcompilationmanager.cpp]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=58</guid>
		<description><![CDATA[You need to install HTK if you want to run simon with the whole functionality. HTK is not included, it has to be downloaded from a different source (registration required). From my point of view, this information could hint people who are familiar with Sphinx and Qt into the right direction: &#8220;simon uses the NON-FREE [...]]]></description>
			<content:encoded><![CDATA[<p>You need to install HTK if you want to run simon with the whole functionality. HTK is not included, it has to be downloaded from a different source (registration required). From my point of view, <a href="http://dot.kde.org/2009/08/22/simon-speech-activated-user-interface-kde">this information</a> could hint people who are familiar with Sphinx and Qt into the right direction:</p>
<blockquote><p>&#8220;simon uses the NON-FREE HTK for that. Only _one_ class in simon comes into contact with the HTK. The model compilation manager. This class: <a href="http://speech2text.svn.sourceforge.net/viewvc/speech2text/trunk/simonlib/speechmodelcompilation/modelcompilationmanager.cpp?revision=891&amp;view=markup">http://speech2text.svn.sourceforge.net/viewvc/speech2text/trunk/simonlib&#8230;.</a> Those 1200 lines (including other, julius related stuff) are everything that links simon to the HTK. The class could very, very easily be replaced with one that uses something else.&#8221;</p></blockquote>
<p>simon should continue to make use of HTK because there are <a href="http://www.joelonsoftware.com/articles/fog0000000069.html">things that you never should do</a>: </p>
<blockquote><p>&#8220;They did it by making the single worst strategic mistake that any software company can make:</p>
<p>They decided to rewrite the code from scratch.&#8221;</p></blockquote>
<p>Well, but maybe there is someone out there who would want to start a fork of simon, and replace HTK by <a href="http://en.wikipedia.org/wiki/CMU_Sphinx">Sphinx</a>? Of course, this would be a completely different project.</p>
<p>I think that Sphinx could use a GUI.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=58</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Sound frames &#8211; a, e, i, o, u, b(e)</title>
		<link>http://speech.blau.in/?p=46</link>
		<comments>http://speech.blau.in/?p=46#comments</comments>
		<pubDate>Sat, 18 Jul 2009 10:49:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=46</guid>
		<description><![CDATA[Let&#8217;s take a look at a few sound frames (click picture to enlarge): U+0061 (a), U+02D0 The sounds in this article correspond to the German pronunciation. U+0065 (e), U+02D0 U+0069 (i), U+02D0 U+006F (o), U+02D0 U+0075 (u), U+02D0 First part: U+0062 (b) &#8211; second part: U+0065 (e), U+02D0]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s take a look at a few sound frames (click picture to enlarge):</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-a.png"><img class="alignnone size-medium wp-image-45" title="sound-a" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-a-300x103.png" alt="sound-a" width="300" height="103" /></a></p>
<p>U+0061 (a), U+02D0<br />
The sounds in this article correspond to the <a href="http://de.wiktionary.org/wiki/Wiktionary:Lautschrift#Vokale">German pronunciation</a>.</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-e.png"><img class="alignnone size-medium wp-image-49" title="sound-e" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-e-300x131.png" alt="sound-e" width="300" height="131" /></a><br />
U+0065 (e), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-i.png"><img class="alignnone size-medium wp-image-50" title="sound-i" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-i-300x129.png" alt="sound-i" width="300" height="129" /></a><br />
U+0069 (i), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-o.png"><img class="alignnone size-medium wp-image-51" title="sound-o" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-o-300x131.png" alt="sound-o" width="300" height="131" /></a><br />
U+006F (o), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-u.png"><img class="alignnone size-medium wp-image-52" title="sound-u" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-u-300x118.png" alt="sound-u" width="300" height="118" /></a><br />
U+0075 (u), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-be.png"><img class="alignnone size-medium wp-image-53" title="sound-be" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-be-300x87.png" alt="sound-be" width="300" height="87" /></a><br />
First part: U+0062 (b) &#8211; second part: U+0065 (e), U+02D0</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=46</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Julius package for Ubuntu</title>
		<link>http://speech.blau.in/?p=43</link>
		<comments>http://speech.blau.in/?p=43#comments</comments>
		<pubDate>Thu, 18 Jun 2009 17:49:13 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=43</guid>
		<description><![CDATA[Soon, there should be an updated Julius package for Ubuntu (4.0.2 -&#62; 4.1.2).]]></description>
			<content:encoded><![CDATA[<p>Soon, there should be an <a href="http://www.voxforge.org/home/forums/message-boards/speech-recognition-engines/julius-4.1.2">updated Julius package for Ubuntu</a> (4.0.2 -&gt; 4.1.2).</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=43</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Characteristics of the sound &#8220;a&#8221;</title>
		<link>http://speech.blau.in/?p=34</link>
		<comments>http://speech.blau.in/?p=34#comments</comments>
		<pubDate>Wed, 20 May 2009 01:37:45 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[HTK]]></category>
		<category><![CDATA[sound]]></category>
		<category><![CDATA[Audacity]]></category>
		<category><![CDATA[signal processing]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=34</guid>
		<description><![CDATA[Let&#8217;s take a look at the characteristics of the sound &#8220;a&#8221; (spoken like in father). Here is a screenshot of Audacity which shows the repetitive pattern of the sound &#8220;a&#8221;: I have marked the different waves with numbers 1, 2, 3, 4, 5. The waves with the same number are slightly different one from another, [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s take a look at the characteristics of the sound &#8220;a&#8221; (spoken like in f<strong>a</strong>ther). Here is a screenshot of <a href="http://spirit.blau.in/ubuntu/tag/audacity/">Audacity</a> which shows the repetitive pattern of the sound &#8220;a&#8221;:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a.png"><img class="alignnone size-medium wp-image-35" title="waveform-a" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-300x102.png" alt="waveform-a" width="300" height="102" /></a></p>
<p>I have marked the different waves with numbers 1, 2, 3, 4, 5. The waves with the same number are slightly different one from another, but they are similar. It is a repetitive pattern. Let&#8217;s extract a complete frame of the sound &#8220;a&#8221;:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small.png"><img class="alignnone size-full wp-image-36" title="waveform-a-small" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small.png" alt="waveform-a-small" width="178" height="317" /></a></p>
<p>The above picture shows the first frame. Let&#8217;s compare the first frame with the second frame:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small2.png"><img class="alignnone size-full wp-image-38" title="waveform-a-small2" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small2.png" alt="waveform-a-small2" width="182" height="311" /></a></p>
<p>Take a look at the yellow marked area, and compare it with the corresponding area of the previous picture. It is slightly diffent.</p>
<p>This was a short introduction into <a href="http://en.wikipedia.org/wiki/Category:Signal_processing">signal processing</a>. These sound waves can be analysed by software like the HTK toolkit.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=34</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Installing simon-juliusd-0.1-alpha2.exe</title>
		<link>http://speech.blau.in/?p=30</link>
		<comments>http://speech.blau.in/?p=30#comments</comments>
		<pubDate>Sun, 22 Jun 2008 07:19:33 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[speech recognition]]></category>
		<category><![CDATA[Simon]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=30</guid>
		<description><![CDATA[I just downloaded the obviously recently released program simon-juliusd-0.1-alpha2.exe. Before running the program, I checked it with ClamWin (I always do that before I install new software). It is OK, so I will install this program on my computer. The program is licensed under the GPL. On my computer, the program simon.exe was installed on [...]]]></description>
			<content:encoded><![CDATA[<p>I just downloaded the obviously <a href="http://speech.blau.in/?p=26#comment-37">recently released</a> program  <a href="http://sourceforge.net/project/showfiles.php?group_id=190872&amp;package_id=224125&amp;release_id=608297" title="Simon is a free speech recognition software">simon-juliusd-0.1-alpha2.exe</a>.  Before running the program, I checked it with ClamWin (I always do that before I install new software).  It is OK, so I will install this program on my computer.  The program is licensed under the GPL.  On my computer, the program simon.exe was installed on the location &#8220;H:\Program Files\simon\simon-0.1-alpha-2.&#8221;</p>
<p>Here is a screenshot:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2008/06/simon-konfiguration.png" title="Simon configuration"><img src="http://speech.blau.in/wp-content/uploads/2008/06/simon-konfiguration.png" alt="Simon configuration" /></a></p>
<p>But now, it is beginning to get complicated.  Take a look at the next screenshot:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2008/06/simon-checklist.png" title="Simon checklist"><img src="http://speech.blau.in/wp-content/uploads/2008/06/simon-checklist.png" alt="Simon checklist" /></a></p>
<p>So to use this program successfully, there are several additional programs needed.  I need the <a href="http://mein-parteibuch.org/wiki/HTK">HTK toolkit</a>, and <a href="http://mein-parteibuch.org/wiki/Julius_Speech_Recognition_Engine">Julius</a>.  And there are further components necessary.  I think I will stop the installation now.  Or should I continue?  At the moment, I am not sure.  I think, that I will hit the next button.</p>
<p>I won&#8217;t publish a screenshot from the next step.  But it is about HTK programs HDman, HCopy, and several other programs.  I think (but I am not sure) that it is necessary to tell Simon the path on which location those programs are installed.  A few months ago, I made some first steps with HTK and Julius, but everything was pretty complicated.  At the moment, I am reading a few pages in the HTK book, everything is very abstract.  And it takes a lot of time to get involved.  But it is possible!  You just have to stay focused.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&#038;p=30</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

