<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>speech recognition software</title>
	<atom:link href="http://speech.blau.in/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://speech.blau.in</link>
	<description>- free and open source -</description>
	<pubDate>Sun, 30 Aug 2009 18:50:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>What are amplitude, frequency, wavelength?</title>
		<link>http://speech.blau.in/?p=67</link>
		<comments>http://speech.blau.in/?p=67#comments</comments>
		<pubDate>Sun, 30 Aug 2009 18:50:14 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=67</guid>
		<description><![CDATA[I think that 16 kHz / 16 bit recordings should be sufficient for the development of a speech model. But what does that mean? A good article explains the differences between amplitude (16 bit recordings are more precise than 8 bit recordings), and frequency, wavelength (the human ear can distinguish up to 20 kHz; you [...]]]></description>
			<content:encoded><![CDATA[<p>I think that 16 kHz / 16 bit recordings should be sufficient for the development of a speech model. But what does that mean? A <a href="http://www.makepages.com/membersonly/sound.html">good article</a> explains the differences between amplitude (16 bit recordings are more precise than 8 bit recordings), and frequency, wavelength (the human ear can distinguish up to 20 kHz; you need the double amount of kHz for recording; 16 kHz means that 8 kHz are distinguished - should be sufficient for speech).</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=67</wfw:commentRss>
		</item>
		<item>
		<title>Replacing HTK by Sphinx?</title>
		<link>http://speech.blau.in/?p=58</link>
		<comments>http://speech.blau.in/?p=58#comments</comments>
		<pubDate>Fri, 28 Aug 2009 22:45:07 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<category><![CDATA[modelcompilationmanager.cpp]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=58</guid>
		<description><![CDATA[You need to install HTK if you want to run simon with the whole functionality. HTK is not included, it has to be downloaded from a different source (registration required). From my point of view, this information could hint people who are familiar with Sphinx and Qt into the right direction:
&#8220;simon uses the NON-FREE HTK [...]]]></description>
			<content:encoded><![CDATA[<p>You need to install HTK if you want to run simon with the whole functionality. HTK is not included, it has to be downloaded from a different source (registration required). From my point of view, <a href="http://dot.kde.org/2009/08/22/simon-speech-activated-user-interface-kde">this information</a> could hint people who are familiar with Sphinx and Qt into the right direction:</p>
<blockquote><p>&#8220;simon uses the NON-FREE HTK for that. Only _one_ class in simon comes into contact with the HTK. The model compilation manager. This class: <a href="http://speech2text.svn.sourceforge.net/viewvc/speech2text/trunk/simonlib/speechmodelcompilation/modelcompilationmanager.cpp?revision=891&amp;view=markup">http://speech2text.svn.sourceforge.net/viewvc/speech2text/trunk/simonlib&#8230;.</a> Those 1200 lines (including other, julius related stuff) are everything that links simon to the HTK. The class could very, very easily be replaced with one that uses something else.&#8221;</p></blockquote>
<p>simon should continue to make use of HTK because there are <a href="http://www.joelonsoftware.com/articles/fog0000000069.html">things that you never should do</a>: </p>
<blockquote><p>&#8220;They did it by making the single worst strategic mistake that any software company can make:</p>
<p>They decided to rewrite the code from scratch.&#8221;</p></blockquote>
<p>Well, but maybe there is someone out there who would want to start a fork of simon, and replace HTK by <a href="http://en.wikipedia.org/wiki/CMU_Sphinx">Sphinx</a>? Of course, this would be a completely different project.</p>
<p>I think that Sphinx could use a GUI.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=58</wfw:commentRss>
		</item>
		<item>
		<title>Sound frames - a, e, i, o, u, b(e)</title>
		<link>http://speech.blau.in/?p=46</link>
		<comments>http://speech.blau.in/?p=46#comments</comments>
		<pubDate>Sat, 18 Jul 2009 10:49:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=46</guid>
		<description><![CDATA[Let&#8217;s take a look at a few sound frames (click picture to enlarge):

U+0061 (a), U+02D0
The sounds in this article correspond to the German pronunciation.

U+0065 (e), U+02D0

U+0069 (i), U+02D0

U+006F (o), U+02D0

U+0075 (u), U+02D0

First part: U+0062 (b) - second part: U+0065 (e), U+02D0
]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s take a look at a few sound frames (click picture to enlarge):</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-a.png"><img class="alignnone size-medium wp-image-45" title="sound-a" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-a-300x103.png" alt="sound-a" width="300" height="103" /></a></p>
<p>U+0061 (a), U+02D0<br />
The sounds in this article correspond to the <a href="http://de.wiktionary.org/wiki/Wiktionary:Lautschrift#Vokale">German pronunciation</a>.</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-e.png"><img class="alignnone size-medium wp-image-49" title="sound-e" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-e-300x131.png" alt="sound-e" width="300" height="131" /></a><br />
U+0065 (e), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-i.png"><img class="alignnone size-medium wp-image-50" title="sound-i" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-i-300x129.png" alt="sound-i" width="300" height="129" /></a><br />
U+0069 (i), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-o.png"><img class="alignnone size-medium wp-image-51" title="sound-o" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-o-300x131.png" alt="sound-o" width="300" height="131" /></a><br />
U+006F (o), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-u.png"><img class="alignnone size-medium wp-image-52" title="sound-u" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-u-300x118.png" alt="sound-u" width="300" height="118" /></a><br />
U+0075 (u), U+02D0</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/07/sound-be.png"><img class="alignnone size-medium wp-image-53" title="sound-be" src="http://speech.blau.in/wp-content/uploads/2009/07/sound-be-300x87.png" alt="sound-be" width="300" height="87" /></a><br />
First part: U+0062 (b) - second part: U+0065 (e), U+02D0</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=46</wfw:commentRss>
		</item>
		<item>
		<title>Julius package for Ubuntu</title>
		<link>http://speech.blau.in/?p=43</link>
		<comments>http://speech.blau.in/?p=43#comments</comments>
		<pubDate>Thu, 18 Jun 2009 17:49:13 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=43</guid>
		<description><![CDATA[Soon, there should be an updated Julius package for Ubuntu (4.0.2 -&#62; 4.1.2).
]]></description>
			<content:encoded><![CDATA[<p>Soon, there should be an <a href="http://www.voxforge.org/home/forums/message-boards/speech-recognition-engines/julius-4.1.2">updated Julius package for Ubuntu</a> (4.0.2 -&gt; 4.1.2).</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=43</wfw:commentRss>
		</item>
		<item>
		<title>Characteristics of the sound &#8220;a&#8221;</title>
		<link>http://speech.blau.in/?p=34</link>
		<comments>http://speech.blau.in/?p=34#comments</comments>
		<pubDate>Wed, 20 May 2009 01:37:45 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[HTK]]></category>

		<category><![CDATA[sound]]></category>

		<category><![CDATA[Audacity]]></category>

		<category><![CDATA[signal processing]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=34</guid>
		<description><![CDATA[Let&#8217;s take a look at the characteristics of the sound &#8220;a&#8221; (spoken like in father). Here is a screenshot of Audacity which shows the repetitive pattern of the sound &#8220;a&#8221;:

I have marked the different waves with numbers 1, 2, 3, 4, 5. The waves with the same number are slightly different one from another, but [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s take a look at the characteristics of the sound &#8220;a&#8221; (spoken like in f<strong>a</strong>ther). Here is a screenshot of <a href="http://spirit.blau.in/ubuntu/tag/audacity/">Audacity</a> which shows the repetitive pattern of the sound &#8220;a&#8221;:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a.png"><img class="alignnone size-medium wp-image-35" title="waveform-a" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-300x102.png" alt="waveform-a" width="300" height="102" /></a></p>
<p>I have marked the different waves with numbers 1, 2, 3, 4, 5. The waves with the same number are slightly different one from another, but they are similar. It is a repetitive pattern. Let&#8217;s extract a complete frame of the sound &#8220;a&#8221;:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small.png"><img class="alignnone size-full wp-image-36" title="waveform-a-small" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small.png" alt="waveform-a-small" width="178" height="317" /></a></p>
<p>The above picture shows the first frame. Let&#8217;s compare the first frame with the second frame:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small2.png"><img class="alignnone size-full wp-image-38" title="waveform-a-small2" src="http://speech.blau.in/wp-content/uploads/2009/05/waveform-a-small2.png" alt="waveform-a-small2" width="182" height="311" /></a></p>
<p>Take a look at the yellow marked area, and compare it with the corresponding area of the previous picture. It is slightly diffent.</p>
<p>This was a short introduction into <a href="http://en.wikipedia.org/wiki/Category:Signal_processing">signal processing</a>. These sound waves can be analysed by software like the HTK toolkit.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=34</wfw:commentRss>
		</item>
		<item>
		<title>Installing simon-juliusd-0.1-alpha2.exe</title>
		<link>http://speech.blau.in/?p=30</link>
		<comments>http://speech.blau.in/?p=30#comments</comments>
		<pubDate>Sun, 22 Jun 2008 07:19:33 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<category><![CDATA[Simon]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=30</guid>
		<description><![CDATA[I just downloaded the obviously recently released program  simon-juliusd-0.1-alpha2.exe.  Before running the program, I checked it with ClamWin (I always do that before I install new software).  It is OK, so I will install this program on my computer.  The program is licensed under the GPL.  On my computer, the [...]]]></description>
			<content:encoded><![CDATA[<p>I just downloaded the obviously <a href="http://speech.blau.in/?p=26#comment-37">recently released</a> program  <a href="http://sourceforge.net/project/showfiles.php?group_id=190872&amp;package_id=224125&amp;release_id=608297" title="Simon is a free speech recognition software">simon-juliusd-0.1-alpha2.exe</a>.  Before running the program, I checked it with ClamWin (I always do that before I install new software).  It is OK, so I will install this program on my computer.  The program is licensed under the GPL.  On my computer, the program simon.exe was installed on the location &#8220;H:\Program Files\simon\simon-0.1-alpha-2.&#8221;</p>
<p>Here is a screenshot:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2008/06/simon-konfiguration.png" title="Simon configuration"><img src="http://speech.blau.in/wp-content/uploads/2008/06/simon-konfiguration.png" alt="Simon configuration" /></a></p>
<p>But now, it is beginning to get complicated.  Take a look at the next screenshot:</p>
<p><a href="http://speech.blau.in/wp-content/uploads/2008/06/simon-checklist.png" title="Simon checklist"><img src="http://speech.blau.in/wp-content/uploads/2008/06/simon-checklist.png" alt="Simon checklist" /></a></p>
<p>So to use this program successfully, there are several additional programs needed.  I need the <a href="http://mein-parteibuch.org/wiki/HTK">HTK toolkit</a>, and <a href="http://mein-parteibuch.org/wiki/Julius_Speech_Recognition_Engine">Julius</a>.  And there are further components necessary.  I think I will stop the installation now.  Or should I continue?  At the moment, I am not sure.  I think, that I will hit the next button.</p>
<p>I won&#8217;t publish a screenshot from the next step.  But it is about HTK programs HDman, HCopy, and several other programs.  I think (but I am not sure) that it is necessary to tell Simon the path on which location those programs are installed.  A few months ago, I made some first steps with HTK and Julius, but everything was pretty complicated.  At the moment, I am reading a few pages in the HTK book, everything is very abstract.  And it takes a lot of time to get involved.  But it is possible!  You just have to stay focused.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=30</wfw:commentRss>
		</item>
		<item>
		<title>VoxForge dictionary isn&#8217;t encoded in UTF-8</title>
		<link>http://speech.blau.in/?p=28</link>
		<comments>http://speech.blau.in/?p=28#comments</comments>
		<pubDate>Sat, 21 Jun 2008 03:24:41 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<category><![CDATA[ASCII]]></category>

		<category><![CDATA[dictionary]]></category>

		<category><![CDATA[UTF-8]]></category>

		<category><![CDATA[VoxForge]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=28</guid>
		<description><![CDATA[I just downloaded the VoxForge dictionary (2.6 MB), and opened it with Notepad++.  Obviously, it is encoded in ANSI, not in UTF-8.  That&#8217;s OK because it does contain just standard characters.  I am guessing that this dictionary is compatible with ASCII.  But I would suggest that future versions should be published [...]]]></description>
			<content:encoded><![CDATA[<p>I just downloaded the <a href="http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Lexicon/VoxForge.tgz">VoxForge dictionary</a> (2.6 MB), and opened it with Notepad++.  Obviously, it is encoded in ANSI, not in UTF-8.  That&#8217;s OK because it does contain just standard characters.  I am guessing that this dictionary is compatible with ASCII.  But I would suggest that future versions should be published in UTF-8.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=28</wfw:commentRss>
		</item>
		<item>
		<title>Switching from Arpabet to IPA</title>
		<link>http://speech.blau.in/?p=27</link>
		<comments>http://speech.blau.in/?p=27#comments</comments>
		<pubDate>Sat, 21 Jun 2008 00:27:54 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<category><![CDATA[dictionary]]></category>

		<category><![CDATA[IPA]]></category>

		<category><![CDATA[UTF-8]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=27</guid>
		<description><![CDATA[Obviously, the CMU pronouncing dictionary is using the Arpabet.  The Arpabet has the advantage that it is possible
&#8220;to represent phonemes with ASCII characters.&#8221;
But today, the UTF-8 standard is becoming more and more common.  In my opinion, there should be a discussion to switch from Arpabet/ASCII to IPA/UTF-8.  The IPA is easier to [...]]]></description>
			<content:encoded><![CDATA[<p>Obviously, the <a href="http://en.wikipedia.org/wiki/CMU_Pronouncing_Dictionary" title="CMU pronouncing dictionary">CMU pronouncing dictionary</a> is using the Arpabet.  The <a href="http://en.wikipedia.org/wiki/Arpabet">Arpabet</a> has the advantage that it is possible</p>
<blockquote><p>&#8220;to represent phonemes with ASCII characters.&#8221;</p></blockquote>
<p>But today, the UTF-8 standard is becoming more and more common.  In my opinion, there should be a discussion to switch from Arpabet/ASCII to IPA/UTF-8.  The IPA is easier to read than the Arpabet. And UTF-8 should be backwards compatible to ASCII (at least, as far as I know).</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=27</wfw:commentRss>
		</item>
		<item>
		<title>RFC 4267: VoiceXML, PLS, SSML, SRGS, CCXML</title>
		<link>http://speech.blau.in/?p=26</link>
		<comments>http://speech.blau.in/?p=26#comments</comments>
		<pubDate>Thu, 19 Jun 2008 04:30:06 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=26</guid>
		<description><![CDATA[Recently, I read the document RFC 4267.  In my opinion, this framework is something very interesting.
]]></description>
			<content:encoded><![CDATA[<p>Recently, I read the document <a href="http://www.ietf.org/rfc/rfc4267.txt" title="RFC 4267">RFC 4267</a>.  In my opinion, this framework is something very interesting.</p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=26</wfw:commentRss>
		</item>
		<item>
		<title>learning sphinx automatic speech recognition</title>
		<link>http://speech.blau.in/?p=25</link>
		<comments>http://speech.blau.in/?p=25#comments</comments>
		<pubDate>Mon, 24 Mar 2008 06:17:44 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[speech recognition]]></category>

		<category><![CDATA[sphinx]]></category>

		<guid isPermaLink="false">http://speech.blau.in/?p=25</guid>
		<description><![CDATA[You can learn to use the CMU Sphinx automatic speech recognition system.  I followed several steps of this tutorial, but I didn&#8217;t succeed.  I used Ubuntu Linux.  What was the problem?  Well, there occurred several smaller problems.  I could solve a few of them, but not all.  I will [...]]]></description>
			<content:encoded><![CDATA[<p>You can learn to use the CMU Sphinx automatic speech recognition system.  I followed several steps of <a href="http://www.speech.cs.cmu.edu/sphinx/tutorial.html">this tutorial</a>, but I didn&#8217;t succeed.  I used Ubuntu Linux.  What was the problem?  Well, there occurred several smaller problems.  I could solve a few of them, but not all.  I will try again.</p>
<p>2008-04-01: <a href="http://voxforge.org/home/forums/message-boards/speech-recognition-engines/doubt-about-sphinx3-installation">Doubt about sphinx3 installation</a></p>
]]></content:encoded>
			<wfw:commentRss>http://speech.blau.in/?feed=rss2&amp;p=25</wfw:commentRss>
		</item>
	</channel>
</rss>
