Speech Processing System

This project team will extend Naresh Trilok's dissertation work as follows:
  1. become familiar with Naresh's Matlab programs and rerun the experiments to understand the system
  2. create a grey-scale plot of the 13 frequency bands as a function of time (this plot can be used to display the speech samples before and after segmentation, and also after dividing the segmented samples in the seven speech sounds)
  3. run a 24 feature experiment (12 frequency bands without the first Cepstral component, means and variances)
  4. add 40 speakers to the database of speech samples, increasing the number of speakers from 10 to 50, with 10 samples of "My name is ..." from each speaker (increasing the total number of utterances from 100 to 500). Each speech sample in the database should include the speakers name, gender, age, and nationality.
  5. automate the segmentation of the "My name is" portion of the speech samples
  6. automate the segmentation of the "My name is" phrase into its 7 speech sounds
  7. rerun the 24 and 84 feature experiments with the increased database of speech samples
  8. run a 168 feature experiment (both means and variances of the 84 measures)
You should use Matlab for converting the new speech samples into the 13 frequency bands as a function of time (simply use Matlab programs already provided by Naresh). You should be able to use either Java or Matlab for the grey-scale display, for the segmentation of "My name is", for further dividing the segmented samples into the 7 speech sounds, and for obtaining the features (means and variances). Matlab should again be used for the neural network classifier (again using the programs provided by Naresh).

For a summary of Naresh's dissertation that we just submitted to a conference, see Paper that was submitted to ISI. .