Year 2007 Final-Year Projects in
SIGNALS AND INFORMATION PROCESSING SYSTEMS

Dr. Roberto Togneri

We are currently in the first century of the information age and the new era of information systems engineering. The Signals and Information Processing (SIP) Group is offering exciting and challenging final-year projects to students who can demonstrate the required interest, motivation and passion. If you are interested in any of the projects on offer (or you have suggestions of your own) please email Roberto Togneri  at <roberto[@]ee[.]uwa[.]edu[.]au> or drop by Room 4.10 for an obligation free discussion, additional information and even reading material to help you make a truly informed choice.

Slides used for SIP Group Project Presentation on Tuesday, September 5

PLEASE NOTE: Most of the projects will require proficiency either in the MATLAB  programming language, shell scripting and/or UNIX command-line environments. Most students should be familiar with MATLAB, operate comfortably in a UNIX environment and be able to learn the basics of shell scripting if required. This should take you no more than 1-2 weeks practice depending on your current programming and computing skills. You may like to refer to my online documentation and tutorials on MATLAB and UNIX/Shell to help you with this. For projects emphasising real-time or embedded implementations proficiency in C/C++ Programming will be important.
STOP! Some important questions you should ask yourself:
  1. Do you have a course weighted average of  78 or better?
  2. Are you looking at getting a first class honours?
  3. Are you still considering all options available to you after you graduate?
If you answered yes to the above, then you should be seriously considering post-graduate studies after you graduate in 2007.
For more information please see:
Postgraduate Research in Spoken Language Systems


First ... most of the projects I offered to students this year are still available to interested students for next year, please check my 2006 FYP page for the the details of these projects.
Now ... to the exciting Final-Year Projects being offered  for students next year:


4.A. Secure Command and Control using Speech Recognition and Speaker Verification
This is a systems engineering project where you will build a prototype speech recognition system (see http://www.cslu.ogi.edu/HLTsurvey/ch1node4.html) and/or speaker verification system (see http://www.cslu.ogi.edu/HLTsurvey/ch1node9.html) using the Hidden Markov Model Toolkit (HTK) software (see http://htk.eng.cam.ac.uk). Your application domain is a futuristic automated home where voice and sound is used to respond to commands and authenticate occupants.  Possible tasks include: sound-activated light switch, voice-controlled TV remote, security user authentication by voice, voice-dialling, etc.  This project can be as simple as you like (speaker-dependent, isolated word recognition in quiet environments) to as challenging and exciting as you like (continuous speech recognition in noisy environments) and investigating different aspects (voice activity detection, keyword spotting, spoken language understanding, etc.). Note that more than one student can work on this project.

4.B. Reconstruction of Noise Corrupted Speech Spectrograms
Speech signals are usually transformed into the time-frequency domain and represented as spectrographic images (spectrograms, see http://en.wikipedia.org/wiki/Spectrogram) where the two axes of the image represent time and frequency respectively. The pixel value of each element in the image marks the energy of the signal in that time-frequency location. Different regions of this spectrographic picture may be corrupted to different degrees by noise. The purpose of this project is the investigation of methods to reconstruct the damaged regions of spectrograms prior to recognition from the information available in reliable regions and a priori knowledge about the structure of speech. Once the reconstruction process is completed standard speech recognition methods and feature extraction techniques can be applied. The HTK toolkit will be used for the speech recognition experiments and MATLAB for implementing the reconstruction algorithms. Note that this project will be co-supervised with a SIP Lab PhD student.

4.C. Single Channel Blind Source Separation
The ability to separate different sources of sound is an important mechanism for speech enhancement and robust speech recognition. Consider an environment with the speaker and background music, or the speaker with background car noise, etc.. With microphone array processing multiple channels allow spatial filtering to help isolate the speaker. But there are many applications where only a single channel is available. In this challenging but satisfying project you will investigate and implement a technique based on independent component analysis (ICA) to identify and isolate the different sources of sound in a single channel recording. You will need to both understand the basics of the theory, implement the algorithms in MATLAB, and carry out the investigations on synthetically designed audio samples (e.g. separating speech from music).

4.D. Performance Evaluation of Auditory Models
Computational auditory models replicate the process of human perception of speech and sound in the peripheral auditory system. They have been used as a tool for speech processing, speech and voice analysis as well as in the investigation of auditory phenomena. Two software tools are available for auditory modeling: Auditory Image Modeling (AIM) (see http://www.pdn.cam.ac.uk/groups/cnbh/research/aim.html) and Development System for Auditory Modeling (DSAM) (see
http://www.essex.ac.uk/psychology/hearinglab/dsam). AIM is intended to simulate the processing performed by the auditory system to convert a sound into your first conscious awareness of that sound, that is your auditory image of the sound. The processing steps are pre-cochlear processing, basilar membrane motion, the neural activity pattern and construction of the auditory image. DSAM brings together established models which simulate various stages in the auditory process. The DSAM
library is a computational platform and set of coding conventions which supports a modular approach to auditory modeling.  Available with DSAM is the Auditory Model Simulator (AMS) application which is a fully-fledged, ready to use application with a graphical user interface. It comes as a finished product for Windows and Linux platforms with ready-complied installation package. Also available is the "AutoTest" for testing DSAM routines. Both packages are written in C and also have
a MATLAB version (AIM-MAT and AFM).

The project involves implementation of computer models of auditory systems using the existing development tools for auditory modeling. The student will implement different established models including some of the models that have been developed here at the SIP Lab. The main task will be to evaluate the relative performance and efficiency of these models in identifying perceptual properties of speech. Note that this project will be co-supervised with a SIP Lab PhD student.

4.E. Music Classification and Summarization
With the increasing stock of available music files which are ripped or downloaded to portable music players, MP3 CDs, etc. it becomes increasingly important to be able to more efficiently search and index music in cases where the title or artist is unknown. In this project you will perform simple classification of the different music genres (rock, jazz, ambient, classic) and sub-genres (rock: pop, metal, r&b, dance, etc.) using standard pattern classification methods (see http://cnx.org/content/m11691/latest/). For a more challenging project you can also consider music segmentation and summarisation by attempting to detect the key segments of a music track and identify the "hook" (e.g. chorus) used to summarise the music piece.

4.F. Nonlinear Function Mapping using the MLP and RBF Neural Networks
Function mapping is the ability to be able to uniquely map data from one domain to another domain. With parametric mappings an analytic function, y=f(x), will map points x to y. Parametric, linear mappings are usually easy to estimate, however nonlinear mappings are much more complicated. And when the functional form of the mapping is unknown a parametric mapping is not even possible. In this project you will examine nonlinear, data-driven mappings using neural networks as function approximators. Specifically you will use the MLP and RBF networks to determine a mapping between the speech vocal tract resonances (or formants) and the speech acoustic spectral features. You can either perform acoustic-to-formant mapping (a formant tracker) or the more difficult formant-to-acoustic mapping. You will carry out your investigations under MATLAB and generate the required formant and acoustic data using the various speech software tools and corpora. Note that your results will make an important contribution to the next generation dynamic acoustic models for speech.

4.G. Real-time EEG processing for interactive ERP and TMS
Interactive ERP/TMS is a new process in electrophysiology whereby stimuli are delivered in response to selected short term patterns of the electroencephalogram (EEG). For event related potential research, stimuli may be auditory, visual or somatosensory stimuli, while for TMS work, the stimuli are high-powered magnetic pulses. Interactive recordings are used in basic research of cognition, in epidemiological research (schizophrenia) and in clinical trials in the treatment of depression and schizophrenia. They are also being tested brain computer interface (BCI) applications whereby devices are controlled directly by "thought" activity. Highly motivated students would be expected to select and implement their own processing paradigm, algorithm, and software for evaluation. Processing pattern recognition methods that have previously been used in interactive recording include syntactic analysis, spectral analysis and phase or amplitude thresholds.  The only requirement being to recognise a brainwave state in sufficient time to deliver a stimulus before the state changes. Training on the technical and clinical aspects of EEG/ERP will be provided at CCRN (http://www.ccrn.uwa.edu.au), along with a scored corpus of EEG data to allow (offline) retrospective development/training of such a system. CCRN then provides TMS and ERP stimuli delivery systems, subject assistance and clinical supervision to potentially implement the system (online) as a possible new treatment. Note that is a collaborative project between the SIP Group and CCRN.

4.H. Etch Pit Density (EPD) of Semiconductor Wafers
The Etch Pit Density (EPD) of semiconductor wafers is a measure of the number of defects and its accurate calculation is important. The defects are visibly enhanced when the wafer is wet etched in a certain chemical solution. An image of the wafer surface reveals the pit defects as small spots. The defects shown by the image need to be detected, classified as defects and then counted. This process is not unlike particle counting used in microbiology. In the project a highly motivated student with a sound background in image processing, programming and signal processing will be required to investigate and implement strategies for enhancing the defects by appropriate image processing (e.g. thresholding, etc.), detect and classify the defects by recognising the defect characteristics and counting the number of defects. Part of this investigation can include evaluating the ImageJ public-domain software (see http://rsb.info.nih.gov/ij) for this purpose. Note that this project will be jointly supervised with the Microelectronic Research Group (MRG).

4.I. Performance Analysis of Cryptographic Algorithms
In the modern digital information age there is a widespread need for effective cryptography. The use of encryption is common for secure transactions, secure storage of sensitive, user authentication and digital rights management. There are many available cryptographic algorithms that one can use for both private-key and public-key encryption (see http://www.ssh.fi/support/cryptography and http://www.eskimo.com/~weidai/algorithms.html). In this project a highly motivated student with the right Maths or CS background can choose to investigate different aspects of cryptographic algorithm performance including: implementation of more efficient Elliptic Curve (EC) public-key encryption and comparison with RSA or the seriousness of timing attacks based on cache, CPU and memory profiling by evaluations and solutions. Alternatively for a more straightforward project you can benchmark the performance of various public-domain algorithms (see http://www.eskimo.com/~weidai/benchmarks.html). Note that this project is supported by the interests of Motorola Research, Australia.

4.J. Evaluation of an Identity-Based Encryption Scheme
The biggest drawback of standard public key infrastructure (PKI), aside from simply getting people to use it, is the rigmarole of generating and distributing public keys. One potential solution was proposed nearly 20 years ago. It's called Identity-Based Encryption (see http://crypto.stanford.edu/ibe/). With IBE the sender can encrypt the text using a human-readable string as the key (e.g. the recipients email address). At the other end the recipient then has to retrieve or generate the corresponding private key to decrypt the ciphertext. In this project you will investigate, implement and evaluate a simple IBE prototype scheme and compare its performance, useability and applicability to an equivalent PKI scheme. Students for this project should have either have a CS or Maths major background. Note that this project is supported by the interests of Motorola Research, Australia.


Want more? Have a look at my 2008 FYP list for the projects which didn't make it this time round and which may be offered to students for 2008, but are available to students in 2007 if you are interested.

Last Updated: 5 September 2006
Dr. Roberto Togneri's Research Page
CRICOS Provider No: 00126G