Abstract:
Among different forms of audio data or information, the author wishes to limit the scope of this
research to privacy protection in voice contents of speakers, because voice generally conveys
intelligence such as gender, emotion and it differs from speaker to speaker. De-identification of
voice may bring numerous advantages, such as preserving the privacy of speakers during
communication, maintaining confidentiality of inquirers who conduct critical investigations and
improve the clarity of voice signals used in airport/aviation communication by standardizing the
voices of Pilots and Air Traffic Controllers. Though advanced voice encryption methods are
available to deteriorate the intelligence of speech, they do not directly address the issues of speaker
de-identification. This research project aims at de-identification of voice signals while preserving
the intelligence of the speech during communication.
Designed GUI for mono LPC spectrums of original and de-identified voice signals
In this project, the de-identification process was done at three stages, where the last two processes
are irreversible. First, in the frequency normalization stage, pitch of the original signal is changed
and slightly de-identified the voice in frequency domain. Then 12 LPC (Linear Predictive Coding)
co-efficient values of the subject-person’s original voice signal is subtracted from the 12 coefficient
values of the reference sample voice signal. As a result, features are slightly moderated
by the second stage. In the third stage the features are destroyed again by shuffling LPC coefficients
randomly within three categories. Therefore, this whole process is expected to preserve
a higher level of privacy. Based on the test carried out by using 15 samples of male and 15 samples
of female voice produced a degree of 10% and 20% de-identification, which could be accepted as
a very satisfactory result.