为研究信号相关性在语音情感识别中的作用,提出了一种面向语音情感识别的语谱图特征提取算法.首先,对语谱图进行处理,得到归一化后的语谱图灰度图像;然后,计算不同尺度、不同方向的Gabor图谱,并采用局部二值模式提取Gabor图谱的纹理特征;最后,将不同尺度、不同方向Gabor图谱提取到的局部二值模式特征进行级联,作为一种新的语音情感特征进行情感识别.柏林库(EMO-DB)及FAU Ai Bo库上的实验结果表明:与已有的韵律、频域、音质特征相比,所提特征的识别率提升3%以上;与声学特征融合后,所提特征的识别率较早期声学特征至少提高5%.因此,利用这种新的语音情感特征可以有效识别不同种类的情感语音.
To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set.
In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections (DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP (KDCLPP) is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE'05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projections (LPP), local discriminant embedding (LDE), graph-based Fisher analysis (GbFA) and so on, with different categories of classifiers.
In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field (EDF) and the ant colony search (ACS) strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter- relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn- based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio)'visual emotion challenge (AVEC) 2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.
Aimed at the problem of narrow tunability and low frequency microwave signal generated by the optical method,a novel approach to stabilizing the tunable photonic microwave generated by the multi-wavelength Brillouin fiber laser is proposed and is experimentally demonstrated.A singlelongitudinal-mode Brillouin fiber laser is designed,and by using the laser,a multi-wavelength Brillouin fiber laser with more than eleven orders of Stokes wave is observed.The wavelength spacing of the adjacent Stokes wave is 0.085 nm.If the Brillouin pump power is increased,the number of Stokes wave output can be further increased.The tunable microwave signals of 10.8 and 21.6 GHz are obtained by heterodyning the Rayleigh wave and Stokes wave of the multiwavelength Brillouin fiber laser.In the experiment,by tuning the pump wavelength and temperature of the gain fiber,microwave signals at different frequencies are generated.The tunable frequency range can be further expanded by using a temperature controller with a wider adjustment range,and the generated microwave signal exhibits high stability on frequency.