IROS2019国际学术会议论文集0746_第1页
IROS2019国际学术会议论文集0746_第2页
IROS2019国际学术会议论文集0746_第3页
IROS2019国际学术会议论文集0746_第4页
IROS2019国际学术会议论文集0746_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Identifi cation of Rat Ultrasonic Vocalizations from Mix Sounds of a Robotic Rat in a Noisy Environment Chang Li, Student Member, IEEE, Qing Shi, Member, IEEE, Zihang Gao, Hiroyuki Ishii, Atsuo Takanishi, Qiang Huang, Fellow, IEEE, Toshio Fukuda, Life Fellow, IEEE AbstractSocial interaction between a robot and rats is important since the robot can generate reproducible social behaviors across trials. However, lacking internal state feed- back from the rat makes current robot-rat interaction a very preliminary level comparing with rat-rat interaction. Previous biological studies showed that ultrasonic vocalizations (USVs) emitted by a rat are expressions of its internal emotional states, which therefore can be used as part of feedback for a robot-rat interaction. The challenge is to accurately identify rat USVs in real-time from mix sounds generated by the robot in a noisy environment. To address these problems, we propose an SVM-based rat USVs identifi cation method. This SVM method uses three types of features to represents the characteristics of mix sound and use these multidimensional features to identify rat USVs. Results show that our identifi cation method has an accuracy of 84.29% with only 4.84% false-positive rate. Furthermore, we carefully design the fi lter window length with respect to sound chunk length and use only one microphone to record the mix sound. All of these efforts are to reduce the calculation time to realize real-time identifi cation. Eventually, the identifi cation process can be executed within 3.5ms, which defi nitely meet the real-time demand. This research lays the foundation of the feedback based interaction between rat and robot, and also shows promise in the study of ethology and the interaction between robot and animals. I. INTRODUCTION Recently, the robotic rat has been widely used to study the laboratory rats behavior 1, 2, explore the interactions between rats and the robot 35, and even mimic their perceptions 6, 7. Actually, it is natural that we can use robots to study the behavior of the corresponding animals because these bio-inspired robots provide tools for biologists to study animal behavior 8. For example, S. Heath et al. 9 designed a robotic rat and used to interact with eight different rats, and they found that the rats had different responses to the different behavioral patterns of the robot. R. Ortiz et al. 10 controlled an e-puck robot to perform interaction behavior with some rats, and they fi nally found similar rat- rat interaction behaviors between rats and the robot. We also have previously used a multi-link robotic rat to study the rat-like pitch and yaw actions 11, and evaluated the motion *This work is supported in part by the National Nature Science Founda- tion of China (NSFC) under grant No. 61773058 and 61627808, the National Key R 2) Rat USVs identifi cation should be completed in real- time so that it can be used directly in the interaction of the robot and the rat. To this end, we carefully design the fi lter window length with respect to sound chunk length and use only one microphone to record the mix sound. The results show that the maximum total delay of our method is only 3.5ms; 3) To my best knowledge, this is the fi rst research about rat USVs identifi cation from mix sound, which shows promise in the study of ethology and the interaction studies between robot and animals. Resolution16 bit or 8 bit Available sample rate kHz 300, 250, 214, 187.5, 155.6, 150, 125, 100, 75, 62.5, 50 Frequency response 20 Hz 140 kHz Input sensitivity dBV Max trim-43.2 Min trim-3.2 Max trim step gain-28.4 Min trim step gain1.6 Camera Rat WR-5M 2m Camera Rat WR-5M Microphone (a)(c) (b) Fig. 1.System setup: a) Sound acquisition system; b) Avisoft Ultra- SoundGate 416Hb; c) Some key parameters of b). II. RATUSVSACQUISITION ANDANALYSIS Because of the rat emits ultrasound, common devices can hardly be used to record it, let alone analyzing it. So, we fi rst set up a system which is able to record the audible as well as ultrasonic sound in real-time. Then, the spectrogram of rat and robot USVs are obtained in order to analyze their characteristics. A. Acquisition System Setup The acquisition system mainly consists of an experiment open-fi eld, a computer, a camera and an ultrasound acqui- sition device kit. As the Fig. 1(a) shown, the ultrasonic microphone and the camera are placed on the top of the open-fi eld in order to record vocalizations as well as videos during experiments. The microphone then transmit signals to the Avisoft UltraSoundGate 416Hb (shown in Fig. 1(b), the device used to process and save the USVs to the computer. Its maximal frequency response and the sample rate is 140 kHz and 300 kHz respectively 29, which fully cover the range of rat USVs. Other key parameters of this device are shown in Fig. 1(c). A computer is used to gather all data and responsible for data processing. Across the experimental period, the open-fi eld is surrounded by the blackout cloth in order to provide a dusky environment to the rat. In terms of the experimental object, we use Long-Evans rats (36 weeks) as the source of rat USVs and use a multi- link robotic rat as the source of the robots sound. This robot, called WR-5M as introduced in 11, has 13 degrees of freedom (DOFs) in total and is driven by 3 different types of motor: 6 DC motors, 3 servo motors, and 4 stepper motors. In such a scenario, the sound emitted by the robotic rat comes not only from its mechanism and the friction but also from these motors. With the help of this system, we separately collect some samples of rat USVs and the robots sound. In order to make rats vocalize, we individually put the rat into the open-fi eld for 15 minutes and record the sound during its exploration. Although there are many other ways to make the rat vocalize such as tactile stimulation by the breeder 7289 12345 Time (s) 0 10 20 30 40 50 60 Freq. (kHz) -40 -30 -20 -10 Power/Decade (dB) 510152025 Time (s) 0 10 20 30 40 50 60 Freq. (kHz) -40 -30 -20 -10 Power/Decade (dB) (a) 12345 Time (s) 0 10 20 30 40 50 60 Freq. (kHz) -40 -30 -20 -10 Power/Decade (dB) 510152025 Time (s) 0 10 20 30 40 50 60 Freq. (kHz) -40 -30 -20 -10 Power/Decade (dB) (b) Fig. 2.Sepctrogram of a) rat USVs, and b) the robot sound staring at 7s and stopping at 11.5s. or juvenile play 20, our way can bring less noise emitted by the experimenter or another rat. As for the robotic rat, we control it to perform a serials actions, such as grooming, upright rearing, or simply running, so that we can get its sound comprehensively as possible. By the way, the devices sample rate used in this paper is set to 125 kHz. The reason why we adopt 125 kHz is that: during the experiment, we rarely observed rats USVs that exceeding 60 kHz. So 125 kHz sample rate is enough to record the rat USVs and its helpful to the real-time processing of signals. B. Sound spectrograms and their analysis Fig. 2(a) shows the spectrogram of rat USVs. Because the rat randomly vocalize during the experiment, rat USV data used in this fi gure is trimmed and organized from 4 rats vocalizations. According to this fi gure, the frequency components of rat USVs are mostly in the range of 40 kHz to 60 kHz and these USVs last 30ms to 50ms. There also have constant and continual frequency components around 20 kHz. These components are environmental noise, from the computer, ventilation equipment of the animal room, etc. Fig. 2(b) shows the spectrogram of the robots sound, start- ing at 7s and stopping at 11.5s. The frequency components of the robot are complex and continuous, and most of the time, the harmonic of the robot sound extends to rat 50-kHz USV bands. The existence of these harmonics is the reason why we cannot simply detect rat USVs by directly detecting the existence of 40kHz to 60kHz frequency. Besides, there are two bursts around 10s and 12s, which is caused by the fast running of the servo motor. It will also infl uence our detection of rat USVs. III. SVM BASEDRATUSVSIDENTIFICATION A. Preprocessing Due to the existence of the noise around 20 kHz as mentioned in II-B, fi ltering to the raw sound data is needed to suppress its infl uence. Here, we designed a Hamming (a) (b) Fig. 3. Hamming window FIR fi lter with the cut-off frequency at 37.5 kHz and the order of 128: a) Magnitude response in dB; b) Impulse response window FIR high-pass fi lter to do this work. The cutoff frequency of this fi lter is set to 37.5 kHz so that the noise can be fully suppressed without infl uence the 50-kHz rat USVs. We further set the order of this fi lter to 128 in order to balance the accuracy of cutoff frequency and the fi lters group delay. Under the condition of 125 kHz sample rate, the group delay is only about 0.5 ms, hardly infl uencing the real-time performance of the identifi cation process. The magnitude and impulse responses are shown in Fig. 3. Besides, a common short-time Fourier transformation (STFT) is conducted on the fi ltered sound data for the sake of feature calculations in the frequency domain. B. Features Sound features describe the sound in the different points of view, such as in temporal, energy, and spectral views 30 and it can be used in a variety of applications 31 such as coding, automatic score following, or analysis-synthesis, etc. The choice of features is important, since sound features are used to training the SVM while the selected features have a large effect to the recognition accuracy 27. So it is better to carefully choose the proper features. In this paper, we selected 7 different features from 3 point of views to represents the characteristics of the mix sound. 1) Zero-Cross Rate (ZCR): Zero-cross rate, a temporal feature, describes the rate of the sound wave cross the axis. The defi nition of ZCR is shown in (1). ZCR has a great effect in distinguishing the rat USVs with less robot noise. For a sound chunk that contains rat USVs, it tends to have a higher ZCR. So, fi ltering the sound data with high-pass fi lter make a great contribution to the ZCR because the low frequency components will greatly decrease this value. ZCR = 1 N N X n=2 1R0Sn Sn1(1) where Snis the nthvalue of sound data chunk, N is the length of the sound chunk, and 1R0is an indicator function. 7290 2) Energy Entropy (ENE): Energy entropy describes the sound feature in the view of energy and it refl ects the dispersal of the energy. A fi ltered sound chunk which does not contain rat USVs will possess large energy entropy, while the frame that contains the rat USVs will have small energy entropy. Energy entropy may help us to determine whether there is rat USVs or not in the environment that exits robot noise. The fi ltering of the data also contributes to the ENE value because it highlights the high frequency part so that a rat USV will result in a greater change of ENE. Its defi nition is shown in (2). ENE = N X n=1 p(Sn)lgp(Sn)(2) where p(Sn ) is the probability which defi ned as follows: p(Sn) = S2 n PN n=1S2n 3) Spectral Shape Features: In the frequency domain, the spectrum shows much information about the component of the sound. So we can extract more features in this do- main. In this paper, spectral centroid (SCE), spectral spread (SSP), spectral skewness (SSK), spectral kurtosis (SKU), and Spectral roll-off (SRO) are used to represents the sound characters. Their defi nitions are shown in (3) to (7). SCE = F = 1 F F X f=0 Af P (Af) (3) SSP = F = 1 F v u u t F X f=0 (Af )2 P (Af)(4) SSK = 1 3 F X f=0 (Af )3 P (Af) (5) SKU = 1 4 F X f=0 (Af )4 P (Af) (6) SRO = FRO F (7) In these equations, Afis the amplitude at frequency f, F is the Nyquist frequency, which is the half of the sample rate, P (Af ) is probability which is defi ned as (8) shown, and the FRO is the roll-off frequency defi ned in (9). P(Af) = Af F P f=0 Af (8) FRO X f=0 A2 f = 0.95 F X f=0 A2 f (9) These spectral features show the shape, or in other words, the distribution of the spectrum of the sound. If there exist rat USVs in the sound frame, these features will be slightly changed, which can be detected by the SVM so that it can make a correct identifi cation. Finally, all the above features compose a feature vector shown in below: x = ZCR, ENE, SCR, SSP, SSK, SKU, SROT This vector accompanied by its label l (rat USVs or non-rat USV) can be used in the SVM training process. Also, the mix sounds feature vector can be used in SVM identifi cation. C. SVM Training and Identifi cation Suppose that we have Mrrat USVs training samples and Mnrnon-rat USV ones. These samples feature vectors and their label compose a feature matrix X and a corresponding label vector l : X = x1,x2, ,xMr+Mnr l = l1,l2, ,lMr+Mnr Then, SVM training can be conducted using X and l . We would not discuss in deep about the training process because this paper does not focus on it. After training, we can get the support vector matrix Xs, the weight vector as well as the bias b: Xs= ?xs 1,x s 2, ,x s mr+mnr ? X where mrand mnrare the number of support vectors of rat USVs and non-rat USV, respectively ( 0 mr Mr, 0 0? Features ENE SCE SSP SSK SKU SRO ZCR Frequency Energy Temporal Fig. 4. The fl ow chart of SVM based rat USVs identifi cation. The red dash line represents fl ows when using SVM to identify a new unknown mix sound data. get mix sound samples as comprehensive as possible. After that, we trimmed the sound by a 4096-length window and separated the sound chunk into two categories: rat USVs and non-rat USVs. The reason why we select 4096-length window is that this window length, lasting for 33ms, has similar durations to that of rat USVs, and this length can speed up the calculation in the STFT process as well. Finally, a total of 82 rat USVs samples and 84 non-rat USV samples are obtained. These samples contain most of the situations that will occur in the real interaction process to ensure the training result is suitable for most situations. After training the SVM model in Matlab (R2018b), we get 28 support vectors in total: 14 of them are with respect to rat USVs and the others are with respect to non-rat USV. B. SVM Identifi cation To test the identifi cation ability of the trained SVM model, we randomly mixed the other 62 chunks of rat USVs into the robot sound and using the mix sound to be identifi ed. Its spectrogram is shown in Fig. 5(a). Here, the reason why we manually mix rat USVs with the robot sound is similar to that represented in IV-A. Then, we use a sliding window with 4096 data length and half overlapping to observe the data and calculate its features. This process is shown in Fig 4 with red dash line indicate the data fl ow of SVM identifi cation. By using equation (10), we can get the identifi cation result as shown in the Fig 5(b). In this fi gure, the solid red line indicates that the distance between the detected sound chunk feature and the hyperplane determined by the support vectors is positive, While the blue dash line indicates that the distance is negative. A positive distance means that there may exist rat USVs in the sound chunk, and the further the distance, the greater the likelihood that rat USVs is present in the sound chunk, and vice versa for the negative distance. After statistical analysis, we fi nd that accuracy of the identifi cation if 84.29% with the false-positive rate and the false-negative rate are 12.90% and 4.84% respectively. Despite the errors, the result is still considered to meet the demand for rat USVs detection purpose. Besides, we further analyzed the necessity of each feature in the identifi cation, as shown in Table I. We tried to identify the same mix sound without a certain feature and counted the number of support vectors as well as false-negative rate and false-positive rate. According to the results, SRO is (a) (b) Fig. 5.a) Filterd spectrogram of the mix sound of Rat USVs and the robot sound. b) SVM identifi cation result of a). In fi gure b), the solid red line indicates that C is positive, which means rat USVs is identifi ed at that moment; the blue dash line indicates that C is negative, which means there is no rat USVs at that moment. TABLE I IDENTIFICATION COMPARISON mrmnrFalse-positiveFalse-negative All Features14144.85%12.90% Non-ZCR202016.13%6.45% Non-ENE25259.68%19.35% Non-SCE21176.45%16.13% Non-SSP191712.90%14.52% Non-SSK182014.52%11.29% Non-SKU191812.90%19.35% Non-SRO252425.81%12.90% most important in rat USVs identifi cation because the false- positive rate will greatly increase if the SRO was not used. Also, if ENE and SKU were not taken into consideration in rat USVs identifi cation, the false-negative rate will increase a lot. The absence of other features infl uences the result in various degrees as well. Moreover, any absence of features will increase the number of support vectors, which may lower the computation speed during identifi cation. So, all features are indispensable in the identifi cation process. C. Time Delay As mentioned before, we need to ensure that the SVM identifi cation is fast so that it can guarantee the real-time requirements. So we calculate the time delay of the method. The time delay in this paper comes from two parts: the group delay of fi lter, and the calculation time that needed by the method, as shown in (11). It should be noted that we suppose that there does not have delays in the ultrasound acquisition device, which means the sound can be recorded immediately by the device. T = TG+ TC(11) where TGis the group delay and TCmeans delay caused by calculations. In this paper, the group delay TGis about 0.5ms as mentioned before while the calculation time TCvaries from 7292 1ms to 3ms (in Windows 7 64bit, 96G RAM). So the total time delay T is less than 3.5ms. This delay is quite low enough to guarantee the real-time requirements. V. CONCLUSION ANDFUTUREWORK In this paper, we

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论