




下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、IEEE Robotics and Automation Letters (RAL) paper presented at the2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019Decoding the Perceived Difficulty of Communicated Contents by Older People: Toward Conversational Robot-Assistive Elderly C
2、areSoheil Keshmiri1 and Hidenobu Sumioka1 and Ryuji Yamazaki2 and Hiroshi Ishiguro1,3long.However, enabling robots to interact with humans is a complex task and even more so when it comes to humans verbal communication: a conversation that resonates with one person may not sound the same to another,
3、 people lose their attention in different paces, individuals perceive difficulty of a topic in their own ways. Despite substantial advances in facial feature analysis 6, such facial expressions may not be as informative in case of verbal communication. For instance, a frowning face while listening t
4、o a conversation might signal attention or difficulty in following an statement than discomfort or anger. Such contextual effects during a verbal communication are highly subjective (i.e., vary from individual to individual) and internalized: they may not be immediately available through conventiona
5、l responses such as facial expression.Brain as the base for behavioural responses can help alleviate some of these shortcomings. In particular, brain- based approach to human-robot interaction is well-suited for verbal communication in which robotic media need to track the perceived complexity of co
6、nversational topic by their human companions in order to sustain their interaction through modulation of the communicated content. Such an ability can especially be helpful when these agents interact with individuals who struggle with expressing themselves (e.g., overstressed or shy persons and indi
7、viduals with such diseases as selective mutism).In this article, we aim at online estimation of the older peo- ples perceived difficulty of communicated contents during verbal communication based on pattern of their prefrontal cortex (PFC) activation. We focus on storytelling as a first step toward
8、decoding of the conversational communication since stories scripts can be kept intact and repeated to different individuals without any change in their contents, thereby allowing for the control of such confounders as subtle differences in conveyed information. In this context, the core issue is how
9、 to evaluate the individuals perceived difficulty of a verbally communicated content, considering the lack of an objective quantification for such perceptions. Here, we hypothesize that the perceived difficulty of a verbal communication is reflected in the cognitive load that a person experiences. I
10、n cognitive psychology, the cognitive load refers to the effort that is endured by the working memory (WM): the core component of the human cognition that includes language comprehension 7. Previous studies have formulated such simple WM tasks as mental arithmetic (MA) 8 and n-back 9 to quantitative
11、ly evaluate the level of cognitive load. Furthermore, functional imaging hasprovidedAbstract In this study, we propose a semi-supervised learn- ing model for decoding of the perceived difficulty of communi- cated content by older people. Our model is based on mapping of the older peoples prefrontal
12、cortex (PFC) activity during their verbal communication onto fine-grained cluster spaces of a working memory (WM) task that induces loads on humans PFC through modulation of its difficulty level. This allows for differential quantification of the observed changes in pattern of PFC activation during
13、verbal communication with respect to the difficulty level of the WM task. We show that such a quantification establishes a reliable basis for categorization and subsequently learning of the PFC responses to more naturalistic contents such as story comprehension. Our contribution is to present eviden
14、ce on effectiveness of our method for estimation of the older peoples perceived difficulty of the communicated contents during an online storytelling scenario.I. INTRODUCTIONA distinct attribute of robots in comparison with other media is their physical embodiment which allows for a sense of togethe
15、rness 1. Research suggests that children who read with the learning-companion robot consider their reading companion to support their reading comprehension and that it motivates a deepening social connection 2. Along the same direction, Mann et al. 3 find that people are more responsive to robots th
16、an computer-based healthcare systems. Additionally, Keshmiri et al. 4 identify that tele- communicating through a humanoid results in the older peoples brain to exhibit a similar activation pattern as in- person communication.These findings unanimously identify the potential of robots for improving
17、the accessibility, consistency, and quality of our public and medical care services. At the same time, they also imply the necessity for increased social interaction ability of robots 5 if we are to harness their potentials and positive impacts on our social lives in its earnest. After all, social i
18、nteraction is a bidirectional communication channel and interactive media that can comprehend their human com- panions expectations to respond accordingly is the minimum requirement if such interactions and relationships are to last*This research was supported by JST CREST Grant Number JP- MJCR18A1,
19、 JSPS KAKENHI Grant Number JP19K20746, and ImPACTGrant Number 2014-PM11-07-01.1Soheil Keshmiri and Hidenobu Sumioka are with Advanced Telecom- munications Research Institute International (ATR), Kyoto, Japan, Hi-roshi Ishiguro is with Graduate School of Engineering Science, Os- aka University, Japan
20、. soheil,sumiokaatr.jp 2Ryuji Ya- mazaki is with School of Social Sciences, Waseda University, Japan. rysaoni.waseda.jp 3Hiroshi Ishiguro is the with Graduate School of Engineering Science, Osaka University, Japan, and the Visiting Director of Hiroshi Ishiguro Laboratories (HIL) at ATR.ishigurosys.e
21、s.osaka-u.ac.jpCopyright 2019 IEEEa considerable evidence that shows the neural correlates of WM process reside in PFC 8, 9, 10.We propose to evaluate the perceived difficulty of commu- nicated contents during verbal communication via cognitive loads that are estimated based on brain activities duri
22、ng simple WM tasks. Specifically, we first organize cluster spaces that are formed through application of K-mean al- gorithm 11 on the near-infrared spectroscopy (NIRS) time series of older peoples PFC activity in response to induced cognitive load by n-back (n = 1, 2) auditory task (referred to as
23、NBT hereafter). In this task, participants are required to recall the reoccurrences of sequential (i.e., n = 1) or every- other (i.e., n = 2) occurrence of numerical values (1 through 9). We use NBT since it forms a better basis for quantifica- tion of the verbally communicated contents, considering
24、 its effect on PFC 12 and its ability in identifying the change in PFC activation in response to individuals emotions and change in mood 10. Next, we map older peoples PFC activity during an easy/hard listening task (referred to as EHL hereafter) onto NBT clusters. This mapping serves as a refinemen
25、t that allows for objective quantification of the brain activation during verbal communication based on well- defined clusters of the n-back, thereby including the PFC information during EHL that is not available in a pure WM task setting. EHL is designed to induce different level of cognitive loads
26、 on older peoples PFC by modulating its communicated information. This mapping process results in quantification of the frontal activities during EHL according to their proximity to the NBT clusters centroids (i.e., their centers): a process referred to as cross-labeling (e.g., label 1 if PFC activi
27、ty is closer to n = 1 cluster or 2, otherwise). Last, we use these cross-labeled PFC activities to train a linear supervised classifier for decoding of the older peoples PFC responses to online communicated topics.We show that our method can capture cognitive load ofthe older people during a natural
28、 storytelling scenario and that its estimation is associated with the older peoples self- assessment of the difficulty of the story. Our contribution to human-robot interaction is to form the first (to the best of our knowledge) preliminary step toward a conversational- based robot-assistive elderly
29、 care via enabling these media to predict the difficulty of their verbally communicated content (e.g., while telling a story in an elderly care) as perceived by the older people.perceived difficulty of stories. In the following section, we explain each step in details.A. Choice of Feature SpacePrevi
30、ous results 13, 14 indicated that differential en- tropy (DE) (i.e., average information content of a continuous random variable) significantly outperforms feature spaces that are predominantly used for classification of f/NIRS time series of human subjects PFC activity. Due to these results, we use
31、d linear estimate of DE for extracting features of the PFC activity.B. Clustering of NIRS times Series of n-back WM taskFigure 1 (A) and (B) show this process. We formed our n-back WM cluster spaces through application of K- mean algorithm 11 with two centroids on DE feature vectors of every five-se
32、cond-long NIRS time series of PFC activity during one- and two-back WM tasks. This resulted in formation of two clusters (i.e., C1 and C2 clusters, Figure 1 (B). We computed a DE feature vector (i.e., V in Figure 1 (A) for a given n-back NIRS time series of PFC activity of each participant as 15:12H
33、(Xj ) =log (2e )2(1)Xj22jthwhereis the variance of thenon-overlapping seg-Xjment of entire time series X of the participants PFC activity. It is worthy of note that the interpretation of C1 and C2 as representatives of PFC activation in response to easy/difficult communicated contents finds evidence
34、 in differential PFC activation in response to one- and two-back WM tasks 9. In this study, we used data from 13 that pertained to twenty eight adults frontal activities (eleven males and seventeen females, M = 30.96, SD = 10.84) who performed one- and two-back tasks.C. NBT-Based Cross-Labeling of E
35、HLFigure 1 (C) illustrates this step. We mapped DE feature vectors of participants NIRS PFC activity during EHL onto C1 and C2 cluster spaces based on their L2-norm distances (i.e., Euclidean distance) to centroids of these clusters. We labeled these vectors as easy (i.e., 1) if they were closer to
36、C1s center or difficult (i.e., 2) if they were closer to C2s center. This resulted in formation of clusters L (short for lower cognitive load) and H (short for higher cognitive load) that excluded NBT and were solely based on EHL. As a result, the EHLs labeling with respect to NBT established a corr
37、espondence between PFC activity in response to verbal communication and clusters of NBT.D. Training a Linear Supervised ModelFigure 1 (D) shows this process. We used 80.0% of EHL cross-labeled data for training while utilizing the remainder 20.0% for cross-validation (CV) to train our linear supervi
38、sed classifier. We used the linear supervised classifier in 13 that is based on a modified canonical linear regression. We chose this model due to its significantly improved accuracyII. METHODOLOGYFigure 1 shows an overview of our method. it consists of five steps A) feature extraction i.e., calcula
39、ting the infor- mation content of the brain activity, B) clusters formation using the participants PFC activity in response to induced cognitive load by NBT, C) NBT-Based cross-labeling of older peoples PFC activities during EHL which involves their labeling based on their proximity to NBT clusters
40、centroids (e.g., label 1 if PFC activity is closer to n = 1 cluster or 2, otherwise), D) training a linear supervised model with cross-labeled EHL data, and E) online estimation of theFig. 1. Models schematic diagram. (A) DE feature vectors for PFC activity of the participants in response to one- an
41、d two-back WM tasks were calculated, using equation (1). (B) These feature vectors were used to form clusters C1 and C2 through application of K-mean algorithm 11. (C) C1 and C2 clusters were utilized for labeling the DE feature vectors of EHL time series of PFC activity of human subjects via mappin
42、g these vectors onto C1 and C2 clusters based on their L2-norm (i.e., Euclidean distance) to the centroids of C1 and C2 (i.e., their respective centers), resulting in formation of EHL- based clusters, L (short for lower cognitive load) and H (short for higher cognitive load). (D) This cross-labeled
43、data was further used for training a linear supervised classifier 13 for online classification of PFC activity of older people in response to communicated contents. During training, 80.0% of EHL was used as training data. We used the remainder data of EHL for cross-validation (CV). (E) Trained linea
44、r supervised model was used for online estimation of the perceived difficulty of communicated contents by older people during conversation. (F) Once the session was over, the model counts the number of DE feature vectors that are classified as members of L or H clusters. Subsequently, it labeled the
45、 session as difficult/easy if number of DE feature vectors assigned to H/L during the session was larger than those in L/H, thereby returning this count along with the average of the L2-norms of the DE feature vectors of the selected cluster.in comparison with dominantly adapted classifiers for NIRS
46、- based n-back WM task in the literature.During the training, we adapted a brute-force search that started with a single feature (i.e., length one feature vector) through ten (i.e., feature vectors of length ten). For eachof these lengths 2, we also checked whether inclusionof polynomial degrees to
47、capture the interaction between the elements of a given feature vector can improve the performance. Therefore, we checked for polynomial degrees zero (i.e., no polynomial feature) through seven. We found that feature vectors of length four combined with polynomial degree of two yielded the highest p
48、rediction accuracy. There- fore, we used the length four feature vectors with polynomial degree of two.E. Online Estimation of the Perceived DifficultyWe used our trained linear model for online estimation of the perceived difficulty of communicated contents by older people during storytelling exper
49、iment. At every prediction cycle (i.e., every 20-second in current implementation), our model summarized the current PFC activity time series of the older people into its calculated feature vector. Next, it utilized the trained linear model to estimate the correspondence between the feature vector o
50、f the current PFC activity to two clusters. It then returned the magnitude of the induced difficulty of the communicated topic at that prediction cycle (i.e., L2-norm of its feature vector) along with its estimated label (i.e., whether closer to the Ls or Hs centroid). The older peoples perceived di
51、fficulty was estimated based on total number of DE feature vectors that were classified as members of L or H clusters.III. EXPERIMENTSWe conducted two experiments to verify the ability of our model in capturing the older peoples perceived difficulty ofverbal communication. In the first experiment, w
52、e verified that the trained model with the recorded data during EHL task had the ability to classify the NBT. In the second, we verified that the trained model was capable of estimating the perceived difficulty during storytelling (i.e., STE). Consider- ing the two-class labeling in our approach (i.
53、e., L = 1 and H = 2), the chance level accuracy was 50.0%.All participants were free of neurological and psychiatric disorders and had no history of hearing impairment. Subjects were seated in an armchair with head support in a sound- attenuated testing chamber, with instructions to fully relax whil
54、e their eyes closed. All experiments were carried out with written informed consents from all subjects.We used a minimalist design humanoid called Telenoid (Figure 2 (b) in our experiments. Motion of Telenoid was generated based on voice of the operator, using an online speech-driven head motion sys
55、tem 16. We placed Telenoid on a stand in an approximately 1.4 meter distance from the seat of the participant (Figure 2 (a).Near Infrared Spectroscopy (NIRS) 17 was used to collect PFC activity of the participants. We chose NIRS due to its non-invasive operational setup, portability, and relative im
56、munity to body movement 18. In our experiments, we acquired NIRS time series data of the participants using aFig. 2. (a) Experimenter demonstrates experimental setup. (b) Telenoid.ing a one-minute-long resting data that was then followed by its corresponding topic. We kept the communicated contents
57、intact in all the sessions. However, we randomized the order of easy/difficult contents among participants. Every subject participated in all of these settings.For model verification, we used the original labeling of NBT data for one- and two-back WM tasks (i.e., prior to K- mean clustering) from 13
58、. This enabled us to determine whether induced cognitive loads during NBT formed a proper basis for quantification of the cognitive demands on PFC during verbal communication. We considered our models prediction a true positive (tp) if its estimation and the NBTs original label were both 2 (i.e., difficult, Section II-C). Similarly, we considered it a true negative (tn) if estimated and original labels were both 1 (i.e., easy, Section II-C). Otherwise, we considered the estimate a false positive (fp) (i.e., predicted label = 2 and original label = 1) or a false negative (
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 教科版小学科学六年级上册实验教学安全计划
- 银行集团企业文化建设方案范文
- 浙江省小学四年级体育必学内容教学计划
- 文化遗产保护施工文明施工管理体系及措施
- 员工对公司的意见和建议范文
- 小学低年级班干部创新能力培养计划
- 班主任高级研修班心理健康教育心得体会
- 加油站设备故障排查计划
- 营销策划岗位职责标准他
- 物业企业客户满意度提升计划
- 鱼丸生产加工项目可行性研究报告
- 胜动燃气发电机组基本构造与工作原理课件
- 七年级数学下册一元一次不等式组说课稿人教新课标版
- 校长专业水平测试题
- 腹腔镜胆囊切除术后护理查房
- 精装修验房流程及标准(课堂PPT)
- 压力分散型锚索张拉方案
- 《建设项目前期工作咨询收费暂行规定》计价格【1999】1283号
- 15软件安装详细图文教程包成功破解
- 组委会结构图与职责说明宁(共4页)
- 体育投掷单元教学计划(共4页)
评论
0/150
提交评论