下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、IEEE Robotics and Automation Letters (RAL) paper presented at the2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019Decoding the Perceived Difficulty of Communicated Contents by Older People: Toward Conversational Robot-Assistive Elderly C
2、areSoheil Keshmiri1 and Hidenobu Sumioka1 and Ryuji Yamazaki2 and Hiroshi Ishiguro1,3long.However, enabling robots to interact with humans is a complex task and even more so when it comes to humans verbal communication: a conversation that resonates with one person may not sound the same to another,
3、 people lose their attention in different paces, individuals perceive difficulty of a topic in their own ways. Despite substantial advances in facial feature analysis 6, such facial expressions may not be as informative in case of verbal communication. For instance, a frowning face while listening t
4、o a conversation might signal attention or difficulty in following an statement than discomfort or anger. Such contextual effects during a verbal communication are highly subjective (i.e., vary from individual to individual) and internalized: they may not be immediately available through conventiona
5、l responses such as facial expression.Brain as the base for behavioural responses can help alleviate some of these shortcomings. In particular, brain- based approach to human-robot interaction is well-suited for verbal communication in which robotic media need to track the perceived complexity of co
6、nversational topic by their human companions in order to sustain their interaction through modulation of the communicated content. Such an ability can especially be helpful when these agents interact with individuals who struggle with expressing themselves (e.g., overstressed or shy persons and indi
7、viduals with such diseases as selective mutism).In this article, we aim at online estimation of the older peo- ples perceived difficulty of communicated contents during verbal communication based on pattern of their prefrontal cortex (PFC) activation. We focus on storytelling as a first step toward
8、decoding of the conversational communicationAbstract In this study, we propose a semi-supervised learn- ing model for decoding of the perceived difficulty of communi- cated content by older people. Our model is based on mapping of the older peoples prefrontal cortex (PFC) activity during their verba
9、l communication onto fine-grained cluster spaces of a working memory (WM) task that induces loads on humans PFC through modulation of its difficulty level. This allows for differential quantification of the observed changes in pattern of PFC activation during verbal communication with respect to the
10、 difficulty level of the WM task. We show that such a quantification establishes a reliable basis for categorization and subsequently learning of the PFC responses to more naturalistic contents such as story comprehension. Our contribution is to present evidence on effectiveness of our method for es
11、timation of the older peoples perceived difficulty of the communicated contents during an online storytelling scenario.I. INTRODUCTIONA distinct attribute of robots in comparison with other media is their physical embodiment which allows for a sense of togetherness 1. Research suggests that children
12、 who read with the learning-companion robot consider their reading companion to support their reading comprehension and that it motivates a deepening social connection 2. Along the same direction, Mann et al. 3 find that people are more responsive to robots than computer-based healthcare systems. Ad
13、ditionally, Keshmiri et al. 4 identify that tele- communicating through a humanoid results in the older peoples brain to exhibit a similar activation pattern as in- person communication.These findings unanimously identify the potential of robots for improving the accessibility, consistency, and qual
14、ity of our public and medical care services. At the same time, they also imply the necessity for increased social interaction ability of robots 5 if we are to harness their potentials and positive impacts on our social lives in its earnest. After all, social interaction is a bidirectional communicat
15、ion channel and interactive media that can comprehend their human com- panions expectations to respond accordingly is the minimum requirement if such interactions and relationships are to lastsories scripts can be kept intact and repeated todifferent individuals without any change in their contents,
16、thereby allowing for the control of such confounders as subtle differences in conveyed information. In this context, the core issue is how to evaluate the individuals perceived difficulty of a verbally communicated content, considering the lack of an objective quantification for such perceptions. He
17、re, we hypothesize that the perceived difficulty of a verbal communication is reflected in the cognitive load that a person experiences. In cognitive psychology, the cognitive load refers to the effort that is endured by the working memory (WM): the core component of the human cognition that include
18、s language comprehension 7. Previous studies have formulated such simple WM tasks as mental arithmetic (MA) 8 and n-back 9 to quantitatively evaluate the level of cognitive load. Furthermore, functional imaging has provided*This research was supported by JST CREST Grant Number JP- MJCR18A1, JSPS KAK
19、ENHI Grant Number JP19K20746, and ImPACTGrant Number 2014-PM11-07-01.1Soheil Keshmiri and Hidenobu Sumioka are with Advanced Telecom-munications Research Institute International (ATR), Kyoto, Japan, Hi- roshi Ishiguro is with Graduate School of Engineering Science, Os-aka University, Japan. soheil,s
20、umiokaatr.jp 2Ryuji Ya-mazaki is with School of Social Sciences, Waseda University, Japan. rysaoni.waseda.jp 3Hiroshi Ishiguro is the with Graduate School of Engineering Science, Osaka University, Japan, and the Visiting Director of Hiroshi Ishiguro Laboratories (HIL) at ATR.ishigurosys.es.osaka-u.a
21、c.jpCopyright 2019 IEEEa considerable evidence that shows the neural correlates of WM process reside in PFC 8, 9, 10.We propose to evaluate the perceived difficulty of commu- nicated contents during verbal communication via cognitive loads that are estimated based on brain activities during simple W
22、M tasks. Specifically, we first organize cluster spaces that are formed through application of K-mean al- gorithm 11 on the near-infrared spectroscopy (NIRS) time series of older peoples PFC activity in response to induced cognitive load by n-back (n = 1, 2) auditory task (referred to as NBT hereaft
23、er). In this task, participants are required to recall the reoccurrences of sequential (i.e., n = 1) or every- other (i.e., n = 2) occurrence of numerical values (1 through 9). We use NBT since it forms a better basis for quantifica- tion of the verbally communicated contents, considering its effect
24、 on PFC 12 and its ability in identifying the change in PFC activation in response to individuals emotions and change in mood 10. Next, we map older peoples PFC activity during an easy/hard listening task (referred to as EHL hereafter) onto NBT clusters. This mapping serves as a refinement that allo
25、ws for objective quantification of the brain activation during verbal communication based on well- defined clusters of the n-back, thereby including the PFC information during EHL that is not available in a pure WM task setting. EHL is designed to induce different level of cognitive loads on older p
26、eoples PFC by modulating its communicated information. This mapping process results in quantification of the frontal activities during EHL according to their proximity to the NBT clusters centroids (i.e., their centers): a process referred to as cross-labeling (e.g., label 1 if PFC activity is close
27、r to n = 1 cluster or 2, otherwise). Last, we use these cross-labeled PFC activities to train a linear supervised classifier for decoding of the older peoples PFC responses to online communicated topics.We show that our method can capture cognitive load of the older people during a natural storytell
28、ing scenario and that its estimation is associated with the older peoples self- assessment of the difficulty of the story. Our contribution to human-robot interaction is to form the first (to the best of our knowledge) preliminary step toward a conversational- based robot-assistive elderly care via
29、enabling these media to predict the difficulty of their verbally communicated content (e.g., while telling a story in an elderly care) as perceived by the older people.perceived difficulty of stories. In the following section, we explain each step in details.A. Choice of Feature SpacePrevious result
30、s 13, 14 indicated that differential en- tropy (DE) (i.e., average information content of a continuous random variable) significantly outperforms feature spaces that are predominantly used for classification of f/NIRS time series of human subjects PFC activity. Due to these results, we used linear e
31、stimate of DE for extracting features of the PFC activity.B. Clustering of NIRS times Series of n-back WM taskFigure 1 (A) and (B) show this process. We formed our n-back WM cluster spaces through application of K- mean algorithm 11 with two centroids on DE feature vectors of every five-second-long
32、NIRS time series of PFC activity during one- and two-back WM tasks. This resulted in formation of two clusters (i.e., C1 and C2 clusters, Figure 1 (B). We computed a DE feature vector (i.e., V in Figure 1 (A) for a given n-back NIRS time series of PFC activity of each participant as 15:1H(Xj) =log2(
33、2e2 )(1)Xj2whereis the variance of thenon-overlapping seg-2jthXjment of entire time series X of the participants PFC activity. It is worthy of note that the interpretation of C1 and C2 as representatives of PFC activation in response to easy/difficult communicated contents finds evidence in differen
34、tial PFC activation in response to one- and two-back WM tasks 9. In this study, we used data from 13 that pertained to twenty eight adults frontal activities (eleven males and seventeen females, M = 30.96, SD = 10.84) who performed one- and two-back tasks.C. NBT-Based Cross-Labeling of EHLFigure 1 (
35、C) illustrates this step. We mapped DE feature vectors of participants NIRS PFC activity during EHL onto C1 and C2 cluster spaces based on their L2-norm distances (i.e., Euclidean distance) to centroids of these clusters. We labeled these vectors as easy (i.e., 1) if they were closer to C1s center o
36、r difficult (i.e., 2) if they were closer to C2s center. This resulted in formation of clusters L (short for lower cognitive load) and H (short for higher cognitive load) that excluded NBT and were solely based on EHL. As a result, the EHLs labeling with respect to NBT established a correspondence b
37、etween PFC activity in response to verbal communication and clusters of NBT.D. Training a Linear Supervised ModelFigure 1 (D) shows this process. We used 80.0% of EHL cross-labeled data for training while utilizing the remainder 20.0% for cross-validation (CV) to train our linear supervised classifi
38、er. We used the linear supervised classifier in 13 that is based on a modified canonical linear regression. We chose this model due to its significantly improved accuracyII. METHODOLOGYFigure 1 shows an overview of our method. it consists of five steps A) feature extraction i.e., calculating the inf
39、or- mation content of the brain activity, B) clusters formation using the participants PFC activity in response to induced cognitive load by NBT, C) NBT-Based cross-labeling of older peoples PFC activities during EHL which involves their labeling based on their proximity to NBT clusters centroids (e
40、.g., label 1 if PFC activity is closer to n = 1 cluster or 2, otherwise), D) training a linear supervised model with cross-labeled EHL data, and E) online estimation of theFig. 1. Models schematic diagram. (A) DE feature vectors for PFC activity of the participants in response to one- and two-back W
41、M tasks were calculated, using equation (1). (B) These feature vectors were used to form clusters C1 and C2 through application of K-mean algorithm 11. (C) C1 and C2 clusters were utilized for labeling the DE feature vectors of EHL time series of PFC activity of human subjects via mapping these vect
42、ors onto C1 and C2 clusters based on their L2-norm (i.e., Euclidean distance) to the centroids of C1 and C2 (i.e., their respective centers), resulting in formation of EHL-based clusters, L (short for lower cognitive load) and H (short for higher cognitive load). (D) This cross-labeled data was furt
43、her used for training a linear supervised classifier 13 for online classification of PFC activity of older people in response to communicated contents. During training, 80.0% of EHL was used as training data. We used the remainder data of EHL for cross-validation (CV). (E) Trained linear supervised
44、model was used for online estimation of the perceived difficulty of communicated contents by older people during conversation. (F) Once the session was over, the model counts the number of DE feature vectors that are classified as members of L or H clusters. Subsequently, it labeled the session as d
45、ifficult/easy if number of DE feature vectors assigned to H/L during the session was larger than those in L/H, thereby returning this count along with the average of the L2-norms of the DE feature vectors of the selected cluster.in comparison with dominantly adapted classifiers for NIRS- based n-bac
46、k WM task in the literature.During the training, we adapted a brute-force search that started with a single feature (i.e., length one feature vector) through ten (i.e., feature vectors of length ten). For eachverbal communication. In the first experiment, we verified that the trained model with the
47、recorded data during EHL task had the ability to classify the NBT. In the second, we verified that the trained model was capable of estimating the perceived difficulty during storytelling (i.e., STE). Consider- ing the two-class labeling in our approach (i.e., L = 1 and H = 2), the chance level accu
48、racy was 50.0%.All participants were free of neurological and psychiatric disorders and had no history of hearing impairment. Subjects were seated in an armchair with head support in a sound- attenuated testing chamber, with instructions to fully relax while their eyes closed. All experiments were c
49、arried out with written informed consents from all subjects.We used a minimalist design humanoid called Telenoid (Figure 2 (b) in our experiments. Motion of Telenoid was generated based on voice of the operator, using an online speech-driven head motion system 16. We placed Telenoid on a stand in an
50、 approximately 1.4 meter distance from the seat of the participant (Figure 2 (a).Near Infrared Spectroscopy (NIRS) 17 was used to collect PFC activity of the participants. We chose NIRS due to its non-invasive operational setup, portability, and relative immunity to body movement 18. In our experime
51、nts, we acquired NIRS time series data of the participants using aof these lengths 2, we also checked whether inclusionof polynomial degrees to capture the interaction between the elements of a given feature vector can improve the performance. Therefore, we checked for polynomial degrees zero (i.e.,
52、 no polynomial feature) through seven. We found that feature vectors of length four combined with polynomial degree of two yielded the highest prediction accuracy. There- fore, we used the length four feature vectors with polynomial degree of two.E. Online Estimation of the Perceived DifficultyWe us
53、ed our trained linear model for online estimation of the perceived difficulty of communicated contents by older people during storytelling experiment. At every prediction cycle (i.e., every 20-second in current implementation), our model summarized the current PFC activity time series of the older p
54、eople into its calculated feature vector. Next, it utilized the trained linear model to estimate the correspondence between the feature vector of the current PFC activity to two clusters. It then returned the magnitude of the induced difficulty of the communicated topic at that prediction cycle (i.e
55、., L2-norm of its feature vector) along with its estimated label (i.e., whether closer to the Ls or Hs centroid). The older peoples perceived difficulty was estimated based on total number of DE feature vectors that were classified as members of L or H clusters.III. EXPERIMENTSWe conducted two exper
56、iments to verify the ability of our model in capturing the older peoples perceived difficulty ofFig. 2. (a) Experimenter demonstrates experimental setup. (b) Telenoid.ing a one-minute-long resting data that was then followed by its corresponding topic. We kept the communicated contents intact in all
57、 the sessions. However, we randomized the order of easy/difficult contents among participants. Every subject participated in all of these settings.For model verification, we used the original labeling of NBT data for one- and two-back WM tasks (i.e., prior to K- mean clustering) from 13. This enable
58、d us to determine whether induced cognitive loads during NBT formed a proper basis for quantification of the cognitive demands on PFC during verbal communication. We considered our models prediction a true positive (tp) if its estimation and the NBTs original label were both 2 (i.e., difficult, Section II-C). Similarly, we considered it a true negative (tn) if estimated and original labels were both 1 (i.e., easy, Section II-C). Otherwise, we considered the estimate a false positive (fp) (i.e., predicted label = 2 and original label = 1) or a false negative (f
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 建筑电气照明系统节能控制方式选择原则制定方法选择原则
- 腹腔镜机器人手术
- 13.4 电磁波的发现及应用 导学案(含答案)-2022-2023学年高二物理上学期(人教版2019必修第三册)
- 室内设计公司简介宣传
- 特色新景点开发与体验导览
- 预防医学科流感防控规范
- 红楼梦人物专题:鸳鸯形象剖析
- 商业设计核心要素解析
- 无人机控制系统的设计
- 感染科肺炎病原体培训指南
- 2025年甘肃庆阳市地理生物会考真题试卷(+答案)
- 北京市西城区2026年中考二模英语试题(含答案)
- (三模)济南市2026届高三5月针对性训练生物试卷(含答案)
- 2026宁夏电投永利能源有限公司招聘21人考试备考题库及答案解析
- 金牛区驷马桥等街道2026年公开招聘社区专职工作人员(26人)笔试备考试题及答案详解
- 2026中国报废汽车拆解行业盈利动态与需求趋势预测报告
- 2026年无损检涡流检二级考核模拟题库附参考答案详解【考试直接用】
- 2026年春教科版(新教材)小学科学三年级下册第三单元《只有一个地球》知识点清单
- 西安交通大学同等学力人员申请硕士学位资格审查表
- 2026新疆事业单位招聘(公基)笔试题及答案
- 护理带教:以人文关怀为核心
评论
0/150
提交评论