Scientific publications

In this page you will be able to find the scientific publications related to GuestXR project published by partners.

Fostering empathy in social Virtual Reality through physiologically based affective haptic feedback

IEEE World Haptics Conference

by Jeanne Hecquard, Justine Saint-Aubert, Ferran Argelaguet, Claudio Pacchierotti, Anatole Lécuyer & Marc Macé


We study the promotion of positive social interactions in VR by fostering empathy with other users present in the virtual scene. For this purpose, we propose using affective haptic feedback to reinforce the connection with another user through the direct perception of their physiological state. We developed a virtual meeting scenario where a human user attends a presentation with several virtual agents. Throughout the meeting, the presenting virtual agent faces various difficulties that alter her stress level. The human user directly feels her stress via two physiologically based affective haptic interfaces: a compression belt and a vibrator, simulating the breathing and the heart rate of the presenter, respectively. We conducted a user study that compared the use of such a “sympathetic” haptic rendering vs an “indifferent” one that does not communicate the presenter’s stress status, remaining constant and relaxed at all times. Results are rather contrasted and user-dependent, but they show that sympathetic haptic feedback is globally preferred and can enhance empathy and perceived connection to the presenter. The results promote the use of affective haptics in social VR applications, in which fostering positive relationships plays an important role.

Activation of human visual area V6 during egocentric navigation with and without visual experience

Current Biology

by Aggius-Vella E., Chebat D.R., Maidenbaum S. & Amedi A.


V6 is a retinotopic area located in the dorsal visual stream that integrates eye movements with retinal and visuo-motor signals. Despite the known role of V6 in visual motion, it is unknown whether it is involved in navigation and how sensory experiences shape its functional properties. We explored the involvement of V6 in egocentric navigation in sighted and in congenitally blind (CB) participants navigating via an in-house distance-to-sound sensory substitution device (SSD), the EyeCane. We performed two fMRI experiments on two independent datasets. In the first experiment, CB and sighted participants navigated the same mazes. The sighted performed the mazes via vision, while the CB performed them via audition. The CB performed the mazes before and after a training session, using the EyeCane SSD. In the second experiment, a group of sighted participants performed a motor topography task. Our results show that right V6 (rhV6) is selectively involved in egocentric navigation independently of the sensory modality used. Indeed, after training, rhV6 of CB is selectively recruited for auditory navigation, similarly to rhV6 in the sighted. Moreover, we found activation for body movement in area V6, which can putatively contribute to its involvement in egocentric navigation. Taken together, our findings suggest that area rhV6 is a unique hub that transforms spatially relevant sensory information into an egocentric representation for navigation. While vision is clearly the dominant modality, rhV6 is in fact a supramodal area that can develop its selectivity for navigation in the absence of visual experience.

Shape detection beyond the visual field using a visual-to-auditory sensory augmentation device

Frontiers in Neuroscience

by Shira Shvadron, Adi Snir, Amber Maimon, Or Yizhar, Sapir Harel, Keinan Poradosu & Amir Amedi


Current advancements in both technology and science allow us to manipulate our sensory modalities in new and unexpected ways. In the present study, we explore the potential of expanding what we perceive through our natural senses by utilizing a visual-to-auditory sensory substitution device (SSD), the EyeMusic, an algorithm that converts images to sound. The EyeMusic was initially developed to allow blind individuals to create a spatial representation of information arriving from a video feed at a slow sampling rate. In this study, we aimed to use the EyeMusic for the blind areas of sighted individuals. We use it in this initial proof-of-concept study to test the ability of sighted subjects to combine visual information with surrounding auditory sonification representing visual information. Participants in this study were tasked with recognizing and adequately placing the stimuli, using sound to represent the areas outside the standard human visual field. As such, the participants were asked to report shapes’ identities as well as their spatial orientation (front/right/back/left), requiring combined visual (90° frontal) and auditory input (the remaining 270°) for the successful performance of the task (content in both vision and audition was presented in a sweeping clockwise motion around the participant). We found that participants were successful at a highly above chance level after a brief 1-h-long session of online training and one on-site training session of an average of 20 min. They could even draw a 2D representation of this image in some cases. Participants could also generalize, recognizing new shapes they were not explicitly trained on. Our findings provide an initial proof of concept indicating that sensory augmentation devices and techniques can potentially be used in combination with natural sensory information in order to expand the natural fields of sensory perception.

Testing geometry and 3D perception in children following vision restoring cataract-removal surgery

Frontiers in Neuroscience

by Amber Maimon, Ophir Netzer, Benedetta Heimler & Amir Amedi


As neuroscience and rehabilitative techniques advance, age-old questions concerning the visual experience of those who gain sight after blindness, once thought to be philosophical alone, take center stage and become the target for scientific inquiries. In this study, we employ a battery of visual perception tasks to study the unique experience of a small group of children who have undergone vision-restoring cataract removal surgery as part of the Himalayan Cataract Project. We tested their abilities to perceive in three dimensions (3D) using a binocular rivalry task and the Brock string task, perceive visual illusions, use cross-modal mappings between touch and vision, and spatially group based on geometric cues. Some of the children in this study gained a sense of sight for the first time in their lives, having been born with bilateral congenital cataracts, while others suffered late-onset blindness in one eye alone. This study simultaneously supports yet raises further questions concerning Hubel and Wiesel’s critical periods theory and provides additional insight into Molyneux’s problem, the ability to correlate vision with touch quickly. We suggest that our findings present a relatively unexplored intermediate stage of 3D vision development. Importantly, we spotlight some essential geometrical perception visual abilities that strengthen the idea that spontaneous geometry intuitions arise independently from visual experience (and education), thus replicating and extending previous studies. We incorporate a new model, not previously explored, of testing children with congenital cataract removal surgeries who perform the task via vision. In contrast, previous work has explored these abilities in the congenitally blind via touch. Taken together, our findings provide insight into the development of what is commonly known as the visual system in the visually deprived and highlight the need to further empirically explore an amodal, task-based interpretation of specializations in the development and structure of the brain. Moreover, we propose a novel objective method, based on a simple binocular rivalry task and the Brock string task, for determining congenital (early) vs. late blindness where medical history and records are partial or lacking (e.g., as is often the case in cataract removal cases).

Inceptor: An Open Source Tool for Automated Creation of 3D Social Scenarios

2023 IEEE Conference on Virtual Reality and 3D User Interfaces

by Dan Pollak, Jonathan Giron & Doron Friedman


Inceptor is a tool designed for non-expert users to develop social VR scenarios that includes virtual humans. The tool uses a text based interface and natural language processing models as input, and generates complete 3D/VR Unity scenarios as output. The tool is currently based on the Rocketbox asset library. We release the tool as an open source project in order to empower the extended reality research community.

Breathing based immersive interactions for enhanced agency and body awareness: a claustrophobia motivated study

ACM CHI Conference on Human Factors in Computing Systems

by Iddo Yehoshua Wald, Amber Maimon, Lucas Keniger de Andrade Gensas, Noémi Guiot, Meshi Ben Oz, Benjamin W. Corn MD & Amir Amedi


This work explores utilizing representations of one’s physiological breath (embreathment) in immersive experiences, for enhancing presence and body awareness. Particularly, embreathment is proposed for reducing claustrophobia and associated negative cognitions such as feelings of restriction, loss of agency, and sense of sufocation, by enhancing agency and interoception in circumstances where one’s ability to act is restricted. The informed design process of an experience designed for this purpose is presented, alongside an experiment employing the experience, evaluating embodiment, presence, and interoception. The results indicate that embreathment leads to signifcantly greater levels of embodiment and presence than either an entrainment or control condition. In addition, a modest trend was observed in a heartbeat detection task implying better interoception in the intervention conditions than the control. These fndings support the initial assumptions regarding presence and body awareness, paving the way for further evaluation with individuals and situations related to the claustrophobia use case.

Persuasive Vibrations: Effects of Speech-Based Vibrations on Persuasion, Leadership, and Co-Presence During Verbal Communication in VR

IEEE Conference on Virtual Reality and 3D User Interfaces

by Justine Saint-Aubert, Ferran Argelaguet, Marc J.-M. Macé, Claudio Pacchierotti, Amir Amedi, & Anatole Lécuyer


In Virtual Reality (VR), a growing number of applications involve verbal communications with avatars, such as for teleconference, entertainment, virtual training, social networks, etc. In this context, our paper aims to investigate how tactile feedback consisting in vibrations synchronized with speech could influence aspects related to VR social interactions such as persuasion, co-presence and leadership. We conducted two experiments where participants embody a first-person avatar attending a virtual meeting in immersive VR. In the first experiment, participants were listening to two speaking virtual agents and the speech of one agent was augmented with vibrotactile feedback. Interestingly, the results show that such vibrotactile feedback could significantly improve the perceived co-presence but also the persuasiveness and leadership of the haptically-augmented agent. In the second experiment, the participants were asked to speak to two agents, and their own speech was augmented or not with vibrotactile feedback. The results show that vibrotactile feedback had again a positive effect on co-presence, and that participants perceive their speech as more persuasive in presence of haptic feedback. Taken together, our results demonstrate the strong potential of haptic feedback for supporting social interactions in VR, and pave the way to novel usages of vibrations in a wide range of applications in which verbal communication plays a prominent role.

The Topo-Speech sensory substitution system as a method of conveying spatial information to the blind and vision impaired

Frontiers in Human Neuroscience

by Amber Maimon, Iddo Yehoshua Wald, Meshi Ben Oz, Sophie Codron, Ophir Netzer, Benedetta Heimler, & Amir Amedi


Humans, like most animals, integrate sensory input in the brain from different sensory modalities. Yet humans are distinct in their ability to grasp symbolic input, which is interpreted into a cognitive mental representation of the world. This representation merges with external sensory input, providing modality integration of a different sort. This study evaluates the TopoSpeech algorithm in the blind and visually impaired. The system provides spatial information about the external world by applying sensory substitution alongside symbolic representations in a manner that corresponds with the unique way our brains acquire and process information. This is done by conveying spatial information, customarily acquired through vision, through the auditory channel, in a combination of sensory (auditory) features and symbolic language (named/spoken) features. The Topo-Speech sweeps the visual scene or image and represents objects’ identity by employing naming in a spoken word and simultaneously conveying the objects’ location by mapping the x-axis of the visual scene or image to the time it is announced and the y-axis by mapping the location to the pitch of the voice. This proof of concept study primarily explores the practical applicability of this approach in 22 visually impaired and blind individuals. The findings showed that individuals from both populations could effectively interpret and use the algorithm after a single training session. The blind showed an accuracy of 74.45%, while the visually impaired had an average accuracy of 72.74%. These results are comparable to those of the sighted, as shown in previous research, with all participants above chance level. As such, we demonstrate practically how aspects of spatial information can be transmitted through non-visual channels. To complement the findings, we weigh in on debates concerning models of spatial knowledge (the persistent, cumulative, or convergent models) and the capacity for spatial representation in the blind. We suggest the present study’s findings support the convergence model and the scenario that posits the blind are capable of some aspects of spatial representation as depicted by the algorithm comparable to those of the sighted. Finally, we present possible future developments, implementations, and use cases for the system as an aid for the blind and visually impaired.

Multi-sensory display of self-avatar's physiological state: virtual breathing and heart beating can increase sensation of effort in VR

IEEE Transactions on Visualization and Computer Graphics

by Yann Moullec, Justine Saint-Aubert, Julien Manson, Melanie Cogne, & Anatole Lecuyer


In this paper we explore the multi-sensory display of self-avatars’ physiological state in Virtual Reality (VR), as a means to enhance the connection between the users and their avatar. Our approach consists in designing and combining a coherent set of visual, auditory and haptic cues to represent the avatar’s cardiac and respiratory activity. These sensory cues are modulated depending on the avatar’s simulated physical exertion. We notably introduce a novel haptic technique to represent respiratory activity using a compression belt simulating abdominal movements that occur during a breathing cycle. A series of experiments was conducted to evaluate the influence of our multi-sensory rendering techniques on various aspects of the VR user experience, including the sense of virtual embodiment and the sensation of effort during a walking simulation. A first study (N=30) that focused on displaying cardiac activity showed that combining sensory modalities significantly enhances the sensation of effort. A second study (N=20) that focused on respiratory activity showed that combining sensory modalities significantly enhances the sensation of effort as well as two sub-components of the sense of embodiment. Interestingly, the user’s actual breathing tended to synchronize with the simulated breathing, especially with the multi-sensory and haptic displays. A third study (N=18) that focused on the combination of cardiac and respiratory activity showed that combining both rendering techniques significantly enhances the sensation of effort. Taken together, our results promote the use of our novel breathing display technique and multi-sensory rendering of physiological parameters in VR applications where effort sensations are prominent, such as for rehabilitation, sport training, or exergames.

A case study in phenomenology of visual experience with retinal prosthesis versus visual-to-auditory sensory substitution


by Amber Maimon, Or Yizhar, Galit Buchs, Benedetta Heimler, & Amir Amedi


The phenomenology of the blind has provided an age-old, unparalleled means of exploring the enigmatic link between the brain and mind. This paper delves into the unique phenomenological experience of a man who became blind in adulthood. He subsequently underwent both an Argus II retinal prosthesis implant and training, and extensive training on the EyeMusic visual to auditory sensory substitution device (SSD), thereby becoming the first reported case to date of dual proficiency with both devices. He offers a firsthand account into what he considers the great potential of combining sensory substitution devices with visual prostheses as part of a complete visual restoration protocol. While the Argus II retinal prosthesis alone provided him with immediate visual percepts by way of electrically stimulated phosphenes elicited by the device, the EyeMusic SSD requires extensive training from the onset. Yet following the extensive training program with the EyeMusic sensory substitution device, our subject reports that the sensory substitution device allowed him to experience a richer, more complex perceptual experience, that felt more “second nature” to him, while the Argus II prosthesis (which also requires training) did not allow him to achieve the same levels of automaticity and transparency. Following long-term use of the EyeMusic SSD, our subject reported that visual percepts representing mainly, but not limited to, colors portrayed by the EyeMusic SSD are elicited in association with auditory stimuli, indicating the acquisition of a high level of automaticity. Finally, the case study indicates an additive benefit to the combination of both devices on the user’s subjective phenomenological visual experience.

Congenitally blind adults can learn to identify face-shapes via auditory sensory substitution and successfully generalize some of the learned features

Scientific reports, 2022

by Roni Arbel; Benedetta Heimler; Amir Amedi


Unlike sighted individuals, congenitally blind individuals have little to no experience with face shapes. Instead, they rely on non-shape cues, such as voices, to perform character identification. The extent to which face-shape perception can be learned in adulthood via a different sensory modality (i.e., not vision) remains poorly explored. We used a visual-to-auditory Sensory Substitution Device (SSD) that enables conversion of visual images to the auditory modality while preserving their visual characteristics. Expert SSD users were systematically taught to identify cartoon faces via audition. Following a tailored training program lasting ~ 12 h, congenitally blind participants successfully identified six trained faces with high accuracy.

Furthermore, they effectively generalized their identification to the untrained, inverted orientation of the learned faces. Finally, after completing the extensive 12-h training program, participants learned six new faces within 2 additional hours of training, suggesting internalization of face-identification processes. Our results document for the first time that facial features can be processed through audition, even in the absence of visual experience across the lifespan. Overall, these findings have important implications for both non-visual object recognition and visual rehabilitation practices and prompt the study of the neural processes underlying auditory face perception in the absence of vision.

Effects of training and using an audio-tactile sensory substitution device on speech-in-noise understanding

Scientific Reports, 2022

by K. Cieśla; T. Wolak; A. Lorens; M. Mentzel; H. Skarżyński; Amir Amedi


Understanding speech in background noise is challenging. Wearing face-masks, imposed by the COVID19-pandemics, makes it even harder. We developed a multi-sensory setup, including a sensory substitution device (SSD) that can deliver speech simultaneously through audition and as vibrations on the fingertips. The vibrations correspond to low frequencies extracted from the speech input. We trained two groups of non-native English speakers in understanding distorted speech in noise. After a short session (30–45 min) of repeating sentences, with or without concurrent matching vibrations, we showed comparable mean group improvement of 14–16 dB in Speech Reception Threshold (SRT) in two test conditions, i.e., when the participants were asked to repeat sentences only from hearing and also when matching vibrations on fingertips were present. This is a very strong effect, if one considers that a 10 dB difference corresponds to doubling of the perceived loudness. The number of sentence repetitions needed for both types of training to complete the task was comparable. Meanwhile, the mean group SNR for the audio-tactile training (14.7 ± 8.7) was significantly lower (harder) than for the auditory training (23.9 ± 11.8), which indicates a potential facilitating effect of the added vibrations. In addition, both before and after training most of the participants (70–80%) showed better performance (by mean 4–6 dB) in speech-in-noise understanding when the audio sentences were accompanied with matching vibrations. This is the same magnitude of multisensory benefit that we reported, with no training at all, in our previous study using the same experimental procedures. After training, performance in this test condition was also best in both groups (SRT ~ 2 dB). The least significant effect of both training types was found in the third test condition, i.e. when participants were repeating sentences accompanied with non-matching tactile vibrations and the performance in this condition was also poorest after training. The results indicate that both types of training may remove some level of difficulty in sound perception, which might enable a more proper use of speech inputs delivered via vibrotactile stimulation. We discuss the implications of these novel findings with respect to basic science. In particular, we show that even in adulthood, i.e. long after the classical “critical periods” of development have passed, a new pairing between a certain computation (here, speech processing) and an atypical sensory modality (here, touch) can be established and trained, and that this process can be rapid and intuitive. We further present possible applications of our training program and the SSD for auditory rehabilitation in patients with hearing (and sight) deficits, as well as healthy individuals in suboptimal acoustic situations.