TITRE DE LA THESE THESE par

Transcripción

TITRE DE LA THESE THESE par
FACULTÉ des SCIENCES
DOCTORAT EN NEUROSCIENCES
des Universités de Genève et de Lausanne
UNIVERSITÉ DE GENEVE
FACULTÉ DE MÉDECINE
Professeur M. PELIZZONE, directeur de thèse
TITRE DE LA THESE
Minimum Requirements for a Retinal Prosthesis to
Restore Useful Vision
THESE
Présentée à la
Faculté des Sciences
de l’Université de Genève
pour obtenir le grade de
Docteure en Neurosciences
par
Angélica PÉREZ FORNOS
de
México D.F., Mexique
Thèse N° 3737
Genève
Editeur: Université de Genève
2006
ANGÉLICA PÉREZ FORNOS
Chemin de Saule 39
CH-1233 Bernex
Geneva, Switzerland
+41(22)348-3959
+41(76)564-3959
[email protected]
OBJECTIVE
To occupy a position where challenging ideas and projects are proposed and developed;
thus demanding constant effort, creativity, research, and actualization.
OVERVIEW
•
•
•
•
•
•
•
Strong interests in research and emerging technologies.
Dedicated, organized, and detail-oriented person.
Good problem solving abilities through analysis and synthesis.
Creative, material and human resources integrator.
Leadership, excellent multidisciplinary team worker.
Interested on continuous education and both person and team improvement.
Self-demanding and perfectionist.
AREAS OF
EFFECTIVENESS
•
•
•
•
•
•
•
•
Project planning and supervision.
Team management.
Custom system design and development.
Programming.
Computer simulations.
Real time and off-line signal processing.
Design, configuration, and customization of data acquisition systems.
Computer-aided dynamic system simulation (CACSD).
EXPERIENCE
Feb/1999 – Jul/1999
NEURAL REHABILITATION ENGINEERING LABORATORY, Brussels, Belgium.
Recording and Analysis of Visual Evoked Potentials Obtained by Electrical Stimulation
Design and development of a custom VEP (Visual Evoked Potentials) recording system.
This system was designed to register and analyze the electroencephalographic (EEG)
activity resulting of electrical stimulation of the visual pathways with surface electrodes on
healthy volunteers, as well as that resulting of direct stimulation of the optic nerve on a
blind volunteer.
Apr/2000 - Mar/2001
marchFIRST Switzerland – Consultant
Project Management. Design and development of custom desktop and web applications.
Specialized group and personal tutoring. Project management.
Apr/2001 – to date
Geneva University - HCUGE, Ophthalmology Clinic, Switzerland - Research Assistant
Within the framework of the CMOS-retina project, design and development of computer
and analysis tools for research in psychophysics of vision. Participation in the conception
and execution of research experiments.
ADDITIONAL
SKILLS
•
•
•
•
EDUCATION
1994 - 1999
LANGUAGES
General knowledge on natural sciences and mathematics (calculus, physics, anatomy,
physiology, and chemistry).
Expertise in the use of several software tools (MS Word, MS Excel, MS Power Point,
MS Access, MS Works, MS SQL Server AutoCad, PSpice).
Programming languages and environments: Turbo Pascal, Turbo C, LabView, MatLab,
Assembler, Visual Basic, ASP, Visual Interdev, Visual Studio, VB Script, JavaScript,
HTML, XML, Visual C++.
Good knowledge of other applied technology systems (microprocessors,
microcontrollers, classic control systems, power electronics).
UNIVERSIDAD IBEROAMERICANA, A.C., México D.F., México
Biomedical Engineering
(equivalence of a Physics Bachelor from the University of Geneva)
•
•
•
Spanish
English
French
PUBLICATIONS
•
•
•
•
•
•
•
•
•
•
•
Pérez Fornos, A. (1999). Recording and Analysis of Visual Evoked Potentials Obtained by Electrical
Stimulation. Universidad Iberoamericana, Plantel Santa Fe, México City, Mexico.
Pérez Fornos, A., Rappaz B., Sommerhalder, J., Safran, A.B., Pelizzone, M. (2002). Simulation of Artificial
Vision: Effects of dynamic versus static spatial quantization of the stimulus on reading. Experimental Eye
Research; 72(S2): 127.
Varsori, M., Pérez Fornos, A., Safran, A.B., Whatham, A. (2002). Changes in viewing strategy in normal
subjects during adaptation to artificial central scotomas. Experimental Eye Research; 72(S2): 129.
Pérez Fornos, A., Sommerhalder, J., Chanderli, K., Pittard, A., Baumberger, B., Fluckiger, M., Safran, A.B., &
Pelizzone, M. (2004). Minimum requirements for mobility in known environments and perceptual learning of
this task in eccentric vision. ARVO Meeting Abstracts, 45, 5445 (abstract).
Sommerhalder, J., Rappaz, B., de Haller, R., Pérez Fornos, A., Safran, A.B., & Pelizzone, M. (2004).
Simulation of artificial vision: II. Eccentric reading of full-page text and the learning of this task. Vision
Research, 44(14), 1693-1706.
Varsori, M., Pérez Fornos, A., Safran, A.B., & Whatham, A. (2004). Development of a viewing strategy during
adaptation to an artificial central scotoma. Vision Research, 44(23), 2691-2705.
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Safran, A.B., & Pelizzone, M. (2004). Changes in eye
movement strategy during the process of learning eccentric reading. Neuro-ophthalmology, 28(3), 99 (abstract).
Pérez Fornos, A., Sommerhalder, J., Pittard, A., Safran, A.B., & Pelizzone, M. (2005). Minimum requirements
for visuomotor coordination and learning of this task in eccentric vision. ARVO Meeting Abstracts, 46, 1533
(abstract).
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Safran, A.B., & Pelizzone, M. (2005). Simulation of Artificial
Vision, III: Do the spatial or temporal characteristics of stimulus pixelization really matter? Investigative
Ophthalmology & Visual Science, 46(10), 3906-3912.
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Pelizzone, M., & Safran, A.B. (2006). Processes involved in
oculomotor adaptation to eccentric reading. Investigative Ophthalmology & Visual Science, 47(4), 1439-1447.
Sommerhalder, J., Pérez Fornos, A., Chanderli, K., Colin, L., Schaer, X., Mauler, F., Safran, A.B., &
Pelizzone, M. (2006). Minimum requirements for mobility in unpredictable environments. ARVO Meeting
Abstracts, 47, 3204 (abstract).
Acknowledgements
This dissertation is the conclusion of 5 amazing years of work. It is also the fruit
of the joint effort of several people, and I would like to thank them all:
Dr. Jörg R. Sommerhalder for his enormous human and scientific support
through all these years of close collaboration.
Dr. Karim Chanderli for all the laughs and for his fundamental contribution to
this project.
Prof. Marco Pelizzone for his guidance and for always forcing me to give the
best of myself.
Prof. Avinoam B. Safran for his continous interest in my work, and for his
unconditional trust and support.
Prof. Daniel Bertrand, Prof. Dominique Muller, and Prof. Nelson Y. Kiang
for making time in their busy agendas and accepting to take part in the jury of this
thesis.
Benjamin Rappaz, Raoul de Haller, Alexandre Pittard, Flavien Mauler,
Lise Colin, and Xavier Schaer for their active collaboration in the different
experiments.
All the volunteers that participated in the experiments; I know that it was not
always easy...
And to all the people that I forgot to mention but that made this project a
reality… It is never easy to summarize 5 years of work in one “thank you” page…
Agradecimientos
Para Mateo, el otro cerebrito escondido detrás de esta tesis
Una etapa más completada... Y al volver la vista atrás me sigo sorprendiendo al
darme cuenta del inmenso apoyo que traigo detrás. Hay mucha gente sin la que esto
no hubiera sido posible. Gracias:
A Roberto, el amor de mi vida, por formar parte de mis sueños...
A mi Papá y a mi Mamá, por enseñarme que nada es imposible y por hacer
todos mis sueños realidad...
A Jingle, por obligarme a seguir mis sueños y a no dejar de soñar...
Al Pato, por enseñarme todos los días que en la diversidad está la riqueza...
A Belita, mi eterna incondicional, la única capaz de transformar mis errores en
aciertos...
A mis abuelos David y Lola, por maravillarse siempre de todos mis logros,
grandes y pequeños...
A Rocío, por su apoyo moral y lingüístico, y por enseñarme a reírme de mi misma
y de mi mamertez...
A mi tía Lelia, el colchoncito donde siempre puedo apoyarme, reponerme y
descansar...
A mi tía Encarna, por estar siempre presente, aún a larga distancia...
A mis primos: Güera, Paquito, Javier, Ángel, Andrea, Robi, Tani, David,
Manuel V., Alex, Vanesa, Manuel P., Danny... por ser los espejos donde aprendí
a mirarme a mí misma...
Al resto de mi familia, que no menciono explícitamente por falta de espacio, no
por falta de ganas y reconocimiento...
A mis amigos: Flais, La Maru, Pete, Anik, Christian, Stephan, Sandra,
María, Martha, Adris, Salustia... gracias a quienes puedo enfrentar la vida a
carcajadas...
A los que ya no están… Siempre los llevo dentro de la cabeza y cerquita del
corazón.
Table of Contents
Summary.............................................................................................................i
Résumé............................................................................................................ vii
Resumen .......................................................................................................... xv
1
Introduction .................................................................................................1
1.1
Anatomy and Physiology of Vision ...........................................................1
1.1.1 The Eye .............................................................................................2
1.1.2 The Optic Nerve and Optic Tract ..........................................................6
1.1.3 The Visual Cortex................................................................................7
1.2
Blindness and Low Vision........................................................................8
1.3
Visual prostheses as a means to rehabilitate blindness ............................ 10
1.3.1 Cortical Stimulation ........................................................................... 12
1.3.2 Optic Nerve Stimulation..................................................................... 14
1.3.3 Retinal Stimulation ............................................................................ 17
1.3.4 Alternative approaches ...................................................................... 23
1.3.5 Comparison of the different approaches.............................................. 23
1.4
Minimum requirements for useful artificial vision..................................... 25
1.5
Scope of this thesis .............................................................................. 28
1.5.1 Significance ...................................................................................... 29
2
General Methods ........................................................................................ 31
2.1
Basic principles of the simulation methodologies..................................... 31
2.2
Image processing................................................................................. 32
2.2.1 Square pixelization............................................................................ 32
2.2.2 Gaussian pixelization ......................................................................... 33
2.2.3 Off-line/Real-time pixelization ............................................................ 34
2.3
Experimental setup .............................................................................. 34
2.3.1 Stationary system ............................................................................. 38
2.3.2 Mobile system .................................................................................. 38
2.4
Data analysis and statistics ................................................................... 39
2.4.1 Percentage scores ............................................................................ 39
2.5
3
Ethical considerations........................................................................... 40
Experiments on Reading ............................................................................. 41
3.1
Foreword ............................................................................................ 41
3.2
Introduction ........................................................................................ 41
3.2.1 Reading in the context of artificial vision............................................. 43
3.3
Specific methods for the reading experiments ........................................ 45
3.3.1 Subjects........................................................................................... 45
3.3.2 Experimental setup ........................................................................... 45
3.4
Pilot experiment: Reading of isolated 4-letter words ............................... 46
3.4.1 Stimuli ............................................................................................ 46
3.4.2 Acute experiments with 4-letter words................................................ 47
3.4.3 Habituation to reading 4-letter words in eccentric vision ...................... 51
3.5
Full-page reading................................................................................. 58
3.5.1 Stimuli ............................................................................................ 59
3.5.2 Analysis methodology ....................................................................... 59
3.5.3 Experimental protocol ....................................................................... 61
3.5.4 Experiment 1: Full-page reading in central vision................................. 62
3.5.5 Experiment 2: Full-page reading in eccentric vision.............................. 63
3.6
Discussion........................................................................................... 71
3.6.1 Main outcome of these experiments ................................................... 71
3.6.2 Analysis of the learning process ......................................................... 73
3.6.3 Additional considerations................................................................... 76
4
3.7
Conclusions ......................................................................................... 77
3.8
Publications resulting from this research ................................................ 78
Experiments on Visuomotor Coordination ..................................................... 79
4.1
Foreword ............................................................................................ 79
4.2
Introduction ........................................................................................ 79
4.2.1 Vision and visuomotor coordination .................................................... 80
4.2.2 Visuomotor coordination in the context of artificial vision ..................... 83
4.3
Specific methods for the experiments on visuomotor coordination............ 84
4.3.1 Subjects........................................................................................... 84
4.3.2 Visuomotor tasks .............................................................................. 84
4.3.3 Effective field of view ........................................................................ 86
4.3.4 Experimental setup ........................................................................... 87
4.4
Acute experiments on visuomotor coordination....................................... 88
4.4.1 Experimental protocol ....................................................................... 88
4.4.2 Experiment 3: Manipulation – The chips task....................................... 89
4.4.3 Experiment 4: Pointing – The LEDs task.............................................. 91
4.4.4 Summary of the results of these experiments...................................... 92
4.5
Habituation experiments on visuomotor coordination .............................. 93
4.5.1 Experimental protocol ....................................................................... 93
4.5.2 Preparatory experiment: Learning in central vision............................... 94
4.5.3 Experiment 5: Learning in eccentric vision .......................................... 96
5
4.6
Discussion ........................................................................................... 98
4.7
Conclusions ....................................................................................... 101
4.8
Publications resulting from this research .............................................. 101
Experiments on Mobility ............................................................................ 103
5.1
Foreword........................................................................................... 103
5.2
Introduction ...................................................................................... 103
5.2.1 What have we learned from low vision patients? ............................... 104
5.2.2 Mobility in the context of artificial vision............................................ 106
5.3
Specific methods for the experiments on mobility ................................. 106
5.3.1 Subjects......................................................................................... 106
5.3.2 Effective field of view ...................................................................... 106
5.3.3 Experimental setup ......................................................................... 107
5.4
Acute experiments on mobility ............................................................ 107
5.4.1 Experiment 6: Laboratory maze ....................................................... 107
5.4.2 Experiment 7: Random forest .......................................................... 111
5.4.3 Experiment 8: Real street crossing ................................................... 114
5.4.4 Summary of the results of these experiments.................................... 118
5.5
Habituation experiments on mobility .................................................... 119
5.5.1 Experimental protocol ..................................................................... 119
5.5.2 Preparatory experiment: Learning in central vision............................. 120
5.5.3 Experiment 9: Learning in eccentric vision ........................................ 121
6
5.6
Discussion......................................................................................... 122
5.7
Conclusion ........................................................................................ 124
5.8
Publications resulting from this research .............................................. 125
Towards Better Simulations of Artificial Vision ............................................. 127
6.1
Foreword .......................................................................................... 127
6.2
Introduction ...................................................................................... 128
6.3
Specific methods for these simulations ................................................ 129
6.3.1 Subjects......................................................................................... 129
6.3.2 Experimental Setup......................................................................... 129
6.4
Experiment 10: Real-time Square vs. Off-line Square Pixelization ........... 130
6.4.1 Experimental protocol ..................................................................... 130
6.4.2 Results .......................................................................................... 130
6.5
Experiment 11: Off-line Gaussian vs. Off-line Square Pixelization ........... 132
6.5.1 Experimental protocol ..................................................................... 132
6.5.2 Results .......................................................................................... 132
6.6
Experiment 12: Real-time Gaussian vs. Real-time Square Pixelization ..... 133
6.6.1 Experimental protocol ..................................................................... 133
6.6.2 Results .......................................................................................... 134
6.7
Discussion......................................................................................... 134
6.7.1 Implications of these results for simulations of artificial vision ............ 136
7
8
6.8
Conclusion ........................................................................................ 137
6.9
Publications resulting from this research .............................................. 137
General Conclusions ................................................................................. 139
7.1
Summary of the results ...................................................................... 139
7.2
Implication of these results for the development of visual prostheses..... 140
7.3
Future work ...................................................................................... 142
7.4
Closing remarks ................................................................................. 143
References .............................................................................................. 145
Appendix A..................................................................................................... 165
Appendix B..................................................................................................... 167
Appendix C..................................................................................................... 169
Appendix D..................................................................................................... 171
Appendix E ..................................................................................................... 183
Publications .................................................................................................... 185
Summary
Blindness is a severe handicap because vision is one of the most important
sensory modalities underlying human activity. Certain forms of retinal degeneration
can lead to complete blindness. Some can occur early in life, like RP, and the
resulting visual impairment usually increases with age. Other forms of blindness are
mostly age related with higher incidence in elderly people. Thus, the impact of
blindness will become more and more important as life expectancy increases. Today,
technological advances have opened new perspectives and it is possible to envision
neural prostheses to restore some useful vision to totally blind patients. Such devices
aim to restore function by direct electrical stimulation of neural tissue. This kind of
approach has proven to be very successful with cochlear implants in the case of
deafness (NIH Consensus Statement, 1995). Several research groups have initiated
projects aiming at the development of various visual prostheses and a huge effort is
presently going on in this field. The implantation of the first basic prototypes of visual
prostheses demonstrate that the hope for a useful aid is not so far away (Dobelle,
2000; Chow et al., 2002; Veraart et al., 2003; Humayun et al., 2003).
There is increasing evidence that, some day, visual prostheses could bring similar
benefits to blind people as those provided by cochlear implants to deaf patients. To
this date, most efforts in the field appear to be concentrated on the development of
technical solutions for visual prostheses (microelectronics, biocompatibility,
electrophysiology, etc…). One key issue seems however to attract very little
attention: What are the minimum requirements for useful artificial vision? In other
words, what is the minimal visual information, necessary to perform common daily
tasks?
Rationale
The goal of the research presented here was to determine minimum requirements
to achieve useful artificial vision. To design visual prostheses, the knowledge of the
minimum information to be transmitted to the brain in order to restore useful
function is essential, theoretically and practically. The history of cochlear implant
development clearly illustrates the importance of modeling studies. Breakthroughs
like the advent of multi-channel cochlear implants, allowing for adequate speech
recognition, were only possible as a result of psychophysical studies (Tong et al.,
1983; Eddington et al., 1998a; Eddington et al., 1998b). Therefore, the research
effort presented in this dissertation is to provide this type of information to the
artificial vision research community in time, hoping to prevent large-scale use of
prototype retinal implants with insufficient numbers of stimulation contacts.
i
General methods
The experimental approach used in these studies was designed to mimic, as
realistically as possible, visual perceptions provided by retinal implants. Such devices
present certain features that lead to several major constraints about the visual
percepts that can be elicited. Retinal prostheses will consist of a finite number of
discrete stimulation contacts (limited resolution), will be implanted at a fixed location
in the eye, and will subtend only a fraction of the entire visual field. Furthermore,
highly eccentric implantation areas will probably have to be envisioned since the
anatomo-physiology of the retina does not favor a foveal location for a visual
prosthesis (Sjöstrand et al., 1999a; Sjöstrand et al., 1999b). The best sites,
potentially preserving retinotopic activation without major distortion, are located at
an eccentricity of 10° and more. This means that the vision of future users of retinal
prosthesis will probably be restricted to small peripheral areas of their visual field.
An experimental setup designed to simulate conditions of artificial vision on
normal volunteers was developed. This setup allows for presentation of pixelized
images, stabilized on visual field areas of a given eccentricity. Therefore, it is capable
of mimicking the type of visual information transmitted by a retinal prosthesis and to
make parametric changes in the amount or nature of such information. The minimum
requirements for three classes of basic visual functions were investigated: the
identification of small objects as in reading and the localization of objects and body
in space for adequate visuomotor coordination and whole-body mobility.
Experiments on reading
Reading is an extremely important activity in our modern societies and represents
one of the main goals of low vision patients seeking rehabilitation. The thorough
analysis of this task is, thus, fundamental for the evaluation of the rehabilitation
prospects of visual prostheses for blind patients.
A first series of experiments using isolated 4-letter words showed that
performance drops abruptly when the information content of the target is reduced
below a certain threshold (expressed in number of pixels). For central reading, a
viewing window containing at least 250 pixels is necessary to code 4-letter words. At
eccentricities beyond 10°, reading performance decreases rapidly even if more than
250 pixels are used. A second study was dedicated to respond the question of
whether the task of eccentric reading under such specific conditions could be
improved by training. Two subjects, naive to this task, were trained to read isolated
4-letter words under conditions of simulated artificial vision at 15° eccentricity (in the
lower visual field). Reading performance of both subjects increased impressively
throughout this experiment. Reading scores of 6% and 23% correct were observed
at the beginning of the experiment; by the end of the experiment (about one month
of daily training; 1 hour/day), reading scores improved to 64% and 85%. Control
tests demonstrated that the learning process consisted essentially in the adaptation
to use an eccentric area of the retina for the reading task.
ii
A second set of experiments addressed the more realistic task of full-page
reading, which included the control of the subjects’ eye movements to achieve page
navigation in similar conditions of artificial vision, mimicking an eccentric retinal
implant. Three subjects, naïve to the task, were trained for almost two months
(about 1 hour/day) to read full-page texts. Subjects had to use their own eye
movements to displace a 10° x 7° viewing window stabilized at 15° eccentricity in
their lower visual field. Initial reading scores were very low for two subjects (about
13% correctly read words), and astonishingly high for the third subject (86%
correctly read words). However, all of them significantly improved their performance
with time, reaching close to perfect reading scores (ranging from 86% to 98%) at
the end of training. Initial reading rates were as low as 1 to 5 words/min and
increased significantly with time to 14 to 28 words/min. Qualitative text
understanding was also estimated. A score of at least 85% correct was necessary to
achieve ‘good’ text understanding. Gaze position recordings, made during the
experimental sessions, demonstrated that the control of eye movements, especially
the suppression of reflexive vertical saccades, constituted an important part of the
overall adaptive learning process.
Experiments on visuomotor coordination
The lack of resolution might affect visuomotor tasks requiring detailed vision, such
as those involving object identification. Furthermore, difficulties with visuomotor
coordination may also result from defects in the peripheral visual field and available
field of view, which affect localization/orientation abilities. Encoding spatial
information and using it to direct a particular motor response might, therefore,
impose various constraints (in terms of information requirements) to a visual
prosthesis.
Two tests were especially developed to examine these tasks. In the first
configuration, the chips task, subjects had to recognize simple figures (drawn on
wooden chips) and place them in the adequate position and orientation on
randomized templates. In the second test, the LEDs task, subjects had to point with
the finger, as precisely as possible, on spots marked by light points (LEDs) displayed
underneath a touch screen. Similar to the reading experiments, artificial vision was
simulated by projecting images of limited resolution (pixelization level) on a 10° x 7°
viewing window, stabilized at a fixed position in the visual field. For these
experiments the size of the effective field of view projected in this 10° x 7° visual
area (portion of the environment visible at glance) could also be varied.
In a first experiment, the minimum requirements needed to reach optimum
performance were established using central vision. Both the number of pixels
contained in the viewing window, as well as the effective field of view projected
inside it, selectively affected visuomotor performance; various combinations of these
parameters allowed good performance on these tasks. Nevertheless, the results
revealed a fundamental limit for visuomotor performance: a minimum effective
resolution of approximately 2 pixels/deg2 of the environmental space was necessary
to achieve both tasks with reasonable accuracy and speed (i.e. approximately 100
pixels with a 8° x 6° field of view; about 400 pixels with a 16° x 12° field of view; or
iii
around 1600 pixels with a 33° x 23° field of view). A field of view of approximately
16° x 12° represented the best resolution/performance compromise and subjects
spontaneously reported preferring it to the others.
In a second experiment, 3 normal volunteers, naïve to eccentric viewing, were
trained to perform the visuomotor tasks using a viewing window stabilized at 15° of
eccentricity in the lower visual field. An effective field of view of 16° x 12° was
chosen for this second experiment. To be consistent with our previous experiments
on reading, a resolution of 498 pixels in the viewing window was judged to be the
most adequate for learning the task in eccentric vision (effective resolution of 2.6
pixels/deg2). For the chips task, one subject achieved excellent %-correct scores
immediately, while the other 2 subjects consistently obtained scores above 95% after
4 to 15 sessions. Average time required to correctly place a chip asymptoted at
around 9 s after 8, 13, and 38 sessions. For the leds task, pointing precision
converged around 0.7 cm but results were very variable. Pointing rates stabilized
within 8 sessions at 5.8 s/target.
Experiments on mobility
Mobility essentially requires the capacity to judge egocentric and exocentric
distances for solving issues such as localization of body in space, perception of
movement, distance estimation, and speed estimation. These tasks might, therefore,
impose different constraints to an artificial vision system than those found in the
previous tasks. As in the previous experiments, artificial vision was simulated by
projecting images of limited resolution on a 10° x 7° viewing window, subtending
different fractions of the environment, and stabilized at a fixed position in the visual
field.
First, we determined the minimum requirements for useful mobility in central
vision. Since the minimum information required for achieving satisfactory
performance varies according to the type of environment in which the task is to be
performed, a series of tasks involving different realistic situations were evaluated.
The first configuration, the laboratory maze task, was conceived to assess mobility
performance in familiar, randomized indoor environments. This task consisted in
walking through an indoor course consisting of 6 obstacles frequently encountered in
daily life. The second setting, the random forest task, was designed to assess
mobility performance in randomized, unfamiliar indoor environments including some
dynamic elements. Subjects had to walk through an ‘artificial forest’ composed of 52
randomly positioned obstacles or ‘trees’, from a random starting position to a random
end position. The last task, the real street crossing, was intended to assess the visual
requirements for mobility in a real-life, dynamic environment. In this case, the
capacity of estimating speed and distance of approaching objects (cars) was
investigated. Results of these experiments confirmed that minimum information
requirements were closely linked to the type of environment on which the tasks were
to be performed. Mobility in well-known indoor environments required relatively little
visual information: approximately 0.2 pixels/deg2 (i.e. 150 pixels with a 33° x 23°
field of view or 600 pixels with a 66° x 46° field of view). Large fields of view did not
seem to be of particular advantage in these settings. Mobility in less predictable
iv
environments incorporating some dynamic elements, such as that of the random
forest task, was more sensitive to the number of pixels available on the viewing
window, requiring approximately 500 pixels. A 33° x 23° field of view tended to yield
the best performance. Finally, approximately 1000 pixels seem to be required for
subjects to feel safe while performing mobility tasks in unknown, dynamic
environments such as that of the real street crossing task. As fewer pixels were
available in the viewing window, subjects needed to compensate with additional
information sources (i.e. hearing); smaller visual, fields providing more detailed
visual information, seem to be advantageous in these settings.
Second, we evaluated possible learning effects when performing mobility tasks in
eccentric vision (15° in the lower visual field). According to the first mobility
experiments, an effective field of view of 33° x 23° seems to be the best compromise
between a large enough field of view while still maintaining reasonable image
resolution and was, thus, chosen for this second experiment. To be consistent with
our previous experiments on reading, a resolution of 498 pixels in the viewing
window was judged to be the most adequate for learning the task in eccentric vision
(effective resolution of 0.65 pixels/deg2). Mobility performance in eccentric vision was
explored using the laboratory maze task. Error counts asymptoted within the first 10
training sessions. The time to accomplish the mobility task stabilized after about 40
sessions. Interestingly, under similar experimental conditions, subjects could achieve
the task more rapidly in eccentric vision than in central vision after training.
Experiments exploring more realistic simulations of artificial
vision
In our previously mentioned studies, we used simplified simulations of artificial
vision to determine the basic parameters for visual prostheses to restore useful
function. In a final series of experiments we explored the effect of such
simplifications on the most ‘information-demanding’ task: full page reading.
Normal volunteers had to read full-pages of text using a 10° x 7° viewing window
stabilized in central vision. In a first study, we measured reading performance
comparing off-line and real-time square pixelizations at different resolutions. Results
showed that real-time square pixelization required about 30% less information
(pixels) than its off-line counterpart. In a second experiment, off-line square
pixelization was compared to off-line gaussian pixelization with various degrees of
overlap (σ). Results from this experiment revealed a restricted range of gaussian
widths (0.143 < σ < 0.571) for which performance was equivalent or significantly
better than that obtained with square pixelization. Finally, in a third experiment, realtime square pixelization was compared to real-time gaussian pixelization. This
experiment demonstrated, however, that reading performance was similar in both
real-time pixelization conditions.
This investigation revealed that real-time stimulus pixelization favors reading
performance. Performance gains were, however, relatively moderate and did not
v
allow for a significant (e.g. two-fold) reduction of the minimum resolution (400-500
pixels) needed to achieve useful reading abilities.
Conclusion
Taken together, these results suggest that, about 500 phosphenes retinotopically
arranged over a 10° x 7° retinal area (corresponding to an implant surface of 3 x 2
mm2), is the minimum visual information required to restore useful function. If this
minimum criterion is fulfilled, retinal implants might restore some full-page reading
abilities to blind patients. Visuomotor coordination and whole-body mobility seem to
be less demanding, in terms of information content, than the reading task. In
addition, the effective field of view represented by the active area of the implant will
have to be optimized for each task. A highly magnified effective field of view
simultaneously containing strings of 4-6 characters and about two lines of text
(about 2° x 1.4° for a typical newspaper) is required for efficient reading, an
effective field of view of about 16° x 12° seems to allow for efficient visuomotor
coordination, and an effective field of view of 33° x 23° appears to be necessary for
tasks involving whole-body mobility. A significant learning process will be required to
reach optimal performance with such devices, especially if the implant has to be
placed outside the fovea. Visual prostheses should aim to meet these criteria in order
to provide efficient functional rehabilitation to blind patients.
vi
Résumé
La vision est l’une des modalités sensorielles la plus importante pour l’activité
humaine. Sa perte engendre un handicap lourd. Certaines formes de
dégénérescences rétiniennes peuvent conduire à la cécité. Dans la rétinite
pigmentaire, par exemple, la perte de vision peut survenir à un âge relativement
précoce. D'autres maladies sont principalement dues au vieillissement, ayant par
conséquent une incidence plus élevée chez les personnes d’un âge avancé. Ainsi, au
vu de l’augmentation de l’espérance de vie, la cécité est amenée à devenir un
problème de plus en plus important. Aujourd'hui, les progrès technologiques ont
ouvert de nouvelles perspectives, offrant la possibilité d'envisager des prothèses
neurales pour restituer une vision utile à des patients aveugles. Ces dispositifs
tentent de restituer une fonction visuelle par stimulation électrique directe du tissu
neural. Une approche similaire approche s'est avérée très réussie avec la mise au
point d’implants cochléaires dans le cas de problèmes de surdité (NIH Consensus
Statement, 1995). Plusieurs groupes ont entamé des recherches visant au
développement de diverses prothèses visuelles. L'implantation des premiers
prototypes de ces dispositifs montre que l'espoir d’une aide utile n'est pas loin
(Dobelle, 2000; Chow et al., 2002; Veraart et al., 2003; Humayun et al., 2003).
Il y a de plus en plus d’évidence qu’un jour les prothèses visuelles pourraient
apporter aux patients aveugles des bénéfices semblables à ceux procurés aux
patients sourds par les implants cochléaires. Jusqu’à aujourd’hui, la plupart des
efforts dans le domaine sont concentrés sur le développement de solutions à des
problèmes techniques des prothèses visuelles (microélectronique, biocompatibilité,
électrophysiologie, etc...). Cependant, une question semble attirer très peu
d’attention bien que fondamentale: Quel est l’information minimale pour une vision
artificielle utile ? En d'autres termes, quel est le minimum d'information visuelle
nécessaire pour accomplir les tâches quotidiennes?
Buts
Le but de ce projet est de déterminer les caractéristiques minimales pour qu’un
implant rétinien permette une vision utile. La connaissance du minimum
d'information qui doit être transmis au cerveau pour restituer une fonction utile est
essentielle, théoriquement et pratiquement, pour la conception de prothèses
visuelles. L'histoire du développement des implants cochléaires illustre clairement
l'importance de telles études. Des avancées, comme le développement des implants
cochléaires multicanaux permettant la reconnaissance de la parole, ont été rendues
possibles grâce à des études psychophysiques (Tong et al., 1983; Eddington et al.,
1998a; Eddington et al., 1998b). Ainsi, la recherche menée dans le cadre de cette
thèse a pour objectif principal de fournir ce type d'information, dans l’espoir d’éviter
vii
l'utilisation à grande échelle d’implants rétiniens comportant un nombre insuffisant
d’électrodes de stimulation.
Méthodes générales
L'approche expérimentale utilisée pour ces études a été conçue pour reproduire,
de la façon la plus réaliste possible, la perception visuelle telle qu’elle sera produite
par les implants rétiniens. Ces dispositifs auront certaines caractéristiques impliquant
différentes contraintes fondamentales sur la perception visuelle qui pourra être
évoquée. Les prothèses rétiniennes consisteront en un nombre fini d’électrodes de
stimulation (résolution limitée), qui seront implantées à un endroit fixe dans l’œil et
couvriront seulement une fraction du champ visuel. En outre, des zones
d'implantation très excentrées devront probablement être envisagées puisque les
caractéristiques anatomiques et physiologiques de la rétine ne favorisent pas
l’implantation d’une prothèse visuelle proche de la fovéa. Les meilleurs sites pour
l’implantation, permettant potentiellement de préserver une bonne rétinotopie sans
déformation majeure de l’image, sont situés à des excentricités supérieures à 10°.
Ceci signifie que la vision des futurs utilisateurs de prothèses rétiniennes sera
vraisemblablement limitée à de petits secteurs périphériques de leur champ visuel.
Un simulateur de vision artificielle a été développé pour réaliser les tests
psychophysiques. Ce système permet de présenter des images de résolution limitée
(pixelisées), stabilisées sur des régions déterminées du champ visuel (à une
excentricité donnée). Il est capable d'imiter l'information visuelle transmise par une
prothèse rétinienne et permet de faire des changements paramétriques de la
quantité ou de la nature de cette information. Les conditions minimales pour trois
tâches visuelles fondamentales ont été étudiées : l'identification de petits objets
comme par exemple dans la lecture, la localisation du corps et la localisation d’objets
dans l'espace permettant une coordination visuomotrice et une mobilité adéquate.
Études sur la lecture
La lecture est une activité extrêmement importante dans la société moderne et
est un des buts principaux de la rééducation en basse vision. L'analyse exhaustive de
cette tâche est donc essentielle pour l'évaluation des perspectives de réadaptation
des futurs porteurs de prothèses visuelles.
Une première étude sur la lecture de mots isolés de 4 lettres a démontré que les
performances des sujets chutaient abruptement lorsque l’information représentant la
cible était réduite en dessous d'un certain seuil (exprimé en nombre de pixels). Ainsi,
la lecture en vision centrale nécessite une aire de stimulation contenant au moins
250 pixels pour coder les mots de 4 lettres. À des excentricités supérieures à 10°, les
performances diminuaient rapidement même pour des résolutions supérieures à 250
pixels. Dans les mêmes conditions expérimentales une deuxième étude a été
consacrée à l’évaluation de possibles effets d’apprentissage dans la lecture de mots
de 4 lettres. Deux sujets, naïfs à la tâche, se sont entraînés à lire des mots pixélisés
de 4 lettres (250 pixels) dans des conditions simulées de vision artificielle à une
viii
excentricité de 15° (dans le champ visuel inférieur). Les performances de lecture des
2 sujets ont augmenté considérablement tout au long de l’expérience. Au début de
l’étude les sujets identifiaient correctement seulement 6% et 23% des mots. À la fin
de la période d’apprentissage (environ 1 h/jour pendant un mois), les mêmes sujets
arrivaient à lire 64% et 85% des mots correctement. Des expériences contrôles ont
démontré que, pour cette tâche, l'apprentissage a essentiellement consisté en
l’adaptation du sujet à l’utilisation d’une aire excentrée de la rétine pour lire.
Un deuxième ensemble d'expériences a été conduit sur la tâche plus réaliste de
lecture de textes en pleine page, comprenant le contrôle des mouvements oculaires
du sujet lors de la navigation sur la page de texte, dans des conditions similaires de
vision artificielle excentrée. Trois sujets, naïfs à la tâche, se sont entraînés pendant
presque deux mois (environ 1 h/jour) à lire des pages entières de texte. Les sujets
devaient employer leurs propres mouvements oculaires pour déplacer une fenêtre de
stimulation de 10° x 7°, stabilisée à 15° dans le champ visuel inférieur. Au début des
expériences, les scores de lecture1 étaient très bas pour deux sujets (environ 13%),
et étonnamment hauts pour le troisième sujet (86%). Cependant, les 3 sujets se
sont améliorés de manière significative pendant la période d’apprentissage,
atteignant des scores presque parfaits à la fin de l’expérience (de 86% à 98%). Les
vitesses de lecture initiales étaient très basses, de 1 a 5 mots/min, et ont
sensiblement augmenté avec le temps pour atteindre de 14 à 28 mots/min. Une
analyse qualitative de la compréhension des textes a aussi été réalisée. Un minimum
de 85% de mots correctement lus s’est avéré nécessaire à une bonne
compréhension des textes. Les enregistrements des mouvements oculaires obtenus
pendant les séances expérimentales ont clairement démontré que le contrôle de ces
mouvements, plus particulièrement la suppression des saccades réflexes verticales, a
constitué une partie très importante de l’apprentissage global de la lecture en
excentricité.
Études sur la coordination visuomotrice
La résolution réduite des images générées par une prothèse visuelle pourrait
affecter sérieusement les tâches de coordination visuomotrice exigeant une vision
détaillée, comme l’identification d’objets ou cibles potentielles. De plus, des défauts
de la vision périphérique limitant le champ visuel disponible affectent les capacités de
localisation/orientation dans l’espace, et pourraient engendrer des difficultés dans la
coordination visuomotrice. Par conséquent, la codification de l’information spatiale
pertinente à ces tâches et l’utilisation de cette même information dans le but de
diriger une réponse motrice particulière, peuvent imposer différentes contraintes à
une prothèse visuelle.
Deux tests ont été développés pour évaluer les capacités de coordination
visuomotrice en simulant une vision artificielle sur des sujets normaux. Dans la
première tâche, la tâche des jetons, le sujet devait reconnaître des objets (motifs
dessinés sur des jetons en bois) et les poser par la suite sur des motifs identiques
1
% de mots correctement lus.
ix
situés sur un plan de travail arrangé aléatoirement. Les différents objets ne
pouvaient être reconnus que visuellement (et non par le toucher). Dans le deuxième
test, la tâche des LEDs, les sujets devaient pointer avec un doigt, aussi précisément
que possible, à des endroits déterminés d’un plan de travail (touch screen)
représentés par des points lumineux (LEDs) s’allumant aléatoirement. Comme dans
les études sur la lecture, la vision artificielle était simulée en projetant des images de
résolution réduite (pixelisées) dans une fenêtre de stimulation de 10° x 7°, stabilisée
à un endroit fixe du champ visuel. De plus, pour ces expériences la taille du champ
visuel réel projeté dans la fenêtre de 10° x 7° (portion de l’environnement visible
simultanément dans la fenêtre de stimulation) pouvait être aussi modifiée, ce qui
n’était pas le cas dans les expériences de lecture.
Dans une première étape, l’information minimale requise pour atteindre une
performance visuomotrice efficace en vision centrale a été déterminée. Tant le
nombre de pixels contenus dans la fenêtre de stimulation que la taille du champ
visuel effectif projeté dans cette fenêtre semblent avoir un effet sur la performance
visuomotrice. Cependant, chacun des deux paramètres a une influence différente.
Plusieurs combinaisons de ces paramètres ont permis d’obtenir une bonne
performance pour la tâche des jetons et la tâche des LEDs. Néanmoins, les résultats
ont révélé un seuil minimal de résolution pour la performance visuomotrice. Une
résolution effective minimale de 2 pixels/deg2 (par exemple, environ 100 pixels avec
un champ visuel effectif de 8° x 6°, autour de 400 pixels avec un champ visuel
effectif de 16° x 12° ou aux alentours de 1600 pixels avec un champ visuel effectif
de 33° x 23°) s’est, en effet, avérée nécessaire à l’accomplissement des deux tâches
avec une précision adéquate. Un champ visuel effectif de 16° x 12° semble être le
meilleur compromis entre performance et résolution pour obtenir une vitesse
raisonnable. De plus, les sujets l’ont spontanément identifié comme étant leur
préféré.
Dans une deuxième étape, 3 volontaires normaux, naïfs à la vision excentrée, se
sont entraînés à réaliser les deux tâches en utilisant une fenêtre de stimulation
stabilisée à 15° d’excentricité dans le champ visuel inférieur. Suite aux résultats de la
première série d’expériences, un champ visuel effectif de 16° x 12° à été choisi. En
accord avec nos premières études sur la lecture, une résolution de 498 pixels à été
jugée comme la plus adéquate pour l’apprentissage en vision excentrée (résolution
effective de 2.6 pixels/deg2). Pour la tâche des jetons, un sujet a immédiatement
atteint des scores2 excellents, alors que les 2 autres sujets ont obtenu des scores
supérieurs à 95% après 4 et 15 séances. La vitesse de placement des jetons3 s’est
stabilisée, pour tous les sujets, à environ 9 s/jeton après 8, 13, et 38 séances. Pour
la tâche des LEDs, l’erreur de pointage4 a convergé autour de 0.7 cm. Cependant, les
résultats des 3 sujets étaient très variables. Les vitesses de pointage5 se sont
stabilisées autour de 5.8 s/LED pour tous les sujets en moins de 8 séances.
2
3
4
5
% de jetons correctement placés (bonne position et orientation).
Temps nécessaire en moyenne pour placer un jeton correctement.
Distance absolue entre la position réelle du LED et l’endroit pointé par le sujet.
Temps moyen nécessaire pour localiser et pointer sur un LED.
x
Études sur la mobilité
La mobilité dépend essentiellement de la capacité à juger les distances
égocentriques et exocentriques, de façon à pouvoir localiser son corps dans l'espace,
percevoir les mouvements, estimer les distances, et estimer la vitesse. Ce type de
tâche pourrait donc imposer des contraintes différentes aux prothèses visuelles que
celles préalablement identifiés lors des deux études précédentes. Les méthodes de
simulation utilisées pour cette série d’expériences sont similaires a celles employées
précédemment. La vision artificielle a été simulée en projetant des images de
résolution limitée, comprenant différentes portions de l’environnement, et stabilisées
à un endroit fixe du champ visuel.
Dans un premier temps, les conditions minimales nécessaires à une mobilité
efficace ont été déterminées en vision centrale. Il est connu que pour des tâches de
mobilité le minimum d’information nécessaire pour avoir une bonne performance
varie en fonction de l’environnement où la tâche est accomplie. Par conséquent,
l’évaluation du minimum d’information nécessaire à la mobilité a été évaluée sur une
série de tâches dans différentes situations réalistes. Le premier test, le parcours de
laboratoire, a été conçu pour étudier la mobilité dans des environnements intérieurs,
familiers mais aléatoirement arrangés. Cette tâche consistait à réaliser un parcours
comprenant 6 obstacles de la vie quotidienne (une table, une porte, une table avec
une chaise, un passage entre 2 poteaux, et un slalom de 3 poteaux). Le deuxième
test, la forêt aléatoire, a été conçu pour l’évaluation de la mobilité dans un
environnement intérieur, aléatoire et imprédictible, incluant quelques éléments
dynamiques. Pour cette tâche, les sujets devaient traverser d’un point de départ à un
point d’arrivé aléatoires une «forêt artificielle» constituée de 52 obstacles («arbres»)
accommodés aléatoirement. Pendant cette traversée, un nombre variable de
personnes (0, 1 ou 2) pouvaient traverser la forêt. Le sujet devait alors les éviter. Le
dernier test, la traversée d’une route réelle, a été conçu pour évaluer les exigences
de la mobilité dans des environnements réels et dynamiques. La capacité des sujets
à estimer la vitesse et la distance d’objets en approche (voitures) a été plus
particulièrement étudiée. La tâche consistait à juger la possibilité de traverser une
route réelle en fonction de la « qualité/quantité » de l’information visuelle fournie par
le simulateur de vision artificielle6. Les résultats de cet ensemble d’expériences sur la
mobilité ont confirmé que les besoins en information varient selon le type
d’environnement dans lequel la tâche est accomplie. La mobilité dans des
environnements connus requiert relativement peu d’information visuelle:
approximativement 0.2 pixels/deg2 (par exemple, 150 pixels avec un champ visuel
effectif de 33° x 23° ou 600 pixels avec un champ visuel effectif de 66° x 46°). Dans
de tels environnements aucun avantage n’a été constaté avec des champs visuels
plus larges. La capacité de mobilité dans des environnements moins prévisibles s’est
avérée plus sensible au nombre de pixels contenus dans l’aire de stimulation. Pour de
telles tâches, environ 500 pixels sont nécessaires. Une tendance à une meilleure
performance a été observée avec un champ visuel effectif de 33° x 23°. Enfin,
autour de 1000 pixels étaient nécessaires pour se sentir en sécurité pendant des
tâches de mobilité dans des environnements dynamiques, comme celui de la
6
Pour des raisons évidentes de sécurité, les sujets ne devaient pas réellement accomplir la traversée.
xi
traversée d’une route. Dans cette dernière tâche, au fur et a mesure que le nombre
de pixels disponibles dans l’aire de stimulation diminuait, les sujets devaient
compenser la manque de résolution par une autre sources d’information (l’audition).
Des champs visuels plus restreints, offrant une information visuelle plus détaillée,
semblaient, de plus, être avantageux dans ces environnements inconnus et
dynamiques.
Dans un deuxième temps, nous avons évalué les éventuels effets d'apprentissage
lors des tâches de mobilité exécutées en vision excentrée (15° dans le champ visuel
inférieur). Selon les premières expériences de mobilité, un champ visuel effectif de
33° x 23° apparaît comme le meilleur compromis entre un champ visuel assez grand
tout en maintenant une résolution d'image raisonnable. Un tel champ visuel a donc
été choisi pour cette étude. Suivant les résultats des études sur la lecture, une
résolution de 498 pixels à été jugée comme la plus adéquate pour l’apprentissage en
vision excentrée (résolution effective de 0.65 pixels/deg2). La tâche utilisée pour
cette évaluation fut le parcours de laboratoire. Le nombre d’erreurs lors des parcours
s’est stabilisé en moins de 10 séances d’apprentissage. Le temps nécessaire pour
compléter le parcours s’est stabilisé après environ 40 séances. A la fin de l’étude, les
sujets pouvaient accomplir la tâche même plus rapidement en vision excentrée qu’en
vision centrale, dans les mêmes conditions expérimentales.
Études explorant des simulations plus réalistes de la vision
artificielle
Dans les études précédentes, certaines simplifications ont été effectuées lors des
simulations de vision artificielle. D’une part, la pixélisation a été réalisée avec un
algorithme décomposant les images en une matrice de pixels carrés et d’intensité
lumineuse uniforme. Ce type de traitement ne correspond certainement pas a ce que
pourrait être une réponse physiologique évoquée par une prothèse visuelle. D’autre
part, pour les études sur la lecture un algorithme de pixélisation statique a été
appliqué. Dans le but d’évaluer les éventuels avantages/désavantages liés a des
stimuli plus réalistes, une dernière série d’expériences a donc été réalisée, lors de
laquelle les caractéristiques temporelles et spatiales de l’algorithme de réduction de
l’information étaient modifiées. Nous avons exploré l'effet de telles simplifications sur
la tâche la plus «exigeante» en termes d’information: la lecture de textes pleine
page.
Tout d’abord, un algorithme de pixélisation statique (images pré-pixélisées) a été
comparé avec un autre algorithme de pixélisation dynamique (pixélisation en temps
réel de l’image dans la fenêtre de stimulation), à différentes résolutions. Les
performances de lecture (score et vitesse de lecture) mesurées pour 5 sujets
normaux ont montré qu’une pixélisation en temps réel présente un avantage par
rapport à une pixélisation statique équivalente car, en bougeant leurs yeux, les sujets
peuvent intégrer plusieurs images pixélisées légèrement différemment pour
reconnaître les mots plus facilement. Dans un deuxième temps, des performances de
lecture obtenues lors des simulations utilisant des pixels carrés d’intensité uniforme
ont été comparées avec celles obtenues lors des simulations utilisant pixels avec une
xii
distribution d’intensité gaussienne de plusieurs largeurs (σ). Les résultats ont révélé
que, pour certaines gaussiennes de largeurs optimales (0.143 < σ < 0.571), la
performance était très similaire, ou même très légèrement supérieure, à celle
obtenue avec la pixélisation carrée.
Ces expériences démontrent qu’une simulation plus réaliste (pixélisation du
stimulus en temps réel) induit une amélioration des performances de lecture
d’environ 30%. Néanmoins, l’amélioration constatée avec des algorithmes
dynamiques n’est pas suffisante pour diminuer le nombre de pixels nécessaires pour
la lecture de façon significative (par exemple, à la moitié). En outre, pour arriver à de
bonnes performances, il serait souhaitable d’utiliser des paramètres de stimulation
produisant des phosphènes7 avec des largeurs (dispersions) dans la zone de nos
valeurs optimales. Les résultats de futures études électrophysiologiques indiqueront
ce qui sera réellement possible.
Conclusion
L’ensemble des résultats obtenus suggère que 400-500 phosphènes, arrangés
rétinotopiquement sur une surface rétinienne de 10° x 7° (correspondant à une
surface d’implant de 3 x 2 mm2), constituent l’information minimale requise pour
restituer une fonction visuelle utile. Si ce critère minimum est respecté, nous
pouvons espérer rétablir certaines capacités de lecture aux futurs porteurs de ces
prothèses rétiniennes. La coordination visuomotrice et la mobilité dans des
environements familiers sont des tâches moins exigeantes en termes de contenu
d’information que la lecture. De plus, le champ visuel effectif représenté par la
surface active de l’implant devra probablement être optimisé pour chaque tâche. Un
champ visuel effectif avec beaucoup d’agrandissement, contenant de 4 a 6 lettres et
une hauteur d’environ deux lignes de texte (ce qui correspond à un champ visuel de
2° x 1.4° pour un journal typique), est nécessaire pour permettre une lecture
efficace. Un champ visuel effectif d’environ 16° x 12° permet une performance
efficace (lente mais assez précise) pour les tâches de coordination visuomotrice.
Enfin, les tâches de mobilité requièrent un champ visuel effectif d’environ 33° x 23°.
Une période d’apprentissage relativement longue sera également nécessaire à
l’obtention des performances optimales si ces implants doivent être placés loin de la
fovéa. Toutes les prothèses visuelles devraient viser à satisfaire ces critères afin de
pouvoir fournir une réadaptation considérée comme « fonctionnelle » aux patients
aveugles.
7
Perception visuelle isolée évoqué par un moyen de stimulation autre que la lumière (e.g. avec des
courants électriques).
xiii
Resumen
La ceguera es una discapacidad importante ya que la visión constituye una de las
principales modalidades sensoriales y es fundamental para la actividad humana.
Ciertas formas de degeneración de retina pueden conducir a una ceguera absoluta.
Algunas, como la retinitis pigmentosa, pueden ocurrir a una edad relativamente
precoz. Otras causas de ceguera están relacionadas con el envejecimiento y, en
consecuencia, tienen mayor incidencia en personas de edad avanzada. Por lo tanto,
el impacto global de la ceguera se vuelve más importante a medida que la esperanza
de vida aumenta. Los avances tecnológicos recientes han abierto nuevas
perspectivas y hoy en día es posible imaginar prótesis neuronales que permitan
regenerar algún tipo de visión útil a pacientes completamente ciegos. Dichos
dispositivos intentan restituir la sensación visual mediante estimulación eléctrica
directa del tejido nervioso. Este concepto ha sido muy exitoso en el caso de los
implantes cocleares para la rehabilitación de los pacientes sordos (NIH Consensus
Statement, 1995). Varios grupos de investigación han lanzado proyectos ambiciosos
con el objetivo de desarrollar diferentes tipos de prótesis visuales, que interactúan
con el sistema nervioso a diferentes niveles (retina, nervio óptico o corteza cerebral
visual). Los primeros prototipos de dichas prótesis han sido implantados
recientemente (Dobelle, 2000; Chow et al., 2002; Veraart et al., 2003; Humayun et
al., 2003), lo que sugiere que el sueño de una ayuda visual útil podría volverse
realidad en un futuro no tan lejano.
Cada día hay más evidencia que sugiere que en un futuro cercano las prótesis
visuales podrían traer a los pacientes ciegos beneficios semejantes a los que ya
proporcionan los implantes cocleares a los pacientes sordos. La mayoría de los
esfuerzos parecen estar concentrados en el desarrollo de soluciones técnicas para las
prótesis visuales (microelectrónica, biocompatibilidad, electrofisiología, etc…). Sin
embargo, un aspecto clave de este desarrollo parece atraer muy poca atención:
¿Cuáles son los requerimientos de base para obtener una visión artificial útil? En
otras palabras: ¿Cuál es la información visual mínima que debe ser transmitida al
sistema nervioso para poder realizar las tareas básicas de la vida cotidiana?
Justificación
El objetivo de los estudios presentados en esta tesis es delinear los
requerimientos visuales mínimos para lograr una rehabilitación funcional adecuada
mediante un dispositivo de visión artificial. El conocimiento del mínimo de
información que debe ser transmitida al cerebro para restaurar una función visual
“útil” es esencial, teórica y prácticamente, para el diseño de prótesis visuales. La
historia del desarrollo de los implantes cocleares subraya la importancia de este tipo
de estudios. Adelantos como el desarrollo de los implantes cocleares multicanal, que
permiten la discriminación adecuada del habla, son el resultado de estudios
xv
psicofísicos (Tong et al., 1983; Eddington et al., 1998a; Eddington et al., 1998b).
Esta tesis pretende, por consiguiente, determinar este tipo de parámetros
anticipadamente para evitar el uso a gran escala de prótesis que tengan una
cantidad insuficiente de contactos de estimulación.
Métodos generales
Los métodos experimentales utilizados durante este estudio se diseñaron para
imitar, de la manera más realista posible, las percepciones visuales que producirán
las prótesis retinales. Estos dispositivos presentan ciertas características que
restringirán importantemente el tipo de percepciones visuales que podrán ser
evocadas. Las prótesis retinales consistirán en una cantidad finita de contactos de
estimulación (resolución de la imagen limitada), estarán implantadas en un lugar fijo
de la retina y cubrirán solo una fracción del campo visual natural (limitado por el
tamaño del área activa del implante). Por otra parte, probablemente deberán
considerarse áreas de implantación muy excéntricas ya que la anatomo-fisiología de
la retina no favorece una localización foveal para estas prótesis (Sjöstrand et al.,
1999a; Sjöstrand et al., 1999b). Las mejores ubicaciones para, potencialmente,
preservar una activación retinotópica sin mayor distorsión de la imagen se
encuentran a más de 10° de excentricidad. Esto quiere decir que las percepciones
visuales de los futuros portadores de las prótesis retinales estarán limitadas a
pequeñas áreas periféricas del campo visual.
Se diseñó un dispositivo experimental para simular dichas condiciones de visión
artificial en voluntarios con visión normal. Este sistema permite la presentación de
imágenes de resolución limitada (pixeladas) estabilizadas en áreas del campo visual
de determinada excentricidad. Por lo tanto, este simulador permite imitar el tipo de
información visual trasmitido por una prótesis retinal y, al mismo tiempo, permite
efectuar cambios paramétricos en la cantidad y naturaleza de dicha información. Se
evaluaron los requerimientos de las funciones visuales básicas: la identificación de
objetos/símbolos pequeños como se requiere durante la lectura, y la localización de
objetos y del propio cuerpo en el espacio como se requiere durante las tareas de
coordinación visuomotriz y de movilidad.
Experimentos sobre la lectura
La lectura es una actividad extremadamente importante en las sociedades
modernas y constituye la meta principal de la rehabilitación de los pacientes de baja
visión. El análisis sistemático de esta tarea es fundamental para la evaluación de las
perspectivas de rehabilitación que podrán ofrecerse a los futuros portadores de
prótesis visuales.
El estímulo visual que se utilizó para la primera serie de experimentos sobre la
lectura fueron palabras de cuatro letras (por ejemplo, “alto”). Los resultados
revelaron que el rendimiento de lectura disminuye abruptamente cuando la
información que representa la palabra se reduce más allá de cierto límite (expresado
en número de píxeles): Se requiere un mínimo de 250 píxeles contenidos en un área
xvi
de visión de 10° x 3.5° (3 x 1 mm2 en la retina) para codificar correctamente las
palabras de cuatro letras. A excentricidades más allá de 10° el rendimiento de lectura
decae rápidamente aún con resoluciones de imagen mayores a 250 píxeles. Un
segundo estudio fue dedicado a investigar si el rendimiento de lectura excéntrica
puede mejorarse mediante el entrenamiento. Dos voluntarios con visión normal, sin
previa experiencia en tareas involucrando la visión excéntrica, fueron entrenados
para leer palabras de cuatro letras bajo condiciones simuladas de visión artificial, a
15° de excentricidad (en el campo visual inferior). El rendimiento de lectura de
ambos sujetos mejoró notablemente durante el periodo de entrenamiento
(aproximadamente un mes de entrenamiento diario; 1 hora/día). Durante las
primeras sesiones experimentales los sujetos lograban leer correctamente solo 6% y
23% de las palabras presentadas; al final del experimento, su rendimiento mejoró
hasta alcanzar un 64% y 85% de palabras correctamente identificadas. Algunos
experimentos de control demostraron que el proceso de aprendizaje consistió,
principalmente, en adaptarse a utilizar un área excéntrica de la retina para realizar la
tarea de lectura.
Una segunda serie de experimentos fue diseñada para investigar una tarea más
realista: la lectura de páginas de texto, incluyendo el uso de los movimientos
oculares para la navegación sobre la misma, en condiciones similares de visión
artificial imitando una prótesis de retina excéntrica. Tres voluntarios sin experiencia
previa en la tarea fueron entrenados durante casi dos meses (aproximadamente 1
hora/día) para leer textos de esta manera. Los sujetos debían usar sus propios
movimientos oculares para desplazar una ventana de visión de 10° x 7° que contenía
572 píxeles8, estabilizada a 15° de excentricidad en su campo visual inferior. Las
tasas de lectura9 iniciales fueron muy bajas en dos sujetos, y extraordinariamente
altas en el caso del tercero (alrededor de 86%). Sin embargo, los tres voluntarios
mejoraron su rendimiento con el tiempo, alcanzando tasas de lectura casi perfectas
(entre 86% y 98%) al final del experimento. Las velocidades de lectura10 iniciales
fueron muy bajas, de 1 a 5 palabras/min, y con el entrenamiento aumentaron
significativamente a 14-28 palabras/min. También se llevó a cabo un análisis
cualitativo de la comprensión general de los textos leídos. Este análisis reveló que se
requieren tasas de lectura de al menos 85% de palabras correctamente identificadas
para alcanzar una “buena” comprensión de los textos. El análisis posición de los ojos
con respecto a las páginas de texto demostró que una parte importante del
aprendizaje consistió en la mejora del control de los movimientos oculares
excéntricos, especialmente la supresión de los sacádicos verticales reflejos
(intentando centrar la ventana de visión en la fovea).
8
Resolución de imagen equivalente al mínimo obtenido en los primeros experimentos con las palabras
de 4 letras. El lector debe tener en cuenta que para los experimentos de lectura de páginas enteras
de texto, la ventana de visión medía el doble que en los experimentos de lectura de palabras de 4
letras, por lo cual también se duplicó el número de píxeles.
9
% de palabras correctamente identificadas.
10
Número de palabras correctamente identificadas por unidad de tiempo.
xvii
Experimentos sobre la coordinación visuomotriz
La falta de resolución puede afectar seriamente la coordinación visuomotriz,
esencialmente aquellas actividades que requieren la identificación de objetos. Por
otro lado, se sabe que los defectos de la visión periférica afectan las capacidades
básicas de localización y orientación, lo que también puede tener un impacto en las
tareas de coordinación visuomotriz. La codificación de la información espacial y la
utilización de dicha información con el objetivo de dirigir una respuesta motora
particular puede, por lo tanto, imponer ciertas condiciones (con respecto a los
requerimientos de información) a una prótesis visual.
Se diseñaron dos pruebas para explorar los aspectos principales de la
coordinación visuomotriz. La primera de ellas, la prueba de las fichas, consistía en
reconocer figuras simples dibujadas en fichas de madera para luego colocarlas en la
posición y orientación adecuadas sobre patrones aleatorios. La segunda
configuración, la prueba de los LEDs, consistía en señalar con el dedo, tan
precisamente como fuera posible, puntos luminosos (LED) que se encendían al azar
bajo una pantalla sensible al tacto (touch screen. De manera similar a los
experimentos sobre la lectura, se simularon las condiciones de visión artificial
proyectando imágenes de resolución limitada en una ventana de visión de 10° x 7°,
estabilizada en posiciones determinadas del campo visual. Durante estos
experimentos también se investigó la importancia del tamaño del campo visual
efectivo11 representado por la imagen proyectada en la ventana de visión.
En un primer experimento se evaluaron las condiciones visuales mínimas
necesarias para alcanzar un rendimiento óptimo en visión central. Tanto el número
de píxeles contenidos en la ventana de visión como el tamaño del campo visual
efectivo proyectado a su interior afectaron el rendimiento visuomotriz; varias
combinaciones de estos dos parámetros permitieron un buen desempeño en ambas
pruebas. Sin embargo, los resultados revelaron un límite fundamental para el
rendimiento visuomotriz: una resolución efectiva de 2 píxeles/deg2 (es decir: 100
píxeles con un campo visual efectivo de 8° x de 6°, 400 píxeles con un campo visual
efectivo de 16° x de 12° ó 1600 píxeles con un campo visual efectivo de 33° x de
23°). Un campo visual efectivo de aproximadamente 16° x 12° resultó ser el mejor
compromiso entre una resolución suficientemente detallada y un campo visual
suficientemente grande para estas actividades; además, los voluntarios expresaron
espontáneamente su preferencia por este campo visual.
En un segundo experimento se entrenó a tres voluntarios con visión normal, sin
experiencia previa de visión excéntrica, a realizar las dos pruebas de coordinación
visuomotriz utilizando una ventana de visión estabilizada a 15° de excentricidad en el
campo visual inferior. Para este experimento se utilizó un campo visual efectivo de
16° x 12°. Dados los resultados de los experimentos sobre la lectura, se eligió una
resolución de imagen de 498 píxeles para aprender a realizar las tareas en visión
excéntrica (resolución efectiva de 2.6 píxeles/deg2). Durante la prueba de las fichas,
11
Porción del espacio visible simultáneamente en la imagen representada en la ventana de visión.
xviii
uno de los sujetos logró tasas de posicionamiento12 excelentes de manera inmediata.
Los otros dos sujetos necesitaron entre 4 y 15 sesiones para obtener
sistemáticamente tasas superiores a 95%. El tiempo promedio para colocar una ficha
correctamente se estabilizó alrededor de 9 s/ficha tras 8, 13 y 38 sesiones. Durante
la prueba de los LEDs, la precisión de localización13 convergió alrededor de 0.7 cm,
pero los resultados fueron muy variables. La velocidad de señalización se estabilizó
en 8 sesiones alrededor de 5 s/LED.
Experimentos sobre la movilidad
De manera general, la movilidad requiere la capacidad de juzgar distancias
egocéntricas y exocéntricas para poder resolver cuestiones como la localización del
propio cuerpo en el espacio, la percepción del movimiento, la estimación de
distancias y la estimación de velocidades. Por lo tanto, estas tareas pueden tener
requerimientos de información diferentes a los ya delineados para las tareas
estudiadas con anterioridad. Al igual que en los experimentos previos, se simularon
las condiciones de visión artificial proyectando imágenes de resolución limitada en
una ventana de visión de 10° x 7°, que estaba estabilizada en posiciones fijas del
campo visual y que contenía diferentes fracciones del entorno visual.
Primero se determinaron los requerimientos mínimos para la movilidad en visión
central. Dichos requerimientos parecen depender de las condiciones en la que la
actividad debe llevarse a cabo. Por lo tanto, se realizaron varias pruebas en distintos
contextos. La primera prueba, el recorrido de laboratorio, fue diseñada para evaluar
el desempeño en entornos interiores, aleatorios pero familiares. Esta prueba
consistía en completar un recorrido de laboratorio compuesto de 6 obstáculos
comunes colocados al azar: una mesa con una silla, una serie de marcas en el piso,
una puerta, tres escalones, el paso entre dos postes y el zigzagueo alrededor de 3
columnas). La segunda prueba, el bosque aleatorio, fue dirigida a evaluar la
movilidad en entornos interiores, aleatorios y desconocidos incluyendo algunos
elementos dinámicos. Los sujetos, desde una posición inicial aleatoria, debían cruzar
un “bosque artificial” compuesto de 52 “árboles” colocados al azar hasta una posición
final aleatoria. Durante este recorrido, un número variable de personas (0, 1 ó 2)
podía cruzar el bosque perpendicularmente a la trayectoria de los sujetos, y todo
choque con estas personas debía evitarse. La última prueba, el cruce de una calle
real, fue diseñada para estudiar los requerimientos visuales de la movilidad en un
entorno real y dinámico. En este caso, se evaluó la capacidad de estimar la velocidad
y la distancia de objetos (automóviles) en acercamiento. La prueba consistía en
juzgar la posibilidad de cruzar una calle de tránsito regular (sentido único) en función
de la cantidad/calidad de información visual proporcionada por el simulador de visión
artificial14. Los resultados de esta serie de experimentos confirmaron que la cantidad
mínima de información requerida durante las tareas de movilidad varía de acuerdo al
contexto en que éstas deben realizarse. La movilidad en entornos interiores y
12
% de fichas correctamente colocadas (posición y orientación adecuadas).
Calculada como el promedio de los errores en la señalización (distancia absoluta entre la ubicación
real del LED y la posición indicada por los sujetos).
14
Por razones obvias de seguridad los sujetos no debían cruzar la calle realmente.
13
xix
conocidos requiere relativamente poca información: aproximadamente 0.2
píxeles/deg2 (es decir 150 píxeles con un campo visual efectivo de 33° x 23° ó 600
píxeles con un campo visual efectivo de 66° x 46°). Los campos de visión amplios no
parecen representar ninguna ventaja en estas circunstancias. Las tareas de movilidad
en entornos menos previsibles que incluyen ciertos elementos dinámicos, como en la
tarea del bosque aleatorio, parecen ser más sensibles a la cantidad de información
disponible en la ventana de visión, requiriendo alrededor de 500 píxeles. Los campos
de visión de alrededor de 33° x 23° parecen favorecer el desempeño en este caso.
Finalmente, se necesitan aproximadamente 1000 píxeles para sentirse seguro
durante las tareas de movilidad en entornos reales, desconocidos y dinámicos, como
en la prueba del cruce de una calle real. A medida que se reducía la cantidad de
píxeles disponibles en la ventana de visión, los sujetos debían buscar alternativas
para compensar la falta de información (por ejemplo, utilizando más la audición). Los
campos visuales más reducidos, que proporcionan una información más detallada del
entorno, parecen constituir una ventaja en estos contextos.
En segundo lugar se evaluaron los posibles efectos de aprendizaje cuando las
tareas de movilidad deben realizarse en visión excéntrica (15° en el campo visual
inferior). De acuerdo a los resultados de los primeros experimentos de movilidad, un
campo visual efectivo de 33° x 23° parece representar el mejor compromiso para el
desempeño. Dicha condición proporciona una visión global del entorno lo
suficientemente amplia mientras mantiene un nivel de resolución de imagen
razonable. Por consiguiente, se eligió esta condición para el segundo estudio. Para
ser consistente con los resultados de los experimentos sobre la lectura, se eligió una
resolución de imagen de 498 píxeles para este experimento (es decir, una resolución
efectiva de 0.65 píxeles/deg2). En este caso se evaluó la movilidad utilizando la
prueba del recorrido de laboratorio descrita anteriormente. La cantidad de errores
por recorrido disminuyó con el tiempo, y alcanzó una asíntota en 10 sesiones. El
tiempo necesario para completar el recorrido disminuyó de forma considerable con el
entrenamiento y alcanzó valores estables en alrededor de 40 sesiones.
Sorprendentemente, después del entrenamiento los sujetos lograron un mejor
desempeño en visión excéntrica que en visión central.
Experimentos explorando simulaciones más realistas de la
visión artificial
En los estudios anteriores se hicieron ciertas simplificaciones en las simulaciones
de visión artificial. Por una parte, la pixelización (reducción de resolución de la
imagen) se llevó a cabo con un algoritmo que descompone las imágenes en matrices
de píxeles cuadrados y de intensidad uniforme (pixelización cuadrada). Dicho
algoritmo es adecuado para simular los estímulos a resolución reducida transmitidos
por un implante retinal ya que, en teoría, la forma de los píxeles no altera el
contenido de información global presente en la imagen. Sin embargo, este tipo de
procesamiento no corresponde a lo que podría ser la respuesta fisiológica evocada
por una prótesis visual. Por otra parte, durante los experimentos sobre la lectura se
utilizó un algoritmo estático para el procesamiento de imágenes. Por consiguiente, se
xx
llevó a cabo una última serie de experimentos para investigar las eventuales ventajas
o desventajas relacionadas a estímulos más realistas, variando las características
temporales y espaciales del algoritmo de reducción de información. Se evaluó el
efecto de dichas simplificaciones en la tarea más “exigente” en términos de
información: la lectura de páginas de texto.
Primero se comparó el rendimiento de lectura de páginas de texto procesadas
utilizando algoritmos de pixelización estática (imágenes pre-procesadas) con aquellas
procesadas con algoritmos de pixelización dinámica (procesamiento en tiempo real
de la imagen presentada en la ventana de visión). Este comparativo se realizó a
diferentes niveles de resolución de imagen. El rendimiento de lectura (tasa y
velocidad de lectura) medidas en 5 voluntarios con visión normal mostró que la
pixelización dinámica representa una ventaja comparada a su equivalente estático,
ya que mediante los movimientos oculares los sujetos pueden integrar imágenes
ligeramente distintas de una misma palabra y reconocerla con más facilidad. En
segundo lugar se compararon simulaciones en las que se utilizaron píxeles cuadrados
con otras utilizando píxeles cuya intensidad variaba de acuerdo a distribuciones
gaussianas con diferentes dispersiones (σ). Éste comparativo demostró que con
ciertas dispersiones gaussianas óptimas (0.143 < σ < 0.571) el rendimiento de
lectura es muy similar o ligeramente superior al obtenido con la pixelización cuadrada
equivalente.
Este conjunto de experimentos demuestra que simulaciones más realistas
(pixelización del estímulo en tiempo real) conducen a una mejora en el rendimiento
de lectura de aproximadamente 30%. Sin embargo, dicho efecto no es suficiente
para reducir de manera significativa (por ejemplo, a la mitad) la cantidad de píxeles
necesarios para un buen rendimiento de lectura. Además, para lograr un buen
rendimiento sería conveniente utilizar parámetros de estimulación que produzcan
fosfenos15 con dispersiones dentro del rango de valores óptimos determinados en
nuestros experimentos. Los resultados de futuros estudios electrofisiológicos
mostrarán las posibilidades reales.
Conclusión
Este conjunto de resultados sugiere que alrededor de 500 fosfenos, distribuídos
retinotópicamente sobre una superficie retinal de 10° x 7° (lo que corresponde a un
implante de 3 x 2 mm2), es la información mínima requerida para restituir una
función visual útil. Si este criterio es respetado, sería posible restablecer ciertas
capacidades de lectura a los futuros usuarios de prótesis visuales. La coordinación
visuomotriz y la movilidad parecen ser menos exigentes en términos de contenido de
información que la lectura. Sin embargo, el campo visual efectivo representado en la
superficie activa del implante deberá optimizarse para cada tipo de actividad. Para
una lectura eficaz, es necesario un campo visual efectivo con mucho aumento que
contenga de 4 a 6 letras y 2 líneas de texto (lo que corresponde a un campo visual
de 2° x 1.4° para un periódico normal; aprox. 180 píxeles/deg2). Un campo visual
15
Percepción visual (punto luminoso) generada mediante una estimulación diferente a la luz (por
ejemplo con corrientes eléctricas).
xxi
efectivo de alrededor de 16° x 12° parece permitir un desempeño adecuado (lento
pero preciso) en las tareas de coordinación visuomotriz. La movilidad requiere un
campo visual efectivo de aproximadamente 33° x 23°. Finalmente, si estos
dispositivos deben implantarse lejos de la fovea, se requerirá un periodo de
aprendizaje relativamente largo para obtener un rendimiento óptimo. Todas las
prótesis visuales deben intentar satisfacer estos criterios para proveer una
rehabilitación funcional adecuada a los pacientes ciegos.
xxii
1 Introduction
The eyes are not responsible when the mind does the seeing.
Publilius Syrus (~100 B.C.)
Blindness is generally considered as one of the most serious handicaps for the
human being. Due to the complexity of the visual system and despite numerous
research efforts, rehabilitation developments in this field seem to be slow. However,
this pessimistic picture might change soon since technological advances offer a whole
new range of possibilities.
Several projects aiming towards developing visual prostheses have been launched
recently. Such devices would restore lost function by bypassing the damaged
structures and electrically stimulating the remaining visual pathway. Obviously,
damage to other structures should be avoided and the ‘artificial vision’ provided by
the prosthesis has to be useful to the patient. A number of important factors
(electrical, surgical, biocompatibility, psychophysical) have thus to be thoroughly
investigated and considered.
This chapter intends to set the necessary background to understand visual
perception in the context of artificial vision and the importance of the determination
of minimum requirements for a visual prosthesis to restore useful vision. First, the
basic anatomy and physiology of the visual system will be outlined and the basic
issues on blindness and low vision will be discussed. Further along, the concept of
visual prostheses will be explained in more detail. The different design approaches
will be described and compared, summarizing the current status of research. Finally,
alternative prosthetic approaches, interfacing with the visual system by other means
than electrical stimulation, will be introduced.
1.1 Anatomy and Physiology of Vision
Many consider vision as the most important sense for man and it is one of the
best-studied senses since its intricacy has fascinated many researchers.
Senses are the input channels from which we perceive, understand and figure out
the surrounding world. Perception begins in the receptor cells, sensitive to some type
of stimuli. The information gathered is then transferred through the different sensory
pathways to the cerebral cortex.
The visual pathway is schematized in figure 1. In the case of the visual system,
the physical stimulus is light (electromagnetic waves in the visible spectrum). Light
enters the eye and is transformed into electrical signals by the retina. The resulting
neural signals leave the retina through fibers that constitute the optic nerve. Optic
nerves coming from both eyes get reorganized at the optic chiasm, forming the optic
1
2
INTRODUCTION
tract. Finally, the optic tract projects to
a number of neural structures including
the visual cortex, where the brain
makes sense of it all, completing the
retino-cortical pathway.
All along the visual pathway, the
visual image is processed hierarchically
through parallel channels beginning
already at the retina (Livingstone &
Hubel, 1988). The most important are
the magnocellular (M) and parvocellular
(P) pathways, which originate in the
retinal ganglion cell layer. Briefly, the M
pathway is mainly dedicated to the
perception of depth and motion, while
the P pathway is mostly concerned with
the perception of color and fine detail.
In the following sections, the main
structures of the visual pathway will be
exposed in more detail.
1.1.1 The Eye
Figure 1. Schematic view of the visual pathway.
Reproduced from Bartleby.com©16 (Gray, 1918).
The eye can be considered as an
advanced optical device designed to
focus the visual stimulus on the receptors, avoiding all possible distortion. In
humans, each eye constitutes a separate optical system, each forming a single
image. The structure of the eye is schematized in figure 2a. It is composed of a set
of fluid-filled chambers: the anterior chamber (between the cornea and the iris; filled
with aqueous humor), the posterior chamber (between the iris, the zonule fibers and
the crystalline lens; filled with aqueous humor), and vitreous chamber (between the
crystalline lens and the retina; filled with vitreous humor). Three layers of tissue
cover these chambers. The cornea (frontal transparent surface), the sclera (white
and opaque surface covering most of the eye), and the limbus (annular tissue
dividing the first two; see fig. 2b) form the outer layer. The middle coat, or uveal
tract, is made of three distinct but continuous structures: the iris (annular tissue just
behind the cornea), the ciliary body (a muscular ring), and the choroid (mainly blood
vessels and dense melanin pigment). The pupil is the hole in the center of the iris
through which light enters the eye cavity. The inner layer, the retina, constitutes the
neural part of the eye. It is located at the image plane of the eye’s optical system
and is juxtaposed to the pigment epithelium.
Light, focused by the cornea and crystalline lens, traverses the vitreous chamber
to reach the retina. The eye being an advanced optical system, it is capable of
16
http://www.bartleby.com/107/illus763.html
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
3
b)
Figure 2. a) Structure and optics of the eye. Adapted from Webvision©17 (Kolb et al., 2003). b)
Schematic representation of the limbus. Modified from the website of the Department of Optometry
and Visual Science of the City University of London18.
adapting to environment conditions and/or task requirements. The iris controls pupil
size so that more or less light is allowed to enter the eye. Through a procedure called
accommodation, the processes of the ciliary muscle constantly modify the refractive
power of the crystalline lens in order to form a sharper image converging on the
retinal visual axis or fovea. Any light that is not absorbed by the retina is absorbed
by the pigment epithelium, thus preventing image degradation that could result from
reflections on the back of the eye.
1.1.1.1 The Retina
The retina is, by far, the most complex structure in the eye. It actually forms part
of the central nervous system, representing, therefore, an excellent model for
studying sensory transduction and for understanding information processing in
higher brain circuits. Processing at this stage deserves to be described separately
and in more detail.
The retina presents a laminar structure (fig. 3a) consisting of three layers of
neural cell bodies or nuclear layers, and two layers of synaptic connections or
plexiform layers. Additionally, an outer limiting membrane separates it from the
choroid and an inner limiting membrane separates it from the vitreous chamber. The
retinal pigment epithelium lays adjacent to the neural retina. Even though this
melanin pigment has no neural tissue, its role is essential for optimal visual
perception. Its main functions are: preventing light reflection, providing metabolic
support for photoreceptors, and contributing to adaptation.
The outer nuclear layer (ONL) constitutes the photosensitive coat of the retina
formed by photoreceptor cells of two types: rods and cones. Cones function in bright
light and are responsible for fine discrimination and color vision. Rods function in dim
17
18
http://www.webvision.med.utah.edu/imageswv/draweye.jpeg
http://www.city.ac.uk/optometry/Biolabs/Outer%20Coat%20Lab/Outer%20Coat.htm
4
a)
INTRODUCTION
b)
Figure 3. a) Simplified structure of the retina. The retina is composed of three layers of neural cells
and two synaptic layers surrounded by limiting membranes and the pigment epithelium. Modified after
the original taken from Webvision©20 (Kolb et al., 2003). b) Density of rods and cones across the
human retina. Image taken from Webvision©21 (Osterberg, 1935). Cone density peaks in the fovea
and falls rapidly outside this region. Maximum rod density is found at around 18° eccentricity. There
are no photoreceptors in the optic disc (blind spot).
light settings, and are responsible for night vision. These photoreceptors are not
equally distributed across the retina (see fig. 3b; Osterberg, 1935; Curcio et al.,
1987). The fovea contains exclusively cones, providing the highest image resolution.
At 10° eccentricity19, cone density rapidly decreases down to a concentration of 5%
of the local photoreceptor count. Inversely, rod density grows rapidly up to 18° of
eccentricity. The inner nuclear layer (INL) contains several classes of cells:
horizontal, bipolar, amacrine, and interplexiform. The ganglion cell layer (GCL) is
formed of ganglion cell bodies and some displaced amacrine cells (Wässle et al.,
1989). Ganglion cells are the retinal neurons that ultimately transmit the visual
output of the retina to the remaining visual pathway. Similar to the photoreceptors,
ganglion cell distribution varies across the retina (Sjöstrand et al., 1999b). The fovea
contains no ganglion cells; ganglion cell density increases rapidly to a maximum at
around 5° (foveal border) and then decreases again towards the retinal border. The
nerve fiber layer (NFL) is formed by the unmyelinated axons of ganglion cells
crossing the retina towards the optic disc (optic nerve head). At the outer plexiform
layer (OPL) synapses are formed between: (1) different photoreceptors, (2)
photoreceptors and bipolar cells, and (3) photoreceptors and horizontal cells. The
inner plexiform layer (IPL) contains synapses between: (1) bipolar and ganglion cells,
(2) different amacrine cells, and (3) amacrine and ganglion cells. Photoreceptor,
horizontal, and bipolar cells work with graded potentials while amacrine and ganglion
cells generate neural action potentials.
19
1 mm in the retina corresponds approximately to 3.6° of visual angle (Drasdo & Fowler, 1974).
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
5
Information transfer in the retina follows a columnar model. Light must pass
through all retinal layers in order to activate the photoreceptors, lying in the most
distal part of the retina. The visual pigment of photoreceptors absorbs light photons,
and these cells translate the information to an electrochemical message. Bipolar cells
divide the dynamic range by separating the information into ON/OFF pathways and
relay the information to ganglion cells. As soon as the stimulation signal reaches
threshold, ganglion cells respond with action potentials of a frequency proportional to
the graded stimulus. In addition, vertical modules composed by horizontal and
amacrine cells integrate information from neighboring regions to maximize
information content and highlight important image features.
a)
b)
Figure 4. a) Schematization of subsequent connections of retinal cells, illustrating the complexity of
the retinal circuitry. A single ganglion cell might code the information of several photoreceptors, and,
at the same time, a single photoreceptor might interact with various neurons. Taken from the
website22 of the Virginia-Maryland Regional College of Veterinary Medicine. b) Section of the human
fovea illustrating the lateral displacement of photoreceptor-ganglion cell connections. Adapted from
Webvision©23 (Kolb et al., 2003).
Signals are transformed in many ways as they travel through the retina so that a
final message containing several basic image organization features is transmitted
towards the brain. The image processing circuitry of the visual system is already
launched at the early photoreceptor stage to avoid signal deterioration.
Photoreceptor sensitivity is optimized through a huge range of light intensities (from
dark night to bright sunlight) by an adaptive process. At a given background
illumination level, photoreceptors can only detect intensity differences of about three
orders of magnitude. As luminous conditions fluctuate, their operating range is
shifted up or down the intensity scale. Contrast detection is thus developed to the
20
21
22
23
http://www.webvision.med.utah.edu/imageswv/schem.jpeg
http://www.webvision.med.utah.edu/imageswv/Ostergr.jpeg
http://education.vetmed.vt.edu/Curriculum/VM8054/EYE/RETWIRES.JPG
http://www.webvision.med.utah.edu/imageswv/hufovea.jpeg
6
INTRODUCTION
utmost: perception is possible over a large range of intensities without sacrificing
discrimination (Cotter, 1990). Further along, bipolar cells perform some form of data
compression by responding only at borders between dark and light areas
(center/surround receptive fields). These receptive fields also allow for relative
measurements of color and brightness independent of lighting conditions (Stetten,
2000). Another important consideration concerns retinal cell distribution, proportions,
and subsequent connections. There is a dramatic decrease from photoreceptor count
(about 125 million) to number of ganglion cells (about 1 million), and it has already
been mentioned that the different retinal cells are not uniformly distributed. As a
result, several photoreceptors contact a bipolar cell, and many bipolar cells contact a
single ganglion cell. At the same time, information gets splitted as one photoreceptor
may interact with various neurons (see fig. 4a). It has been estimated that around
the fovea (2°-3° of eccentricity) a maximum of 3 cones are connected to each retinal
ganglion cell (Sjöstrand et al., 1999a). This cone to ganglion cell connection ratio
decreases with eccentricity, reaching 1 at about 10° and 0.5 at eccentricities of 19°
and beyond. Due to retinal topography, these connections are laterally displaced:
ganglion cells connected to central photoreceptors are distributed over a larger area
and are located at greater eccentricities. This lateral displacement becomes less
important as eccentricity increases (see fig. 4b). Temporal coding of multiple inputs
and nonlinear properties of synaptic cells are used to prevent any information loss
(Slaughter, 1990).
1.1.2 The Optic Nerve and Optic Tract
The optic nerve and the optic tract are formed by the ensemble of axons of
retinal ganglion cells and constitute the wiring of the visual system. Ganglion cell
axons become myelinated at the optic disc, forming the optic nerve. The central
artery and vein of the retina pass through its center. It has been suggested that
there is a rough topographic representation within the optic disc and that visuotopic
organization varies across the nerve’s length (Fitzgibbon & Taylor, 1996).
Optic nerves from both eyes join at the optic chiasm, forming the optic tract.
Fibers get reorganized so that each optic tract contains axons from the opposite
visual hemifield (see fig. 1). Optic tracts project to several structures. Some fibers
reach the suprachiasmatic nucleus (SCN) in the hypothalamus, whose main function
is the regulation of circadian rhythms24. Input reaching the pretectum is used for
pupil control and accommodation reflexes. The superior colliculus uses the
information for saccadic eye movement control as well as for orienting and attention
movements to novel stimuli. Only the input reaching the lateral geniculate body is
transmitted to the cerebral cortex and ultimately generates visual perception. The
actual function of the geniculate nucleus is not clear yet, but it is believed to control
retinal information flow to the cortex through its feedback connections from other
neural regions.
24
Approximate 24-hour cycle modulating physiological processes.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
7
1.1.3 The Visual Cortex
The visual cortex is the brain region devoted to the integration and discrimination
of the image. This interpretation results in the conscious perception of images as a
representation of the real world. As outlined in figure 5a, it is divided in several
hierarchic areas across which the stimulus is continuously segmented to ultimately
generate perception.
The visual first processing center in the brain is the primary visual cortex (striate
cortex, visual area 1, or V1). Each cerebral hemisphere receives topographically
organized information from the opposite half of the visual field (fig. 5b).
Approximately half of its surface receives input from high-resolution regions (fovea
and its surroundings) and the remainder represents larger peripheral areas. Visual
information is greatly segmented in V1. It is organized into vertical processing
columns of different types. Each column type is concerned with a particular image
feature like orientation, color perception, and spatial frequency. Another type of
columns, ocular dominance columns, receive input from either the right or the left
eye and are thought to be the foundation for stereoscopic vision (ability to perceive
depth and relief).
a)
b)
Figure 5. The visual areas in the brain. a) Unfolded and flattened map of the right hemisphere
displaying the location of different visual areas in the cerebral cortex of the macaque monkey.
Reprinted with permission from Van Essen et al., SCIENCE 255:419-423 (1992) and Felleman et al.,
CEREBRAL CORTEX 1, 1 (1991). Copyright 2006 AAAS and Oxford University Press. b) Retinotopic
map of V1. Image adapted from Mason & Kandel (1991). Copyright 2006 McGraw-Hill Companies,
Inc.
From V1 a number of connections are sequentially made with other visual areas
(see fig. 5a). From V2 and beyond, visual information aspects are further segregated
to specialized areas and organization complexity increases. For example, V4 is
thought to participate in color perception while V5 may play a role in movement
detection and depth estimation.
8
INTRODUCTION
1.2 Blindness and Low Vision
Simply speaking, blindness is the absence of perception of visual stimuli. The
international ICD-1025 standard defines it as a visual acuity of less than 3/60 (0.05)
in the better eye with the best possible correction. In parallel, low vision corresponds
to a best-corrected visual acuity of less than 6/18 (0.3) but better or equal to 3/60.
In 1990, the worldwide estimate of visually impaired people was of 148 million
(Thylefors et al., 1995). Of these, approximately 38 million people were blind, and
110 million were low vision cases at risk of becoming blind. Alarmingly, these
numbers could double by the year 2020. To this date, however, the exact numbers
remain unknown due to an important lack of epidemiological information. What we
certainly know is that blindness represents a lot more than a mere health problem.
On one hand, blind people usually face serious social constraints leading to poorer
social lives, lower education levels, lower employment opportunities, and lower life
expectancies than sighted people (WHO, 1997). A 14-country study on disability
ranked blindness as the fifth most disabling condition behind quadriplegia, dementia,
active psychosis, and paraplegia (Ustun et al., 1999). On the other hand, the
financial burden of blindness is considerable. Based on 1993 figures, it has been
calculated as US$168 billion (Smith & Smith, 1996). The global economic productivity
loss is estimated to grow from 19 billion in the year 2000 to US$50 billion by the year
2020 if no preventive measures are taken (Frick & Foster, 2003).
Now that the complexity of the visual system has been discussed, it is easy to
realize that blindness or low vision can result from lesion or malfunction at any level
of the visual pathway. The seriousness of the affection depends directly on the
structure affected. The World Health Organization (WHO) has dedicated a set of fact
sheets to blindness and its impact (WHO, 1997). More recently, some authors
(Margalit & Sadda, 2003; Congdon et al., 2003) have published comprehensive
reviews on visual impairment causes.
The major causes of blindness are exposed in figure 6. These are cataract,
trachoma, glaucoma, and onchocerciasis (river blindness). Other diseases’ impact is
also acknowledged to be important, but specific numbers are unavailable to this
date. Diabetic retinopathy is publicly documented as the primary cause of visual
impairment amid working age adults (Congdon et al., 2003). Age-related macular
degeneration (AMD), disabling approximately 8 million people worldwide (WHO,
1997), constitutes the most common non-avoidable cause of visual disability.
Retinitis pigmentosa (RP) is the principal cause of inherited blindness with a
prevalence estimated in 1 out of 3000 (Humphries et al., 1992). Statistics from
Prevent Blindness America rank it as the 6th leading cause of blindness in the United
States (4.7%) affecting approximately 100,000 americans (Leonard & Gordon, 2002)
and approximately 1.5 million people worldwide (Boughman et al., 1980; Haim et al.,
1992). In addition, trauma is estimated to be responsible for half a million blindness
cases (Thylefors, 1992).
25
International Statistical Classification of Diseases and Related Health Problems – 10th Revision
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
9
Figure 6. Major causes of blindness worldwide. Data taken from Thylefors et al. (1995).
Causes of blindness vary significantly across economic regions (fig. 6). In
developing countries most disabling conditions could be treated or prevented.
Cataract can be successfully corrected with surgery; trachoma can be treated
through hygiene, antibiotic administration, and corrective surgery; childhood
blindness could be reduced by means of immunization, better nutrition, prophylaxis,
and avoidance of harmful medicines (WHO, 2002). An ambitious initiative aiming to
eradicate avoidable blindness through prevention and eye care programs, VISION
2020, has been launched by the WHO (2000).
In developed countries, like Switzerland, non-avoidable diseases are most
common. The impact of these conditions, mainly age related, is expected to increase
due to current trends of population ageing. Medical intervention is not expected to
significantly reduce their impact in a near future since clinical treatments for the
major diseases in this category are unavailable nowadays. Laser photocoagulation is
roughly the only therapy available for AMD and benefits only a limited number of
patients. Other experimental treatments for this disease like photodynamic therapy,
pharmacologic inhibition, surgical intervention, and radiation therapy are being
explored (Ciulla et al., 1998). Periodic screening and early laser treatment have
proven to be helpful tools for preventing blindness in patients suffering from diabetic
retinopathy, and alternative therapies are currently being studied (Harding, 2003).
Genetic therapy is expected to be the best alternative for retinitis pigmentosa (Hims
et al., 2003).
10
INTRODUCTION
In this context, the role of rehabilitation becomes crucial. Low vision and
blindness aids for daily living have greatly evolved from guide dogs and canes to
complex technological devices, although traditional methods are still greatly
appreciated and frequently used. A number of studies have demonstrated the
efficacy of these systems in improving quality of life and reducing perceived disability
(Margrain, 1999; Margrain, 2000). New systems compensate the visual deficit either
with augmentation devices (strong eyeglasses, telescopes, and video/computer
magnifiers) or by sensory substitution (replacing vision with another sense such as
hearing or touch). The variety of the available aids is enormous; a simple internet
search will return plenty of references and good overviews of such systems can be
easily found in the literature (Peli et al., 1991; Kaczmarek, 2000; Peli, 2001).
1.3 Visual prostheses as a means to rehabilitate blindness
Research and technology are also trying to make use of existing knowledge to
restore visual function when blindness is irreversible. The principle is quite simple:
the broken visual information path would by ‘repaired’ by substituting the defective
structure with some kind of artificial system. Recently, a new alternative has come
into focus: visual prosthetic devices.
The foundations of the visual prosthetic field were established as early as 1755,
when LeRoy discovered that electricity applied to a blind eye resulted in light
perception (Clausen, 1955). Through the years, the effects of electricity on the
human body continued to be explored. In particular, the revolutionary work of
Einthoven gave new insights into the therapeutic use of electricity (see e.g.
Einthoven & Jolly, 1908).
The relationship between electricity and vision were, however, not discussed
again until the 20th century when a group of researchers described phosphenes26
elicited by direct electrical stimulation of the cortex while performing surgery
(Löwenstein & Borchardt, 1918; Krause, 1924; Foerster, 1929; Urban, 1937; Penfield
& Jasper, 1954). These findings led Giles Brindley and his colleagues to the first
attempt of a “visual prosthetic implant” (Brindley & Lewin, 1968a; Brindley & Lewin,
1968b; Brindley, 1973). Two volunteers were implanted with arrays of platinum
electrodes over the occipital cortex and stimuli were delivered by radio transmission.
Some years later, Dobelle followed Brindley’s footsteps. Several experiments with
acute electrode configurations were performed before proceeding to the implantation
of permanent devices (Dobelle & Mladejowsky, 1974; Dobelle et al., 1974; Klomp et
al., 1977). Several volunteers participated in these experiences, and two have kept
the implant for more than 20 years (Dobelle, 2000). Results from both groups
26
The Concise Oxford English Dictionary (Oxford Reference Online, 2004) defines the word
phosphene as: “a sensation of a ring or spot of light produced by pressure on the eyeball or direct
stimulation of the visual system other than by light”. This term will appear frequently throughout this
dissertation but the reader must take it with caution since its definition is quite ambiguous. On one
hand, it makes no allusion to the exact site on the visual pathway where the perception originated
(e.g. visual cortex or retina). On the other hand, this designation does not take into account the
intensity of the perception nor its temporal profile. We use it despite its ambiguity because there is
not a more accurate one and because it is broadly used in the artificial vision community.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
11
demonstrated that phosphenes could be successfully evoked, and Dobelle was even
able to obtain patterned perceptions (Dobelle et al., 1976). Yet, these first attempts
encountered a number of significant difficulties related to the use of surface
electrodes, like merging phosphenes, multiple phosphenes elicited by a single
electrode, fading perceptions, and high currents needed to reach stimulation
threshold. None of these implants proved to be useful. Nevertheless, these
pioneering efforts demonstrated the feasibility of the approach.
Since then, technological advances and success in other rehabilitation fields, like
cochlear implants (Rauschecker & Shannon, 2002) have boosted interest on the idea
of developing visual neuroprostheses. Such a device will be composed of two
modules, one external and one implanted (see fig. 7). In general, the external
module will contain an image capture unit (mini-camera), an image-processing unit,
and a wireless transmission unit communicating with the implanted module. The
internal module will contain all stimulator electronics and the neural interface unit
(electrode array). Some detailed descriptions of a system of this nature can be found
in the scientific literature (Warren & Normann, 2000; Liu et al., 2003).
Figure 7. Basic elements of a visual neuroprosthesis. The external module captures the visual
stimulus, processes and transmits the information to the implanted module. The implanted module
communicates directly with the target neural tissue.
In a few words, the labor of the external module is to capture the visual scene,
transform it into an ‘electrical image’ that can be correctly interpreted by the brain,
and transmit the signals to the implanted module. The image capture unit consists of
a photodiode array, a CCD, or a miniaturized camera. Due to the limited number of
electrodes in the implant, spatial resolution will not impose any particular constraints
on this material. Temporal resolution is not expected to be a problem either since the
response of the human visual system is relatively slow (around 30Hz). The role of
image processing is to modify the input image so that the stimulus that will
ultimately reach the brain contains enough topographic information to be effectively
identified and used. The complexity of this stage will therefore depend directly on the
implantation site (especially on its visuotopic organization) and on the degree of
plasticity of the remaining visual pathway. An encoder for spatial remapping of video
signals on retinal prostheses is currently being developed by Eckmiller (1997). The
description of several image processing techniques for information content
enhancement in artificial vision systems can be found in Boyle et al. (2002). Wireless
transmission will be used to pass on the information to the implanted module and to
deliver power to its components at the same time. This transfer can be achieved by
radio frequency or electromagnetic waves. The number of electrodes in the implant
and the data compression/encoding algorithms used by the image-processing unit
12
INTRODUCTION
will determine the bandwidth needed for transmission. Since the distance between
the transmitter and the receiver will be rather short (approximately 1 to 3 cm),
power coupling can be achieved quite efficiently.
The implanted module will be in charge of communicating with the nervous
system, obviously imposing important physical and biological biocompatibility
considerations. The role of the stimulator is to ‘transduce’ the electrical signals into
neural signals that can correctly stimulate the neurons at the particular implant
location. The main design considerations for this element have been outlined in detail
by Jones & Normann (1997) and in more recent reviews (Maynard, 2001; Margalit et
al., 2002). The electrode array constitutes the actual interface with the neural tissue.
Oxidized iridium is the material most frequently used for stimulation electrodes in
implantable neuroprostheses as it has proven to be highly biocompatible and
effective (Blau et al., 1997; Weiland & Anderson, 2000). In order to avoid tissue
heating, electrolysis and electrode cross-talk, the size and density of this element will
be mainly limited by the intensity of the stimulating currents used. Stimulation
currents will in turn depend directly on electrode geometry, the type of cells being
stimulated, as well as the distance between target cells and the stimulating
electrodes (Palanker et al., 2004).
Currently several groups are working towards the development of a visual
prosthesis, each of them attempting to restore visual function at different levels of
the visual pathway. Several reviews on the subject have been published recently
(Greenberg, 2000; Warren & Normann, 2000; Maynard, 2001; Margalit et al., 2002;
Zrenner, 2002b; Lakhanpal et al., 2003; Weiland et al., 2005). The main
biocompatibility, electrical, and psychophysical design considerations common to all
design approaches have been outlined in a number of research papers (Warren &
Normann, 2000; Humayun, 2001; Maynard, 2001; Margalit et al., 2002). In the
following sections, the different approaches will be presented. Prosthesis designs
using alternatives to electrical stimulation will also be mentioned. Afterwards, the
advantages and disadvantages of each design will be discussed.
1.3.1 Cortical Stimulation
Cortical prostheses attempt to restore vision by direct electrical stimulation of the
primary visual cortex (fig. 8). This approach was the first to be seriously considered.
The pioneering efforts outlined earlier in this chapter (Brindley & Lewin, 1968a;
Brindley & Lewin, 1968b; Dobelle & Mladejowsky, 1974; Dobelle et al., 1974),
correspond to this category. These experiments encountered a number of problems,
related to the use of surface electrodes. Because of the large surface area of the
electrodes (1 mm2), high currents (from 1 mA to 3 mA) were needed to generate
phosphenes. Inter-electrode spacing had to be of 3 mm in order to minimize
interactions between electrodes and even then, subjects reported seeing ‘halos’
surrounding and joining individual phosphenes. The perception a spatially organized
set of multiple phosphenes could never be achieved. Therefore, these devices never
proved to be useful to implanted patients. However, some interesting results have
been published lately by Dobelle (2000). The vision of one of the volunteers
implanted in 1978 with an array of surface electrodes (Klomp et al., 1977), has
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
13
improved significantly due to more
adequate processing of the visual
information by a 5th generation
stimulation system.
New advances in fabrication and
microtechnology have led to the design
and development of more selective
neural interfaces (Rutten, 2002). In the
1990’s,
Schmidt
and
his
group
conducted a series of experiments using
a penetrating electrode array implanted
on the primary visual cortex of a human
Figure 8. Illustration of the concept of a cortical
volunteer (Bak et al., 1990; Schmidt et
visual prosthesis. Image taken from the John A.
al., 1996). Stimulation thresholds were
Moran Eye Institute website27 at the University
as low as 1.9 µA, and separate
of Utah.
phosphenes were detected with closely
spaced stimulating electrodes (only 250
µm or 500 µm apart). This study clearly demonstrated the advantage of using
penetrating electrode arrays instead of surface electrodes, setting the physiological
foundation of most cortically based artificial vision systems being developed
nowadays.
Recent efforts in this field have concentrated in the development of high
electrode count arrays (Jones et al., 1992; Hoogerwerf & Wise, 1994; Kewley D.T. et
al., 1997; Bai et al., 2000; Bai & Wise, 2001). A group at the University of Utah has
developed a silicone based, 10x10 penetrating electrode matrix: the Utah Electrode
Array (Campbell et al., 1991; Jones et al., 1992), specifically conceived as an
interface for the cerebral cortex. This electrode array projects out of a 0.2 mm
substrate designed to rest on the cortical surface (fig. 9a). Electrodes are 1.5 mm
long ending in platinum coated conical tips with a radius of curvature of about 3 µm
(fig. 9b).
A pneumatic tool for adequately inserting the Utah Electrode Array into the cortex
and the surgical procedure for implantation have also been developed (Rousche &
Normann, 1992; Maynard et al., 2000). A series of behavioral experiments conducted
in the auditory cortex of the cat demonstrated that a response could be obtained
over periods of about 100 days with stable thresholds, indicating no damage to
neurons in the vicinity (Rousche & Normann, 1999). Chronic single and multi-unit
responses in the cat and monkey have been recorded over several years, providing
preliminary evidence of the functionality and biocompatibility of the implant (Maynard
et al., 1999). However, these studies have also highlighted a significant problem
concerning the mechanical stability of the implant on the brain. Formation of
adhesions between the dura matter28 and the electrode array have been observed as
a consequence of the immunological response of the implanted tissue (Rousche &
27
28
http://www.moraneyecenter.org/research/normann/normann.htm
Outermost and thickest of the three membranes (meninges) covering the brain and the spinal cord.
14
a)
INTRODUCTION
b)
Figure 9. The Utah Electrode Array. a) Penetrating electrode array developed at the University of
Utah. b) Platinum coated tips of the Utah Electrode Array. Reprinted from VISION RES, 39(15),
Normann et al., A neural interface for a cortical vision prosthesis, 2577-2587, Copyright 1999, with
permission from Elsevier.
Normann, 1998b). This provoked constant relative movement between the brain and
the electrode array29. Histological analyses revealed that this mechanical coupling
resulted in constant displacement of the electrodes within the surrounding cortical
structures, causing local trauma and modifying electrode impedance (Rousche &
Normann, 1998a). A new technique to prevent these dural adhesions, has been
proposed (Maynard et al., 2000). It consists in placing a Teflon® sheet between the
electrode array and the dura matter. This procedure appears to be effective as long
as the Teflon® cover remains in its initial position. The biocompatibility of the Utah
Electrode Array is still being assessed and validated. Once it is completed the
research group plans on proceeding with psychophysical studies of phosphene
perception on human volunteers (Normann et al., 1999).
These encouraging results, altogether with the development of new and more
selective interfaces with the nervous system, have given birth to other visual
prosthesis designs, stimulating the visual system at more peripheral sites, such as
the retina or the optic nerve.
1.3.2 Optic Nerve Stimulation
A research group at the Université Catholique de Louvain in Brussels, Belgium,
has launched a particular visual prosthesis development initiative, interfacing with
29
Brain tissues are not attached together; instead, the brain “floats” on the cerebrospinal fluid, inside
the cranial cavity (Rowland et al., 1991). This configuration provides the necessary support to
maintain the brain’s 3D structure and constitutes a mechanical cushion protecting the nervous tissue
from harmful forces resulting from head movement and impact. Therefore, there is permanent relative
movement between the brain and surrounding tissue.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
15
the visual system at the optic nerve: the
Microsystems Based Visual Prosthesis
(MiViP; Veraart et al., 1998). At this level,
the entire visual field is represented in a
relatively small area; phosphenes could
be, therefore, evoked over a large portion
of the visual field using only a few
contacts.
In February 1998, a female volunteer,
totally blind from RP, was implanted with
a 4-contact self-sizing spiral cuff
electrode placed around the optic nerve
(Veraart et al., 1998). This particular
electrode design had proven effective
selective activation in previous studies
(Veraart et al., 1993). The four electrode
contacts were labeled 0°, 90°, 180°, and
270° according to their angular position
around the optic nerve. Stimulation Figure 10. Retinotopic distribution of 64
currents were brought to the electrode phosphenes according to the active contact in
the self-sizing spiral cuff electrode. Near
through a percutaneaous connector. The threshold a certain relationship between the
first results showed that the electrode did stimulation quadrant and the perception
not induce any sensations other than quadrant was preserved. Reprinted from BRAIN
visual. Phosphenes of different forms, RESEARCH, 813(1), Veraart et al., Visual
colors, and sizes could be successfully sensations produced by optic nerve stimulation
using an implanted self-sizing spiral cuff
elicited.
These
phosphenes
were electrode, pp. 181-186, Copyright 1998, with
disorderly distributed around a visual field permission from Elsevier.
of approximately 60° vertically and 85°
horizontally (Veraart et al., 1998).
Nevertheless, a certain relationship between the stimulation quadrant30 and the
perception quadrant31 was respected near threshold (fig. 10).
In August 2000, the percutaneous connector was replaced by an implanted
neurostimulator and an antenna used for telemetry (Delbeke et al., 2002). The
concept of the system as it is implemented nowadays is schematized in figure 11a.
The implanted components are visible in the x-ray of the volunteer presented in
figure 11b.
For prostheses to provide any form of useful vision, a precise correlation between
stimulation parameters and the character of the perceived phosphenes must
therefore be established. A set of equations intended to define this relationship have
been derived (Delbeke et al., 2003a; Delbeke et al., 2003b). In this case, a
correlation seems to exist between perception threshold and pulse duration, number,
and frequency. Conversely, phosphene characteristics such as luminosity, size, and
30
31
Electrode position around the optic nerve.
Phosphene position in the visual field.
16
a)
INTRODUCTION
b)
Figure 11. Elements of the optic nerve visual prosthesis a) Schema illustrating the concept of the
system. b) X-ray of the blind volunteer showing the implanted components. Reprinted from Veraart et
al., ARTIF ORGANS 27(11):996-1004 (2003). With permission of Blackwell Publishing.
position seem to be best predicted by stimulus intensity. A screening test for
identifying potential optic nerve prosthesis candidates has also been developed
(Delbeke et al., 2001).
Sets of psychophysical studies have been carried out to assess the potential
benefits of the optic nerve prosthesis. For these experiments, interleaved stimulation
was used to evoke patterns of 4 to 24 phosphenes and the volunteer was able to
scan the environment using a head-mounted camera. After several months of
training, the volunteer was able to accurately identify, localize, and grasp different
daily life objects located among others (Lambert et al., 2003). However, the time
needed to complete the task was considerable: about 60 s to grasp a particular
object surrounded by others. Performance also increased significantly with practice in
pattern recognition and orientation discrimination tasks (Delbeke et al., 2002;
Veraart et al., 2003). These tasks were evaluated using images constituted of bars
with different orientations, subtending a visual angle of 32° x 2.2°. For pattern
recognition, the volunteer reached a score of 63% correct in a processing time of 60
s. Concerning orientation discrimination, the volunteer reached a score of 100%
correct in 8 s.
Altogether, these results undoubtedly demonstrate the functionality and safety of
the system. The issue of usefulness of this prosthesis remains however unclear.
Veraart’s group claims that certain basic tasks do not require much resolution, and
that with enough training the volunteer has benefit from the limited visual input
provided by the prosthesis. This is not obvious from the results of psychophysical
tests. For example, it took 60s for the volunteer to correctly identify and grasp a
required object, under high contrast conditions and after training. This processing
time cannot be judged as acceptable under any point of view; one can easily
demonstrate that it takes considerably less time to achieve the same task using
touch. Warren & Normann (2000) suggest that a number of issues be investigated to
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
17
determine the feasibility of the approach: (1) the significance and viability of
functional optic nerve fiber populations; (2) the role of electrical stimulation in
preserving the optic nerve from continued degeneration; (3) the number of
phosphenes required to achieve given levels of task performance; and (4) if
penetrating electrode array designs would produce more focalized stimulation and
better control of phosphene parameters.
1.3.3 Retinal Stimulation
The retina bears a strong
interest in the context of
artificial vision. Due to its
relatively simple structural
organization and surgical
accessibility, it has been
particularly targeted as a
feasible implantation site for
visual prostheses. A number
of studies have revealed
that when blindness is due
to progressive destruction of
rods and cones, inner retinal Figure 12. Retinal implant design approaches: the epiretinal
cells are preserved (Santos implant and the subretinal implant. Reprinted with permission
from Zrenner, SCIENCE 295:1022-25 (2002). Copyright 2006
et al., 1997; Medeiros & AAAS.
Curcio, 2001; Cursiefen et
al., 2001; Kim et al., 2002a;
Kim et al., 2002b). This occurs in AMD and RP, which, as already discussed, are
diseases that have a significant impact in developed countries like Switzerland. This
last approach consists, thus, in electrically stimulating the visual system through
retinal cell layers that are still functional. A research group at the Johns Hopkins
Hospital completed a series of acute experiments with temporary electrode arrays
placed over the surface of the retina of blind volunteers. Their results demonstrated
that electric stimulation at this level results in visual perception, and that simple
forms can be perceived in response to patterned electrical stimulation (Humayun et
al., 1996; Weiland et al., 1999; Humayun et al., 1999). On this basis, two different
retinal prosthesis designs have been envisioned (fig. 12). In the first approach, the
epiretinal implant, the stimulating electrode array is placed over the retinal surface,
between the vitreous humor and the inner limiting membrane. In the second
approach, the subretinal implant, the electrode array is to be positioned in the
subretinal space, substituting the degenerate photoreceptors.
1.3.3.1 Epiretinal Implant
The configuration of an epiretinal implant is similar to the prosthesis designs
mentioned beforehand. It consists of an implanted module including the electrode
array and the stimulator, and an external module comprising a camera that captures
the visual scene and an image processor that transforms the input into a pattern of
18
INTRODUCTION
currents that can be correctly
interpreted by the brain (fig.
13). This approach has been
adopted by several research
groups all around the world
(Wyatt & Rizzo, 1996; Humayun
et al., 1996; Rizzo & Wyatt,
1997; Humayun et al., 1999;
Grumet et al., 2000; Suaning &
Lovell, 2001; Humayun, 2001;
Liu et al., 2003).
Implantable electrode arrays
have been developed and tested
in animals. A research group in
Australia has developed a
specific circuit containing 100
stimulation channels intended
Figure 13. Concept of an epiretinal implant. An external
for
retinal
neurostimulation
camera captures the image (A). The signals are wirelessly
(Suaning & Lovell, 2001), and transmitted (B) to the implanted module (C). The
the
corresponding
surgical stimulating electrode array (D) is implanted over the retinal
implantation technique has been surface. Reprinted from VISION RES, 43(24), Humayun et
developed and tested in the al., Visual perception in a blind subject with a chronic
microelectronic retinal prosthesis, pp. 2573-2581, Copyright
ovine eye (Kerdraon et al.,
2003, with permission from Elsevier.
2002).
Preliminary
results
confirmed that there was no
evidence of macroscopic trauma. However, further experimentation is required to
extensively evaluate both the biocompatibility of the implant and the surgical
procedure. Another research group at the Doheny Retina Institute (Los Angeles,
California, U.S.A.) implanted 4 mixed-breed sighted dogs with 5x5 platinum discshaped electrode arrays. The implants remained in place during a follow-up period of
several months (Majji et al., 1999). Electrophysiological and histological tests
revealed that the retinal tissue under the electrode array was preserved and
remained functional. The feasibility of the surgical procedure and the biocompatibility
of the implanted material were thus demonstrated.
Meantime, a 1st generation retinal implant designed to be used on humans has
been developed by the research group at the Doheny Retina Institute (Fujii et al.,
2003). Two blind volunteers were implanted with 4x4 platinum electrode arrays
attached to an external microelectronic circuit (Yanai et al., 2003). Phosphenes
evoked by a single electrode were described either as a round spot of light or as a
lighted center surrounded by a black ring, darker than the background (Humayun et
al., 2003). Statistical analysis of perception thresholds measured, on one subject,
over the first 10 weeks of testing showed no significant change for 10 electrodes, a
significant decrease for 3 electrodes, and a significant increase for the remaining 3
electrodes. When stimulation was computer-controlled, the subjects were able to
describe the direction of two sequentially activated electrodes as up, down, left, or
right in 70% of the cases. When scanning the environment using a head-mounted
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
19
camera, the subjects were able to detect lighting conditions in 100% of the cases
and to recognize simple forms (the orientation of an L) with 75% of accuracy (Yanai
et al., 2003). Other tests conducted with the head-mounted camera on only one
subject showed that he was able to detect the movement of a flash of light in a dark
room in 100% of the cases, to detect the movement of an object in 80% of the
cases, and to discriminate movement direction in 70% of the cases (Humayun et al.,
2003).
Another research group at the Harvard Medical School performed in-vivo
experiments on acute epiretinal stimulation during surgical trials on humans (Rizzo et
al., 2003a; Rizzo et al., 2003b). The main purpose of these studies, in which five
subjects totally blind from RP and one with normal vision32 participated, was to
determine perceptual thresholds and to explore the relationship between the pattern
of electrical stimulation and the perception induced. Electrical stimulation was
delivered with external current sources and through either needle electrodes (250µm
of diameter), or an iridium oxide microelectrode array (electrode diameters of
400µm, 100µm, or 50µm). The results showed that perception thresholds were
significantly higher in blind patients, exceeding safe charge density estimates.
Furthermore, thresholds increased with the severity of blindness. The lowest
thresholds detected in blind volunteers, with the 400 µm electrodes, were above 0.32
mC/cm2, compared to that of the normal volunteer of 0.08 mC/cm2. These findings
question the feasibility of long-term epiretinal stimulation as the means to restore
vision. Single electrodes induced percepts that were considerably smaller than the
size of the active electrode. Blind subjects perceived forms that matched the
electrical stimulation patterns only in 48% and 32% of the cases for single-electrode
and multiple-electrode trials, respectively. Stimulation patterns and percepts matched
in 57% of the cases in the normal subject. The low matching score observed in the
normally sighted patient suggests that the results observed in blind patients were not
due to retinal pathology alone, but also to the lack of effective stimulation methods.
1.3.3.2 Subretinal Implant
In
the
case
of
subretinal implants, the
optics of the eye and an
array
of
microphotodiodes replace the
external image capture
module of the general
visual prosthesis model
(fig. 14). In other words,
the
photodiodes
incorporated into the
electrode array capture
and
transform
the
32
Figure 14. Concept of a subretinal implant. A microphotodiode
array is used to transform the light entering the eye into a pattern of
electrical stimulation currents. Source: Alfred Stett, NMI Reutlingen.
The volunteer with normal vision underwent enucleation of the eye because of orbital cancer.
20
INTRODUCTION
incident light into electric stimulation currents in situ. This way, the microphotodiodes replace the function of the lost photoreceptors and the optics of the eye
is used to project the light on these detectors. Neither an external capture module,
nor an image-processing unit are therefore needed in this configuration. Two
research groups have adopted this approach (Chow & Chow, 1997; Zrenner et al.,
1997; Peyman et al., 1998; Zrenner et al., 1999; Zrenner, 2002b). It is also worth
mentioning that quite recently the research group from Harvard Medical School
decided to change the design of their prosthesis from the epiretinal approach (Wyatt
& Rizzo, 1996; Rizzo & Wyatt, 1997) to a subretinal design because its inherent
biocompatibility and engineering advantages (Yamauchi et al., 2004; Rizzo et al.,
2004).
To this date, two prototypes of subretinal implants have been developed. These
are silicon-based chips of about 3 mm to 2.5 mm of diameter containing several
hundreds or even thousands of micro-photodiodes. Each of the photodiodes is
connected to a microelectrode that can be fabricated in gold, platinum, oxide iridium
or TiN (Zrenner et al., 1997; Peachey & Chow, 1999).
The surgical techniques for implanting such prototypes have been developed. The
classic procedure, ab-interno, is displayed in figure 15a. It consists in entering the
back of the eye by the vitreous chamber and placing the implant under the retina
through a small incision using a special implantation tool (Peyman et al., 1998;
Zrenner, 2002a). The German consortium, led by Zrenner, has adopted another
approach, ab-externo (fig. 15b). A small scleral flap is prepared and the implant is
pushed between the pigment epithelium and the retina with a custom-made plastic
foil. Both techniques have been tested and validated in animals.
a)
b)
Figure 15. Surgical procedures for implantation of subretinal visual prosthesis. a) Ab-interno
approach. b) Ab-externo approach. Images courtesy of Jan Monzer.
Biocompatibility tests performed in vitro and in vivo in different animal models
have returned very encouraging results. Rabbits and pigs were implanted during 14
months (Schwahn et al., 2001; Kohler et al., 2001). The implants remained in place
and the histology of the retina was not pathologically transformed. Similar tests have
been completed in cats (Pardue et al., 2001; Chow et al., 2001). Follow-ups up to 27
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
21
months after surgery revealed the stability of the implant in the subretinal space. The
comparison of the ERG response of normal and implanted eyes showed that the
latter presented normal waveforms, only slightly smaller in amplitude. Nonetheless,
the outer nuclear layer covering the implant disappeared almost completely,
suggesting that there was some kind of obstruction of the flow of nutrients to these
layers. Future implants will probably have to be perforated in order to allow for the
exchange of nutrients between the pigment epithelium and the surviving retinal
layers.
Concerning the long-term stability of the implants it has been reported that TiN
electrodes do not show any signs of corrosion after 18 months of implantation and
can be considered bio-stable. This was also the case for oxide iridium electrodes
(Chow et al., 2002), but not for electrodes fabricated in gold (Chow et al., 2001).
Spatial resolution and operational range for multisite electric stimulation have
been evaluated on isolated chicken retina (Stett et al., 2000). Ganglion cell
sensitivity, related to locally applied electric charges, seems to be well delimited in
space. An array with an inter-electrode spacing of about 100 µm should allow for
retinotopic stimulation. It has also been demonstrated that light arriving to the retina
in natural conditions does not contain enough energy for the photodiodes to produce
adequate neuronal stimulation (Zrenner et al., 1997). It appears therefore imperative
to provide additional energy (induction or IR) to the implant to cope with this deficit.
Recently, a new idea for a self-powered prosthesis has been proposed by Palanker et
al. (2003). In this design, ambient light falling in regions of the retina not covered by
the implant is gathered with additional photodiodes (placed, for example, in the
anterior chamber). The extra power supplied by these secondary photovoltaic cells
could provide enough energy for adequate stimulation.
The first attempts of implantation on human volunteers have been performed
recently (Chow et al., 2002; Chow et al., 2003). Six subjects received a circular
implant measuring 2 mm of diameter and containing approximately 5000 electrodes.
These researchers report that the patients did not have any problems with the
implant; there was no sign of important infection, rejection, inflammation, migration,
erosion, or retinal detachment. Apparently, all of them presented an improvement of
visual perception (subjective and objective). These surprising results have originated
an intense debate, especially because improvements have been observed outside of
the retinal area being stimulated by the implant and since there is no additional
supply of energy in this model. In this case, it is most probable that the improvement
of visual function experienced by subjects was related to some kind of neurotrophic
effect (Chow et al., 2003).
1.3.3.3 The “CMOS-retina”: a Swiss project
The work for the present dissertation was elaborated within the framework of the
CMOS-retina Swiss project. The ultimate goal of this effort is to develop a prosthesis,
which will effectively restore useful vision to blind patients using subretinal
stimulation. To this purpose, the microtechnical, neurophysiological, psychophysical
and surgical aspects involved in prosthetic vision are being simultaneously
22
INTRODUCTION
investigated. This way, all the necessary fundamental medical and technical
knowledge will be established.
This project involves the collaboration of interdisciplinary teams, each of which is
renowned in its field: the Ophthalmology Clinic of the Geneva University Hospitals
(HUG)33 experienced in clinical ophthalmology, visual psychophysics, and eye
surgery; the Institute for Microsystems (IMS) and the Microelectronics Laboratory
(LEG) at the Lausanne Federal Polytechnic School (EPFL), experienced in
microfabrication techniques; and the Department of Physiology at the medical center
of the University of Geneva (CMU), experienced in retinal physiology. A diagram
illustrating the general organization of the project is presented in figure 16.
Figure 16. Institutional diagram of the CMOS-retina Swiss project. This project unifies synergistic
efforts of a highly interdisciplinary consortium composed by specialists in clinical ophthalmology and
human psychophysics (Geneva University Hospitals, HUG), in microelectronics (Lausanne Federal
Polytechnic School, EPFL) and in electrophysiology (Medical Center of the University of Geneva, CMU).
During the last years, two layouts for a subretinal implant have been developed,
integrated and tested (Ziegler, 2002; Ziegler et al., 2004). The mechanical
procedures for the elaboration of these microchips and for its packaging into a
polyimide protection film have been studied. Passive implants have been
manufactured for in vitro and in vivo testing and a mathematical model of the
electrode-retinal tissue contact has been established. A setup for electrophysiological
recordings in isolated retina has been prepared and the characteristics for electrical
pulses necessary for the stimulation of retinal neurons have been determined (Lecchi
et al., 2004; Lecchi et al., 2006). Psychophysical experiments that simulate artificial
vision in normal subjects have been performed to determine the fundamental design
guidelines of the new device (Sommerhalder et al., 2003; Pérez Fornos et al., 2004;
Sommerhalder et al., 2004; Pérez Fornos et al., 2005a). Several algorithms for the
pixelization of the stimuli have been tested and compared (Pérez Fornos et al.,
2005b). Recently, first inactive implant dummies have been implanted on rat eyes to
test biocompatibility and to evaluate surgical techniques.
33
The abbreviations stand for the French names of the institutions.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
23
1.3.4 Alternative approaches
A Japanese research group is currently working on an hybrid retinal implant (Yagi
et al., 1999; Ito et al., 1999). In this model, neural cells are cultured over a
microelectrode array that is attached to a photoelectric device. This approach rests
on the supposition that axons can grow from cultured cells and establish functional
connections with the central nervous system. It seems, however, very difficult to give
a precise direction to axon growth. The latest results presented by a group at
Stanford University seem to offer an alternative solution to this problem. A
photovoltaic electrode array is superposed to a membrane consisting of a set of low
cups with thin microtubes. In this case nerve cell processes can be effectively
directed to grow in these precise patterns (Leng et al., 2002; Huie et al., 2002), and
even individual electrode-neuron connections might be achieved (Wu et al., 2003),
thus minimizing stimulation thresholds and providing more specific stimulation. This
would result in higher resolution stimulation with lower power consumption in future
devices (Mehenti et al., 2003; Huie et al., 2003; Huie et al., 2004).
Another very recent approach using neurotransmitters to stimulate the surviving
retinal cells has been presented (Fishman et al., 2002; Iezzi et al., 2003; Safadi et
al., 2003; Fishman et al., 2004). In such a retinal prosthesis the electrode array is
replaced by a set of microfluidic pumps that deliver caged neurotransmitters
(Glutamate or GABA), with a resolution down to 5µm. Flexible polymer devices,
measuring 1.5x1.5 mm, have been successfully implanted into the subretinal space
of the rabbit (Fishman et al., 2003) and a technique for evaluating the efficacy of this
type of stimulation, compared to natural visual stimulation, has been presented lately
(Elfar et al., 2004). At a first glance, neurochemical stimulation of retinal cells
appears to be more ‘physiologically natural’, yet, there are important issues to solve
in this domain. The liberation of great amounts of glutamate, for example, is toxic for
neural cells (Walraven et al., 2002; Iezzi et al., 2002). Research has been undertaken
to determine the toxicological profiles of different neurotransmitter molecules (Kapi
et al., 2004; Iezzi et al., 2004) and to cope with this excitotoxic effect (Walraven et
al., 2003; Gasperini et al., 2003; Walraven et al., 2004).
1.3.5 Comparison of the different approaches
Certain design issues to be considered are more or less challenging depending on
the particular implantation site for the prosthesis. The particular problems and
advantages of each approach are condensed in table 1.
Cortical stimulation presents two main advantages. The first is that this approach
would rehabilitate the maximum of blind patients, including those in which retinal
and/or optic nerve stimulation is not possible. Second, the visual cortex is a
particularly robust implantation site. Protected by the skull, it constitutes a very
anatomically stable location compared to the eye34. On the other hand, several
34
As long as there is no mechanical coupling between the electrode array and the cranium (originated
from the immunologic reaction of the implanted tissue), as already discussed in the section describing
cortical prostheses.
24
INTRODUCTION
significant disadvantages must be mentioned. The surgical complications at this level
would be the most serious. In addition, such a system would bypass all peripheral
visual processing. Since the visuotopic organization of the visual cortex is nonconformal (Normann et al., 2001), the electronic treatment of the stimulus will
probably have to be extremely complex in order to evoke meaningful patterns at this
level.
The only advantage of optic nerve stimulation, compared to cortical stimulation is
that the consequences of surgical complications would be less important. However,
weighed against retinal stimulation, it does not present any fundamental advantage.
If blindness results from diseases where retinal ganglion cells are affected, the optic
nerve is also condemned to degeneration. Therefore, the patients that could be
treated with this approach, essentially those suffering from retinal degenerations,
would also be able to benefit from retinal implants. Furthermore, meaningful and
reproducible percepts appear to be very difficult to obtain due to the particular
visuotopy of the optic nerve.
Retinal stimulation presents the advantage that, compared to the others, the
consequences of surgical complications would be the least serious. Moreover, this
particular location would benefit from most of the natural peripheral processing of
the visual system. This is particularly significant in the case of subretinal stimulation,
where adequate retinotopic organization could be obtained without further
processing of stimulation patterns and where patients would be able to scan the
environment using normal eye movements. Furthermore, electrical stimulation at this
level could help restore ‘non-visual’ functions relying on luminous stimulation such as
the regulation of circadian rhythms35. Alternatively, both approaches have the
inconvenient that at least the retinal ganglion cell layer and the optic nerve must
remain functional. Hence, only patients suffering from diseases such as RP or AMD
would benefit from these devices. In addition, epiretinal stimulation will face two
important issues related to the location of the implant (in front of the retina). First,
the implant will be subject to fast rotation eye movements (up to 700°/s), due to
which its attachment and stable positioning relative to the eye will be difficult.
Second, it is not clear yet if not only nearby ganglion cell bodies will be stimulated,
but also traveling axons of distant ganglion cells. This would greatly complicate the
issue of visuotopic stimulation. A computational model aiming to determine precisely
the neural target according to electrical stimulation parameters has been presented
(Greenberg et al., 1999). Subretinal stimulation faces a different constraint, already
mentioned in the previous section. Since light entering the eye does not provide
enough energy to achieve adequate retinal stimulation (Zrenner et al., 1997), an
external power source might be needed. Recently, a research group (Palanker et al.,
35
Recently, a new photopigment, melanopsin, has been identified as the main biological transducer
for the regulation of circadian rythms (Provencio et al., 2000; Reppert & Weaver, 2002; Provencio et
al., 2002). This opsin is expressed in the dendrites of a special population of retinal ganglion cells,
most of which project directly to the SCN (Gooley et al., 2001). Therefore, direct electrical stimulation
of the inner retinal layers could have an effect on circadian regulation, provided that these particular
ganglion cells are preserved from degeneration and depending on their particular respose to electric
stimulation.
25
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
Table 1. Advantages and disadvantages of the different approaches towards visual prosthesis
development according to implantation site.
Cortical
Optic Nerve
Epiretinal
Subretinal
Population of blind patients
concerned
largest
retinal
degenerations
(RP, AMD)
retinal
degenerations
(RP, AMD)
retinal
degenerations
(RP, AMD)
Surgical complications
serious
less serious
least serious
least serious
easy
easy
difficult
easy
difficult
difficult
easy
easy
Preserves peripheral visual
processing
NO
NO
YES
YES
Follows eye movements
NO
NO
NO
YES
Fully implantable
NO
NO
NO
YES
Processing flexibility
YES
YES
YES
possible
difficult
difficult
difficult
easy
Attachment (mechanical stability)
Visuotopic stimulation
Large scale integration
2003) has focused in solving this specific issue with novel designs of self-powered
subretinal chips.
In conclusion, retinal implants seem to be the most elegant and promising way to
approach artificial vision. They could profit from natural processing in the still intact
peripheral structures of the visual system. Surgery is less invasive than for other
stimulation sites, which is an important clinical advantage.
It is important to point that visual prostheses are not the only alternatives
proposed to rehabilitate blindness nowadays. Scientific progress has also lead to the
exploration of novel genetic and cellular therapies (see e.g. McFarland et al., 2004;
Leal et al., 2005). We acknowledge such approaches, but they will not be considered
since they are clearly out of the scope of this thesis. Within the current dissertation,
subretinal implants are considered as the most probable model of visual prostheses,
but most of the results presented will be beneficial to all artificial vision projects.
1.4 Minimum requirements for useful artificial vision
Despite the enormous progress that has been made in the field, there are several
fundamental questions that should still be systematically addressed. The status of
research described in the previous section clearly shows that most efforts have
concentrated on developing technological solutions for visual prostheses. In
particular, psychophysical aspects related to artificial vision seem to have received
little attention. Yet, the ultimate goal of these devices, rehabilitating blind patients,
should not be forgotten. The minimum information that should be transmitted to the
brain in order to restore ‘useful’ visual function should be, therefore, thoroughly
investigated.
26
INTRODUCTION
Figure 17. Information path of a visual prosthesis, from the stimulus to the brain.
Senses function in the same way any information system does: they capture
physical data, convert it into biochemical and electrical signals, process and transmit
the information to higher integration centers, to finally use the input data. For
replacement within the system, one must determine where it has failed and whether
a stage of the system can be artificially bypassed. As described in previous sections,
most of the visual prostheses being developed nowadays use electrical stimulation as
the means of interaction with the visual system. All designs work in the same way.
First, a sensor captures the stimulus, transforming the visual input into electrical
signals. Afterwards, a processor transforms this ‘electrical image’ into patterns that
should be correctly interpreted by the nervous system. A stimulator then transmits
the information to the target neural tissue through a microelectrode array. Finally,
the brain attempts to make sense of it all.
How can we determine the minimum requirements for such a system to work?
The brain will be capable to achieve useful function only if it is provided with
sufficient information (see fig. 17). The first major site where information might be
lost is the processor/stimulator interface. At this level visual information is split into a
finite number of processing channels corresponding to the number of electrodes
available in the implant. Therefore, enough information should be transmitted via
this first processing stage in order to achieve useful visual function. The second
major site of possible information loss is the electrode-nerve interface. The detailed
characteristics of neural activation at this boundary are largely unknown at present,
and will depend on the exact nature and site of activation.
The handicap of low vision patients in everyday life is extremely severe (Weih et
al., 2000). It can result in problems with small object recognition, specifically
reading, and with spatial orientation, including whole-body mobility and visuomotor
coordination. Difficulties with reading are mainly associated to disorders of the
central visual field, whereas difficulties with mobility and visuomotor coordination
may also result from defects in the peripheral visual field. Research in all these areas,
both in normal subjects and in low vision patients, is extensive. These particular
studies will be detailed in the introduction of the corresponding chapters. The
present section will focus on research specifically addressing issues related to
prosthetic vision.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
27
There are very few studies related to reading using a visual prosthesis. Cha and
coworkers used a pixelized vision system to simulate artificial vision in normal
subjects (Cha et al., 1992b). Their head-mounted experimental setup consisted of a
video camera sending images to a monochrome monitor that projected to the
subject’s right eye (maximum viewing angle of 1.7°). Pixelization was achieved by
overlaying the monitor with opaque masks containing a variable number of square
perforations (pixels). Their results show that a 25 x 25 array (625 pixels),
representing four letters of text and projected on a foveal visual field of 1.7°, is
sufficient to provide reading rates near 170 words/min (WPM) using scrolled text36,
and near 100 WPM using fixed text37. Another research group measured reading
speeds and facial recognition rates with simulated prosthetic vision in the central
visual field using a head mounted video display (Dagnelie et al., 2000; Thompson et
al., 2000; Humayun, 2001). Subjects used eye movements to scan the stimuli
through a pixelizing grid. Several grid parameters were explored. Subjects achieved
reading speeds up to 100 WPM and close to perfect face recognition. In reading,
performance decreased significantly when the grid size covered less than 4 letters,
when a grid density of less than 4 pixels per letter width was used, when contrast
was less than 10%, or when more than 50% of the pixels were randomly turned off.
The limits for facial recognition were when the face was 3 times wider than the grid,
when grid density was less than 8 pixels per face width, when contrast was less than
20%, when less than 4 gray levels were used, and with more than 50% random pixel
drop-out. More recently, another study on facial recognition has been presented by
the same group (Thompson et al., 2003). Simulation methods were similar to their
first studies, and again, the effect on performance of different grid parameters were
evaluated at high (99%) and low (12.5%) contrast conditions. In both conditions,
the threshold parameters allowing good performance were: a minimum grid size of
25 x 25, a maximum random pixel dropout of 30%, a minimum of 6 gray levels, and
a maximum pixel spacing of 4.5 minutes of arc. The optimum range for dot (pixel)
size ranged from 0.5° to 1°. Face recognition accuracy was higher in high contrast
conditions. A learning effect, particularly evident in low-contrast conditions, was also
revealed.
Cha et al. (1992a) were the only ones to directly address whole-body mobility
under conditions simulating artificial vision. Normal human volunteers had to walk
through a maze including a series of obstacles, while their visual input was restricted
by a pixelized vision simulator similar to the one used for their reading experiments
(Cha et al., 1992b). Walking speed and number of obstacle contacts were measured
as a function of pixelization, object reduction and field of view. Performance
improved with large head movements, which however led to balance problems
associated with an abnormal vestibulo-ocular reflex. Their results suggested that,
similar to reading performance, an array of 25 x 25 pixels, projected on a foveal
visual field of 1.7° but encompassing a field of view of about 30°, could provide
useful mobility performance in environments not requiring a high degree of pattern
recognition.
36
The line of text scrolled automatically across a 10 characters wide horizontal window, therefore no
eye movements were needed for reading.
37
The text was static and subjects used their eye movements for navigation.
28
INTRODUCTION
Only some qualitative experiments were carried out to explore visuomotor
coordination tasks in conditions mimicking artificial vision (Humayun, 2001; Hayes et
al., 2003). A head mounted video display and pixelizing software were used for their
simulations. Almost all subjects were able to pour candies from one cup to another
using a grid of 16 x 16 pixels and about 50% of the subjects were able to cut a sheet
of paper under the same conditions. Similar to the psychophysical experiments on
reading and mobility mentioned previously (Cha et al., 1992a; Cha et al., 1992b;
Dagnelie et al., 2000; Thompson et al., 2000; Thompson et al., 2003), all
experiments were carried out using central vision.
In summary, the information contained in 625 pixels appears to be sufficient to
reach close to normal reading performance and useful mobility. However, these
experiments were conducted using oversimplified experimental conditions. Neither
visuomotor coordination, nor mobility in large-scale environments simulating natural
settings, were really studied in conditions relevant for artificial vision. None of the
presented studies did really mimic artificial vision, such as provided by retinal
implants, placed at well-defined and fixed retinal positions38. Furthermore, possible
eccentric locations for the implant were not explored in any of the previously
mentioned studies. This is particularly relevant in the context of retinal prostheses,
since the anatomo-physiology of the retina does not favor a foveal implant location.
Retinal prostheses are meant to treat cases involving photoreceptor loss (e.g. RP). In
these cases, the target neurons for electrical stimulation are surviving cells at the
inner retinal layers: bipolar and ganglion cells. These neurons are not present in the
central retina. In the parafovea, these inner retinal cells are arranged in several
superimposed layers that make it difficult to activate them in predictable patterns.
The best sites for retinotopic activation without major distortion would therefore be
located at an eccentricity of 10° and more. This mapping issue is especially important
when considering retinal implants transforming incident light into stimulation currents
in situ (Zrenner, 2002a; Chow et al., 2003). When an external camera is used to
capture the stimuli, as done in other retinal prosthesis prototypes (Rizzo & Wyatt,
1997; Humayun et al., 2003), an image processing module including remapping
routines adapted to the position of the retinal stimulator can be incorporated. In this
case, a more central location of the implant could be envisioned, but only on the
account of very sophisticated hardware and/or software.
1.5 Scope of this thesis
It is clear that more realistic simulations of artificial vision had to be done in order
to determine adequately the minimum requirements for useful artificial vision. The
work presented here intends to be a methodical assessment of the minimum
requirements to obtain useful artificial vision, concentrating essentially on the first
site of information loss: the processing stage. Performance on a set of tasks was
measured while systematically varying the amount and type of information
transmitted through the processing stage. Then, we made the ‘ideal’ assumption that
38
The impact of this issue on performance and, more in particular, on the rehabilitation expectations
of visual prosthesis wearers has been lately acknowledged by one of the research groups mentioned
above (Dagnelie et al., 2004; Kelley et al., 2004)
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
29
no information was lost at the electrode-nerve interface. The minimum visual
information critical to perform a specific function was identified through the analysis
of the experimental results. The supposition that the brain can use all the
information transmitted via the processing stage while evaluating the needs for basic
vision allowed us, therefore, to determine the minimum requirements for useful
artificial vision. This approach has already been used to study speech perception
cochlear implant users (Shannon et al., 1995; Dorman & Loizou, 1997; de Balthasar
et al., 1999; Loizou et al., 1999).
First, the general methods, common to all experiments, will be detailed. In the
following chapters, the specific issues related to each basic function will be outlined
and the corresponding results will be presented. In order to complete the global
assessment of minimum visual requirements for useful artificial vision, the last
chapter will present an evaluation of the effects on performance of more realistic
simulations considering some characteristics of the electrode-nerve interface.
At this point, it is important to detail my specific contribution to this investigation.
I joined the visual psychophysics research group at the Ophthalmology clinic of the
HUG in April 2001. At that date, the studies aiming to determine the minimum
requirements for useful reading were already on their way. More in particular, some
pilot experiments on the reading task had already been completed (reading isolated
letters and 4-letter words; Bagnoud et al., 2001; Sommerhalder et al., 2003).
Therefore, my actual participation started with the experiments exploring full-page
reading (section 3.5). The pilot experiments with 4-letter words (section 3.4) are,
however, also included in this dissertation for the sake of completeness.
1.5.1 Significance
The goal of this project is to determine minimum requirements to achieve useful
artificial vision. What can we consider ‘useful’ vision? The research effort
presented in this dissertation obviously relies on this definition. We believe that, for a
retinal prosthesis to be useful to future implant wearers, it has to satisfy their major
rehabilitation expectations. Based in our own clinical experience, and after extensive
discussions with several associations for the blind in Switzerland, we observed that
what blind patients mainly expect from these devices is to recover some kind of
reading abilities. Conversely, they generally believe that they manage pretty well
already in other every-day tasks.
The knowledge of the minimum information that has to be transmitted to the
brain to restore useful function is essential, theoretically and practically, to design
visual prostheses. Some authors may argue that psychophysical studies will have
only limited value until actual devices are implanted and tested (Margalit et al.,
2002). The history of cochlear implant development demonstrates that this is not
true, and clearly illustrates the importance of modeling studies. The first cochlear
implants developed in the 60s were single-channel devices that provided modest
rehabilitation to the deaf (Doyle et al., 1963). Some years later, on the basis of
modeling studies, Kiang and coworkers demonstrated that multi-channel stimulation
was required for high-level speech recognition (Kiang et al., 1979). Multi-channel
30
INTRODUCTION
cochlear implants became commercially available at about the same time. However,
single-channel cochlear implants remained in common use for more than 15 years,
until it became clear that their clinical results could not match those obtained with
multi-channel implants. Simulations of artificial hearing on normal subjects (Shannon
et al., 1995; Dorman & Loizou, 1997; Loizou et al., 1999; Hamzavi et al., 2000)
clearly demonstrated the fundamental reasons underlying these differences in
performance, but came too late to prevent the unnecessary extended use of singlechannel implants. The research effort presented in this dissertation is an attempt to
learn from past history and to provide this type of information to the artificial vision
research community at an adequate time, with the hope of preventing large-scale
use of prototype retinal implants with insufficient numbers of stimulation contacts. In
the particular case of visual prostheses, suppose for example, that one can
demonstrate that the perception of about N spatially distinct phosphenes, distributed
in a certain way throughout the visual field, is necessary to code the information
needed to perform a given visual function. The value of systems that do not provide
this capability will be limited. It appears therefore imperative to have such
knowledge before proceeding to human implantation trials.
The experimental approach that will be presented here is designed to mimic
visual perceptions provided by retinal implants. While these prostheses represent one
of the most elegant and promising designs to approach artificial vision, the results
from the proposed studies will be of general interest to all research groups working
on artificial vision since they focus on the minimum visual information necessary to
achieve a particular task (which is fundamentally limited by the task itself, not by the
means of stimulation). Furthermore, this investigation focuses on conditions
mimicking problems encountered in every-day life. Using simulations of artificial
vision on normal subjects is meaningful for addressing pertinent questions. On one
hand, blind patients using visual prostheses are not really available at this time. On
the other hand, by using normal observers with a simulated impairment one can look
at the effect of a single parameter without biasing its effect with that of others. The
use of simulations makes it easy to repeat experiments within a single subject and a
given parameter can be varied over a full range. These advantages of simulations
have been recognized by others and used to address specific questions (Pelli, 1987;
Cornelissen & Van den Dobbelsteen, 1999).
The results presented in this dissertation will provide essential indications for the
design of future visual prostheses and will help judge the level of visual rehabilitation
that could be provided with such devices. Hopefully, this work will benefit academic
institutions, hospitals and industry in Switzerland, and particularly blind patients all
over the world.
2 General Methods
Too much sanity may be madness. And maddest of all, to see life as it
is and not as it should be!
Miguel de Cervantes Saavedra (1547 - 1616)
2.1 Basic principles of the simulation methodologies
Our simulations attempt to mimic percepts elicited by a subretinal implant
transforming incident light into stimulation currents in situ:
•
Retinal prostheses will be implanted at a fixed and most probably eccentric
location of the retina. They will not cover the whole surface of the retina. Visual
perception will therefore be restricted to this part of the visual field.
•
The information conveyed in the ‘images’ perceived by implant wearers will be of
reduced (low) resolution, due to the limited number of stimulation contacts.
Figure 18. Illustration of the simulation procedure. Due to the limitations imposed by a subretinal
implant, the environment will only be visible through a small restricted viewing window (equivalent to
implant size), stabilized at a particular region of the visual field (implant location), and of reduced
resolution (due to the limited contacts in the implant).
For simulation purposes, this means that the environment should only be visible
through a low resolution, small, and fixed-size viewing window that will always
appear at the same position relative to the subject’s center of fixation, thus following
his/her eye movements. This process is illustrated in figure 18.
31
32
GENERAL METHODS
This chapter will describe in detail how these simulations were achieved. First, the
image processing techniques and algorithms used will be explained. Later on, the
experimental setup will be described. The particular issues related to the different
tasks will be presented in the specific methods section of the corresponding chapter.
2.2 Image processing
a)
b)
All the images used as
stimuli were 8-bit bitmap
(BMP)
grayscale
images.
Information content reduction
was performed through basic
image processing techniques
that decomposed the original
image into a given number of
pixels
(pixelization).
The
pixelization methods will be
described
hereafter.
The
particular algorithm used in
each experiment will be
Figure 19. Pixelization algorithms used: a) square
specified in the corresponding pixelization using a block-averaging algorithm; b) gaussian
chapter. The stimulus pixel pixelization using a 2D gaussian function.
geometry depended on the
algorithm used: square or
gaussian pixelization. Either of these pixelization methods could be presented to the
subject in their off-line or real-time versions.
2.2.1 Square pixelization
Square pixelization was performed using a simple block-averaging algorithm. This
technique consists in merging N x N pixel arrays of the original image into single
pixels with uniform luminance values corresponding to the mean grayscale levels of
the original N x N matrices (see fig. 19a), as defined in equation 1:
(µ y + N )
2
(1)
A(µ x , µ y ) =
∑
(µ x + N )
2
∑ i(x , y )
i
j
j = ( µ x − N )i = ( µ x − N )
2
2
2
N
where A(µx,µy) denotes the luminance value (grayscale level) of the stimulus pixel
with center coordinates µx and µy. i(xi,yj) represents the luminance value of the
corresponding pixels in the original image. N stands for the number of vertical and
horizontal pixels in the original image that are merged together in the pixelized
image. Figure 20 displays sample stimuli processed at different pixelization levels (N
values).
33
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
b)
c)
Figure 20. Text windows processed using square pixelization containing: a) 28000 pixels (N = 1); b)
572 pixels (N = 7); and c) 231 pixels (N= 11).
2.2.2 Gaussian pixelization
Figure 21. Gaussian pixelization. A 2D gaussian function is applied to each pixel. Block averaging was
used to determine the peak of the gaussian function. σ represents the standard deviation used in the
gaussian function; µx and µy are the center coordinates of the stimulus pixel to which the function is
applied.
Gaussian pixelization consisted in applying a 2D gaussian function to each pixel of
the pixelized image (see fig. 19b), as determined in equation 2:
(2)
I (x, y ) = A(µ x , µ y )⋅ G( x, y )
where I(x,y) denotes the luminance value (grayscale level) at coordinate (x,y) in the
pixelized image. A(µx,µy) stands for the mean grayscale level of the original N x N
matrix constituting the stimulus image pixel with center coordinates µx and µy (see
square pixelization). G(x,y) symbolizes the 2D gaussian function defined in equation
3:
(3)
G ( x, y ) =
−
1
2πσ
2
e
( x − µ x )2 + ( y − µ y )2
2σ 2
where σ represents the standard deviation of the particular gaussian function around
its horizontal and vertical center coordinates µx and µy. In our case, σ (the gaussian
34
GENERAL METHODS
width) determines the amount of overlap of each pixel onto its neighbors and the
center coordinates for each pixel correspond to its horizontal and vertical means (see
fig. 21). Figure 22 shows gaussian pixelized stimuli for several gaussian widths σ.
a)
b)
c)
Figure 22. Pixelization with various gaussian widths σ (pixel overlapping). Gaussian pixelizations
with: (a) σ = 0.071 pixels (no overlap), (b) σ = 0.286 pixels (medium overlap), and (c) σ = 1.143
pixels (large overlap).
2.2.3 Off-line/Real-time pixelization
Image stimuli could be processed either off-line or online (real-time pixelization).
In the first case, the original image was pixelized in the experiment preparation
phase. Then, during the experiments, this pre-pixelized image was presented on the
screen and masked by a gray overlay. The image could be explored through a
transparent window that moved according to the subject’s gaze position. Hence, the
luminosity of all pixels in the stimulus image was fixed, and the subject scanned
portions of this ‘frozen’ image. For real-time pixelization, only a small part of the
original image was pixelized online, during the experiment. The part of the image
that was pixelized corresponded to the content of the viewing window, which
changed as the viewing window moved across the stimulation screen. Therefore, the
luminosity level of each pixel in the stimulus image changed according to
instantaneous gaze position.
2.3 Experimental setup
The experimental setup was based on a commercial high-speed video based gaze
tracking system, the SMI EyeLink I (SensoMotoric Instruments GmbH, Teltow/Berlin,
Germany). The specifications of this system, as provided by the manufacturer, are
presented in appendix A. Briefly, this system consists in a lightweight headband and
2 personal computers: the operator PC and the subject PC.
A schematic view of the EyeLink headband is presented in figure 23a. It is fixed
by adjusting two clamps (top and rear). Two miniature infrared (IR) cameras track
both eyes simultaneously at a 250Hz frame rate. Binocular or monocular eye tracking
is possible since each eye is monitored independently. A third camera that tracks 4
IR markers attached to the stimulation screen is used for head movement
compensation. A picture of a subject wearing the headband is shown in figure 23b.
a)
b)
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
35
Figure 23. The EyeLink headband. a) Schematic view of the helmet highlighting its main
components. Image downloaded from the EyeLink product website39 (SR Research©). b) Photograph
of one of the subjects wearing the headband.
A dedicated image-processing card (full-length ISA card) is installed in the
operator PC. This card uses the information transmitted by the three headband
cameras for online gaze position computation. The pupil is detected as the darkest
surface in the eye. Its center is tracked and calculated using a centroid calculation
algorithm (Van der Geest & Frens, 2002; Cornelissen et al., 2002) with a theoretical
noise-limited resolution of 0.01° and less than 3°/s velocity noise (manufacturer’s
specifications). Heuristic filtering is available for single-sample artifact removal
(Stampe, 1993). Gaze position coordinates on the stimulation screen are obtained by
combining eye and head position measurements (gaze-dependent display). The
system’s 9-point calibration procedure has been described by Stampe (1993).
Mapping of eye coordinates into a head-referenced coordinate system is performed
with a quadratic (monocular eye-tracking mode) or biquadratic (binocular tracking
mode) function (Stampe, 1993; Cornelissen et al., 2002). A MS-DOS program
(EyeLink Operator PC Software version 2.01) running on the operator PC
communicates with the ISA card. Eye position data may be recorded to a file and/or
transmitted in real-time to the subject PC via an Ethernet link. If required, data may
also be output as analog voltage signals.
The subject PC is used for experiment flow and control and is directly connected
to the stimulation screen. This computer runs the main experiment program,
executing all stimuli capture, processing, and display algorithms. Hence, the
computing power needed for this piece of equipment varies according to the
complexity of the experiment. According to specifications, gaze position data
transferred via Ethernet from the operator PC arrives with a delay (after the actual
physical movement) of 6 ms when heuristic filtering is disabled and of 10 ms when
heuristic filtering is enabled. Interfacing with the eye tracking system is achieved
through the EyeLink Windows API library (version 24.06.98), provided by the
manufacturer.
Figure 24 displays a schematic view of the typical EyeLink I system configuration.
This system was chosen because its high frame rate (250Hz) allowed for the
detection of very rapid eye movements (i.e. reflexive saccades). In a recent study by
39
http://www.eyelinkinfo.com/mount_system_config.php
36
GENERAL METHODS
Figure 24. Typical configuration of the SMI EyeLink I system.
Van der Geest & Frens (2002), the EyeLink system (video-based) was compared to
another eye movement measurement technique: scleral search coils. Results
demonstrated that the outputs of both systems were highly correlated, and that the
only disadvantage of video-oculography was that its relatively low sample rate led to
noisy estimates of small eye movements. Other research groups have already
reported using the EyeLink I system in different experimental paradigms (Tant et al.,
2002; Huk et al., 2002; Li et al., 2002).
In our simulations, the subjects used their own eye movements to scan the
stimulation screen, which was visible only through a restricted viewing window of a
determined size (fig. 25).
Gaze position (calculated by
the eye tracking system) was
used to move the target
stimuli
(viewing
window
containing
the
processed
images) on the stimulation
screen. As a result, images
could be steadily projected
onto a defined (central or
eccentric) area of the retina.
A pilot study conducted in our
laboratory demonstrated that
this experimental setup was
Figure 25. The stimulation screen as viewed by the subject.
adequate
for
accurately
The environment was only visible through a defined viewing
stabilizing targets in the visual
window moving on the screen according to the direction of
field by online compensation
gaze.
of the gaze position (Bagnoud
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
37
et al., 2001). The remaining area of the screen was filled with a gray value
corresponding to approximately the mean illumination of the original image.
Depending on the experiment, the viewing window could be stabilized either in
central vision or at determined eccentricities40. Eccentric stimuli were always
presented in the lower visual field since it has been demonstrated that human
subjects generally perform tasks better when stimuli are presented in this area
(Previc, 1990; Chen et al., 2004). Furthermore, low vision patients with central visual
field defects usually tend to place the field defect above the new area used for
fixation (Fletcher & Schuchard, 1997). In addition, for the reading task this choice
offered an additional practical advantage: the retinal eccentricity of the target varies
less when it is projected to the lower or upper visual field, than when projected to
the left or the right visual field.
Off-line square pixelization was carried out with the mosaic-pixelizing filter
included in Adobe Photoshop® 5.5 (Adobe Systems Incorporated, San Jose, CA,
USA). All the remaining algorithms and experiment programming was done under
Microsoft Visual C++ 6.0 SP5 (Microsoft, Redmond, WA, USA) and the latest Platform
SDK library (Windows API, GDI, Direct X) available at the time the experiment
began.
The eye cameras were
positioned so that the pupil
was clearly visible and well
defined
at
any
gaze
position. At the beginning
of
each
experimental
session a standard 9-point
calibration of the eyetracker was performed. The
calibration was checked
regularly for possible drifts
or artefactual movements,
and if necessary, slightly
corrected to insure an exact
control of the viewing
Figure 26. The stationary system (typical configuration of the
window position during
SMI Eyelink I system. In the picture shown, a subject is
wearing the eye tracking system during one of the Reading
each experimental session.
experiments (Chapter 3).
For
the
experiments,
subjects wore the head
mounted SMI eye tracking
system in one of two configurations: a stationary or a mobile system.
40
The point of reference for the stimulus eccentricity was the center of the viewing window.
38
GENERAL METHODS
2.3.1 Stationary system
The stationary system was used to explore the reading task. In this setup, the
SMI EyeLink I eye tracker was used in its standard configuration (see fig. 25). The
operator PC was a Compaq Deskpro EP (Celeron-400). The subjects were
comfortably seated facing the stimulation screen, a 22” high refresh rate monitor
(Elsa ECOMO 22H99; see fig. 26) connected to the subject PC (P3-450 equipped with
a Matrox G200 graphics card). Subjects were requested not to move during the
experimental sessions. The stimulation screen was set to a resolution of 800 x 600
pixels and a refresh rate of 120 Hz. Eye-to-screen distance was of 57 cm41; at this
distance, the 40 cm x 30 cm surface of the screen subtends a visual field of 40° x
30°, 1° corresponding to 20 screen pixels (at the screen resolution chosen).
2.3.2 Mobile system
Since it is impossible to simulate mobility and visuomotor coordination tasks with
our stationary setup (while sitting in front of a computer screen), a similar, but
mobile setup based on the same EyeLink system was developed (fig. 27).
a)
b)
Figure 27. The mobile system. The stationary setup was modified in order to allow for mobility. A
rapid LCD display was fixed in front of the subject’s eyes and a webcam mounted: a) on the side (for
visuomotor coordination tasks) or b) on the top of the system (for mobility tasks) was used to capture
images from the environment.
A small and rapid LCD display (NEC NL6448BC26-01), measuring 170.09 mm x
128.2 mm, was fixed in front of the subject’s eyes, at an eye-to-screen distance of
23 cm. The LCD display subtended, thus, the same visual field of 40° x 30°. All the
electronics needed for the LCD display were attached as counterweight in the back of
the headband. A cover of black cloth prevented the subject from seeing anything
else than the screen, and a bite-bar guaranteed the necessary rigidity for reliable eye
tracking and stimulus stabilization. A small webcam (Philips ToUCam Pro) was used
to capture environment images at a 30 Hz frame rate and was connected directly to
the subject PC. The stimulation screen was set to a resolution of 640 x 480 pixels
(largest image frame size provided by the webcam) and a refresh rate of 75 Hz
41
The eye position being aligned to the table border it was easy for the examiner to control whether
the eye to screen distance was maintained stable.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
39
(maximum allowed by the LCD display). In this configuration, 1° of visual angle
corresponded, thus, to 16 screen pixels.
The standard image capture angle of the webcam was of 33° horizontally and
24.75° vertically. For visuomotor coordination experiments, the webcam was
mounted on the side of the system, at eye-height (fig. 27a). For mobility
experiments, a custom objective allowing a larger image capture angle (66° x 49.5°)
was adapted to the webcam, which was mounted on the top of the system (fig.
27b). Most of the material used for these modifications was aluminum so that the
system was kept as light as possible. A Dell Latitude C840 (P4-M, 2.2 GHz) notebook
equipped with a nVidia GeForce4 440 Go 64-bit graphics card, running Windows XP
SP1, was used as the subject PC in order to increase mobility and computing power.
Both, the operator and the subject PC’s were installed on a mobile rack and
connected with an approximately 10 m long cable to the headband-mounted items.
2.4 Data analysis and statistics
The majority of the data analysis methods used will be the same throughout this
document, irrespective of the task being explored. Eventual learning effects were
explored by fitting the data to a non-linear exponential function (whenever the data
allowed it), in order to average session-to-session variability. For example, learning
curves were established on the basis of performance evolution versus time. Curve
stabilization was determined on the basis of the time constant42 of the corresponding
exponential function used (3τ).
Statistical effects on performance were determined using linear correlation
(Pearson’s correlation) or standard (paired) t tests with a significance level of 0.05.
2.4.1 Percentage scores
When analyzing performance data from the different tasks, results will be often
expressed as percentage (%) correct scores. Nonetheless, such proportional scales
are not adequate for statistical analysis since they fit a binomial probability
distribution, where variance is not correlated with the mean. In other words, data is
not normally distributed around the mean and scale values are not linear in relation
to test variability. An adequate scale transformation must be, thus, applied to the
data in order to obtain statistically valid results. The arcsine or angular
transformation (AT), defined in equation 6 (Thornton & Raffin, 1978), is generally
known to be the most appropriate for proportional values as it spreads data along
both ends of the scale (0% and 100%) while it compresses the middle. In this
equation, X stands for the number of samples being positive or negative (proportion
or percentage score) and N designates the total number of samples (number of
samples equivalent to 100% in the case of percentage scores).
42
The time constant (τ) of an exponential function corresponds to the time required for the function
to decrease by a factor of 1/e (approximately 0.368). The stabilization time of an exponential function
is generally estimated as 3τ.
40
GENERAL METHODS
(6)
AT = arcsin X / ( N + 1) ) + arcsin
( X + 1) / (N + 1))
However, such a transformation has the significant drawback of generating values
that are very different to the original scale and are, hence, difficult to interpret
intuitively. For example, the arscine-transformed equivalent of a 0% score is
approximately 0.10, of 50% it is about 1.57, and of 100% it is around 3.04. This
shortcoming can be surpassed with the rationalized arcsine transform (RAT) defined
in equation 7 (Studebaker, 1985). The values generated by this transform, the socalled “rationalized arcsine units” (RAU), have the advantage of being numerically
close to the original percentage range, while they retain all of the desirable
properties of the angular transform. For example, for a sample size of 50, a score of
0% corresponds to -16.5 RAU, 50% to 50 RAU and 100% to 116.5 RAU.
(7)
RAU = 46.47324337 AT − 23
All percentage results will be therefore transformed to RAU units prior to any
statistical analysis. For better clarity, however, an approximate %-correct scale will
always be shown on the right ordinates of the graphs and will also be used in the
text.
2.5 Ethical considerations
All experiments were conducted according to the ethical recommendations of the
Declaration of Helsinki, and were approved by local ethical authorities43.
43
The «Comité d’Etique de la Recherche sur l’Etre Humain» (CEREH) of the HUG.
3 Experiments on Reading
I took a speed-reading course and read ‘War and Peace’ in twenty
minutes. It involves Russia.
Woody Allen (1935- )
3.1 Foreword
Reading is an extremely important activity in our modern societies. It is strongly
associated with vision-related estimates of quality of life, and represents one of the
main goals of low vision patients seeking rehabilitation (Elliott et al., 1997; Wolffsohn
& Cochrane, 1998; McClure et al., 2000; Hazel et al., 2000; Margrain, 2000). The
thorough analysis of this task is, thus, fundamental for the evaluation of the
rehabilitation prospects of visual prostheses to blind patients.
3.2 Introduction
Reading is a complex task that requires the conjunction of several oculomotor,
cognitive, and visual processes (Reichle et al., 2003). Understanding its fundamentals
has received plenty of attention. The low vision research group at the University of
Minnesota has systematically studied various aspects of reading in normal subjects
and low vision patients. For normal subjects (Legge et al., 1985a), they reported that
maximum reading rates are achieved for characters subtending 0.3° to 2° of visual
angle; that reading rate increases with field size, but only up to 4 characters,
independently of character size; and that, when the text is pixelized, reading rates
increase with pixel density, but only up to a critical density that depends on character
size. Reading was also found to be very tolerant to either luminance or color contrast
reductions (Legge & Rubin, 1986; Legge et al., 1987; Legge et al., 1990). At very
low (< 10%) luminance contrast however, reading speed drops due to prolonged
fixation times and to an increased number of saccades, presumably related to a
reduced visual span (Legge et al., 1997). When testing the effect of print size on
reading speed in eccentric vision, they found that the use of larger characters
improved peripheral reading to some extent, up to a critical print size (Chung et al.,
1998). However, maximum reading speed decreased from about 808 WPM
(words/min) for foveal vision to about 135 WPM for eccentric vision at 20°
eccentricity44. Thus, print size was not the only factor limiting maximum reading
speed in normal eccentric vision. These findings contradict the scaling hypothesis
(Toet & Levi, 1992; Latham & Whitaker, 1996), which states that eccentric reading
can match foveal reading only by increasing print size.
44
Such high reading rates were achieved by using rapid serial visual presentation (RSVP).
41
42
EXPERIMENTS ON READING
In low-vision patients, reading is similar to normal reading in several aspects
(Legge et al., 1985b; Rubin & Legge, 1989; Legge et al., 1990; Legge et al., 1997),
but difficult to predict on the basis of routine clinical evaluation (Legge et al., 1992).
As a rule however, it can be stated that low-vision patients with central field defects
achieve lower reading rates than those with preserved central fields (Legge et al.,
1985b; Rubin & Legge, 1989).
Reading in every-day life often requires certain page navigation abilities. Page
navigation not only implies the stabilization of gaze on a particular point of interest,
but also rather accurate oculomotor control. Small successive saccades should be
performed towards each word in the lines of text, and larger saccades are required
to jump from the end of one line to the beginning of the next. In previous literature,
page navigation has essentially been studied in connection with the use of special
field of view magnifiers, intended as reading aids for low vision patients. Beckmann &
Legge (1996) measured reading rates of normal and low vision subjects in two
conditions: with horizontally drifting text requiring no page navigation and with a
closed-circuit television magnifier (CCTV) requiring manual page navigation. Manual
page navigation resulted in significantly lower reading rates. This effect was more
pronounced on normal subjects than on low vision patients, suggesting that overall
reading performance was reduced in these patients essentially because of other
visual factors, and not because of navigational factors. A second comparative study
of the same research group, including RSVP text presentation and manual (mousecontrolled) page navigation, confirmed their previous findings (Harland et al., 1998).
The use of RSVP and drifting text presentation resulted in better reading
performance than the use of CCTV or manual navigation. Interestingly, they did not
observe significant differences in reading rates across the four methods of text
presentation in a group of patients with central field loss (for subjects who were
forced to use eccentric fixation for reading).
On one hand, full-page reading might be expected to be easier than deciphering
isolated words, because subjects can make use of context information to facilitate
reading. We are better at reading meaningful sentences than random words (Latham
& Whitaker, 1996; Fine & Peli, 1996). The benefits of context are however
controversial when it comes to peripheral vision. It has been suggested that readers
with central field loss would be less efficient in using context to facilitate reading
(Baldasare & Watson, 1986), but this hypothesis is contradicted by other studies. For
example, Fine & Peli (1996) compared reading rates for meaningful sentences to
reading rates for random words for normally sighted subjects and for subjects with
central field loss, and found that speed gains due to context were present and
equivalent for both groups of subjects when using RSVP and scrolled text
presentation.
On the other hand, full-page reading can also be expected to be more difficult
than deciphering isolated words, because successful reading of several lines of text
requires page navigation abilities (accurate oculomotor control). This can be difficult
to achieve with a restricted and stabilized viewing window, especially with eccentric
retinal locations. In humans, selective attention is mainly focused around the fovea,
the retinal area providing the highest spatial resolution. The oculomotor system is
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
43
constructed to essentially subserve foveal function by directing and stabilizing images
of interest to that retinal location. When the fovea is lost as a result of disease,
affected subjects strive to use optimally spared retinal areas as a replacement.
Adaptation to this viewing condition might involve several processes. Spared retinal
areas with best visual acuity and/or appropriate visual field (‘visual span’) should be
identified. Such eccentric retinal locations are commonly known as “preferred retinal
loci” or PRL (Von Noorden & Mackensen, 1962; Cummings et al., 1985). Selective
attention must be transferred to these eccentrically located PRL (Altpeter et al.,
2000). In addition, oculomotor control mechanisms should be reorganized to allow
shifting images of interest directly to the PRL (Heinen & Skavenski, 1992). The
development of eccentric fixation seems to appear prior to the ability to perform
saccades shifting the image of interest onto that new fixation area. An experimental
study where bilateral foveal lesions were performed in three adult monkeys showed
that while both fixation and saccadic mechanisms may adapt to foveal loss, saccadic
adaptation requires a much lengthier process (Heinen & Skavenski, 1992). In that
experiment, eccentric fixation occurred already one day following the lesion, and new
PRL stabilized within two days. In contrast, numerous reflexive saccades
inappropriately projecting visual stimuli onto the damaged fovea were still observed
the first days after lesion. Saccades gradually adapted to reference the newly
developed PRL over a period that lasted several weeks. Two months following
lesions, two of the three animals were able to generate saccades bringing the PRL
directly or close to the target image. This distinction between the development of
eccentric fixation and the adaptation of eccentric (non-foveating) saccades suggests
that oculomotor adaptation to peripheral viewing relies on multiple mechanisms.
3.2.1 Reading in the context of artificial vision
The studies cited above (as well as many others) have led to the identification of
a series of important parameters that are critical for reading in normal and low vision
subjects. To our knowledge, however, there are only a limited number of studies,
which specifically address visual prosthesis development issues. These have already
been introduced. Briefly, Cha et al. (1992b) used a pixelized vision system to
simulate artificial vision in normal subjects. Their results showed that a 25 x 25 array
of pixels representing four letters of text projected on a foveal visual field of 1.7° is
sufficient to provide reading rates near 170 WPM using scrolled text, and near 100
WPM using fixed text. Another group at the John Hopkins University of Baltimore
conducted experiments on the properties of pixelized vision (Dagnelie et al., 2000;
Thompson et al., 2000; Thompson et al., 2003). Several parameters were explored.
Tested individuals achieved reading speeds up to 100 WPM, which dropped off when
the pixelizing grid was smaller than 4 letters, with grid densities lower than 4
pixels/letter width, or when more than 50% of the pixels were randomly shut off. An
44
EXPERIMENTS ON READING
eccentric implantation site and the fact that a retinal implant would stimulate a fixed
area of the retina have not been fully taken into consideration yet45.
It is important to investigate the effect of stimulus eccentricity on performance,
because, as already exposed in Chapter 1, the anatomo-physiology of the retina does
not favor a foveal location for retinal prostheses (see e.g. Sjöstrand et al., 1999a).
The best sites, potentially preserving retinotopic activation without major distortion,
are located beyond 10° eccentricity. Consequently, the vision of future users of
retinal prosthesis will probably be restricted to small peripheral areas of their visual
field. However, due to the decreasing gradient of visual acuity throughout the retina
and to the cortical magnification factor, our ability to identify objects in the periphery
is poor. Especially reading words of several letters is very difficult due to contour
interaction. This phenomenon is generally known as the “crowding effect” (Toet &
Levi, 1992).
The first aim of the work presented in this chapter was to assess reading
performance with a system projecting stimuli onto defined, stabilized areas of the
visual field. Four letters of text visible at glance is about the minimum lettersequence allowing normal or close to normal reading speeds46 (Legge et al., 1985a).
Therefore, in a set of pilot experiments, we explored the basic information
requirements for the reading task using isolated 4-letter words. First, we studied
the influence of stimulus information content (pixelization level), stimulus eccentricity
and stimulus size on reading performance for isolated 4-letter words. Second, we
explored whether low eccentric reading performances can be significantly improved
by training.
Obviously, these first experiments did not require page navigation during reading.
It is difficult to predict how future retinal prosthesis wearers would cope with the
page navigation problem. Neither horizontally drifting text nor RSVP do realistically
mimic text reading using retinal implants, since both methods are expressively
intended to minimize eye movements. Mouse-controlled navigation and CCTV reading
are both quite unnatural conditions, because they rely on manual page navigation.
The main objective of this investigation was, therefore, to address the issue of fullpage reading in conditions mimicking artificial vision as provided by a retinal
implant, taking the page navigation problem into account. We conducted two
successive experiments. In experiment 1, subjects were asked to read pixelized fullpage texts using a viewing window stabilized on the fovea. In experiment 2, subjects
were asked to perform the same task, but using a viewing window stabilized at 15°
eccentricity.
45
The research group at the John Hopkins University has recently begun using image stabilization on
the retina and acknowledged that it has a significant effect on reading performance (Kelley et al.,
2004).
46
Maximal reading speeds would be favored with viewing windows containing a higher number of
characters. However, working in the field of low vision, we considered the number of 4 letters as an
adequate minimum value to obtain reading speeds, which may not be maximal, but are close to
normal. This issue will be revisited in the discussion section of this chapter.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
45
The results presented in this chapter also offer a unique opportunity to study the
overall process through which subjects adapt to eccentric viewing. Eye movements
recorded during the experiments were also analyzed with the purpose of better
defining the processes of oculomotor adaptation to eccentric reading.
3.3 Specific methods for the reading experiments
3.3.1 Subjects
Subjects were recruited from the staff of the Ophthalmology Clinic of the Geneva
University Hospitals. Their age ranged from 25 to 47 years. All had normal or best
corrected visual acuity of 20/20 on the tested eye and a normal ophthalmologic
status. All of them were fluent French speakers and were familiar with the purpose of
this study.
3.3.2 Experimental setup
To simulate visual percepts produced by a retinal implant, images were projected
on a defined and stabilized retinal area. The details of the simulation procedures
have already been described in the General Methods Chapter. For all reading
experiments, the apparatus used corresponded to the stationary setup. The imageprocessing algorithm used was off-line square pixelization.
Subjects were comfortably seated at 57 cm of the screen (see fig. 26). At the
beginning of each run, eye-to
screen-distance was checked
and a standard 9-point
calibration was performed.
Subjects were requested not
to move during the session.
The
actual
experimental
sequence started afterwards.
Gaze position was used to
move the viewing window
containing the target stimuli
(BMP
images)
on
the
stimulation screen (see fig.
28). Images could thus be
steadily projected onto a
defined (central or eccentric)
area of the retina, the
reference
point
for
eccentricity being the center
of the viewing window.
Eccentric stimuli were always
Figure 28. The stimulation screen as viewed by the subject.
The viewing window, containing fragments of pixelized text,
moves on the screen according to the direction of gaze and
with a certain offset (in this case, 15° of eccentricity). The
background of the remaining screen area was kept in a gray
color corresponding to the mean luminosity of the target
stimuli (4-letter words of full-page texts).
46
EXPERIMENTS ON READING
presented in the lower visual field (please refer to Chapter 2 for more details).
3.4 Pilot experiment: Reading of isolated 4-letter words
As already mentioned in section 1.5, these pilot experiments had already been
completed when I joined the project. Nevertheless, these experiments have been
included in this dissertation to show a complete picture of the minimum requirements
for useful reading. In addition, these results are required to fully understand some of
the chief main theoretical foundations for the main experiments presented in this
chapter: those exploring full-page reading.
3.4.1 Stimuli
Stimuli were presented in rectangular white areas (viewing windows) filled with
black 4-letter words of common French language (including accented characters and
capital letters for proper names). The largest possible Arial font (Helvetica) style was
chosen because it is commonly used and has proved enabling good reading in low
vision subjects (Buultjens et al., 1999). The stimuli used for the experiments were
pre-processed BMP images. Figure 29 shows an example of one of the stimuli words
processed at different pixelizations.
maximum screen resolution
140 pixels
875 pixels
286 pixels
83 pixels
Figure 29. The 4-letter word “rang” illustrating the five degrees of pixelization used in the
experiment.
For each run, a block of 50 words, randomly chosen among a library of 500
common French 4-letter words, was presented. The subject had to say the word
he/she recognized for each item of the run. The response (right or wrong47) was
entered by the examiner into the operator PC and stored for further analysis. After
each word presentation, the calibration was checked for possible drifts, and
eventually slightly corrected, to insure an exact control of the position of the target
image during the entire experiment.
47
We were strict when attributing these scores. Words had to be perfectly recognized (gender and
number mistakes were considered as complete errors) to be considered as correctly read.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
47
3.4.2 Acute experiments with 4-letter words
Reading performance was assessed versus a number of variables, each being an
important parameter of prosthetic vision:
1) Stimulus size. Two viewing windows were investigated. First, a viewing
window that subtended a visual field of 20° x 7°, which allowed the use of a
print size greater than the critical print size needed for optimal reading
performance at an eccentricity of 20° (Chung et al., 1998). The height of a
small letter ‘x’ of the Arial font size used for this large viewing window
corresponded to a visual angle of 3.6°. Second, a viewing window subtending
10° x 3.5°, which corresponded to a surgically manageable, realistic retinal
prosthesis (3 x 1 mm2 on the retina). The height of a small letter ‘x’ of the
Arial font size used for this small viewing window corresponded to a visual
angle of 1.8°.
2) Stimulus pixelization (i.e. number of pixels in the viewing window). Five
degrees of pixelization were used (see fig. 29): maximum screen resolution48,
875, 286, 140, and 83 pixels. At maximum screen resolution, the large viewing
window enclosed 4 times more pixels than the small viewing window.
Otherwise, tests were performed at equal pixel resolution for both viewing
window sizes. Note that using the same number of pixels on both viewing
windows implied that pixels were 4 times larger on the larger (20° x 7°)
viewing window.
3) Stimulus eccentricity. Five different eccentricities were tested: 0°, 5°, 10°,
15°, and 20° in the lower visual field.
3.4.2.1 Experimental protocol
Five subjects participated in this experiment, and all tests were conducted
monocularly. Each subject performed one run in each condition. Testing always
started at the lowest eccentricity using maximum screen resolution first, then 875
pixels, 286 pixels, 140 pixels, and finally 83 pixels. The same procedure was
repeated using the next eccentricity. Possible global learning effects would therefore
favor performance at low pixelization levels and at high eccentricities.
Word presentation duration was of 3 s. When stimuli were presented in eccentric
vision, a fixation aid (a red filament crossing the screen) was installed to make it
easier for the inexperienced subject to keep the target on screen.
3.4.2.2 Performance with the 20° x 7° viewing window
Figure 30 presents mean reading performance versus pixelization level for the
large viewing window, at various eccentricities. In central vision, reading
48
Expressed in x by y terms these pixelizations result in: maximum screen resolution = 400 x 140
pixels for the larger viewing window and 200 x 70 pixels for the smaller viewing window.
48
EXPERIMENTS ON READING
performance of 4-letter words
was close to perfect (over
90% correct) for pixelizations
down to 286 pixels. For
pixelizations below that value,
reading performance dropped
abruptly. This result indicates
that approximately 300 pixels
are necessary to transmit the
relevant
information
for
reading 4-letter words. In
peripheral vision, maximum
reading
performance
decreased
with
growing
eccentricity. At an eccentricity
of 10°, almost perfect reading
Figure 30. 4-letter word reading performance versus
number of pixels in a 20° x 7° stabilized viewing window.
(> 90% correct) was still
Mean reading scores in RAU ± SEM (left scale) and in %
possible
at
high
pixel
(right scale) for 5 normal subjects at 5 eccentricities in the
resolutions. At 15° and 20°
lower visual field.
eccentricities, perfect reading
was never achieved (not even
with high pixel resolutions). Maximum reading performance was limited to values of
88% and 63% correctly read words, respectively, in these cases.
3.4.2.3 Performance with the 10° x 3.5° viewing window
Figure 31 presents mean reading performance versus pixelization level for the
small viewing window, at
various
eccentricities.
In
central
vision,
results
observed
for
the
small
viewing window were very
similar to those obtained with
the large window. Reading
performance of 4-letter words
was also perfect (> 90%
correct) for pixelizations down
to 286 pixels and dropped
dramatically afterwards. The
same limiting criterion of
about 300 pixels was, thus,
found to transmit the relevant
information.
In
eccentric
Figure 31. 4-letter word reading performance versus
vision,
the
decrease
in
number of pixels in a 10° x 3.5° stabilized viewing window.
maximum
reading
Mean reading scores in RAU ± SEM (left scale) and in %
(right scale) for 5 normal subjects at 5 eccentricities in the
performance with growing
lower visual field.
eccentricity
was
more
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
49
pronounced than that observed with the 20° x 7° viewing window: at eccentricities
of 10°, 15° and 20°, maximum reading performance was limited to values of 89%,
57%, and 30% correct, respectively.
3.4.2.4 Normalized data in both viewing windows
The raw observations presented in the two preceding figures demonstrate that
both pixel resolution and eccentricity of the stimuli affected reading performance of
4-letter words. The data were normalized to the values obtained at maximum screen
resolution in order to better compare the effect of pixelization at different
eccentricities (fig. 32). These normalized data demonstrate that pixelization affected
reading performance quite similarly at all eccentricities and on both viewing windows.
This result is consistent with the fact that pixelization directly influences the
information content of the source image. Conversely, stimulus eccentricity seemed to
affect the way this source information was processed by the visual system, limiting
maximum reading performance.
a)
b)
Figure 32. Normalized reading performances for 4-letter words versus number of pixels in a
stabilized viewing window of a) 20°x 7° and b) 10° x 3.5°. Mean normalized reading scores ± SEM for
5 normal subjects at 5 eccentricities in the lower visual field. Data was normalized to mean reading
performance values at maximum screen resolution.
3.4.2.5 Single letter recognition versus 4-letter word reading
At eccentricities beyond 10°, most subjects reported having problems recognizing
letters in the middle of the words. One of the underlying reasons for poor reading
performances might thus be the “crowding effect” (Tychsen, 1992). An additional
experiment, using isolated letter stimuli instead of 4-letter words, was designed to
verify this hypothesis. Briefly, isolated single letters of the same font type and size as
used for 4-letter words were created for the small viewing window. The letter was
50
EXPERIMENTS ON READING
presented in the center of the
viewing
window,
which
contained the same number
of pixels than in the word
experiments.
Blocks of 50 letters were
chosen among the French
alphabet according to their
frequency of use in our pool
of 500 words. These letters
were randomly presented to
five new subjects. The results
of this additional experiment
are summarized in figure 33.
Up to an eccentricity of 15°,
Figure 33. Isolated letter recognition performance versus
isolated letter recognition was
number of pixels in the 10° x 3.5° stabilized viewing window.
Mean letter recognition scores in RAU ± SEM (left scale) and
almost
independent
of
in % (right scale) for 5 normal subjects at 5 eccentricities in
eccentricity. At an eccentricity
the lower visual field.
of 20°, maximum letter
recognition was still about
90% correct for high pixelizations.
Figure 34. Reading performance versus eccentricity for 286pixel resolution stimuli in a 10° x 3.5° stabilized viewing
window. Isolated letter recognition (red plot) is compared to
4-letter word reading (blue plot). Mean reading scores in RAU
± SEM (left scale) and in % (right scale) for 5 normal
subjects. A probabilistic estimate to recognize 4-letters in
sequence is also plotted for comparison (green plot).
49
Figure
34
compares
reading
performance
of
isolated letters to that of 4letter words at 286-pixel
resolution for the small
viewing window. It is clear
that isolated letter recognition
was much less affected by
eccentricity than 4-letter word
reading. To compare both
results, we computed the
intrinsic
probability
to
correctly identify 4 isolated
letters
in
successive
sequence, on the basis of the
probability to recognize single
letters49 (p4 = p14). This
rough estimation still fell short
to account for the very low
scores observed in the wordreading task. Therefore, even
if both tasks are difficult to
Obviously, this estimate is a very simplified lowest limit since many words can be correctly identified
when less than all 4 letters are recognized.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
51
compare quantitatively, this observation suggests that 4-letter word reading was
significantly impaired at high eccentricities by the fact that the letters to be
recognized were flanked by others. The crowding effect may be the underlying
mechanism.
Finally, it is also interesting to note that at high eccentricities, isolated letter
recognition performance was significantly better at a target resolution of 875 pixels
than at maximum screen resolution (fig. 34; p = 0.016 at 15° eccentricity, p = 0.018
at 20° eccentricity). The data for 4-letter word reading on the small viewing window
(fig. 32) showed the same trend. This finding might indicate that at high
eccentricities a certain blur of the target (due to pixelization) leads to better reading
performance, when letter size is below the critical print size. A study by Li, Nugent, &
Peli (2001) compared letter recognition of pixelized and smoothed anti-aliased letters
on a CRT display. They found no significant difference between both conditions in
peripheral vision up to 12.5° eccentricity. The tendency we observed appeared only
at higher eccentricities (beyond 15°). However, the stimuli used in the mentioned
research work are difficult to compare with ours.
3.4.3 Habituation to reading 4-letter words in eccentric vision
This experiment was dedicated to investigate if eccentric reading performance, in
conditions mimicking vision with a retinal implant, could improve with training. We
investigated the effects of training on eccentric reading by presenting isolated words
in a 10° x 3.5° viewing window (corresponding to a surgically manageable retinal
implant surface of 3 x 1 mm2), located at 15° of eccentricity in the lower visual field
(corresponding to a physiologically favorable location on the retina) and containing
286 pixels (corresponding to a pixelization level that allowed close to perfect word
recognition in central vision). In this same condition, subjects could identify correctly
only between 20% and 48% of the words in the previous experiment. Two additional
control conditions with the same viewing window (10° x 3.5°) were used for
comparison: stimuli containing 286 pixels, presented at 0° eccentricity (central
reading); and stimuli containing 14’000 pixels (maximum screen resolution), but
presented at 15° eccentricity.
3.4.3.1 Experimental protocol
Two subjects, AR and EO, participated in this experiment. They were naïve to the
task (i.e. they did not participate in any of the previous studies). Subject AR made all
tests in monocular condition since retinal implants will certainly be used monocularly
(at least for first generation implants). However, it was interesting to compare
monocular to binocular learning. Therefore, subject EO therefore performed all tests
in binocular condition.
Three experimental sessions were conducted each working day of the week (5
days per week). Each session included one run consisting of a 50-word block, in each
of the 3 different conditions, in the following order:
52
EXPERIMENTS ON READING
1) Control condition 1: 286 pixels at 0° eccentricity.
2) Control condition 2: 14’000 pixels at 15° eccentricity.
3) Main condition: 286 pixels at 15° eccentricity.
The easiest condition was thus tested first and the most difficult last, so that
within-session learning would favor results in the most difficult condition.
Stimulus presentation time was limited to a maximum of 10 s, and subjects were
instructed to press a key as soon as they recognized the projected word. Response
time was recorded with the word recognition score. At the end of each run, reading
score (number of correctly recognized words) and mean response time (s) were
automatically computed by the program.
Each experimental session lasted about 20 minutes. Consequently, the three daily
sessions represented about 1 hour of training. A total of sixty-nine sessions were
conducted on each subject. The regular daily flow of sessions was interrupted only
for weekends and exceptionally once (AR) or twice (EO), for 3-day vacations.
Learning curves were computed for reading scores and mean response times
using the exponential functions presented in Chapter 2. Significant learning effects
were determined using simple linear correlation (Pearson’s correlation).
3.4.3.2 Learning effects on eccentric reading of pixelized words
Figure 35a presents reading score results versus session number for each subject
tested in the main condition (viewing window containing 286 pixels, 15° eccentricity).
Impressive learning effects can be observed. Both subjects started the experiment
with low reading scores. With training, their scores improved by about 60%. These
improvements were highly significant for both subjects (Pearson’s correlation: r =
0.80, p < 0.0001 for EO and r = 0.86, p < 0.0001 for AR). The exponential fit to the
data revealed some noticeable individual differences. At the beginning of the learning
period, subject EO was able to identify correctly about 23% of the words, whereas
subject AR identified correctly only 6% of them. Both progressed over time. During
final sessions, subject EO achieved scores of about 85% correct, and subject AR of
about 64% correct. EO’s scores asymptoted for the last 15 sessions, while AR’s
scores never stabilized. Note that EO was tested in binocular condition and AR in
monocular condition. Although overall improvements were similar for both of them,
these performance differences might reflect a binocular advantage.
A second experimental observation consistent with a learning process appears in
the analysis of the mean response time (fig. 35b). During the first sessions, typical
response times ranged between 2-5.5 s. For both subjects mean response time
decreased significantly as session number increased (Pearson’s correlation: r = -0.79,
p < 0.0001 for EO and r = -0.38, p = 0.001 for AR). The reduction in response time
is more pronounced for subject EO tested in binocular mode (2.4 s), than for subject
AR tested in monocular mode (0.6 s). Interestingly, the longer initial response times
of subject EO are also associated with better initial reading scores (compare with fig.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
53
b)
Figure 35. Eccentric reading performance of pixelized 4-letter words versus training session number
for 2 normal subjects, expressed as: a) reading scores in RAU (left scale) and in % (right scale), and
b) mean response time [s]. The viewing window contained 286 pixels and was stabilized at 15° in the
lower visual field. The solid lines indicate the best fit to the data.
35a), suggesting differences between subjects in individual strategies to accomplish
this difficult task.
Taken together, these data demonstrate that important improvements in
eccentric reading of pixelized isolated words can be obtained with training. They
demonstrate that subjects can improve reading accuracy and response rate over time
as they familiarize with the task. It is interesting to explore in more detail some of
the parameters influencing this learning effect.
3.4.3.3 Influence of eccentricity and of pixelization on the learning process
Figure 36 presents reading scores for both subjects in the 2 control conditions.
The influence of eccentricity can be observed by comparing data obtained with a 10°
x 3.5° viewing window, containing 286 pixels, but presented at 0° eccentricity
(central reading; red plots in fig. 36) to that obtained in the main condition (same
viewing window size, same number of pixels; black plots in fig. 36). As expected on
the basis of the acute experiments presented in section 3.4.2, central reading
performance was already close to perfect during the first sessions for both subjects.
Both subjects improved their eccentric reading performance with training; however,
they never reached the same performance than in central vision. Results for subject
EO (binocular mode) asymptoted after about 15 sessions, while subject AR
(monocular mode) showed a slower improvement.
The influence of pixel number is demonstrated by comparing data collected with a
10° x 3.5° viewing window, presented at 15° of eccentricity, but at maximum screen
resolution (14000 pixels; blue plots in fig. 36) to that obtained in the main condition
(same viewing window size, same eccentricity). The learning process for eccentric
54
EXPERIMENTS ON READING
Figure 36. Reading scores for 4-letter word deciphering versus training session number in central
vision using a viewing window at 286-pixel resolution (red plot) and at 15° eccentricity in the lower
visual field using a viewing window at maximum screen resolution (blue plot). The solid lines indicate
the best fit to the data in these two conditions. The best fit to the data in the main condition (black
solid line) is also shown for comparison.
reading of words presented at high resolution and for eccentric reading of words
presented at 286-pixel resolution was strikingly similar for both subjects. All through
the duration of the experiments, data gathered at the high-resolution condition was
about 10% to 20% higher than that collected at 286-pixel resolution.
Altogether, these results reveal that the improvements measured in the main
condition can be mainly attributed to the adaptation process of eccentric word
reading. Conversely, the habituation to decipher pixelized words plays only a minor
role. It can be therefore concluded that habituation to eccentricity is a dominant
component in the learning process. Besides this, the visual system does not seem to
be able to use all the information presented at a 15° eccentricity. Scores collected at
15° eccentricity asymptote remarkably below those collected with central reading.
Providing more resolution can improve performance to some extent, but does not
totally compensate for the loss due to eccentricity.
3.4.3.4 Effect of familiarization with the word-set
For each run, 50 words randomly chosen out of a 500-word pool were presented
to the subjects. Performance might have improved because, since subjects were
always confronted to the same finite set of words, they were remembering better
and better the set of possible correct answers. We therefore decided to investigate
how this artefactual bias influenced our results.
A new pool of 200 words was generated. Blocks of 50 words were randomly
extracted amongst this new set and presented to the subjects in a new set of
experimental sessions, occurring once the main experiment ended. The remaining
experimental conditions were identical. Figure 37a compares mean reading scores
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
55
b)
Figure 37. Comparison of mean reading performance between the original word set used during the
main learning experiment and a new pool of unpracticed 4-letter words. Bars indicate mean values ±
SD calculated on the basis of three runs: the first and the last 3 runs of the main experiment (red and
blue bars, respectively), and 3 additional runs with unpracticed words (green bars). Experimental
conditions: 15° eccentricity using a viewing window of 10° x 3.5° containing 286 pixels. Results
expressed as: a) mean reading scores in RAU ±SD (left scale) and in % (right scale), and b) mean
response time ±SD [s].
obtained with the new word set to those obtained using the original 500-word pool.
For both subjects, reading scores using the new word set were significantly higher (p
= 0.002 for EO; p < 0.001 for AR) than those achieved at the beginning of the main
experiment. Conversely, reading scores for unpracticed words were only a little lower
than those obtained at the end of the main experiment with the original 500-word
pool. This difference was significant only for subject AR (p = 0.03) but not for
subject EO (p = 0.2). These findings demonstrate that subjects could identify
unpracticed words with a similar accuracy to that they achieved with the old set of
words after learning.
The analysis of mean response times completes this evaluation (fig. 37b). Both
subjects required more time to identify the new words than when using the old word
set at the end of the main study (non-significant for EO at p = 0.07; significant for
AR at p = 0.006). This suggests that the increased difficulty in reading unpracticed
words might be better reflected by response time measurements than by reading
accuracy.
One can conclude from this additional test that the repeated use of the same pool
of words did not significantly bias our results. Therefore, the impressive
improvements observed were real improvements in accomplishing the task.
Interestingly, familiarization with the word pool seems to be important for reading
56
EXPERIMENTS ON READING
speed. Pixelized words that are familiar to the reader are recognized more rapidly
than unpracticed words presented in the same conditions.
3.4.3.5 Binocular versus monocular performance
Subject EO performed all tests using binocular vision while subject AR did all tests
in monocular vision. The final scores reached by subject EO were about 20% higher
than those of subject AR (see fig. 35a). Two important issues can be raised: Is this
difference reflecting an advantage of binocular vision? How would subjects perform if
they used the condition they did not use previously?
At the end of the main experiment, we measured reading scores in inverted
viewing conditions for each subject (i.e. in monocular mode for EO, and in binocular
mode for AR). The remaining experimental conditions were identical to our main
experiment (viewing window containing 286 pixels, presented at 15° eccentricity).
No significant reading performance difference was found for any subject in such
inverted viewing conditions (fig. 38). Individual differences in reading scores
between subjects were preserved. This finding indicates learning is independent of
the condition in which training was conducted. Binocular learning benefits monocular
eccentric reading and monocular learning benefits binocular eccentric reading. There
was however a slight, though non-significant, within subject trend for better scores
with binocular vision. This small advantage is possibly due to binocular summation or
to inter-ocular suppression effects.
Figure 38. Effect of using reversed viewing conditions
(untrained versus trained) on reading scores. The bars
indicate mean values ± SD calculated on the basis of three
runs. Subject EO: three additional runs in monocular
condition (untrained – red bar) versus the last three runs of
the main experiment in binocular condition (trained – blue
bar). Subject AR: three additional runs in binocular condition
(untrained – red bar) versus the last three runs of the main
experiment in monocular condition (trained – blue bar).
Experimental conditions: 15° eccentricity using a viewing
window of 10° x 3.5° containing 286 pixels.
We also investigated if learning gathered monocularly transferred to the nontrained eye. Subject AR, who did all the tests monocularly with her dominant right
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
57
eye, was also tested using her non-dominant left eye. Figure 39 clearly shows that
there is no significant difference in monocular reading performance across both eyes,
in all three different experimental conditions.
Figure 39. Comparison of reading scores between the
trained and the untrained eye at the end of the training
process for subject AR. The bars indicate mean values ± SD
calculated on the basis of three runs: last three runs of the
main experiment and three additional runs conducted on the
untrained eye. All three experimental conditions are
compared.
3.4.3.6 Persistence of learning
We finally decided to investigate if the benefits of learning could persist after a
prolonged period of non-practice. Subject EO did not participate in any testing for a 2
months after the end of the experiments. Her reading performance was then remeasured using the main experimental condition. Figure 40 compares her
performance: (1) at the beginning of the main experiment (day 1); (2) at the end of
the training period (day 36); and (3) after 2 months of non-practice (day 99).
Performance was significantly better, both in terms of reading scores (p = 0.0004)
and in terms of mean response time (p = 0.02), when comparing the 1st training
session to the results obtained after the 2-month break. Conversely, no significant
change in performance (reading scores or mean response time) can be observed
between the last training session and the measures taken 2 months later. These
findings indicate that learning of eccentric reading persists after extended periods of
non-practice.
58
a)
EXPERIMENTS ON READING
b)
Figure 40. Reading performance 2 months after training compared to the reading performance at the
beginning and at the end of training for subject EO. Mean values (±SD) calculated on the basis of 3
runs: first 3 runs of the main experiment (day 1), last 3 runs of the main experiment (day 36), and 3
additional runs conducted tow months after completion of the experiment (day 99). Experimental
conditions: 15° eccentricity using a viewing window of 10° x 3.5° containing 286 pixels in binocular
vision. Results expressed as: a) reading scores in RAU (left scale) and in % (right scale); b) mean
response time [s].
3.5 Full-page reading
The second objective pursued in this chapter was to address the issue of full-page
reading in conditions mimicking artificial vision as provided by a retinal implant. We
conducted 2 successive experiments. First, subjects read pixelized full-page texts
using a viewing window stabilized on the fovea. Second, subjects performed the
same task, but using a viewing window stabilized at 15° eccentricity. Experimental
sessions were repeated daily, until reading scores stabilized. Using central vision,
reading scores asymptoted within a few sessions. A more important and progressive
learning process was observed when using eccentric vision: almost two months were
necessary for reading scores to asymptote.
From the previous experiments using 4-letter words, we knew that a sampling
density of 286 pixels, distributed over a viewing window subtending 10° x 3.5° of
visual field, would allow for close to perfect word recognition. We noticed, however,
that the 3.5° vertical visual span limited page navigation, because it did not allow
visualization of the adjacent lines of text. Pilot experiments in central vision were
therefore conducted to determine a more adequate viewing window height. The
visualization of two lines of text at once was found to be very helpful to orient page
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
59
navigation. In our experiments, this was achieved by doubling the height of the
viewing window (7°). Larger vertical visual spans (≥ 10°) did not result in further
improvement. A viewing window subtending 10° x 7° of visual field was thus chosen
for the full-page reading experiments. We used the same pixel density as determined
in the isolated 4-letter word reading experiments presented previously: 572 pixels50.
Such a viewing window corresponds to a still surgically manageable implant size of 3
x 2 mm2.
3.5.1 Stimuli
A pool of 100 articles of
diverse contents (culture,
politics, economics, sports,
etc…) were downloaded from
the website51 of a popular
information
(neither
too
elementary,
nor
too
sophisticated)
Swiss
newspaper and converted to
BMP text images. We used
Figure 41. A segment of pixelized full-page text as
the same font than in the
presented on the computer screen. The three dots, at the
previous experiments (the
beginning and at the end, indicate a text segment situated
height of the small letter ‘x’
somewhere in the body of an article. Texts were not justified.
corresponded to a visual
angle of 1.8° at a 57 cm
viewing distance). In these conditions, a segment of 7 successive lines of text could
be displayed on the screen, and about 6 successive characters could be visualized in
the viewing window at once. Hyphenation was used to maximize word presentation,
resulting in an average of 25 words per text segment. Each article was divided into
10 successive segments. Figure 41 shows an example of such a stimulus.
3.5.2 Analysis methodology
3.5.2.1 Reading performance
Reading performance was measured in terms of reading scores (expressed as %
of correctly read words per session) and in terms of reading rates (expressed as
average number of correctly read words per minute during each session). Reading
scores were statistically analyzed using scores expressed in RAU units (please refer
50
51
Double the window height = double the number of pixels.
http://www.letemps.ch
60
EXPERIMENTS ON READING
to Chapter 2 for more details on this issue), but an approximate %-correct scale52 is
indicated on the right ordinates of the graphs for better clarity.
3.5.2.2 Text comprehension
Qualitative comprehension of the text was judged by two examiners using an
arbitrary 4-grade scale: ‘None’ meaning that the text was not understood at all;
‘insufficient’ meaning very partial comprehension, insufficient to understand the issue
reported in the text; ‘good’ meaning that the main issue was grasped but not all
details; ‘excellent’ meaning a perfect and detailed comprehension of the text53. After
each reading session, subjects had to describe what they had read and were then
questioned by the two examiners, who had no difficulties attributing one of the four
comprehension levels. Subjects spontaneously reported being satisfied when they
reached ‘excellent’ or ‘good’ levels of comprehension, associated with reading rates
of about 20 words per minute. From a clinical point of view, these two later levels
might be considered as gratifying and useful full-page reading performance.
3.5.2.3 Eye movements
Eye movements were recorded throughout the experiment. Saccades detected
on-line by the automatic parser of the eye tracking system were analyzed in order to
define the various stages of the oculomotor adaptation process for eccentric reading.
For a saccade to be detected by the system, several criteria had to be fulfilled: a
minimum eye displacement of 0.1°, a velocity threshold of at least 30°/s, and a
minimum acceleration threshold of 8000°/s2.
Detected saccades were categorized into 3 main groups according to their
orientation (fig. 42a): horizontal saccades (those with an angle of ±20° around the
horizontal axis, and directed either rightwards or leftwards), vertical saccades (those
with angles between 70° and 110° around the vertical axis, and directed either
upwards or downwards), and oblique saccades (those not fitting into any of the
preceding categories). Horizontal saccades were further subcategorized (fig. 42b)
into: progressions (horizontal saccades directed rightwards and less than 10° in
amplitude), regressions (horizontal saccades directed leftwards and less than 10° in
amplitude), and line-jumps (horizontal saccades directed leftwards and larger than
20° in amplitude).
Saccade frequency was calculated as the total number of saccades performed
during an experimental session (i.e. 4 full-pages of text). Saccade amplitude was
computed as the total absolute eye displacement (length) between the eye position
at the beginning of the saccade and its end position. Average saccade amplitude for
52
Note that the %-correct to RAU transformation is dependent of sample size (in our case the total
number of words used in one session). Approximate %-correct scales are therefore based on the
average number of words computed across all sessions presented on the graphs.
53
Such an uncommon 4 level scale was used because it is much easier to judge first if a subject
understood the main issue of a text and then to ask some questions to determine whether the
subjects grasped some parts of the text or if the subject really had a detailed understanding.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
61
b)
Figure 42. Saccade categorization. (a) Saccadic eye movements were categorized as horizontal
(oriented between –20° and +20° around the horizontal axis, and directed either rightwards or
leftwards), vertical (oriented between 70° and 110° around the vertical axis, and directed either
upwards or downwards), and oblique. (b) According to their direction and amplitude, horizontal
saccades were further subcategorized into progressions, regressions, and line-jumps.
a given experimental session was calculated on the basis of the absolute amplitude
of all saccades performed during the session.
3.5.2.4 Learning process
Whenever the results allowed it, learning curves were computed using the
exponential functions presented in Chapter 2. Significant learning effects were
determined using simple linear correlation (Pearson’s correlation).
3.5.3 Experimental protocol
Three subjects (AD, female, 24 years old; DV, female, 23 years old; DS, male, 30
years old) participated in these experiments. All of them were tested monocularly
using their dominant eye. During each session, they had to read aloud the first 4 text
segments (about 100 words) of an article and their voice was recorded for further
analysis. After each text segment presentation, calibration was checked for possible
drifts and if necessary, slightly corrected to insure an exact control of the viewing
window position during the entire experimental session. In very rare cases, these
controls revealed a significant drift, meaning that the position of the viewing window
was not stable during the presentation of the text segment. The results from the
corresponding segment were discarded and an additional text segment was read. A
qualitative comprehension test was performed at the end of each session (i.e., after
reading 4 successive text segments), by questioning the subject on the content of
the article. A different article was used in each session (none of the subjects read the
same article twice).
In experiment 1, several sessions were conducted using a viewing window
stabilized in central vision. This experiment lasted until subjects became familiar with
62
EXPERIMENTS ON READING
the task (reading pixelized text using a small viewing window for page navigation).
Experiment 2, testing eccentric reading, began once the subjects had adapted to
central reading. Possible learning effects were investigated by repeatedly performing
experimental sessions for a period of almost 2 months. Two sessions were conducted
each working day of the week (5 days/week). The duration of each experimental
session was variable throughout the experiment, but never exceeded 30 minutes.
Two sessions represented therefore less than 1 hour of daily training. This
experiment was stopped once reading scores asymptoted.
3.5.4 Experiment 1: Full-page reading in central vision
This experiment was dedicated to familiarize normal subjects with the unusual
task of reading pixelized, full-page texts using a small viewing window for page
navigation. For this experiment, subjects read 6 text segments per session instead of
the 4 segments per session used in the more difficult experiment 2.
Figure 43 presents reading performance in central vision versus session number
for each subject. All three subjects achieved perfect or close to perfect reading
scores (> 95% correct) already in the first sessions. No significant learning effect
was observed in the analysis of reading scores versus time. Reading rates improved
with time for all three subjects. Analysis of the experimental data revealed that the
average reading rate almost doubled from 71 to 122 WPM for AD. It improved from
65 to 89 WPM for DV, and from 60 to 72 WPM for DS. This improvement was,
however, statistically significant only for subject AD (Pearson’s correlation: r = 0.78,
p = 0.003). Interestingly, subject AD also achieved reading rates that were quite
a)
b)
Figure 43. Reading performance during experiment 1 for three normal subjects. Full-page texts were
read using central vision (10° x 7° viewing window containing 572 pixels). Results expressed as: a)
reading scores expressed in RAU units (left scale) and in approximate % (right scale) versus
experimental session number; and b) reading rates expressed in WPM (words/min) versus
experimental session number. The solid lines indicate the best fits to the data.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
63
superior to those obtained by the remaining 2 subjects at the end of this experiment.
Reading rates measured for the same 3 subjects, in normal viewing conditions (for
articles read directly from the same journal), were significantly higher, ranging
between 160 and 180 WPM.
These data clearly demonstrate that full-page reading can be achieved under
conditions mimicking artificial vision in the central visual field. This means that
relevant information for reading could be transmitted and captured by the visual
system. Almost all words could be correctly deciphered. However, reading rates were
significantly lower than normal due to the increased difficulty of page navigation
using a restricted viewing window and to the fact that this viewing window contained
pixelized stimuli.
3.5.5 Experiment 2: Full-page reading in eccentric vision
Experiment 2 started when subjects had adapted to performing the task in central
vision. Based on the previous results on eccentric reading of isolated 4-letter words,
we expected eccentric full-page reading to require significant adaptation to reach
best performance. Between 55 and 68 sessions per subject were necessary to fulfill
this criterion.
Figure 44a presents individual reading scores versus session number for full-page
reading at 15° eccentricity. Experimental data were fitted with exponential regression
functions. Reading performance for 2 subjects improved enormously throughout the
a)
b)
Figure 44. Reading performance during experiment 2 for 3 normal subjects. Full-page texts were
read using eccentric vision (with a 10° x 7° viewing window stabilized at 15° eccentricity in the lower
visual field, and containing 572 pixels). Results expressed as: a) reading scores expressed in RAU
units (left scale) and in approximate % (right scale) versus experimental session number; and b)
reading rates expressed in WPM versus experimental session number. The solid lines indicate the best
fits to the data.
64
EXPERIMENTS ON READING
experiment. During the first sessions, subjects DV and DS were able to identify only
about 13% of the words in the text, while at the end they achieved scores of 86%
and 98% correct, respectively. In contrast, subject AD already performed well in the
initial sessions (~ 85% correct), and ended up with almost perfect scores (~ 98%
correct). Her learning curve was therefore less spectacular. Reading score
improvements were highly statistically significant for all 3 subjects (Pearson’s
correlation: r = 0.57, p < 0.0001 for AD; r = 0.81, p < 0.0001 for DV; and r = 0.77,
p < 0.0001 for DS).
Figure 44b presents individual reading rates achieved during experiment 2. Large
inter-session variability can be observed for all 3 subjects (particularly in the case of
AD). Nevertheless, improvements were significant for all 3 subjects: AD from 5 WPM
to 26 WPM (Pearson’s correlation: r = 0.74, p < 0.0001), DV from 3 WPM to 14 WPM
(Pearson’s correlation: r = 0.81, p < 0.0001), and DS from 1 WPM to 28 WPM
(Pearson’s correlation: r = 0.90, p < 0.0001). At the end of experiment 2, reading
rates for eccentric reading were still significantly below values obtained in similar
conditions for central reading, and of course below normal reading rates. Yet, they
were remarkable when compared to what subjects achieved during the first sessions.
It is also important to note that reading rates continued to improve after almost two
months of training, suggesting that higher reading rates could still have been
achieved with more practice.
Even if word recognition scores and reading rates constitute helpful experimental
values to demonstrate changes in performance, they do not reflect the degree to
which text content was understood. Text comprehension is not easy to quantify, but
we tried to assess this parameter using the qualitative four-level scale described in
section 3.5.2.2. Figure 45 presents the evolution of text comprehension throughout
experiment 2, for the 3 subjects. In the beginning, subjects DV and DS experienced
major problems understanding the texts they read. ‘Good’ understanding could only
be achieved after 16 sessions
or more. In contrast, subject
AD
achieved
‘good’
to
‘excellent’ text comprehension
since the beginning. At the
end of experiment 2, subjects
AD and DS systematically
achieved
‘excellent’
text
comprehension,
but
not
subject DV. These results fit
well with the performance
curves in figure 44 where
subject AD achieved high
reading scores from the
beginning and subject DV
finished with the lowest
performances.
Figure 45. Text comprehension estimates during experiment
2 versus session number for the 3 subjects.
65
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
Figure 46. Text comprehension estimates versus reading
scores and reading rates. Data collected on the 3 subjects
were merged. Box plots indicate median values, 25th and
75th percentile values (colored box) as well as 10th and 90th
percentile values (vertical bars). Circles indicate outliners.
reading rates.
It is interesting to plot the
results of text comprehension
versus reading scores and
reading rates (fig. 46).
Reading scores over 85%
correct were required to
reach ‘good’ to ‘excellent’ text
comprehension levels. Text
understanding seemed to be
impossible for scores below
60% correct. The distribution
of
comprehension
levels
against reading rates was
more variable. ‘Excellent’ or
‘good’ comprehension levels,
for example, were reached
over a large range of reading
rates, and even occasionally
at reading rates below 10
WPM. Text comprehension
appeared thus to be more
closely associated to high
reading scores than to high
Taken together, results from experiment 2 demonstrate that an important
learning process occurred for eccentric full-page reading. Subjects achieved
functionally useful eccentric full-page reading after almost two months of daily
training. The evolution was however expressed quite differently across subjects.
Subject DS, for example, improved impressively in each of the 3 measured
parameters, all through the experiment. In contrast, subject AD begun the
experiment with relatively high reading scores and good text comprehension. In her
case, the learning process was best expressed by major reading rate improvements.
AD
DV
DS
Figure 47. Gaze position recorded for each subject while performing the reading task in central
vision (last session of experiment 1). The solid line represents the trajectory of the center of the
viewing window relative to the text (see fig. 28). The panels on the right represent frequency
histograms of the vertical coordinates of gaze recorded every 4 ms. Gray bars indicate the position of
the lines of text.
66
EXPERIMENTS ON READING
Subject DV, while showing significant improvements in all aspects, did not attain the
same level of performance than the other 2 subjects in the same period of time.
3.5.5.1 Analysis of eye movements
Illustrative samples of gaze position recordings are shown in figures 47 (central
reading) and figure 48 (eccentric reading). During the first training sessions for
eccentric reading, oculomotor behavior appeared quite inappropriate for the reading
task: large vertical saccades predominated. Subjects seemed unable to fixate words
or to roughly follow a line of text. Oculomotor behavior evolved gradually. Eye
movements intended to decipher single words were already visible as early as in the
5th session, especially for subject DV. At the end of the training period, all subjects
AD
DV
DS
1st
5th
15th
Last
Figure 48. Gaze position recorded for each subject during various sessions (1st, 5th, 15th, and last) in
experiment 2. The solid line represents the trajectory of the center of the viewing window relative to
the text (see fig. 28). The panels on the right represent frequency histograms of the vertical
coordinates of gaze recorded every 4 ms. Gray bars indicate the position of the lines of text.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
67
developed a structured page navigation strategy.
It is interesting to compare final central and eccentric reading strategies. After
training, both eye movement patterns share several similarities. The viewing window
focused on consecutive words and across successive lines of text. Forward-directed
saccades shifting fixation from one word to the next (progressions) and saccades
shifting fixation from the end of one line to the beginning of the next (line-jumps)
could be clearly distinguished. Occasionally, subjects traced back on the same line
(regressions), to visualize again specific words. However, differences could also be
noted between central and eccentric reading. In eccentric vision, regressions
occurred more frequently. Moreover, horizontal saccades seemed less precise;
therefore more small corrective saccades were required.
Gaze stability was quantified by computing histograms of the vertical position of
the viewing window during each experimental session (plotted on the right side of
figs. 47 and 48). During the first sessions of eccentric reading, the histograms were
broad and roughly centered on the screen. There is no evidence for successful
focusing on single lines of the text. The histogram is completely different for the last
session. A series of small peaks at regular vertical intervals (corresponding to that of
the lines of text) can be observed in the histogram. This analysis also revealed that
subjects had the tendency to place the center of the viewing window slightly below
the lines, probably minimizing the eccentricity of the relevant part of the target
image.
From the figures above, it is clear that the overall control of eye movements
improved impressively during the experiment. Gaze position recordings were used to
a)
b)
Figure 49. Mean cumulative length of the total trajectory described by each subject’s eye movements
versus session number. Distances along each one of the 2D axes were calculated separately for: a)
the vertical coordinate, and b) the horizontal coordinate of gaze position data. The solid lines indicate
the best fits to the data.
68
EXPERIMENTS ON READING
compute the mean cumulative length of the vertical (fig. 49a) and horizontal (fig.
49b) components of eye movements on the screen, for each experimental session.
Best fits to these data are also presented for each subject. The mean cumulative
length of vertical eye movements decreased dramatically for all subjects. Initial
values ranged between 35-48 m per text segment, while final values significantly
dropped to 5-9 m per text segment (Pearson’s correlation: r = 0.69, p < 0.0001 for
AD; r = 0.39, p = 0.04 for DV; and r = 0.84, p < 0.0001 for DS); a five-fold
decrease. Total vertical trajectories asymptoted within 3 to 10 sessions for subjects
DV and AD. Values for subject DS stabilized after about 36 sessions. Horizontal
trajectories decreased significantly only in subjects AD and DS (Pearson’s correlation:
r = 0.77, p < 0.0001; and r = 0.87, p < 0.0001, respectively). Compared to their
vertical counterparts, this decline was less impressive (from initial values ~ 21-27 m
per text segment, to ~ 7-14 m at the end of training) and more progressive (values
in AD stabilized after ~ 43 sessions, while in DS these were still decreasing at the
end of the experiment). Total horizontal trajectories for subject DV remained stable
allthrough the experiment.
The distribution of saccades performed during the 1st, 5th, 15th and last eccentric
reading sessions is plotted in figure 50. During the first training session, bundles of
large vertical saccades were observed. Many of these eye movements were between
10° and 20° in amplitude, probably reflecting recurring (reflexive) attempts to bring
the stimulus image onto the fovea (foveating saccades), followed by an equivalent
saccade of opposite direction attempting to bring the viewing window back on the
stimulation screen. In the 5th session, these movements were no longer visible in
subjects AD and DV, and only a few of them were still observed in subject DS. The
amplitude of the remaining vertical saccades decreased gradually, to become hardly
visible at the end of training. In contrast, structured patterns of horizontal eye
movements developed in the 5th session in 2 subjects (AD and DV). From the 15th
session on, horizontal saccades predominated over the initially prevailing vertical
pattern. In the last training session, eye movements essentially consisted of
progressions, regressions, line-jumps, and other small corrective saccades.
Changes in saccade counts (frequencies) by category, are plotted in figure 51.
The total number of vertical saccades decreased significantly over time in all subjects
(Pearson’s correlation: r = 0.58, p < 0.0001 for AD; r = 0.39, p < 0.01 for DV; and r
= 0.72, p < 0.0001 for DS). An approximately 15-fold drop was observed after 3, 20,
and 25 sessions in subjects DV, AD, and DS, respectively. Slighter (about 5-fold) but
significant (Pearson’s correlation: r = 0.72, p < 0.0001 for AD; r = 0.73, p < 0.0001
for DV; and r = 0.82, p < 0.0001 for DS) frequency decays were observed for
oblique saccades. In subjects AD and DS, the process was slower (33 and 38
sessions, respectively) than for vertical saccades. In subject DV, values were still
decreasing when the experiment ended. Evolution of horizontal saccade counts was
more complex and data could not be fitted with an exponential curve. In AD and DV,
these increased significantly during the first 15 sessions (respectively, Pearson’s
correlation: r = 0.60, p < 0.05 and r = 0.72, p < 0.01) and then significantly
decreased (respectively, Pearson’s correlation: r = 0.48, p < 0.001 and r = 0.31, p <
0.05). In subject DS, horizontal saccade counts increased significantly during the first
69
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
7 sessions (Pearson’s correlation: r = 0.91, p < 0.01) and then decreased
significantly (Pearson’s correlation: r = 0.82, p < 0.0001).
AD
DV
DS
1st
5th
15th
Last
Figure 50. Angular distribution of the saccades performed each subject during training for eccentric
reading at different times of the learning process (1st, 5th, 15th, and last training sessions).
Additional results were obtained following horizontal saccade subcategorization
(see fig. 52). The proportion of progressions increased significantly in all 3 subjects,
from values ranging between 45% and 65% in the first sessions up to about 65% by
the end of training (Pearson’s correlation: r = 0.74, p < 0.0001 for AD; r = 0.43, p <
0.001 for DV; and r = 0.74, p < 0.0001 for DS). Only subject AD reached an
70
a)
EXPERIMENTS ON READING
b)
c)
Figure 51. Changes in saccade frequency versus session number for each subject during training of
eccentric reading, by saccade category: a) vertical saccades, b) horizontal saccades, and c) oblique
saccades. Average values in central vision (black dashed lines) are also shown for comparison.
asymptote (after 50 sessions). Regressions behaved inversely. In the beginning of
training, they represented about 41%, 34%, and 43% of the total number of
horizontal saccades in AD, DV, and DS, respectively. These proportions significantly
decreased to 17%, 26%, and 27%, respectively (Pearson’s correlation: r = 0.70, p <
0.0001 for AD; r = 0.65, p < 0.0001 for DV; and r = 0.81, p < 0.001 for DS). At the
end of the experiment, the proportion of regressions were still decreasing in subjects
DV and DS, while in subject AD values stabilized after about 30 sessions. The total
number of line-jumps increased significantly with training in DV and DS (Pearson’s
correlation: r = 0.64, p < 0.0001 and r = 0.61, p < 0.0001, respectively). Line-jump
counts in AD were more variable, but also tended to increase over time (Pearson’s
correlation: r = 0.20, p = 0.1). Values in subjects AD and DV stabilized after around
16 and 27 sessions. In the case of subject DS, line-jump counts had not asymptoted
when the experiment ended.
a)
b)
c)
Figure 52. Evolution of the different horizontal saccade subcategories versus session number for
each subject during training of eccentric reading: a) proportion (%) of regressions; b) proportion (%)
of regressions; and c) total number of line jumps. The proportions of progressions and regressions
were calculated on the basis of the total number of horizontal saccades. Average values for central
vision (black dashed lines) are also shown for comparison.
71
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
Average amplitude of the different saccade categories was also modulated
throughout the training period (fig. 53). For vertical saccades, amplitudes dropped
significantly from initial values of 5°-8° down to final values of around 3° (Pearson’s
correlation: r = 0.56, p <0.0001 for AD; r = 0.69, p < 0.0001 for DV; and r = 0.72, p
< 0.0001 for DS). Asymptotes were reached after 13, 20, and 27 sessions in subjects
DV, AD, and DS, respectively. Average amplitude of oblique saccades remained
stable in subject DS, and decreased very slightly but significantly in subjects AD and
DV (respectively, Pearson’s correlation: r = 0.34, p < 0.01 and r = 0.47, p <
0.0001). In contrast, average amplitude of horizontal saccades significantly increased
from values ranging between 5°, 4°, and 2.5°, up to 7°, 6°, and 4° in subjects AD,
DV, and DS (correspondingly, Pearson’s correlation: r = 0.42, p < 0.001; r = 0.51, p
< 0.001; and r = 0.80, p < 0.0001). In subject DS, amplitudes never stabilized, while
in subjects AD and DV curves asymptoted after 20 and 23 sessions.
a)
b)
c)
Figure 53. Changes in saccade amplitude (°) versus session number for each subject during training
of eccentric reading, by saccade category: (a) vertical saccades, (b) horizontal saccades, and (c)
oblique saccades. Average values for central vision (black dashed lines) are also shown for
comparison.
3.6 Discussion
3.6.1 Main outcome of these experiments
In central vision, the first set of reading experiments clearly demonstrate that
about 300 distinctly perceived points (pixels), distributed around a 10° x 3.5° visual
field, are necessary to read 4-letter words. For the second set of experiments,
subjects had to move a small viewing window on a computer screen to navigate
across full pages of pixelized text using their own eye movements. We kept the same
pixel density and used a viewing window with double vertical visual span (10° x 7°).
About 600 pixels were sufficient for useful full-page reading. These data extend the
work of Cha et al. (1992b), but end on a similar conclusion: 300-600 pixels are
needed to read small strings of characters. This pixel density appears therefore to be
an intrinsic criterion: more related to the type of stimuli to be deciphered than to the
presentation protocol.
72
EXPERIMENTS ON READING
In eccentric vision (≥ 10°) initial reading performance was poor, even at high
pixelization levels (see figs. 30 and 31). Therefore, the main factor limiting reading
performance at high eccentricities is not pixel resolution but rather the fact that only
part of the information is grasped by the visual system. The “crowding effect” seems
to play an important role, because isolated letters were perfectly recognized at
eccentricities where 4-letter words were impossible to read (see fig. 34). The use of
large fonts (and accordingly of large viewing windows) can attenuate the negative
effect of eccentricity on reading performance, but cannot totally compensate for it.
Strictly speaking, these results suggest that eccentric implant locations (≥ 10°)
would strongly impair reading performance. It might, however, be required to place
retinal prostheses at such high eccentricities to keep image distortion minimal. These
considerations raise the question of whether subjects can adapt to eccentric reading
and, as a consequence, achieve higher levels of performance with training. This is
particularly important since our experiments were conducted with untrained
observers, who were not used to such viewing conditions. Complementary
experiments were designed to investigate if eccentric reading, under conditions
simulating retinal implants, could be improved by learning or if it was limited by
fundamental properties of the visual system.
The studies presented in this chapter demonstrate that subjects are able to adapt
to the unnatural task of eccentric reading. Remarkable improvements could be
observed during the course of both studies, for all the subjects that participated in
the investigations. Therefore, these results demonstrate that useful reading can be
achieved in conditions mimicking a retinal implant. This outcome is very promising
for the future of retinal prostheses. However, it is not surprising, as several
experimental observations suggest that eccentric reading can be improved with
training.
Westheimer (2001) demonstrated that learning in peripheral vision is taskspecific. Improvements were observed for stereoscopic, orientation, vernier acuity,
bisection, and time discrimination tasks, but not for resolution or Landolt C acuities.
Westheimer’s results confirmed those by other authors (Beard et al., 1995; Schoups
et al., 1995; Crist et al., 1997), indicating that spatial visual functions, which rely on
important processing in higher cortical areas, can be improved by training in the
visual periphery.
There is also extensive evidence in the low vision literature that educational
training is an important factor for successful eccentric reading by patients with
macular scotoma (see e.g. Peli, 1986). Nilsson (1990), for example, trained patients
with advanced age-related macular degeneration to use optical aids and
demonstrated that they greatly improved their reading capacities with only about 5
hours training. Half of the evaluated subjects had to use eccentric viewing due to an
absolute central scotoma. The situation encountered when educating low vision
patients to use optical aids for reading is however not identical to simulating an
eccentric retinal implant on normal subjects. Low vision patients are generally able to
use large parts of their retina situated relatively close to the fovea. The stimuli used
in this study were restricted to small viewing windows stabilized at high eccentricities
in the lower visual field. Moreover, the information content of the stimuli was
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
73
reduced (pixelized), which is not the case for the stimuli presented to low vision
patients through optical aids. Some improvement of eccentric reading capacities
could be therefore expected on the basis of clinical experience with low vision
patients, but it was still crucial to demonstrate that practical effects are significant for
the specific conditions of retinal implants.
Since our experimental setting was particularly designed to simulate vision as will
be produced by a retinal implant, it obviously did not illustrate the functional
constraints and remaining retinal capacities found in conditions associated with
central scotomas and eccentric reading. Despite these methodological limitations, the
present results also offer remarkable indications of how mechanisms for eccentric
reading are constructed, at least in certain circumstances.
3.6.2 Analysis of the learning process
Subjects had to cope with several difficulties during their adaptation to reading
using an eccentric area of the visual field. For example, they had to suppress
unwanted reflexive eye-movements, focus attention to this peripheral region of the
visual field, and extract a maximum of information out of low resolution (pixelized)
stimuli. In addition, for full-page reading, subjects had to get accustomed to
scanning several lines of text using an eccentric and restricted viewing window and
reconstructing meaningful sentences out of words and phrase fragments. This list of
difficulties is surely not exhaustive, and all had to be surmounted to achieve the task.
Although it is impossible to analyze these factors in an isolated way, it is interesting
to discuss in more detail how some potentially contribute to the learning process,
while others limit reading performance.
3.6.2.1 The “crowding effect”
In the first eccentric reading study presented in the chapter, using pixelized
isolated 4-letter words, significant learning effects could be observed. These
improvements took about 70 experimental learning sessions. In these experiments,
page navigation was not required, suggesting that one important factor of the overall
learning process is independent of the accurate control of eye movements. Such
component is likely to be associated with performance improvements in deciphering
eccentric low-resolution stimuli.
One can wonder to which extent the lower reading rates observed in eccentric
vision could be attributed to the decreased spatial resolution in peripheral regions of
the retina. Visual acuity at an eccentricity of 15° is expected to be about 20/125
(Daniel & Whitterridge, 1961; Cowey & Rolls, 1974). Whittaker & Lovie-Kitchin
(1993) suggest the use of font sizes several times bigger than the acuity threshold,
to reach optimal reading rates. Bowers & Reid (1997) recommended print sizes of at
least four times the acuity threshold. The character size used in our experiments
corresponds to a visual acuity of about 20/400. This size was thus just adequate, and
did not significantly limit reading rates for eccentric reading in this study. Hence, the
74
EXPERIMENTS ON READING
low reading performance observed cannot uniquely be attributed to decreased
resolution in the periphery.
Our measurements confirmed that eccentric recognition of single letters is much
easier than eccentric reading of entire words. These results fit well with findings
showing that the fovea and periphery have different center-surround interactions
that cannot be completely explained by the cortical magnification factor (Xing &
Heeger, 2000). The phenomenon of reduced discrimination in presence of
surrounding stimuli, known as the “crowding effect”, seems to be of cortical and not
of retinal origin (see e.g. Levi et al., 1985). Electrophysiological experiments in the
monkey, testing the modifications of the functional properties of the primary visual
cortex V1 accompanying perceptual learning suggest that it is possibly due to a
concomitant decrease in the “crowding effect” (Crist et al., 2001). Moreover, the
decrease of crowding has been found to be related to attention (Leat et al., 1999), a
component that could potentially be improved by training, as already indicated by
experiments on visual search (Sireteanu & Rettenbach, 1995; 2000). A significant
decrease of the “crowding effect” is thus very likely to be an important part of the
overall learning process.
3.6.2.2 Text comprehension and the influence of context information
Interestingly, subjects had to achieve relatively high reading scores (> 85%
correct) in order to achieve useful text comprehension (i.e. ‘good’ or ‘excellent’ text
comprehension levels – see fig. 46). Lower reading scores were almost always paired
with insufficient text comprehension. It appears, thus, that reading scores must be
rather high to allow for useful reading. This finding, however, must be considered
with care, since it is based on a very simple qualitative comprehension test.
The influence of context information on reading performance remains a difficult
issue to assess. The comparison of the data on full-page reading to that on isolated
words is interesting in this respect. After training, we observed mean reading scores
of about 75% correct for isolated words whereas mean reading scores of about 94%
correct were obtained for full-page reading. While this difference is not statistically
significant for such a small number of subjects, it is in agreement with the hypothesis
that the use of context information helps reading performance, even in eccentric
vision (Fine & Peli, 1996; Fine et al., 1999).
3.6.2.3 Influence of the restricted viewing window
It is also interesting to examine the impact the restricted viewing window has on
reading performance. The width of the viewing window used in this study limited the
maximum visual span to about 6 letters, consequently reducing maximum reading
rates for central vision. Legge et al. (2001) suggested that the average visual span
shrinks from at least 10 letters in central vision to about 1.7 letters at 15°
eccentricity. This low value could be however increased upon prolonged observation
times. Therefore, at 15° eccentricity, the viewing window restriction was even more
pronounced, not because of our experimental limitation, but because of the ‘natural’
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
75
visual span reduction at high eccentricities. Subjects had therefore either to increase
the number of saccades to decipher a given word, or to increase fixation time to
extend the visual span; both strategies leading to lower reading rates, which is
consistent with our experimental observations.
3.6.2.4 Central versus eccentric reading
Even if subjects adapted well to the eccentric reading tasks, reading rates
appeared to be significantly limited by target eccentricity. Average reading rates
using eccentric vision were considerably lower (by a factor 2.5 to 5.8) than those
achieved with central vision. Other authors have already reported low reading rates
for eccentric vision. Wensveen et al. (1995), for example, found that simulated
central scotomas resulted in dramatic decrements of reading rates. In their younger
subjects, a 3-fold reduction of reading rates was found for 8° central scotomas. It is,
however, difficult to compare quantitatively these results to ours, because of marked
differences in the experimental conditions. In this context it should also be noted
that, when the experiments were terminated, reading rates had not really
asymptoted. This means that subjects could still have improved reading rates with
prolonged training (see figs. 35b and 44b).
3.6.2.5 Oculomotor adaptation to eccentric viewing
Eccentric vision requires adaptation of oculomotor control to such specific viewing
conditions. Reflexive foveating mechanisms must be suppressed and saccadic eye
movements must be redirected to the new fixation locus.
Our data demonstrate that the pattern of eye movements changed impressively
throughout the learning process. Certain oculomotor adaptation stages appeared
consistently in all tested subjects. Two essential adaptation processes could be
distinguished: a faster ‘vertical’ phase aimed at suppressing reflexive foveation, and a
slower ‘horizontal’ phase dedicated to the restructuring of the horizontal eye
movement pattern.
During the first sessions, numerous vertical foveating saccades could be
observed. Interestingly, the first rapid ‘vertical’ adaptation process appears to include
two relatively distinct, parallel phases: one consisting in the reduction of vertical
saccade count, the second in the reduction of both oblique saccade count and
vertical saccade amplitude. According to our results, the former occurred promptly,
and the latter, although rapid, was more progressive. It is reasonable to presume
that both aim at reducing reflexive foveation, but each relies on distinct mechanisms,
as suggested by their different time-course.
The second, slower adaptation phase concerned the restructuration of the
horizontal eye movement pattern. In the initial sessions, no structured reading
sequence could be distinguished. The frequency of horizontal saccades increased
during the first 7 to 15 sessions, and then slowly decreased, while their average
amplitude increased all through the learning process. The proportion of progressions
increased gradually. It has been demonstrated that, in eccentric vision, the visual
76
EXPERIMENTS ON READING
span can increase with training (Chung et al., 2004). This should result in fewer but
longer saccades, as observed in our data. A significant reduction in the proportion of
regressive saccades was also observed in all subjects. As a rule, when reading
difficulty decreases, saccade length increases and the frequency of regressions
diminishes (Pirozzolo, 1983; Rayner, 1998). Subjects spontaneously reported that the
task became easier with training, resulting in better word recognition during
eccentric fixation. Thus, fewer regressions were necessary for deciphering. Linejumps developed gradually and better calibration of progressive saccades could be
achieved with training. Hence, as better eccentric oculomotor control was developed,
fewer corrective saccades were needed. Two parallel, presumably related
phenomena might therefore be distinguished during the development of horizontal
saccade control. The first one corresponded to the adaptation of the amplitude of
horizontal saccades (mainly progressions) to the text presented. The second one
consisted in the reduction in number of regressions.
Our results showed that even when optimal eccentric reading performance has
been attained, oculomotor behavior was not optimal compared to that observed in
central vision (compare results for central and eccentric reading in figs. 51-53).
Although subjects adapted to the eccentric reading task, vertical saccades did not
disappear completely. More horizontal and oblique saccades where required in
eccentric vision than in central vision. In 2 out of the 3 subjects, more line jumps
were performed and horizontal saccades were smaller during eccentric reading. In
general, oblique saccades were smaller for eccentric than central viewing conditions.
These results clearly demonstrate that, even after extensive training, the
characteristics of saccades performed during eccentric and central reading differed. A
previous investigation in patients with central scotoma described a similar behavior
(Whittaker et al., 1991). Even when these patients had adapted to consistently direct
images onto the PRL, characteristics of eccentric saccades still differed from those of
foveating saccades. Typically, foveating saccades have shorter latencies and are
more accurate than eccentric, non-foveating saccades (Hallett, 1978; Zeevi & Peli,
1979; Whittaker & Cummings, 1990). Taken together, these findings confirm that
subjects suppress foveating saccades and then adapt non-foveating saccades to
reference the new fixation locus, in accordance with previous reports (Whittaker et
al., 1991).
3.6.3 Additional considerations
These investigations were designed to explore reading performance in conditions
mimicking artificial vision. Several questions concerning the choice of some of the
stimulation parameters are worthwhile to be discussed.
Why using a font size such that only approximately 4 to 6 characters could be
viewed inside the restricted viewing window? Legge et al. (1985a) determined that
efficient reading requires grasping sequences of at least 4 to 8 characters at a
glance. Several other authors demonstrated that a viewing window containing more
letters could favor optimal reading performances, especially for low vision observers.
Fine & Peli (1996b) report that, on a scrolling display, while normal observers needed
4 to 5 characters to reach maximal reading rates, visually impaired observers
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
77
required 6 to 7 characters. The same authors (Fine et al., 1996) found that reading
rates increased with character sequences as large as 13 when using a fiber optic
stand magnifier. Beckmann & Legge (1996) studied reading speeds for normal and
low vision subjects on magnified text. They compared a condition requiring page
navigation with another one that did not. When page navigation was required, critical
viewing window sizes of 14 and 10 characters were required for normal and low
vision observers, respectively, to reach 85% of their maximum reading speed. These
values were much lower when only 50% of maximum reading speed was required:
4.7 characters for normal subjects and 3.5 characters for low vision subjects. Without
page navigation they obtained values in the order of 4.7 (normal) and 5.2 (low
vision) to reach 85% of the maximum reading speed while 1.2 (normal) and 2.0 (low
vision) characters were needed for 50% of the maximum reading speed. A high
number of visible characters seems thus to be favorable essentially for page
navigation. Nevertheless, there is another argument that favors using a small
number of characters in the viewing window, especially when eccentric vision is
concerned. Using fewer letters permits the use of larger fonts, letter size being an
important limiting factor for eccentric reading. The font size was thus chosen
because it represented a reasonable compromise between a low limit value
concerning the reading speed and the advantage of a small number of big letters for
eccentric reading.
Why use a proportionally spaced font and not an equally spaced font (e.g.
Courier) with enlarged letter spacing to minimize the “crowding effect” in eccentric
reading? Enlarged letter spacing has been found to be significantly beneficial for
eccentric reading (Latham & Whitaker, 1996; Chung, 2004), especially for
attenuating the “crowding effect” (Arditi et al., 1990; Toet & Levi, 1992).
Proportionally spaced letters are closer together than equally spaced ones, thus
favoring the adverse “crowding effect”. However, printed matters (books, journals,
etc…) are almost exclusively printed in such proportional fonts. Therefore, although
the use of a proportionally spaced font increased the difficulty of the eccentric
reading task, it was essential to adapt our experiments to the reality of common
printed materials.
Finally, why not use RSVP or scrolled text presentation? Because they do not
represent realistic simulation conditions for retinal implants. Nevertheless, we
certainly admit that these would have been interesting examination conditions to
study eccentric reading in a general way.
3.7 Conclusions
With the first set of experiments of this study we demonstrated that the amount
of information conveyed by about 300 separate points (pixels), distributed around a
visual area of 10° x 3.5°, is sufficient to allow reading of isolated words in the central
visual field. We also demonstrated that reading of isolated words at high
eccentricities could be significantly improved with practice at the same, relatively
low, pixel resolution. With practice, one can learn to use a highly eccentric part of
the retina for reading, an abnormal task for normal subjects.
78
EXPERIMENTS ON READING
With the second set of experiments we confirmed that about 600 stimulation
points, distributed around a 10° x 7° visual field (the same image resolution than
above), allowed useful full-page reading abilities. A significant learning process was
however required to reach optimal performance with eccentric vision. One of the
main issues involved in the learning process was the adjustment of oculomotor
control in order to reference, as accurately as possible, the eccentric viewing window
used to navigate across text pages. Adaptation of eye movements seems to include
at least two parallel processes, a ‘fast’ vertical adaptation phase essentially involved
in suppressing reflexive foveation and a more progressive restructuration of the
horizontal eye movement pattern. Even after systematic training, eccentric reading
remains a difficult task resulting in low reading rates.
The evaluation of the reading task is particularly important if one is to assess the
rehabilitation prospects of a visual prosthesis since it is strongly associated with
vision-related estimates of quality of life and represents one of the main goals of low
vision patients seeking rehabilitation (Elliott et al., 1997; Wolffsohn & Cochrane,
1998; Margrain, 2000; McClure et al., 2000; Hazel et al., 2000).
3.8 Publications resulting from this research
Sommerhalder, J., Oueghlani, E., Bagnoud, M., Leonards, U., Safran, A.B., & Pelizzone, M. (2003).
Simulation of artificial vision: I. Eccentric reading of isolated words, and perceptual learning. Vision
Research, 43(3), 269-283.
Sommerhalder, J., Rappaz, B., de Haller, R., Pérez Fornos, A., Safran, A.B., & Pelizzone, M. (2004).
Simulation of artificial vision: II. Eccentric reading of full-page text and the learning of this task.
Vision Research, 44(14), 1693-1706.
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Pelizzone, M., & Safran, A.B. (2006). Processes
involved in oculomotor adaptation to eccentric reading. Invest Ophthalmol Vis Sci, 47(4), 14391447.
4 Experiments on Visuomotor Coordination
When it is obvious that the goals cannot be reached, don't adjust the
goals, adjust the action steps.
Confucius (551 – 479 BC)
4.1 Foreword
From everyday experience, we intuitively know that most of our motor actions are
visually guided. A variety of daily life and leisure activities involve gathering
information from the environment and using it to visually guide hand movements
towards a certain target (e.g. operating a telephone, locating and taking items from
a crowded shelf, locating and using items on a dinner table, etc…). Although these
actions seem simple enough, the processes involved are complex. Visuomotor tasks
require the combination of various abilities, like object identification/recognition but
also certain motor skills such as reaching, grasping, or pointing. Encoding spatial
information and using it to direct a particular motor response might impose various
constraints to a visual prosthesis.
4.2 Introduction
For reaching towards objects, the nervous system should solve various issues.
The target towards which the motor action is to be directed should be identified and
localized in space. In addition, the initial status of the actuator (limb) has to be
determined. The appropriate motor command can then be planned and generated.
Finally, throughout the movement, the motor command has to be closely monitored
and corrected if necessary. A considerable amount of work has been carried out to
understand the processes involved in each of these stages (for a comprehensive
review, see Desmurget et al., 1998).
The first step in reach-to-grasp tasks consists in identifying the target and in
determining its position in space as accurately as possible. Several systems interact
during this process. Simply speaking, an initial ‘eye-centered’ representation of the
target is constructed by combining visual input encoding the topographic features of
the stimulus within the retina with extra-retinal signals monitoring eye position.
Among these extra-retinal signals, it has been demonstrated that non-proprioceptive
signals monitoring gaze displacement play the most important role in target
localization (Sparks & Mays, 1983; Guthrie et al., 1983). A slighter, but significant
contribution of ocular proprioception has also been demonstrated (Gauthier et al.,
1990; Blouin et al., 1996), especially in the absence of a structured visual
background (Blouin et al., 1993). Then, all this information is translated to a ‘bodycentered’ reference frame, which is compatible with the action to be performed. This
is achieved by integrating the initial ‘eye-centered’ representation of the target with
79
80
EXPERIMENTS ON VISUOMOTOR COORDINATION
proprioceptive signals encoding the position of the body in extrapersonal space. This
proprioceptive information is gathered mainly from the vestibular system as well as
from the muscles of the neck and the body (Dichgans et al., 1974; Jeannerod, 1981;
Jeannerod, 1984; Roll et al., 1986).
Vision and proprioception are the main sources of information used by the
sensorimotor system for planning and executing motor commands (Prablanc et al.,
1986; Pelisson et al., 1986). The influence of each of these information sources on
visuomotor performance has been thoroughly investigated. Movement accuracy is
obviously maximized when both sensory inputs are available (Rossetti et al., 1994;
Rossetti et al., 1995). Early deafferentation studies revealed that without
proprioception and vision of the actuating limb, monoarticular pointing tasks can still
be performed with relative accuracy (Kelso & Holt, 1980; Rothwell et al., 1982; Bizzi
et al., 1984), but multi-joint movements are severely impaired (Bossom, 1974; Taub
et al., 1975; Rothwell et al., 1982). If vision of the arm is prevented during
movement execution, pointing accuracy decreases (Merton, 1961; Held & Freedman,
1963; Foley & Held, 1972). Allowing vision of the hand only before movement
initiation already resulted in a significant improvement of movement accuracy both in
normal subjects (Prablanc et al., 1979; Rossetti et al., 1994; Desmurget et al., 1997)
and in deafferented patients (Ghez et al., 1995). Rossetti et al. (1995) explored the
effect of introducing a sensory conflict between visual and proprioceptive information
sources in a pointing task. The initial position of the hand was optically displaced
with prisms while the vision of targets was kept undistorted. Vision of the limb was
not available during the movement. Results indicated a systematic pointing error
directed opposite to the optical visual displacement of the hand. The perturbation
remained, however, undetected to most subjects. A more recent study used virtual
reality to isolate both information sources (Lateiner & Sainburg, 2003). Results
revealed that visual (virtual) input predominated over proprioceptive (real) input
when adjusting the direction of movement.
Altogether, these studies demonstrate that both visual and non-visual information
are used in conjunction during visuomotor tasks. Continuous monitoring of the motor
apparatus significantly contributes to the accuracy of goal-directed movements. In
addition, when visual and non-visual sensory signals diverge, visual input appears to
be privileged.
4.2.1 Vision and visuomotor coordination
The findings summarized previously substantiate the fact that visual information is
continuously used by the nervous system to solve the different issues it faces when
performing visuomotor tasks. Yet, an important question remains open: What kind of
visual information is used and in which context?
Based on the different visuomotor deficits observed in animals and patients as a
consequence of localized brain damage, Schneider (1969) postulated that two
distinct neural mechanisms are involved when reaching towards objects: one
responsible for target identification and another responsible for target localization.
Ungerleider & Mishkin (1982) further specified that visual information was selectively
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
81
Figure 54. Illustration of visuomotor behavior during a reach-to-grasp task. Reaching behavior is
functionally separated in two parallel visual channels, one dedicated to target identification and the
other with encoding target location. Modified from Paillard (1982). With permission of MIT Press.
processed to encode intrinsic (size, shape, weight, etc…) and extrinsic (distance,
location, etc…) target cues in the inferior temporal and posterior parietal cortex,
respectively (see also Jeannerod & Biguer, 1982). Following this line of thought,
Paillard (1982) subdivided reach-to-grasp tasks into the distinct behavioral elements
illustrated in figure 54.
Goodale & Milner (1992) reformulated this idea of separate processing pathways
focusing more on output task requirements than on target characteristics. They
suggested that both the ventral (V1 Æ inferior temporal cortex) and dorsal (V1 Æ
posterior parietal cortex) streams process all visual information (structure and
location), but each selectively transforms it with different purposes: perception
(object recognition/identification) or action (intervening in visuomotor actions
directed at such objects). Figure 55 illustrates this functional segregation along the
entire visual pathway.
According to the previous view, if all visual information is processed through the
parallel streams, central and peripheral vision should contribute in a particular
manner to each visuomotor mechanism. There is substantial evidence from
psychophysical experiments outlining the differential roles of foveal and peripheral
vision during visual search for potential targets. On one hand, a number of studies
suggest that detailed object information is primarily coded in the fovea and its
surroundings. Parker (1978) explored eye movement behavior during a picture
recognition task. In his experiments, subjects had to detect changes in a visual scene
consisting in a matrix of 6 objects separated from each other by 10°. Results showed
that almost all objects had to be fixated in order to detect a change in the scene.
Later, Nelson & Loftus (1980) demonstrated that a particular feature of a scene is
more likely to be detected when it has been directly or closely (within 2.6°) fixated.
Other studies have confirmed these findings, indicating that foveal processing is
required for encoding detailed object features (see e.g. Nodine et al., 1979; De Graef
et al., 1990; Hollingworth et al., 2001; Henderson & Hollingworth, 2003). On the
other hand, it has been suggested that the perceptual span in scene perception is
larger than it is for reading (Rayner & Pollatsek, 1992; Henderson et al., 1997). In
82
EXPERIMENTS ON VISUOMOTOR COORDINATION
other words, during visual
search,
meaningful
information can apparently be
extracted
from
relatively
broad visual areas within a
single
fixation.
This
assumption arises from a
number
of
experimental
observations.
In
the
previously mentioned study
by Parker (1978), subjects
were able to detect scene
changes
without
directly
fixating the object concerned
Figure 55. Simplified diagram of the information flow along
in 85% of the trials. In
the visual pathway. Visual input stimulating the retina
addition, changed objects
reaches V1 through the lateral geniculate nucleus (LGN).
were fixated sooner than
From V1, visual information projects through the ventral
unchanged
objects.
This
stream to the posterotemporal (occipito-temporal) cortex,
reveals
that
useful
and through the dorsal stream to the posteroparietal cortex.
The dorsal stream also receives retinal information projected
information about changes of
by the superior colliculus through the pulvinar. Reprinted
the visual environment can be
from CURR OPIN NEUROBIOL, 14(2), Goodale & Westwood,
gathered in the peripheral
An evolving view of duplex vision: separate but interacting
visual field (as far as 10°
cortical pathways for perception and action, pp. 203-211,
eccentricity), and that such
Copyright 2004, with permission from Elsevier.
information can be used
either to elicit a perceptual
response (such as reaching towards an object) or to redirect successive fixations.
Other studies confirm that subjects tend to fixate areas of the visual scene containing
meaningful information based on information gathered on the periphery of the visual
field (Antes, 1974; Loftus & Mackworth, 1978). A later experiment using artificial
scotomas in normal subjects, demonstrated that good object identification accuracy
could be achieved with eccentric vision (Henderson et al., 1997). However, eye
movement behavior was disrupted, probably due to the additional processing
required for identifying objects in the periphery. All these evidence indicates that,
while visual information extracted from the fovea and its surroundings is beneficial
for object identification/recognition, peripheral areas of the visual field play an
important role in identifying ‘informative’ regions of the visual field and subsequently
redirecting eye movements towards these areas.
All these findings fit well with the anatomical and physiological characteristics of
the central and peripheral areas of the visual field (refer to Chapter 1 for details). It
is thus not surprising that central vision is functionally specialized in high-resolution
sampling of spatial information, while peripheral vision mainly contributes to
encoding dynamic (movement) and relative distance cues. It can be therefore
concluded that central vision plays an important role in target identification and in
fine limb trajectory adjustments, while peripheral vision is mainly responsible of
determining and controlling eye and hand movements towards the target’s location
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
83
(Paillard, 1982; Sivak & Mackenzie, 1992; Hooge & Erkelens, 1999; Cornelissen et
al., 2005).
4.2.2 Visuomotor coordination in the context of artificial vision
Our previous experiments exploring the reading task do not allow us to directly
extrapolate to tasks involving visuomotor coordination. Clinical and epidemiological
findings have revealed that several visual factors have different effects on visionrelated daily activities (see e.g. Owsley et al., 2001; West et al., 2002; Nelson et al.,
2003). Visual acuity deficits (i. e. disorders of the central part of the visual field)
affect visuomotor tasks requiring detailed vision, such as those involving object
identification. In addition, defects of the peripheral visual field affect localization and
orientation abilities, critical for visuomotor coordination.
Only some qualitative experiments have been carried out to explore visuomotor
coordination tasks in conditions mimicking artificial vision (Humayun, 2001; Hayes et
al., 2003; refer to Chapter 1 for details). However, fundamental aspects of prosthetic
vision (stabilized retinal projection and probable eccentric implant location) have
been neglected. More recently, the same group presented results of psychophysical
testing involving simple tasks on a blind subject wearing an epiretinal prosthesis
containing 4 x 4 stimulating electrodes (Humayun et al., 2003). The subject was able
to perform the tasks54 with 75% - 100% accuracy. An important learning effect was
also observed. This study was designed to describe quantitatively the outcome of
one of the first chronic implantation trials on humans. It also contributed to the
validation of the epiretinal prosthesis concept and gave important information on the
nature and type of percepts that could be evoked with electrical stimulation.
Nevertheless, this study did not give much information regarding rehabilitation
prospects of such devices. The tasks evaluated were very simple, and did not mimick
realistic, every-day situations. Furthermore, the different aspects of visual function
were not entirely considered.
The main objective of the investigation presented in this chapter was to
systematically assess visuomotor coordination performance in conditions mimicking
artificial vision. With this purpose, we developed a portable system capturing images
representing different portions of environment, and projecting pixelized stimuli onto
defined, stabilized areas of the visual field. In a first set of experiments we
determined the minimum requirements for useful visuomotor coordination,
by studying the influence of stimulus information content (pixelization level and
available field of view) on visuomotor performance with a viewing window stabilized
in central vision. A second set of experiments was dedicated to explore whether
54
Performance was evaluated on tasks using either computer controlled electrical stimulation or
camera controlled electrical stimulation. In the first case, the subject had to identify the order and
location of phosphenes evoked with different electrodes activated sequentially. In the second case,
the subject used a head-mounted camera to control the electrical stimulation pattern. The tasks
evaluated were the detection of the presence or absence of ambient light, motion detection, target
localization, and simple shape recognition.
84
EXPERIMENTS ON VISUOMOTOR COORDINATION
naïve subjects could learn to perform visuomotor coordination tasks in
eccentric vision, under similar experimental conditions.
4.3 Specific methods for the experiments on visuomotor
coordination
4.3.1 Subjects
Subjects were recruited either from the staff of the Ophthalmology Clinic of the
Geneva University Hospitals or from the staff of the University of Geneva. They were
familiar with the purpose of the study. All had visual acuity better than 16/20 on the
tested eye, normal ophthalmologic status, and normal haptic55 perception.
4.3.2 Visuomotor tasks
Two tests were especially developed for our studies based on common clinical
tests and on previous studies of visuomotor coordination in natural tasks and settings
(Pelz, 1995; Purdy et al., 1999; Land et al., 1999; Pelz & Canosa, 2001; Pelz et al.,
2001; Humayun, 2001).
a)
b)
Figure 56. Tasks used to assess visuomotor coordination performance: a) The chips task consisted in
placing wooden chips following a randomized model; b) The LEDs task consisted on pointing on
random targets as accurately as possible.
4.3.2.1 The chips task
First, we explored a simple manipulation task (fig. 56a). Subjects had to recognize
simple figures and place them in the adequate position and orientation on
55
Haptic perception involves both tactile perception through the skin and proprioceptive perception of
the position and movement of joints and muscles.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
85
randomized templates. The template was a 5 x 4 array of square wooden chips, each
representing one of 20 different black figures drawn on a white background. The
chips measured 8 x 8 cm2 and were covered with a smooth and transparent plastic
sheet to remove tactile cues. The total working surface of the model was, therefore,
40 x 32 cm2. The figures appearing on the chips measured 3 - 6 cm along each axis.
For each experimental run, a custom program randomly determined the position
of each chip on the template (none of the subjects was presented with the same
chips configuration twice). At the beginning of each run, the randomized template
was placed in front of the subject, and he received a box containing a copy of each
chip. The score (1 = correct position and orientation; 0.5 = correct position but
wrong orientation; 0 = wrong position) for each chip was noted once the subject
released it at a certain position on the template. Then, the examiner removed the
placed chip to avoid the use of structural and tactile cues for identifying and
positioning the remaining chips. The experiment ended once subjects placed all chips
and the total time needed to complete the task was recorded.
Performance was determined on the basis of the %-score of correctly placed
chips and of the mean chip placement time (calculated as the total time required for
placing all chips divided by the number of correctly placed chips, in s). Similar to the
reading experiments, %-correct scores were transformed to RAU units for all
statistical analyses, but equivalent %-scales are shown on the right axes of the
graphs for clarity.
4.3.2.2 The LEDs task
Second, we examined a simple pointing task (fig. 56b). Subjects had to point with
the finger, as precisely as possible, on bright targets lighting up randomly on a
rectangular panel in front of them. The panel was a 6 x 4 array of red light emitting
diodes (LEDs) surrounded by aluminum reflectors. The array of LEDs was covered
with a smooth red plastic filter to avoid that the potential targets were seen when
not lit. A transparent 19.7” touch screen (3M Touch Systems, Massachusetts, USA;
refer to Appendix B for detailed specifications) was placed over the filtered LEDs
panel. The viewable area of the touch screen subtended 39 x 31.5 cm2. Distance
between LEDs (center to center) was of 6 cm and the distance between the touch
screen borders and most exterior LEDs was of 4 cm. When lit, the diameter of the
circular bright spot of the LEDs subtended approximately 1 cm.
Subjects were presented with a different random target order for each
experimental run (all 24 targets were lit and the same target was never lit twice).
Random target order per subject and per run was pre-established by a custom
program. The touch screen registered the position where the subjects pointed on
screen coordinates (pixels), and these values were transformed to cm56. Pointing
position and time for each target were recorded.
56
A screen resolution of 640 x 480 pixels (as used in these experiments) resulted in horizontal and
vertical touch screen resolutions of 16.46 and 15.21 pixels/cm, respectively.
86
EXPERIMENTS ON VISUOMOTOR COORDINATION
Performance was measured as mean pointing error (calculated as the cumulative
pointing error57 for all targets divided by the number of targets, in cm) and mean
pointing time per target (calculated as the total time required for pointing on all
targets divided by the number of targets, in s).
4.3.3 Effective field of view
For the reading experiments, 5 to 7 characters were visible at glance, and this
visual information was projected on a retinal surface of 3 x 2 mm2. This corresponds
to a quite narrow tunnel vision with high resolution.
Small
EFFECTIVE FIELD OF VIEW
Medium
Large
17920
pixels
498
pixels
124
pixels
Figure 57. Effect of varying simultaneously the size of the effective field of view and the resolution of
the image projected in the viewing window for the chips task. Columns: visual fields of 8.25°x5.8°,
16.5°x11.6°, and 33°x23.1°. Lines: resolutions of 17920, 498, and 124 pixels.
Previous research has demonstrated that the size of the effective field of view has
a significant influence in performance for tasks involving visual search and
orientation, such as visuomotor and mobility tasks (see e.g. Kerkhoff, 1999; Zihl,
2000; Szlyk et al., 2001; Rubin et al., 2001; Nelson et al., 2003). Compared to
reading, these tasks require much larger portions of the visual scene to be visible at
a glance. Figure 57 shows several examples of the visual scene for the chips task
57
Pointing error for each target was calculated as the absolute distance between the actual target
location and the position where the subject pointed.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
87
subtending different effective fields of view. In this case, the pertinent visual
information is typically ‘what is at hand reach while sitting at a table’. Seeing the
whole scene at once (large field of view) requires an image resolution close to
normal vision, which is clearly out of reach of envisioned prostheses. The smallest
field of view provides useful information at lower resolutions, but implies timeconsuming ‘tunnel’ scanning of the scene. The various compromises between the
effective field of view available and resolution (pixelization level) must, therefore, be
thoroughly investigated when considering visuomotor tasks.
In our experiments, we examined this issue by projecting different portions of the
environment (subtending different effective fields of view) onto a stabilized viewing
window. This was achieved by modifying the frame size of the image captured by the
webcam. In our mobile setup the stimulation screen subtended 40° x 30° of visual
field, and was set to a resolution of 640 x 480 pixels. Using its standard optics, the
webcam captured images of the environment corresponding to 33° x 24.75°.
Therefore, when the webcam’s image frame size was set to the same resolution than
the screen (640 x 480), 1° on the screen corresponded to 0.83° of the environment.
Consequently, the 10° x 7° viewing window represented a portion of the
environment subtending 8.25° x 5.8°. Similarly, setting the image frame size of the
webcam to 320 x 240 pixels58 or to 160 x 120 pixels59 resulted in portions of the
environment of 16.5° x 11.6° or 33° x 23.1° being projected onto the 10° x 7°
viewing window.
4.3.4 Experimental setup
The
details
of
the
simulation procedures have
already been described in the
General Methods Chapter.
The
apparatus
used
corresponded to the mobile
setup. The image-processing
algorithm used was real-time
square pixelization.
The
experimental
procedure was very similar to
that used for the reading
Figure 58. One of the subjects wearing the mobile setup
experiments. Briefly, subjects
during the chips task.
were seated wearing the
mobile setup (see fig. 58). At
the beginning of each run, a standard 9-point calibration was performed and the
experimental sequence started afterwards. The viewing window (10° x 7°),
contained pixelized images extracted from the frames captured by the webcam. Gaze
58
59
1° on the screen = 1.65° of the environment.
1° on the screen = 3.3° of the environment.
88
EXPERIMENTS ON VISUOMOTOR COORDINATION
position compensation was used to project this viewing window onto defined (central
or eccentric) areas of the retina. The background of the remaining screen surface
was gray (corresponding to the mean luminosity of the visual scene). Tests were
performed monocularly (using the dominant eye). Test sessions frequently included
several runs, but they never lasted longer than 30 minutes to avoid subjects’ fatigue.
Eye movement data for each experimental session were recorded and stored for
further analysis.
4.4 Acute experiments on visuomotor coordination
Visuomotor performance in central vision was assessed versus 2 variables of
prosthetic vision: the pixelization level and the effective field of view.
With the mobile setup, subjects could explore the environment (modify the
stimulus image displayed in the viewing window) in two ways: with eye movements
and/or by moving their head and trunk. It is interesting to examine how subjects
used these movements to cope with the different viewing conditions. Head
movement data for each experimental session were also recorded. This analysis,
however, does not directly concern the main purpose of this dissertation. Therefore,
the detailed analysis of eye and head movements is presented in Appendix D.
4.4.1 Experimental protocol
Three subjects (CR, male, 24 years old; AP, female, 27 years old; JS, male, 42
years old) participated in the experiments. Before starting the actual experimental
sequence, all subjects performed 3 control sessions for each task. These control
sessions were conducted in normal viewing conditions (subjects were not wearing
the mobile setup). These performance results were used as baseline measures for
‘normal’ visuomotor performance.
Table 2. Testing sequences (latin square permutation) for the effective fields of view during
experiments 5 and 6.
Chips
LEDs
AP
CR
JS
AP
CR
JS
1
33° x 23.1°
16.5° x 11.6°
8.25° x 5.8°
16.5° x 11.6°
8.25° x 5.8°
33° x 23.1°
2
8.25° x 5.8°
33° x 23.1°
16.5° x 11.6°
8.25° x 5.8°
33° x 23.1°
16.5° x 11.6°
3
16.5° x 11.6°
8.25° x 5.8°
33° x 23.1°
33° x 23.1°
16.5° x 11.6°
8.25° x 5.8°
Subjects first performed all tests for the chips task, and then for the LEDs task.
Performance was measured with viewing windows presented at 5 pixelization levels60
(17920, 1991, 498, 221, and 124 pixels) and 3 effective fields of view (8.25° x 5.8°,
16.5° x 11.6°, and 33° x 23.1°; see fig. 57). Three successive runs were performed
per experimental condition (a given effective field of view and pixelization level). The
60
Equivalent to those used in our previous study on reading of isolated 4-letter words.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
89
order in which each effective field of view was presented to each subject was
permuted using a latin square61 (see table 2). With each field of view, subjects
started with the easiest (highest) pixelization level and progressed towards the most
difficult (lowest) one. Once all pixelization levels for a given field of view where
completed, subjects progressed to the next one. Possible global learning effects
favoring a particular effective field of view would be therefore minimized, but would
still favor performance at low pixelization levels.
Results were calculated as the mean of the cumulative data of each subject ±
standard error of the mean (SEM). Statistically significant differences were
determined using standard (paired) t tests with a significance level of 0.05.
4.4.2 Experiment 3: Manipulation – The chips task
a)
b)
Figure 59. Visuomotor performance versus number of pixels in the 10°x7° viewing window for 3
normal subjects performing the chips task. Three effective fields of view projected in the 10°x7°
viewing window are compared in central vision: 8.25°x5.8° (red plot), 16.5°x11.6° (blue plot), and
33°x23.1° (green plot). (a) Mean correct scores expressed in RAU ±SEM (left scale) and in % (right
scale). (b) Mean chip placement time (mean time required to identify and correctly place a chip)
expressed in s ±SEM. The solid black lines indicate mean performance results (±SEM) during control
sessions (normal viewing conditions).
Figure 59 compares mean visuomotor performance for the chips task, with each
effective field of view, versus number of pixels in the viewing window. Individual
performances in each experimental condition were established on the basis of 3
sessions. Chip placement scores were close to perfect (> 95%) in most viewing
conditions. Accuracy dropped below 95% only at 124 pixels with the 16.5° x 11.6°
field of view (blue plots in fig. 59), and at 221 and 124 pixels with the 33° x 23.1°
field of view (green plots in fig. 59).
61
A latin square is an array of N x N elements arranged so that no orthogonal (column or line)
contains the same element twice. These permutations are used to avoid learning effects favoring
performance in a certain experimental condition.
90
EXPERIMENTS ON VISUOMOTOR COORDINATION
Mean chip placement time appeared to be more sensitive to pixelization level.
With the 8.25° x 5.8° field of view (red plots in fig. 59), values were statistically
equivalent across all pixelization levels. However, time tended to increase at 221 and
124 pixels. In addition, a non-significant learning effect was observed at 1991 and
498 pixels. With the 16.5° x 11.6° field of view, a significant slow-down was
observed at 498 (p = 0.04) and 221 pixels (p = 0.02), but did not persist at 124
pixels (p = 0.09) due to the high variability of the results. With the 33° x 23.1° field
of view, time increased significantly at 498 pixels (p = 0.02). This effect was also
visible at 221 pixels, but it was not significant (p = 0.1). The significant slow-down
manifested again at 124 pixels (p = 0.004). Even at the highest pixelization levels,
chip placement time was approximately 2 - 6 times slower with the 3 effective fields
of view than in normal viewing conditions (~ 2 s; p < 0.05; black solid lines in fig.
59).
a)
b)
Figure 60. Normalized visuomotor performance versus effective resolution of the environmental
space in the 10°x7° viewing window for 3 normal subjects performing the chips task. Three effective
fields of view projected in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red
plot), 16.5°x11.6° (blue plot), and 33°x23.1° (green plot). (a) Normalized scores ±SEM. (b)
Normalized chip placement time ±SEM. The solid black lines indicate the best fit to the data.
Performance data for the chips task were normalized to values achieved at the
highest pixelization (17920 pixels), for each of the effective fields of view. Figure 60
displays mean results ± SEM plotted versus the effective resolution of the
environmental space62. Data obtained with the different fields of view were fitted
together to single exponential functions63 (solid black lines in fig. 60). Best fit for
normalized scores reveals that maximum scores can be achieved with effective
environmental resolutions down to approximately 1.10 pixels/deg2. Mean scores
62
Calculated as the number of pixels needed to represent a field of view of 1° x 1° in each condition,
expressed in pixels/deg2.
63
y = yo + ae-bx
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
91
decrease markedly below this effective resolution level. A similar trend can be
observed for normalized time results. Best fit to the data shows that the mean time
needed to correctly identify and place a chip increases notably at effective
environmental resolutions below 1.40 pixels/deg2.
4.4.3 Experiment 4: Pointing – The LEDs task
a)
b)
Figure 61. Visuomotor performance versus number of pixels in the 10°x7° viewing window for 3
normal subjects performing the LEDs task. Three effective fields of view projected in the 10°x7°
viewing window are compared in central vision: 8.25°x5.8° (red plot), 16.5°x11.6° (blue plot), and
33°x23.1° (green plot). (a) Mean pointing error expressed in cm ±SEM. (b) Mean pointing time
(average time required for finding and pointing on a target), expressed in s ±SEM. The solid black
lines indicate mean performance results (±SEM) during control sessions (normal viewing conditions).
Figure 61 compares mean visuomotor performance for the LEDs task, with each
effective field of view, versus number of pixels in the viewing window. Individual
performances in each experimental condition were established on the basis of 3
sessions. At 17920 and 1991 pixels, the 3 effective visual fields yielded pointing
errors of approximately 1 cm. With the 8.25° x 5.8° field of view (red plots in fig.
61), errors remained at the same level even at the lowest pixelization. Nonetheless, a
slight tendency towards larger errors was observed at 124 pixels. With the 16.5° x
11.6° field of view (blue plots in fig. 61), errors tended to increase (non-significantly)
at 498 pixels and below. With the 33° x 23.1° field of view (green plots in fig 61),
pointing errors increased at 498 pixels and below. This loss of pointing precision was
significant only at 221 (p = 0.03) and 124 (p = 0.003) pixels.
Mean pointing time was influenced by the size of the effective field of view
projected in the viewing window. However, no significant effect of the pixelization
level was observed. The slowest pointing times were obtained with the 8.25° x 5.8°
field of view (~ 10 s). With the 16.5° 11.6° field of view, mean pointing times were
of approximately 6 s. The fastest pointing times were obtained with the 33° x 23.1°
92
EXPERIMENTS ON VISUOMOTOR COORDINATION
field of view, which yielded values of about 5 s at the highest pixelizations. Mean
pointing times with this largest field of view slightly increased to approximately 6 s at
221 and 124 pixels.
Comparison of these results with performance obtained in normal viewing (black
solid lines in fig. 61) reveals that pointing errors were 2 - 3 times larger in our
particular experimental conditions. Pointing times were 3 – 7 times slower with the 3
effective fields of view than in normal viewing conditions.
Performance data for the LEDs task were also normalized to values achieved with
the highest pixelization (17920 pixels), for each effective field of view. These results
(mean ± SEM) are plotted versus the effective resolution of the environmental space
in figure 62. Data obtained with the different fields of view were fitted together to
single exponential functions (solid black lines in fig. 62). Best fit for normalized
pointing errors reveals that accuracy decreases at effective environmental resolutions
below 2.2 pixels/deg2. Normalized pointing time appears to be less sensitive to the
effective resolution of the environmental space. In this case, best fit for the data
indicates that temporal performance is stable down to effective resolutions of about
0.8 pixels/deg2. Time needed to localize and point on a target increases below this
value.
a)
b)
Figure 62. Normalized visuomotor performance versus effective resolution of the environmental
space in the 10°x7° viewing window for 3 normal subjects performing the LEDs task. Three effective
fields of view projected in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red
plot), 16.5°x11.6° (blue plot), and 33°x23.1° (green plot). (a) Normalized pointing errors ±SEM. (b)
Normalized pointing time ±SEM. The solid black lines indicate the best fit to the data.
4.4.4 Summary of the results of these experiments
Altogether, results from experiments 5 and 6 demonstrate that both the number
of pixels contained in the viewing window as well as the effective visual field
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
93
projected inside it selectively affect visuomotor performance. Various combinations of
resolution and field of view allowed good performance on these tasks. Nevertheless,
these data reveal a fundamental limit for visuomotor performance: a minimum
effective resolution of about 2 pixels/deg2 was necessary to achieve both tasks with
reasonable accuracy and speed. In addition, a 16.5° x 11.6° effective field of view
seems to be the best compromise for visuomotor performance, providing a
reasonably large visual span while still maintaining reasonable image resolution.
4.5 Habituation experiments on visuomotor coordination
We conducted 2 successive experiments. First, subjects were asked to perform
the visuomotor tasks using a viewing window stabilized on the fovea. In a second
experiment, subjects were asked to perform the same tasks, but using a viewing
window stabilized at 15° of eccentricity.
To be consistent with our previous experiments on reading, a resolution of 498
pixels in the viewing window was judged to be the most adequate for the habituation
experiments on visuomotor coordination. An effective field of view of 16.5° x 11.6°
was chosen on the basis of experiments 5 and 6. This combination results in an
effective resolution of the environmental space of approximately 3 pixels/deg2.
4.5.1 Experimental protocol
Three subjects (AP, female, 27 years old; AW, male, 34 years old; MV, male, 28
years old) participated in the experiments. All of them were naïve to eccentric
viewing and were tested monocularly using their dominant eye. One of the subjects
(AP) was already familiar with the tasks since she also participated in the first set of
experiments exploring the minimum requirements for visuomotor coordination.
For each experimental session, the visuomotor tasks were intercalated: subjects
first performed one run of the chips task, then one run of the LEDs task. The
remaining aspects of the experimental procedure were identical to those described
for the acute experiments on visuomotor coordination. Eye movements were
recorded all through the experiment and stored for further analysis. The detailed
analysis of eye movements is presented in Appendix D.
Each run started with a calibration of the eye tracker, and at the end the
calibration was checked for possible drifts. Whenever the average error obtained in
the calibration check was ≥1°, the results for the corresponding session were
removed from the analysis. This was never the case in the preparatory experiment
(central vision). In experiment 5 (eccentric vision), 4 sessions of the chips task
(12.5%) were discarded for subject AP. In the case of subject AW, 8 sessions for the
chips task (25%) and 7 sessions for the LEDs task (21%) were discarded. Finally, for
subject MV, 14 sessions for the chips task (41%) and 4 sessions for the LEDs task
(12.5%) had to be discarded.
Possible learning effects were investigated by repeatedly performing experimental
sessions for more than 1 month. The criterion used to stop the experiments was the
94
EXPERIMENTS ON VISUOMOTOR COORDINATION
stabilization of temporal performance (chip placement time and pointing time). In
general, 2 - 3 experimental sessions were conducted each working day of the week
(5 days per week). The duration of each experimental session was variable
throughout the experiment, but never exceeded of 30 minutes of testing. Two
periods of testing represented therefore less than 1 hour of daily training. In a
preparatory experiment, several sessions were conducted using a viewing window
stabilized in central vision. This experiment lasted until subjects became familiar with
the tasks (performing both visuomotor tasks while exploring the environment using a
small viewing window). Experiment 5, testing eccentric visuomotor performance,
began once the subjects had adapted to the central viewing condition.
Learning curves were computed using exponential functions as described in
Chapter 2. Significant learning effects were determined using simple linear
correlation (Pearson’s correlation).
4.5.2 Preparatory experiment: Learning in central vision
This experiment was dedicated to familiarize subjects with the unusual activity of
performing visuomotor tasks using a small viewing window containing pixelized
segments of the environment. Approximately 18 to 27 sessions were necessary for
time to asymptote.
Figure 63 presents performance in central vision versus session number for the
chips task. All three subjects achieved good chip placement scores (> 85% correct)
already in the first sessions. Significant learning effects were observed in the analysis
of chips scores versus time for 2 subjects (Pearson’s correlation: r = 0.48, p < 0.05
a)
b)
Figure 63. Performance versus session number obtained for 3 normal subjects performing the chips
task in central vision (10°x7° viewing window containing 498 pixels and subtending a 16.5°x11.6°
field of view). Results expressed as: (a) Chip placement scores expressed in RAU (left scale) and in %
(right scale). (b) Chip placement time expressed in s. The solid lines indicate the best fits to the data.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
95
for AW; and r = 0.69, p < 0.001 for MV). In the case of subject AP, no significant
learning effect was observed since she achieved perfect scores (100% correct)
already in the first sessions. Chip placement time improved significantly with training
for all three subjects (Pearson’s correlation: r = 0.73, p < 0.001 for AP; r = 0.63, p <
0.001 for AW; and r = 0.77, p < 0.0001 for MV). Analysis of the experimental data
revealed that the average time required for correctly identifying and placing a chip
diminished from 7 to 4.5 s for AP, from 20 to 7 s for AW, and from 16 to 7 s for MV.
Figure 64 shows performance in central vision versus session number for the
LEDs task. Subjects achieved average pointing errors between 2 and 1 cm in the first
sessions. A significant learning effect was observed in the analysis of pointing errors
versus time only for AW (Pearson’s correlation: r = 0.65, p < 0.001), who improved
his results from about 1.5 cm down to approximately 1 cm. Mean pointing errors for
subject AP remained stable around 1.1 cm. Pointing errors for subject MV slightly
declined with training from approximately 2 cm down to about 1.4 cm. Mean pointing
time significantly improved with training for all three subjects (Pearson’s correlation:
r = 0.54, p < 0.05 for AP; r = 0.71, p < 0.0001 for AW; and r = 0.54, p < 0.05 for
MV). Results improved from 4.7 to 4 s for AP, from 5.5 to 3 s for AW, and from 7 to
4.8 s for MV.
a)
b)
Figure 64. Performance versus session number obtained for 3 normal subjects performing the LEDs
task in central vision (10°x7° viewing window containing 498 pixels and subtending a 16.5°x11.6°
field of view). Results expressed as: (a) Mean pointing error expressed in cm. (b) Mean pointing time
expressed in s. The solid lines indicate the best fits to the data.
These data clearly demonstrate that accurate but relatively slow visuomotor
performance can be obtained under conditions mimicking artificial vision in the
central visual field. This means that all necessary information could be transmitted
and captured by the visual system. Almost all chips could be correctly identified and
placed. Leds were localized with reasonable accuracy. However, time variables for
both tasks were noticeably lower than results reported for normal viewing conditions
96
EXPERIMENTS ON VISUOMOTOR COORDINATION
in experiments 5 and 6 (compare figs. 63b and 64b with figs. 59b and 61b). This
outcome is not surprising due to the increased difficulty of exploring the environment
using a restricted viewing window.
4.5.3 Experiment 5: Learning in eccentric vision
This experiment was dedicated to explore whether subjects could adapt to the
unusual activity of performing visuomotor tasks using a small viewing window
containing pixelized segments of the environment, and stabilized at 15° eccentricity
in the lower visual field. In this case, temporal performance for both tasks
asymptoted within 30 sessions.
Figure 65 shows performance in eccentric vision (15° in the lower visual field)
versus session number for the chips task. Significant learning effects were observed
in the analysis of chip placement scores versus time for the 3 subjects (Pearson’s
correlation: r = 0.47, p < 0.05 for AP; r = 0.47, p < 0.05 for AW; and r = 0.73, p <
0.001 for MV). Scores for subject AP improved impressively: from initial scores below
10% correct, up to final scores above 97%. Surprisingly, subject AW achieved scores
above 90% correct already in the initial sessions, and achieved perfect (100%
correct) scores at the end of the experiment. Subject MV also started the experiment
with relatively high scores, around 90% correct. He consistently achieved perfect
scores after about 20 sessions.
Improvements in chip placement time were more gradual and impressive.
Significant learning effects were observed for all subjects (Pearson’s correlation: r =
0.80, p < 0.0001 for AP; r = 0.97, p < 0.0001 for AW; and r = 0.81, p < 0.0001 for
a)
b)
Figure 65. Performance versus session number obtained for 3 normal subjects performing the chips
task in eccentric vision (15° in the lower visual field). Experimental conditions: 10°x7° viewing window
containing 498 pixels and subtending a 16.5°x11.6° field of view. Results expressed as: (a) Correct
scores expressed in rationalized arcsine units [RAU] (left scale) and in % (right scale). (b) Chip
placement time expressed in s. The solid lines indicate the best fits to the data.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
97
MV). Subject AP showed an approximately 4-fold improvement: from above 30 s in
the first sessions, to reaching an asymptote around 7 s in approximately 13 sessions.
In subject AW, chip placement time decreased from around 26 s in the initial
sessions, down to 9 s after about 7 sessions. Subject MV started the experiment with
values around 25 s, and reached an asymptote around 9 s after approximately 30
sessions.
Figure 66 shows performance in eccentric vision versus session number for the
LEDs task. Pointing error results were showed a high variability. Pointing errors for
subject AP remained roughly stable throughout the experiment, around a value of
1.2 cm. Subject AW initially achieved pointing errors around 1 cm, and with time
values significantly increased to approximately 1.4 cm (Pearson’s correlation: r =
0.61, p < 0.01). A significant learning effect was observed in the analysis of pointing
errors versus time for MV (Pearson’s correlation: r = 0.58, p < 0.01), who improved
his results from around 1.6 cm to 1.2 cm. Results for this subject were still
decreasing when the experiment ended.
Pointing time improved significantly with training for subjects AP and MV
(Pearson’s correlation: r = 0.68, p < 0.0001 for AP; and r = 0.63, p < 0.001 for MV).
Values improved from 21 to 6 s for AP (in ~ 7 sessions) and from 22 to 6.5 s for MV
(in ~ 9 sessions). Subject AW started the experiment with relatively fast pointing
times (~ 7 s). With training, values for this subject slightly (non-significantly)
declined to approximately 5 s.
a)
b)
Figure 66. Performance versus session number obtained for 3 normal subjects performing the LEDs
task in eccentric vision (15° in the lower visual field). Experimental conditions: 10°x7° viewing window
containing 498 pixels and subtending a 16.5°x11.6° field of view. Results expressed as: (a) Mean
pointing error expressed in cm. (b) Pointing time expressed in s. The solid lines indicate the best fits
to the data.
98
EXPERIMENTS ON VISUOMOTOR COORDINATION
Taken together, results from experiment 5 demonstrate that an important
learning process occurred for both visuomotor tasks. The evolution was expressed
quite differently across subjects. Subject AP, for example, improved impressively, all
through the experiment, in 3 out of the 4 measured parameters (chip placement
score, chip placement time, and target pointing time). In contrast, subject AW begun
the experiment with perfect chip placement scores, while his pointing errors (LEDs
task) appeared to worsen with time. In his case, the learning process was best
expressed in terms of chip placement time and target pointing time. At the end of
experiment 5, all subjects achieved similar levels of visuomotor performance,
reaching the same level than after the preparatory experiments in central vision. We
can consider, thus, that after the 1-month training period, the 3 subjects that
participated in the experiments attained functionally useful visuomotor performance.
4.6 Discussion
The first goal of the experiments presented in this chapter was to explore the
minimum requirements for useful visuomotor coordination in central vision. We
assessed the influence of the particular visual conditions that will most probably
result from the use of retinal prosthetic devices. Simple visuomotor tasks could be
achieved with reasonable speed and accuracy at effective resolutions of the
environmental space above 2 pixels/deg2 (e.g. ~ 400 pixels with the 16.5° x 11.6°
field of view). Below this fundamental value, visuomotor performance was
significantly impaired. Still, each of the visual constraints imposed in our artificial
vision simulations had an impact on visuomotor performance. It is, thus, interesting
to discuss the influence of these factors in more detail.
Visuomotor performance was poorer in conditions simulating artificial vision than
in normal viewing conditions, even at the highest resolution levels. At the highest
pixelization level investigated, chip placement time was already 2 - 8 times slower in
than in normal viewing. Pointing time for the LEDs task displays a similar picture,
showing a 2-fold to 5-fold slow-down. This confirms that the simple fact of having to
explore the environment with a small visual area significantly affects visuomotor
performance by itself. Restricting the size of the effective field of view further limited
visuomotor performance. Both visuomotor tasks were performed faster as larger
portions of the environment were available at glance: a 2 to 3-fold time difference
was observed between the 33° x 23.1° field of view and its 8.25° x 5.8° counterpart.
Substantial increases in visual search times have been obtained in other studies
investigating the influence of artificial scotomas on normal subjects (Henderson et
al., 1997; Cornelissen et al., 2005). The main explanation for this finding is probably
that, since input from the visual periphery is limited, there is no information for
redirecting eye/head movements towards ‘informative’ regions of the environment
(Antes, 1974; Loftus & Mackworth, 1978) and to plan efficient scanning strategies
(Cornelissen et al., 2005). Other studies demonstrated that visual degradations such
as visual field restrictions result in slower limb movements (Servos & Goodale, 1994;
Servos, 2000; Loftus et al., 2004). Authors hypothesized that such slow-down is
probably required by the nervous system to gather enough feedback to correct initial
movement errors and avoid unwanted collisions. Therefore, the increase in the time
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
99
required to perform our visuomotor tasks is probably due to both lengthier scanning
of the environment and to a reduced speed of arm movements towards the targets.
Pointing precision was also fundamentally limited by our artificial ‘tunnel vision’
conditions. Irrespective of the size of the effective field of view projected in the
restricted 10° x 7° viewing window and at maximum target resolution, pointing
errors were almost double than average pointing precision obtained in normal
viewing conditions. Pointing errors were approximately double with the smallest field
of view when compared to its larger counterparts. This result fits well with findings
showing that visual field restrictions and monocular viewing have an adverse effect in
the accuracy of distance estimations. Coello & Grealy (1997) reported that when
targets were displayed in narrow fields of view, aiming accuracy was poorer than in
normal viewing conditions. A research group of the Indiana University studied the
effect of different perturbations of the visual information available to subjects during
reaching tasks (Bingham & Pagano, 1998). The task consisted in inserting a stylus
into a target hole located at variable distance in front of the subject. Their findings
confirmed that visual field restrictions in monocular viewing resulted in
underestimations of target distance. In addition, their results demonstrated that the
accuracy of distance estimations decreased with increasing target distance. Others
explored whether estimates of object distance and size were affected by peripheral
field restrictions per-se, or whether these effects varied with visual field size (Watt et
al., 2000). Subjects were requested to reach towards white paper rectangles in 5
binocular visual field conditions (4°, 8°, 16°, 32°, and 64°). Experiments were
performed in the dark (no visual feedback) and tactile information was removed by
covering the table where the targets were positioned with a transparent acetate
sheet. Subjects systematically underestimated the distance towards the target to be
reached, and this underestimation increased linearly as the size of the field of view
decreased. There was no evidence of misjudgment of object size across the different
visual conditions tested. A more recent study (Loftus et al., 2004) demonstrated that
such distance estimation errors could be reduced or even eliminated in visually rich
and structured environments, as previously suggested by others (Coello & Grealy,
1997; Bingham & Pagano, 1998; Magne & Coello, 2002), and when haptic feedback
was available. However, results indicated when visual field restrictions were present,
the variability of the results increased. It appears, thus, that the fact of having to
explore the environment with a restricted viewing window presented monocularly (in
the poorly structured visual environment provided for the LEDs task), already limited
the subjects’ capacity to estimate the target (LED) location. Furthermore, no tactile
feedback about target location was provided. This probably resulted in the larger
pointing errors observed even when images were presented at maximum screen
resolution.
As expected, the number of pixels in the restricted viewing window affected
visuomotor performance. With the small and medium fields of view, almost perfect
chip placement scores could be achieved even at low pixelization levels. With the
largest field of view, chip placement scores seemed to be limited at target resolutions
below 498 pixels. Pointing precision for the LEDs task worsened around 498 to 221
pixels, depending on the field of view explored. These results are in accordance with
those reported for the reading task in the previous chapter, and agree with
100
EXPERIMENTS ON VISUOMOTOR COORDINATION
consistent observations in low vision patients. Target resolution determines the
capacity of discriminating detail in a visual image. The effect in reading is obviously
more pronounced, since text characters are complex ‘objects’ requiring very fine and
detailed spatial discrimination capabilities. Correctly identifying a chip required some
image discrimination abilities, which were notably limited when only a few pixels
were available to represent large portions of the environment (as was the case for
the largest field of view at low pixelization levels). This issue was more visible in chip
placement time. As less pixels became available, image discrimination became more
difficult and it took subjects longer times to accurately recognize and locate the
position of the chip in the template. The number of pixels in the viewing window also
affected pointing precision for the LEDs task. This was particularly important in the
case of the largest field of view. The rapid loss of image detail resulted in less precise
estimates of target location. Head dispersion results for the chips task confirm these
observations (refer to the analysis of head movements in Appendix D). Head
movements for this task were noticeably influenced by the number of pixels in the
viewing window, especially about the transversal axis. As the image projected in the
viewing window lost resolution, subjects approached the chips template to
compensate for the loss of detail.
Results were quite variable for each of the subjects that participated in the
experiments. In addition, only a relatively small number of subjects (only 3)
participated in the experiments. However, the tendencies mentioned are clear, and
we they outline rather well how the probable constraints of artificial vision devices
impact visuomotor performance. In addition, our results also illustrate how the
impact of these constraints can be minimized (for example, optimizing the effective
field of view available for the task) so that some visuomotor coordination abilities can
be restored to blind patients wearing such devices.
The second goal of this investigation was to investigate whether subjects could
adapt to the unnatural conditions of performing visuomotor tasks using a small
restricted viewing window containing pixelized fractions of the environment and
stabilized at 15° eccentricity. Our results clearly indicate that subjects are able to
adapt quite well to the tasks investigated, and that performance can improve
significantly with time. About 1 month of daily training was necessary in this case to
achieve optimum visuomotor performance.
A couple of issues still deserve to be mentioned. On one hand, our experiments
do not point out the importance of eye movements for visuomotor coordinated tasks.
Since there is evidence that both efferent and afferent eye position signals are used
by the saccadic system to locate and encode targets, visual prostheses that allow the
user to explore the environment using eye movements (as in subretinal implants
transforming light into stimulation currents in-situ) could allow for better
performance in these tasks than systems where the environment has to be explored
with head movements. This issue should still be examined. On the other hand, our
results might underestimate actual visuomotor performance under severely degraded
visual conditions since we deliberately took out all possible tactile cues and
experiments were performed in poorly structured visual backgrounds. Previous
research has demonstrated that, for example, the simple fact of providing a
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
101
structured visual environment significantly reduced the adverse effects of visual field
restrictions (see e.g. Magne & Coello, 2002; Loftus et al., 2004). Therefore, we could
expect future visual prosthesis wearers to face less dramatic limitations in most real
life situations than those depicted in our experiments. Yet, in our methodology we
always try to use the worst possible case scenario, in order to provide the most
realistic picture of the minimum visual requirements for each task category explored.
4.7 Conclusions
The data reported in this chapter clearly demonstrate that useful (accurate but
slow) visuomotor performance can be obtained under conditions mimicking artificial
vision in the central visual field. This means that all necessary information could be
transmitted and captured by the visual system. Almost all chips could be correctly
identified and placed. Leds were localized with reasonable accuracy. However,
visuomotor performance (specially the time required to complete the tasks) was
poorer than in normal viewing conditions. This outcome is not surprising due to the
increased difficulty of exploring the environment using a restricted viewing window.
The results presented in this chapter also demonstrate that, after a relatively short
adaptation process, useful visuomotor performance can be achieved even if retinal
implants have to be placed in the periphery of the visual field.
Compared to the reading task, visuomotor coordination is less demanding in
terms of visual information content. Thus, at this point, we estimate that about 500600 distinctly perceived phosphenes, distributed on a 3 x 2 mm2 implant, remain the
minimum criterion to achieve useful reading and visuomotor coordination.
4.8 Publications resulting from this research
Pérez Fornos, A., Sommerhalder, J., Pittard, A., Safran, A.B., & Pelizzone, M. (2005). Minimum
requirements for visuomotor coordination and learning of this task in eccentric vision. ARVO
Meeting Abstracts, 46, 1533 (abstract).
Pérez Fornos, A., Sommerhalder, J., Pittard, A., Safran, A.B., & Pelizzone, M. (2006). Minimum
requirements for useful visuomotor coordination and learning of such tasks in eccentric vision. In
preparation.
5 Experiments on Mobility
What is the difference between exploring and being lost?
Dan Eldon (1970 - 1993)
5.1 Foreword
In the preceding chapters, we studied reading and visuomotor performance in
conditions simulating artificial vision. An essential category of tasks should still be
investigated: those involving whole-body mobility and orientation. Strelow (1985)
characterizes mobility as “the skill of traveling through the spatial environment,
avoiding obstacles, and traveling directly or indirectly toward goals”. Foulke (1971)
more specifically defines efficient orientation and mobility as the abilities to
accurately localize one’s own body in space and to travel “safely, comfortably and
independently”. Visually driven motor performance is essential for various daily
activities (walking, heading, way-finding, avoiding obstacles, etc…) and has different
visual requirements than the previously studied tasks. For example, for stepping over
an obstacle, pertinent information includes encoding the obstacle’s width, height,
location and eventually speed/direction (if the obstacle is moving). Yet, other
information such as color (required, for instance, for identifying targets in
reaching/grasping tasks) would not be relevant. Whole-body mobility might therefore
impose different constraints to an artificial vision system. The determination of
minimum visual requirements for mobility is, however, not straightforward since
performance on these tasks is difficult to quantify: experimental variables are not
easy to manipulate and responses are difficult to measure; there is no single, most
representative task; and, finally, demands of the mobility task vary according to each
particular environment (Strelow, 1985).
Evaluating whole-body mobility is important since, altogether with reading, it is
strongly associated with vision-related estimates of quality of life and represents one
of the main goals of low vision patients seeking rehabilitation (Pelli, 1987; Wolffsohn
& Cochrane, 1998).
5.2 Introduction
Essentially, whole-body mobility requires the capacity to judge egocentric64 and
exocentric65 distances for solving issues such as localization of body in space,
perception of movement, distance estimation, and speed estimation (Stoffregen,
1985; Warren, 1995; Cutting & Vishton, 1995; Apfelbaum et al., 2006). Among the
sources of information used for this purpose, vision is one of the most valuable since
64
65
Distances measured from the observer to particular locations in the environment.
Distances measured between two points in the environment.
103
104
EXPERIMENTS ON MOBILITY
it simultaneously (and almost instantaneously) supplies static and dynamic
information regarding the near and far environment (Patla, 1997). In addition,
different aspects of visual information such as visual field, acuity, and contrast
sensitivity, selectively influence the way we perceive the environment. For example,
the size of the available visual field fundamentally limits the area in which different
features of the environment (e.g. obstacles, objects of interest…) can be detected.
Conversely, acuity and contrast thresholds determine how much image detail can be
perceived.
Between all these visual information variables, which are fundamental
requirements for mobility and to which point can this input be degraded? Pelli (1987)
studied this issue by artificially restricting vision (available field of view, contrast
sensitivity, and visual acuity) in normally sighted subjects. Mobility tasks were
performed in two environments: a laboratory maze (long corridor cluttered with
randomly positioned vertical foam rubber columns) and a shopping mall (L-shaped
trajectory, 250 m walking distance). In both settings, performance was nearly
unimpaired down to very restricted vision, suggesting that very little information is
required to walk through indoor environments with reasonable accuracy and speed.
The critical thresholds he found for whole-body mobility in the laboratory maze were
10° of visual field, visual acuity of 20/2000, and 4% of normal contrast. The
corresponding critical values for the shopping mall were 4° of visual field, visual
acuity of 20/2000, and 2% of normal contrast. The author stated, however, that low
vision patients who should, according to these criteria, have enough vision to travel
with reasonable accuracy and speed, still complained of mobility problems.
5.2.1 What have we learned from low vision patients?
Low vision patients generally complain of severe problems while performing
mobility tasks. The nature and impact of the handicap obviously depends on the type
of visual deficit, but also appears to be influenced by a number of environmental
variables, such as light level, contrast, and type of obstacles involved in the task.
These patients represent, thus, an ideal model to highlight the fundamental visual
requirements for mobility performance. At the same time, data collected on these
patients also provides valuable indications on how will the different visual constraints
imposed by visual prostheses affect performance on mobility tasks.
Several authors have studied mobility in low vision patients. Marron & Bailey
(1982) investigated the influence of visual field, visual acuity, and spatial contrast
sensitivity in mobility performance. Nineteen low vision patients participated in the
experiments. Mobility and orientation were evaluated using indoor (12.2 x 2.4 m
corridor cluttered with cylindrical obstacles of variable diameter and length hanging
from the ceiling) and outdoor (rectangular city block including different every-day
obstacles) courses. Their results showed that both visual field and contrast sensitivity
have a significant effect on mobility performance, but not visual acuity. A later study
examined mobility performance in 88 visually impaired veterans, divided in groups
according to the type of vision loss (Kuyk et al., 1996). Travel time and total number
of object contacts were measured in an indoor course. In general, mobility was
influenced by light level, object contrast and object type. Performance also varied
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
105
according to the type of visual deficit: subjects with peripheral field restriction had
greater difficulty than those with acuity loss. Later, the same authors (Kuyk et al.,
1998; Kuyk & Elliott, 1999) presented similar studies but using real world settings
(indoor hallway and outdoor course in a residential area). Their results confirmed
that reduced illumination notably impaired mobility. Similar to previously mentioned
findings by Marron & Bailey (1982), contrast sensitivity and visual field extent were
found to be the most important predictors of performance. Laboratory results
correlated well to those obtained on real world situations, but inter-subject variability
was considerable, probably due to differences in vision characteristics between
diseases and to task complexity. Geruschat et al. (1998) compared mobility
performance of patients suffering from retinitis pigmentosa (RP) with those of normal
subjects. The task consisted in walking through a predefined course as quickly and
safely as possible, avoiding all obstacles in their path. Travel time, number of
mobility incidents and self-perceived estimates of mobility performance were
measured under normal and reduced illumination conditions. In general, RP patients
traveled more slowly than normally sighted subjects. All subjects (normal and RP)
were affected by reduced illumination, but mobility incidents were 5 times more
frequent in RP patients. Walking speed was significantly correlated with 3 visual
variables: visual acuity, contrast sensitivity and visual field extent. Turano & Wang
(1992) measured spatial motion thresholds66 in both normal and RP subjects. In
general, these motion thresholds were higher in RP patients than in normal subjects.
They were also significantly higher when simulating random photoreceptor (pixel)
dropout above 25% in normal subjects. These findings confirm the hypothesis that a
reduction in spatial photoreceptor density contributes to motion-threshold elevation.
The same research group tried to categorize a series of 35 mobility situations for
patients at various stages of RP (Turano et al., 1999). They used a questionnaire
scaling from 1 (no difficulty) to 5 (extreme difficulty). The mobility situations
requiring the least and most visual ability were, respectively, “moving about in the
home” and “walking at night”.
A research group at the Arlene R. Gordon Research Institute (Lighthouse
International) proposed new methodologies to determine the minimal visual
requirements for driving a car (Higgins et al., 1996; Higgins & Wood, 1998; Higgins
& Wood, 2005). Indicators of driving performance were abilities such as steering,
reading road signs, and recognizing road hazards. Acuity degradation produced
selective losses in some aspects of driving performance (e.g. decreased ability to
recognize high contrast signs and to avoid large, low contrast road hazards; slower
driving). Other aspects of driving performance (perception of lateral clearance,
maneuvering or ‘slaloming’ through a series of traffic cones) were largely unaffected
by low visual acuity.
Altogether, these findings reveal that several important aspects of vision are
particularly important for whole-body mobility. Two of these factors deserve
particular attention due to the probable constraints of future visual prosthetic
devices: the number of pixels available in the viewing window (i.e. visual acuity) and
the available field of view.
66
Minimum displacement required for correctly detecting the direction of motion.
106
EXPERIMENTS ON MOBILITY
5.2.2 Mobility in the context of artificial vision
Cha et al. (1992a) were the only ones to directly address whole-body mobility
under conditions simulating artificial vision. Briefly, normal subjects had to walk
through an indoor maze while wearing the same pixelized vision simulator used in
their previous reading experiments (Cha et al., 1992b). Their results suggested that,
similar to reading, an array of 25 x 25 pixels, projected on a foveal visual field of
1.7°, but encompassing a field of view of about 30°, could provide useful mobility
performance in environments not requiring a high degree of pattern recognition.
This previous study neglected, however, fundamental aspects of vision with a
retinal implant (images were not stabilized at a particular retinal location, nor were
they subserved to the subjects’ eye movements; eccentric implant locations were not
explored). Minimum requirements for whole-body mobility must, therefore, still be
systematically studied if one expects visual prostheses to restore these abilities, at
least to a certain point.
The main objective of the investigation presented in this chapter was, thus, to
systematically assess mobility performance in conditions mimicking artificial vision as
provided by a retinal implant transforming incident light into stimulation currents ‘insitu’. In a first series of experiments we determined the minimum requirements
for useful mobility, by studying the influence of stimulus information content
(pixelization level and available field of view) on mobility performance with a visual
area stabilized in central vision, in a variety of situations. A second experiment was
dedicated to explore whether naïve subjects could learn to perform whole-body
mobility tasks in eccentric vision, under similar experimental conditions.
5.3 Specific methods for the experiments on mobility
5.3.1 Subjects
Subjects were normal volunteers, familiar with the purpose of the study, and
recruited either from the staff of the Ophthalmology Clinic of the Geneva University
Hospitals, or from the staff of the University of Geneva. Their age ranged from 22 to
49 years. All had visual acuity better than 16/20 on the tested eye, normal
ophthalmologic status, and normal haptic perception.
5.3.2 Effective field of view
Similar to the Visuomotor Coordination experiments, we examined this issue by
projecting different portions of the environment (subtending different effective fields
of views) into the 10° x 7° stabilized viewing window (see fig. 57). This was achieved
by modifying the frame size of the image captured by the webcam. Using the
custom-made objective, the webcam captured images of the environment
corresponding to 66° x 49.5°. Therefore, setting the image frame size of the webcam
to 640 x 480, 320 x 240, and 160 x 120 pixels, respectively resulted in 16.5° x 11.6°,
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
107
33° x 23.1°, and 66° x 46.2° portions of the environment represented inside the
same 10° x 7° viewing window. Please refer to section 4.3.3 for more details.
5.3.3 Experimental setup
The details of the simulation procedures have already been described in the
Chapter 2. The apparatus used corresponded to the mobile setup with the webcam
(with the custom-made objective) mounted on the top of the system (see fig. 27b).
The image-processing algorithm used was real-time square pixelization.
The experimental procedure was very similar to that used for our previous
experiments. At the beginning of each run, a standard 9-point calibration was
performed and the actual experimental sequence started afterwards. The viewing
window (10° x 7°), contained fragments of pixelized images, extracted from the
frames captured by the webcam. Gaze position compensation was used to project
this viewing window onto defined (central or eccentric) areas of the retina (see fig.
25). The background of the remaining screen surface was gray (corresponding to the
mean luminosity of the visual scene). Tests were performed monocularly (using the
dominant eye). Test sessions frequently included several runs, but they never lasted
longer than 30 minutes to avoid subjects’ fatigue. Eye movement data for each run
were recorded and stored for further analysis.
5.4 Acute experiments on mobility
The amount of visual information required for achieving satisfactory mobility
varies according to the type of environment in which the task is to be performed. In
order to present an adequate overall picture of the visual information requirements
for useful mobility, we developed a series of tasks involving different realistic
situations.
5.4.1 Experiment 6: Laboratory maze
This task was conceived to assess mobility performance in familiar, randomized
indoor environments. The task consisted in walking through an indoor maze
consisting of 6 obstacles frequently encountered in daily life and positioned randomly
on the course. In the example shown in figure 67a, the subject had to, successively:
pass between 2 poles, open and pass through a door, climb over stairs, walk on
square marks placed on the floor while avoiding the circular mark, sit in front of a
table and put a pencil inside a plastic cup, and finally slalom around 3 poles. Figure
67b shows a picture of one of the subjects wearing the artificial vision simulator
while performing the task.
108
EXPERIMENTS ON MOBILITY
a)
b)
Figure 67. The ‘Laboratory maze’ task. (a) Scheme of the indoor course used for the experiments.
The task consisted in completing a circular course composed of 6 randomly positioned, familiar
obstacles. (b) One of the subjects wearing the mobile setup during the task.
5.4.1.1 Experimental protocol
Three subjects (AP, female, 28 years old; CF, male, 32 years old; JS, male, 43
years old) participated in the experiments. Before starting the actual experimental
sequence, all subjects performed 3 control runs for each task. These control sessions
were conducted in normal viewing conditions, and results were used as baseline
measures for ‘normal’ mobility performance.
Performance was measured with viewing windows presented at 5 pixelization
levels (17920, 1991, 498, 221, and 124 pixels) and subtending 3 effective fields of
view (16.5° x 11.6°, 33° x 23.1°, and 66° x 46.2°). Three successive runs were
performed per experimental condition (a given effective field of view and pixelization
level). The order in which each effective field of view was presented to each subject
was permuted using a latin square (see table 3). With each field of view, subjects
started with the easiest (highest) pixelization level and progressed towards the most
difficult (lowest) one. Once all pixelization levels for a given field of view where
completed, subjects progressed to the next one. Possible global learning effects
favoring a particular effective field of view would be therefore minimized, but would
still favor performance at low pixelization levels.
Table 3. Testing sequences (latin square permutation) for the effective fields of view during
experiment 6.
AP
CF
JS
1
16.5° x 11.6°
66° x 46.2°
33° x 23.1°
2
66° x 46.2°
33° x 23.1°
16.5° x 11.6°
3
33° x 23.1°
16.5° x 11.6°
66° x 46.2°
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
109
Mobility performance was measured as total time required for completing the
course and number of errors per course. We considered as errors: stepping out of
the marked path (green lines in fig. 67a), involuntarily touching any object, using the
sense of touch to achieve a particular sub-task (e.g. finding the doorknob when
trying to open the door), or failing to complete a sub-task at a particular object (e.g.
dropping or touching the plastic cup when trying to put the pencil inside it, while
sitting at the table).
Results were calculated as the mean of the cumulative data of each subject ±
SEM. Statistically significant differences were determined using standard (paired) t
tests with a significance level of 0.05.
5.4.1.2 Results
Figure 68 compares mean mobility performance for the laboratory maze task with
each effective field of view, versus number of pixels in the viewing window.
Individual performances in each experimental condition were established on the
basis of 3 runs. With the 3 effective fields of view, mobility errors were
approximately 1 for pixelizations down to 498 pixels. With the 16.5° x 11.6° field of
view (red plots in fig. 68) error counts were statistically equivalent across all
pixelizations. With the 33° x 23.1° field of view (blue plots in fig. 68), error counts
remained around 1 down to 221 pixels. At 124 pixels values significantly increased to
2 (p = 0.04). With the 66° x 46.2° field of view (green plots in fig. 68), mobility
errors increased noticeably at 221 and 124 pixels. This increase was significant at
a)
b)
Figure 68. Mobility performance versus number of pixels in the 10°x7° viewing window for 3 normal
subjects performing the ‘Laboratory maze’ task. Three effective fields of view projected in the 10°x7°
viewing window are compared in central vision: 16.5°x11.6° (red plot), 33°x23.1° (blue plot), and
66°x46.2° (green plot). (a) Mean number of errors per course ±SEM. (b) Mean time per course
expressed in s ±SEM. The solid black lines indicate mean performance results (±SEM) during control
sessions (normal viewing conditions).
110
EXPERIMENTS ON MOBILITY
221 pixels (p =0.04) but not at 124 pixels (p = 0.07) due to high inter-subject
variability. Interestingly, a learning effect could be visible at 1991 pixels with the 3
effective fields of view (significant at p = 0.01 for 33° x 23.1°), and at 498 pixels
with the 16.5° x 11.6°.
Results for the mean time per course are similar (fig. 68b). With the 16.5° field of
view, average time was approximately 120 s and statistically equivalent across all
pixelization levels. With the 33° x 23.1° field of view, time per course remained
around 110 s down to 221 pixels. At 124 pixels, values slightly increased to 130 s (p
= 0.05). With the 66° x 46.2° field of view, mean time per course was approximately
110 s down to 1991 pixels. Values increased noticeably below this pixelization level
(p = 0.09 at 498 pixels; p < 0.05 at 221 and 124 pixels). A non-significant learning
effect could also be observed at 1991 pixels, for the effective 3 fields of view. This
tendency was, however, less pronounced than for mobility errors.
In normal viewing conditions (black solid lines in fig. 68), subjects made no
mobility errors and completed the course in approximately 30 s. Mobility errors were
only slightly higher in most of our experimental conditions (except for 221 and 124
pixels with the 66° x 46.2° field of view). In contrast, it took subjects 4 to 6 times
longer to complete the course in all experimental conditions than in normal viewing.
Performance data for this mobility task were normalized to values achieved at
17920 pixels, for each of the effective fields of view investigated. Mean results ±
SEM are plotted versus the effective resolution of the environmental space in figure
69. Data obtained with the different viewing angles were fitted together to single
a)
b)
Figure 69. Normalized mobility performance versus effective resolution of the environmental space in
the 10°x7° viewing window for 3 normal subjects during the ‘Laboratory maze’ task. Three effective
fields of view projected in the 10°x7° viewing window are compared in central vision: 16.5°x11.6°
(red plot), 33°x23.1° (blue plot), and 66°x46.2° (green plot). (a) Mean normalized error counts
±SEM. (b) Mean normalized time ±SEM. The solid black lines indicate the best fit to the data.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
111
exponential functions (solid black lines in fig. 69). Best fits for normalized errors and
for normalized time reveal that, for the ‘Laboratory maze’ task, best mobility
performance can be achieved with effective environmental resolutions down to
approximately 0.2 pixels/deg2. Error counts and time required for completing the
course increased markedly below this level.
5.4.2 Experiment 7: Random forest
This task was designed to assess mobility performance in randomized,
unpredictible indoor environments including some dynamic elements. The task
consisted in walking through an ‘artificial forest’, from a random starting position (A
in fig. 70a) to a random end position (B in fig. 70a). The forest measured 16 x 8 m2,
and was composed of 52 randomly positioned obstacles or ‘trees’. Fifty of these trees
were white and the remaining 2 were black. Subjects had to avoid contacting the
white trees, but were requested to localize and touch the black trees as they passed
through the forest. In addition, a variable number of persons (2, 1, or none; marked
as arrows in fig. 70a) could cross the forest while subjects were performing the task,
and subjects had to avoid bumping into them. Figure 70b shows a picture of one of
the subjects wearing the experimental setup while performing the task.
a)
b)
Figure 70. The ‘Random forest’ task. (a) Scheme of one of the forest configurations used in the
experiments. The task consisted in passing through a random course made of 52 obstacles (trees)
from point A to point B. Please note that the gridlines are only schematic and were not visible to
subjects during the task. (b) One of the subjects wearing the mobile setup during the task.
5.4.2.1 Experimental protocol
Six subjects (LC, female, 22 years old; FM, male, 23 years old; CU, female, 25
years old; XS, male, 25 years old; JS, male, 45 years old; and FS, female, 49 years
112
EXPERIMENTS ON MOBILITY
old) participated in this experiment. Before starting the experiment, all subjects
performed 3 control runs in normal viewing conditions. These measures constituted
baseline measures for ‘normal’ mobility performance.
Performance was measured with viewing windows presented at 5 pixelization
levels (17920, 1991, 498, 221, and 124 pixels) and 3 effective fields of view (16.5° x
11.6°, 33° x 23.1°, and 66° x 46.2°). Three successive runs were performed per
experimental condition (a given effective field of view and pixelization level). The
order in which each effective field of view was presented to each subject was
permuted using a latin square (see table 4). With each field of view, subjects started
with the easiest (highest) pixelization level and progressed towards the most difficult
(lowest) one. Once all pixelization levels for a given field of view where completed,
subjects progressed to the following one. Possible global learning effects favoring a
particular effective field of view would be therefore minimized, but would still favor
performance at low pixelization levels.
Table 4. Testing sequences (latin square permutation) of the effective fields of view during
experiment 7.
LC
FM
CU
1
66° x 46.2°
33° x 23.1°
16.5° x 11.6°
2
16.5° x 11.6°
66° x 46.2°
33° x 23.1°
3
33° x 23.1°
16.5° x 11.6°
66° x 46.2°
XS
JS
FS
1
33° x 23.1°
16.5° x 11.6°
66° x 46.2°
2
16.5° x 11.6°
66° x 46.2°
33° x 23.1°
3
66° x 46.2°
33° x 23.1°
16.5° x 11.6°
Mobility performance was measured as total time required for crossing the forest
(going from point A to point B, passing by the black trees) and total number of errors
per course. We considered as errors: touching a white tree, missing one of the black
trees, bumping into a crosser, stepping out of the marked path (black perimeter in
fig. 70a), and missing the arrival point (B).
Results were calculated as the mean of the cumulative data of each subject ±
SEM. Statistically significant differences were determined using standard (paired) t
tests with a significance level of 0.05.
5.4.2.2 Results
Figure 71 shows mobility performance results for the ‘Random forest’ task,
analyzed against the number of pixels available in the viewing window, for each of
the 3 effective fields of view investigated. Down to 1991 pixels, approximately 1
error was performed with all fields of view. With the 16.5° x 11.6° and the 33° x
23.1° fields of view (respectively red and blue plots in fig. 71), average errors
increased significantly to 2 at 498 pixels and below (p < 0.05). With the 66° x 46.2°
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
113
field of view (green plots in fig. 71) mobility errors increased to about 2 at 221 pixels
and to about 5 at 124 pixels. These important increases were, however, not
significant (p > 0.05) due to the high variability of the results. A slight learning effect
could be observed at 1991 pixels for the 16.5° x 11.6° (p > 0.05) and the 66° x
46.2° (p = 0.01) fields of view.
Analysis of the mean time per course (fig. 71b) reveals that all the effective fields
of view were similarly affected by the number of pixels in the viewing window.
Furthermore, the 33° x 23.1° field of view yielded the fastest times per course at all
pixelization levels. With the 16.5° x 11.6° field of view, significant slow-downs were
observed at 221 and 124 pixels (respectively, p = 0.02 and p = 0.001). With the 33°
x 23.1° field of view, time required to complete the course increased significantly at
498 pixels already (p = 0.006). Significant slow-downs were also observed at 221
and 124 pixels (respectively, p = 0.01 and p = 0.005). Results for the 66° x 46.2°
field of view increased significantly at 498, 221, and 124 pixels (respectively: p =
0.0002, p = 0.02, and p = 0.02).
Subjects made approximately 1 error with the 3 fields of view at 17920 and 1991
pixels, while no errors were performed in normal viewing conditions (black dashed
lines in fig. 71). This difference increased to 2 - 5 errors at 498 pixels and below. At
the highest pixelization levels, mean time required to complete the course was
already 7 - 9 times slower than that required in normal viewing conditions (~ 20 s).
Performance data for this mobility task were normalized to values achieved at the
highest resolution tested (17920 pixels), for each of the effective fields of view
a)
b)
Figure 71. Mobility performance versus number of pixels in the 10°x7° viewing window for 6 normal
subjects performing the ‘Random forest’ task. Three effective fields of view projected in the 10°x7°
viewing window are compared in central vision: 16.5°x11.6° (red plot), 33°x23.1° (blue plot), and
66°x46.2° (green plot). (a) Mean number of errors per course ±SEM. (b) Mean time required to cross
the forest, expressed in s ±SEM. The solid black lines indicate mean performance results (±SEM)
during control sessions (normal viewing conditions).
114
EXPERIMENTS ON MOBILITY
investigated. Mean results ± SEM are plotted versus the effective resolution of the
environmental space in figure 72. Surprisingly, this figure reveals that mobility
performance for this task was very different from one condition to the other.
Therefore, results could not be fitted to an exponential function.
a)
b)
Figure 72. Normalized mobility performance versus effective resolution of the environmental space in
the 10°x7° viewing window for 6 normal subjects during the ‘Random forest’ task. Three effective
fields of view projected in the 10°x7° viewing window are compared in central vision: 16.5°x11.6°
(red plot), 33°x23.1° (blue plot), and 66°x46.2° (green plot). (a) Mean normalized error count ±SEM.
(b) Mean normalized time ±SEM.
5.4.3 Experiment 8: Real street crossing
This task was intended to assess the visual requirements for mobility in a real-life,
dynamic environment. We evaluated the capacity of subjects of estimating speed and
distance of approaching objects (cars). The task consisted in judging the possibility
of crossing a medium-traffic street (without actually crossing it for obvious safety
reasons). Subjects stood on the street-side and, at the signal of the experimenter67,
had to observe the traffic and estimate when they would be able to cross safely. An
experimental run consisted in one street-crossing estimate. After each run subjects
were requested to estimate, in a 0 to 5 scale, the difficulty of the task in the
particular experimental condition (0 easy; 5 impossible) and how safe they felt about
their estimation (0 completely safe; 5 not safe at all). In addition, we asked 2
subjects (FG and FL) to also estimate the extent to which they used hearing to
accomplish the task (0 no use; 5 only used hearing since not enough visual
information). Figure 73 shows a picture of one of the subjects during the task.
67
The experimenter gave the “start” signal after a first vehicle passed in front of the subject.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
115
We
performed
pilot
experiments
to
roughly
explore
the
visual
requirements of the task and
adapt
our
experimental
protocol accordingly. This
evaluation revealed that the
lowest pixelization level tested
in the previous experiments
(124 pixels) was too difficult
for this task, even when small
fields of view were used. We
therefore decided not to
include it in this experiment.
In
addition,
our
pilot
experiments suggested that
the minimum information
requirements for this task
could lay somewhere between
1991
and
498
pixels.
Figure 73. One of the subjects wearing the mobile setup
Consequently, we decided to
during the ‘Real street crossing’ task.
intercalate
2
pixelization
levels in this range. Finally,
the visual span requirements for the task appeared to be very variable. We therefore
decided to investigate the effect of the 4 fields of view that we are able of simulating
with our mobile setup.
5.4.3.1 Experimental protocol
Four subjects (LC, female, 22 years old; XS, male, 25 years old; FG, female, 23
years old; and FL, male, 25 years old) participated in the experiments. Performance
was measured with viewing windows presented at 6 pixelization levels (17920, 1991,
1120, 717, 498, and 221) and 4 effective fields of view (8.25° x 5.8°; 16.5° x 11.6°,
33° x 23.1°, and 66° x 46.2°). Three successive runs were performed per
experimental condition (a given effective field of view and pixelization level). The
order in which each effective field of view was presented to each subject was
permuted using a latin square (see table 5). With each field of view, subjects started
with the easiest (highest) pixelization level and progressed towards the most difficult
(lowest) one. Once all pixelization levels for a given field of view where completed,
subjects progressed to the next one. Possible global learning effects favoring a
particular effective field of view would be therefore minimized, but would favor
performance at low pixelization levels.
Mobility performance was measured in terms of the difficulty, safety, and hearing
index determined by subjects after each trial. For difficulty and safety results, results
were calculated as the mean of the cumulative data of each subject ± SEM.
Statistically significant differences were determined using standard (paired) t tests
with a significance level of 0.05. No statistics were performed on hearing index
116
EXPERIMENTS ON MOBILITY
Table 5. Testing sequences (latin square permutation) of the effective fields of view during
experiment 8.
LC
XS
FL
FG
1
8.25° x 5.8°
66° x 46.2°
33° x 23.1°
16.5° x 11.6°
2
16.5° x 11.6°
8.25° x 5.8°
66° x 46.2°
33° x 23.1°
3
33° x 23.1°
16.5° x 11.6°
8.25° x 5.8°
66° x 46.2°
4
66° x 46.2°
33° x 23.1°
16.5° x 11.6°
8.25° x 5.8°
results because these were only available for 2 subjects, a small sample size not
allowing for valid statistical analyses.
5.4.3.2 Results
Results of the analysis of mobility performance with each effective visual field,
versus the number of pixels available in the viewing window are presented in figure
74. Difficulty estimates at the highest pixelization (17920 pixels) were below 1, and
statistically equivalent for the 4 effective fields of view. Perceived task difficulty
increased as fewer pixels were available in the viewing window. With the 8.25° x
5.8° field of view (red plots in fig. 74), values were statistically equivalent down to a
498 pixels. At 221 pixels, difficulty estimates increased significantly (p = 0.005). With
the 16.5° x 11.6° and 33° x 23.1° fields of view (respectively, blue and green plots in
fig. 74), difficulty estimates increased significantly at 1120 pixels (p = 0.002). Values
remained roughly stable at 717 pixels, and then significantly increased at 498 and
221 pixels (p < 0.05). With the 66° x 46.2° field of view (yellow plots in fig. 74),
difficulty estimates increased significantly at 1991 pixels already (p = 0.01). This
significant increase persisted at the remaining lower pixelization levels (p = 0.006 at
1120 pixels, p = 0.03 at 717 pixels, p = 0.003 at 498 pixels, and p = 0.03 at 221
pixels).
Safety estimates (fig. 74b) yielded similar results. At maximum target resolution,
subjects judged their crossing estimates as safe (values below 1) with the 4 effective
fields of view tested. The feeling of safety decreased as fewer pixels were available
in the viewing window. With the 8.25° x 5.8° field of view, results were statistically
equivalent down to a target resolution of 498 pixels. At 221 pixels, a significant (p <
0.05) increment (a decrease in the feeling of safety) was observed. With the medium
fields of view (16.5° x 11.6° and 33° x 23.1°) safety indexes increased significantly
at 717, 498, and 221 pixels (p < 0.05). Safety estimates for the 66° x 46.2° field of
view increased significantly at 1120 pixels and below (p < 0.05).
Estimations of the use of hearing (fig. 74c) were also very sensitive to pixelization
level. With the 8.25° x 5.8°field of view, hearing was used to accomplish the task
(values ≥ 2) for pixelization levels of 717 and 221 pixels, but surprisingly not for 498
pixels. With the 16.5° x 11.6° and 33° x 23.1° fields of view, hearing seemed to be
required for target resolutions of 1120 pixels and below. Finally, with the 66° x 46.2°
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
117
b)
c)
Figure 74. Mobility performance versus number of pixels in the 10°x7° viewing window for 4 normal
subjects performing the ‘Real street crossing’ task. Four effective fields of view projected in the
10°x7° viewing window are compared in central vision: 8.25° x 5.8° (red plot); 16.5°x11.6° (blue
plot), 33°x23.1° (green plot), and 66°x46.2° (yellow plot). (a) Mean task difficulty estimates ±SEM.
(b) Mean safety appraisal of crossing estimates ±SEM; (c) Mean hearing index (results for only 2
subjects).
field of view, hearing appeared to be necessary at 1991 pixels already. Values
successively increased as fewer pixels were available in the viewing window.
Figure 75 presents the same results but plotted versus the effective resolution of
the environmental space. Interestingly, 2 clear tendencies can be observed. With the
2 smallest fields of view (8.25° x 5.8° and 16.5° x 11.6°; respectively red and blue
plots in fig. 75), difficulty and safety estimates increased significantly (p < 0.05) at
effective resolutions below 10 pixels/deg2. With the largest fields of view (33° x
23.1° and 66° x 46.2°; respectively green and yellow plots in fig. 87) significant (p <
118
EXPERIMENTS ON MOBILITY
0.05) increments could be observed for effective resolutions below 2 pixels/deg2.
Estimates for the use of hearing appear to follow the same trend.
a)
b)
c)
Figure 75. Normalized mobility performance versus effective resolution of the environmental space in
the 10°x7° viewing window for 4 normal subjects performing the ‘Real street crossing’ task. Four
effective fields of view projected in the 10°x7° viewing window are compared in central vision: 8.25° x
5.8° (red plot); 16.5°x11.6° (blue plot), 33°x23.1° (green plot), and 66°x46.2° (yellow plot). (a) Mean
task difficulty estimates ±SEM. (b) Mean safety appraisal of crossing estimates ±SEM; (c) Mean
hearing index (results for only 2 subjects).
5.4.4 Summary of the results of these experiments
Altogether, results from experiments 8, 9, and 10 confirm that minimum
information requirements appear to be closely linked to the type of environment on
which mobility tasks have to be performed. Mobility in well-known, indoor
environments required relatively little visual information: approximately 0.2
pixels/deg2 (e.g. ~ 150 pixels with the 33° x 23.1° field of view) emerge as the
fundamental limit for mobility performance in such environments. Large fields of view
did not seem to be of particular advantage in these settings.
Mobility tasks in less predictable environments incorporating some dynamic
elements, such as that the ‘Random Forest’ task, appear to be more sensitive to the
number of pixels available on the viewing window. In these settings, minimum
information requirements lay around 498 pixels. Similar performance could be
achieved with all fields of view, however, the 33° x 23.1° field of view tended to yield
the best mobility performance.
Finally, 498 pixels appear to be insufficient for subjects to feel safe while
performing mobility tasks in unknown, dynamic environments, such as that of the
‘Real street crossing’ task. As fewer pixels were available in the viewing window,
subjects needed to compensate with additional information sources (i.e. hearing).
Approximately 1000 pixels and smaller fields of view providing more detailed visual
information seem to be advantageous in these settings. However, it is important to
highlight that subjects almost never attempted to cross the street in dangerous
situations.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
119
5.5 Habituation experiments on mobility
This set of experiments was dedicated to explore whether naïve subjects could
learn to perform whole-body mobility tasks in eccentric vision. According to the
previous experiments, a field of view of 33° x 23.1° seems to be the best
compromise between a large enough field of view while still maintaining reasonable
image resolution. Subjects also spontaneously reported preferring this particular field
of view to the others. This field of view was thus chosen for the second set of
experiments. According to the results obtained in the 1st set of mobility experiments,
and to be consistent with our previous experiments on reading, a resolution of 498
pixels in the viewing window was judged to be the most adequate for learning the
task in eccentric vision (effective resolution of 0.65 pixels/deg2).
We used the ‘Laboratory Maze’ task described in the previous section (see fig.
79). Two successive experiments were conducted. First, in a preparatory experiment,
subjects were asked to perform the mobility task using a viewing window stabilized
on the fovea. Second, in experiment 9, the subjects’ ability to perform mobility tasks
with eccentric vision (using a viewing window stabilized at 15° eccentricity) was
investigated. Experimental sessions were repeated daily. Using central vision,
performance asymptoted within 10 sessions. A lengthier learning process was
observed when using eccentric vision: 35 to 45 sessions were necessary for mobility
performance to asymptote.
5.5.1 Experimental protocol
Three subjects (MS, female, 27 years old; HB, male, 31 years old; KC, male, 37
years old) participated in the experiments. All of them were naïve to eccentric
viewing and were tested monocularly using their dominant eye. Possible learning
effects were investigated by repeatedly performing experimental sessions for a
period of more than 1 month. In general, 2 to 3 periods of testing were conducted
each working day of the week (5 days per week). The duration of each experimental
session was variable throughout the experiment. Each experimental session
consisted in, first, a run in which mobility performance was measured on a random
indoor obstacle course. Then, subjects were allowed to practice by completing as
many successive courses (the obstacle order was not changed and performance was
not measured) as possible to complete 30 minutes of testing. This had to be done
since, after only a few sessions, subjects performed the task very rapidly (less than 2
minutes). Therefore, an experimental session consisting only on the measured run
would have represented very short periods of eccentric viewing. The criterion used to
stop the experiments was the stabilization of the time required to complete an entire
course.
Mobility performance results are presented as previously (total time required for
completing the course and total number of errors per course). Oculomotor
adaptation to eccentric viewing was assessed by calculating the cumulative distance
of the subjects’ eye movements, for each experimental session. This analysis of
oculomotor behavior is presented in Appendix E.
120
EXPERIMENTS ON MOBILITY
Learning curves were computed using the exponential functions presented in
Chapter 2. Significant learning effects were determined using simple linear
correlation (Pearson’s correlation).
5.5.2 Preparatory experiment: Learning in central vision
In this experiment, several sessions were conducted using a viewing window
stabilized in central vision. It lasted until subjects became familiar with the task
(performing the ‘Laboratory Maze’ task while exploring the environment using a small
viewing window). Time required to complete the course stabilized within 10 sessions.
Figure 76 presents performance in central vision versus session number.
Relatively high error counts were observed in the first sessions. However, values
decreased rapidly and stabilized after only 4 - 10 sessions. Significant learning effects
were observed in the analysis of error counts versus time only for subject MS
(Pearson’s correlation: r = 0.66, p < 0.05). In the case of subjects HB and KC the
reduction in the total number of errors per course manifested as a clear tendency
(respectively, Pearson’s correlation: r = 0.63, p = 0.05 and r = 0.55, p = 0.08). Time
required for completing a course significantly improved with time for all three
subjects (Pearson’s correlation: r = 0.67, p < 0.05 for HB; r = 0.91, p < 0.001 for
MS; and r = 0.77, p < 0.01 for KC). Analysis of the experimental data revealed that
the average time per course stabilized within 10 sessions, diminishing from 360 to
110 s for HB, from 420 to 140 s for MS, and from 470 to 110 s for KC.
a)
b)
Figure 76. Mobility performance versus session number obtained for 3 normal subjects performing
the ‘Laboratory maze’ task in central vision. Experimental conditions: 10° x 7° viewing window
containing 498 pixels and subtending a 33° x 23.1° field of view. Results expressed as: (a) total
number of errors per course; and (b) total time required to complete the course expressed in s. The
solid lines indicate the best fits to the data.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
121
These data clearly demonstrate that adaptation to the mobiliy task in central
vision is very rapid. However, even after training, time required to complete the
course was still about 4 times slower than normal (approximately 30 s; see fig. 68b).
This is essentially due to the increased difficulty of scanning the environment using a
restricted viewing window.
5.5.3 Experiment 9: Learning in eccentric vision
This experiment was dedicated to explore whether normal subjects could adapt
to the unusual activity of performing mobility tasks using a small viewing window,
containing pixelized fragments of the environment, and stabilized at 15° eccentricity
in the lower field of view. In this case, average time per course stabilized after 35 45 sessions.
Figure 77 presents mobility performance in eccentric vision versus session
number. In the first sessions, 3 - 4 errors were observed for subjects MS and KC.
Subject HB performed 10 - 14 errors. For all subjects, values decreased and
stabilized within 12 sessions. Significant learning effects were observed in the
analysis of error counts versus time for subjects HB and MS (respectively, Pearson’s
correlation: r = 0.44, p < 0.005 and r = 0.45, p < 0.005). Results for subject KC
were very variable all through the training period, therefore no significant learning
effect could be noted. Improvements in the time required to complete the course
were highly statistically significant for all 3 subjects (Pearson’s correlation: r = 0.81,
p < 0.0001 for HB; r = 0.85, p < 0.0001 for MS; and r = 0.81, p < 0.0001 for KC).
a)
b)
Figure 77. Mobility performance versus session number obtained for 3 normal subjects performing
the ‘Laboratory maze’ task in eccentric vision (15° in the lower visual field; 10° x 7° viewing window
containing 498 pixels and subtending a 33° x 23.1° field of view). Results expressed as: (a) total
number of errors per course; and (b) total time required to complete the course expressed in s. The
solid lines indicate the best fits to the data.
122
EXPERIMENTS ON MOBILITY
Analysis of the experimental data revealed that the average time per course
stabilized within 33 sessions, diminishing from approximately 400 to 75 s for HB,
from 300 to 70 s for MS, and from 180 to 60 s for KC. Interestingly, for all 3 subjects
temporal performance achieved at the end of this experiment was faster than that
reached at the end of the previous experiment in central vision. However, it was still
about 2 times slower than that obtained in normal viewing conditions (compare figs.
68b and 77b).
Taken together, results from experiment 9 demonstrate that an important
learning process occurred for mobility with eccentric vision. The evolution was similar
in all subjects. Error counts decreased and stabilized rather rapidly. Time
improvements were more gradual and impressive. After training, subjects could
achieve the task even more rapidly than in central vision.
5.6 Discussion
The first goal of the experiments presented in this chapter was to explore the
minimum requirements for useful mobility. We assessed the influence of the
particular visual conditions that will most probably result from the use of subretinal
prosthetic devices transforming light into electric currents ‘in-situ’. Essentially, retinal
implants might affect whole-body mobility abilities since such tasks are known to
require wide fields of view.
Relatively little visual information is needed for mobility in well-known
environments: an effective resolution above 0.2 pixels/deg2 (i.e. 150 pixels with the
33° x 23.1° field of view) seems to be necessary for useful (accurate but slow)
performance in such environments. Subjects can cope with severely fragmented
information as soon as they recognize a familiar obstacle. Our results also
demonstrate that mobility in unpredictable environments including dynamic elements
is more demanding in terms of visual information requirements than mobility in
highly predictable static environments. At least 498 pixels seemed to be necessary in
this case, irrespective of the field of view used. Medium fields of view (around 33° x
23.1°) tended to yield the best performance. In unpredictable environments
including moving, eventually hazardous objects, higher image resolutions (~ 1000
pixels) are needed for subjects to feel safe. Furthermore, other sources of
information (such as hearing) seem to be useful for compensating the lack of visual
information. Blind subjects and low vision patients commonly use such compensation
strategies to enhance their mobility performance (Rieser et al., 1992; Hill et al.,
1993).
These results are in agreement with every-day clinical observations in low vision
patients. These patients (including those with reduced visual fields) behave
surprisingly well in known environments. At home they can do almost everything,
except for tasks demanding an important amount of visual information (like reading
or watching TV). In contrast, these same patients are greatly handicapped when
moving in unknown environments or if something unexpected happens. Indeed, both
RP and glaucoma patients report having much more problems moving in unfamiliar
than in familiar environments (Nelson et al., 1999; Turano et al., 1999).
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
123
Why do visual requirements for whole-body mobility differ in diverse settings?
When moving in well-known environments, subjects can compensate for the lack of
visual information with a series of visual and motor strategies to deal with difficulties
in egocentric and exocentric distance estimations (Turano et al., 2001; Apfelbaum et
al., 2006). In addition, tacit and explicit levels of knowledge, acquired progressively
by subjects with practice, conduct to performances based less on visual information
(Berthoz, 1997). For example, after a certain period of practice, subjects only have to
apply the motor sequences they constructed for a certain obstacle, and associate this
strategy to their mental map of the environment. Furthermore, an unknown
environment might be scattered not only with static, but also dynamic elements
(people or vehicles moving). Anticipation of forthcoming events implies observation
and extraction of a great amount of central and peripheral visual information
(Warren & Kurtz, 1992; Chanderli, 2002), which can take a considerable amount of
time if the visual field is of 20° or less (Turano et al., 2001). Moreover, depending on
the situation, more or less visual anticipation and shorter or longer reaction times are
necessary.
These observations also contribute to explaining the learning effects (i.e.
tendency towards fewer errors, faster times per course) observed for the ‘Laboratory
maze’ task at high pixelization levels (see fig. 68). A similar but slighter effect was
visible for the ‘Random forest’ task (see fig. 70). This could reflect habituation to the
experimental conditions and increased familiarization with the obstacle courses.
However, learning effects were not observed for the ‘Real street crossing task’. This
suggests that learning mainly consisted in the familizarization with the indoor tasks,
as suggested by Berthoz (1997). This was more difficult for the less predictable
‘Random forest’ task than for the highly predictable ‘Laboratory maze’ task.
Familiarization with the task seemed to be impossible in the unpredictable and
dynamic environment of the ‘Real street crossing’ task.
Another interesting issue arises from the observation that mobility performance
for the ‘Random forest’ task was very different from one visual condition to the other
(see fig. 72). The strategy used by subjects to explore the environment was almost
the same despite of the size of the effective field of view available. In this case, their
strategy seemed to be mainly influenced by the number of pixels available in the
viewing window (see also fig. 71). This striking difference with the other tasks is
probably due to the spatial configuration of the environment. The ‘Random forest’
was cluttered with obstacles that were quite close to each other (1 m of separation
between neighboring trees). This probably forced subjects to deal primarily with their
near-environment, not allowing them to fully exploit the advantage of larger fields of
view.
It is also worth mentioning that although subjects seem to require relatively high
amounts of information to feel safe in dynamic and unknown environments (such as
that of our ‘Real street crossing’ task), they rarely put themselves in dangerous
situations (e.g. attempt to cross the street when a car was too close or when a car
was approaching too fast). This was true even when very little visual information was
available in the viewing window. In general, subjects compensated for the lack of
visual information with audition: they waited until they could not hear any car
124
EXPERIMENTS ON MOBILITY
approaching to attempt the crossing. It seems thus that, at least in this particular
setting, auditory information was enough to complete the task safely most of the
time. Furthermore, as already pointed out by Pelli (1987), subjects seem to be quite
inaccurate when judging the danger of a particular situation. However, this issue
must be taken with caution, because, on one hand, our results only allow for a very
qualitative evaluation of the role of audition for achieving the task. Besides, the
experiments were carried out in young subjects (22 to 24 years old), that
presumably had very good auditory function. Obviously, auditory cues might be less
useful to older patients, which might present hearing deficits. On the other hand,
even one single misjudgement in a task like this one could have terrible
consecuences. Therefore, it is very important that subjects feel safe and sure of their
estimations.
The second goal of the experiments presented in this chapter was to explore
whether mobility tasks could be efficiently performed with a restricted viewing
window stabilized at 15° eccentricity in the lower visual field. Error counts for an
indoor course (familiar environment) asymptoted rapidly, within 10 sessions.
Improvements in the time required to complete the course were more gradual and
better depicted the learning process: around 33 sessions were required in this case.
Surprisingly, after training, the mobility task was performed more rapidly in eccentric
vision than in central vision in similar experimental conditions (p < 0.05; compare
figs. 76b and 77b).
The fact that subjects could learn to perform the mobility task in such unnatural
viewing conditions is not surprising in light of the similar findings reported in the
previous chapters. In addition, several studies have systematically studied the effect
of mobility/orientation training on low vision patients, demonstrating significant
improvements with time (Geruschat & Del'Aune, 1989; Straw & Harley, 1991; Kuyk
et al., 2004). During training in central vision, mobility performance improved
because subjects adapted to performing the task in our particular experimental
conditions, but also because subjects familiarized with the obstacles and the course.
However, such a ‘familiarization’ effect did not influence the results of our eccentric
viewing experiments. These began only after mobility performance in central vision
asymptoted. We can consider, thus, that once subjects started the experiments in
eccentric vision, familiarization with the obstacle course was complete. Hence, we
can consider that in experiment 9 the learning process mainly consisted in the
adaptation to eccentric viewing.
5.7 Conclusion
These data clearly demonstrate that useful mobility performance can be obtained
under conditions mimicking artificial vision in the central visual field. This means that
all necessary information could be transmitted and captured by the visual system.
Similar to visuomotor tasks, mobility performance in our particular experimental
conditions (stabilized 10° x 7° viewing window subtending an effective field of view
of 33° x 23.1° and containing 498 pixels) is noticeably slower than in normal viewing
conditions. This outcome is mainly due to the increased difficulty of exploring the
environment using a restricted viewing window. Our results also indicate that useful
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
125
mobility performance can be achieved even if retinal implants have to be placed in
the periphery of the visual field.
At this point, we estimate that about 500-600 distinctly perceived phosphenes,
distributed on a 3 x 2 mm2 implant, remain the minimum criterion to achieve useful
reading, visuomotor coordination, and whole-body mobility. An image resolution of
1000-2000 pixels would, however, greatly improve the feeling of security for mobility
in unknown, dynamic environments.
5.8 Publications resulting from this research
Pérez Fornos, A., Sommerhalder, J., Chanderli, K., Pittard, A., Baumberger, B., Fluckiger, M., Safran,
A.B., & Pelizzone, M. (2004). Minimum requirements for mobility in known environments and
perceptual learning of this task in eccentric vision. ARVO Meeting Abstracts, 45, 5445 (abstract).
Sommerhalder, J., Pérez Fornos, A., Chanderli, K., Colin, L., Schaer, X., Mauler, F., Safran, A.B., &
Pelizzone, M. (2006). Minimum requirements for mobility in unpredictable environments. ARVO
Meeting Abstracts, 47, 3204 (abstract).
Sommerhalder, J., Pérez Fornos, A., Chanderli, K., Pittard, A., Safran, A.B., & Pelizzone, M. (2006).
Minimum requirements for useful mobility and learning of such tasks in eccentric vision. In
preparation.
6 Towards Better Simulations of Artificial Vision
Imagination and fiction make up more than three quarters of our real
life.
Simone Weil (1909 - 1943)
6.1 Foreword
The psychophysical studies presented in the previous chapters used different
simplified techniques to simulate the limited number of discrete stimulation contacts
available in a prosthesis. Essentially, stimulation images were decomposed into a
finite number of pixels with a simple block-averaging algorithm (refer to Chapter 2
for a detailed description of this algorithm). This resulted in an image composed of a
mosaic of square pixels of various gray levels, the gray level within each pixel being
constant (square pixelization). However, according to electrophysiological research in
the field (Weiland et al., 1999; Stett et al., 2000; Rizzo et al., 2003b; Lecchi et al.,
2006):
•
The patterns of neural activity elicited by electric stimulation of the retina depend
on the strength of the stimulation current.
•
Neural activation diminishes progressively as the distance between the electrode
contact and the neural target increases.
•
Spatially selective activity could be obtained for contact spacings of around 100
µm, when using adequate stimulation parameters68.
These facts imply that phosphenes elicited by electrical stimulation of the retina
should not be of constant luminosity, but brighter in the center, and of course, not of
‘square’ shape. Furthermore, depending on the strength of the stimulation current,
the percepts might develop from a collection of isolated phosphenes towards more
continuous patterns with different degrees of overlap across neighboring
phosphenes. More realistic simulations of artificial vision are, therefore, required to
explore the impact that image processing simplifications had in the results reported
previously. In order to validate our previous studies as well as to improve our
simulation methods for future studies, we decided to specifically investigate the
influence of the spatial and temporal characteristics of stimulus pixelization on
reading performance.
68
The fact that spatially selective activation can be achieved with inter-electrode spacings of about
100 µm is very encouraging. It is consistent with the image resolutions required for useful function,
according to our previous experiments. However, it should still be confirmed if such resolutions could
actually be realized in chronic implants for human use, while respecting safe charge density limits
(Brummer et al., 1983).
127
128
IMPROVED SIMULATIONS OF ARTIFICIAL VISION
6.2 Introduction
Square pixelization could be considered an adequate method to simulate the
reduced information content of the stimuli transmitted by a retinal implant. In a
given condition, the detailed shape of each pixel does not alter the overall
information content of the image. However, a number of studies on face recognition
have demonstrated that detection is considerably hampered when images are
decomposed into uniform, square pixels. Harmon & Julesz (1973) suggested that the
oriented high frequency noise introduced at block borders masked image features
essential for recognition. Yet, this explanation does not completely account for the
significant performance decrease observed in a number of studies where this image
processing technique was used (Costen et al., 1994; Uttal et al., 1995a; Uttal et al.,
1995b; Bachmann & Kahusk, 1997). Gestalt psychologists (Bachmann, 1991; Uttal et
al., 1997) further proposed that square pixelization distorts the image to the point of
modifying its intrinsic gestalt properties69. Bachmann & Kahusk (1997) also suggest a
complementary hypothesis: the ‘block’ constituents or pixels of the processed image
compete for attention with the particular features of the image, thus affecting
recognition. If one wants more accurate simulations of artificial vision, square
pixelization should be replaced by other types of image processing featuring softer
borders and allowing for variable amounts of overlap.
In addition, in our studies exploring the reading task (Chapter 3) another
potential flaw can be identified: the pixelization algorithm was applied off-line over
the entire original image (e.g. seven lines of full-page text), and this pre-processed
image was used for the experiments. Subjects were allowed to scan this image
through a viewing window containing a subset of pixels, the gray level of these
‘frozen’ pixels being independent of the point of gaze on the image. This will not be
the case in artificial vision systems, since stimulation intensity at each electrode
contact will depend on the exact point of gaze relative to the image observed. For
retinal implants transforming light falling on the retina into stimulation currents ‘in
situ’ (Zrenner, 2002b; Chow et al., 2003; Ziegler et al., 2004), this will happen due to
eye movements. Head movements will act similarly in systems using an external
head-mounted camera for stimulus generation (Rizzo & Wyatt, 1997; Normann et al.,
1999; Dobelle, 2000; Humayun et al., 2003; Veraart et al., 2003). In the case of
reading, when focusing on a string of a few characters, its appearance will change
upon small eye (or camera) movements. Temporal cues seem to play a significant
role in visual perception: the human visual system is optimized for detecting
structural changes in dynamic images. A dynamic sequence of slightly different
pixelized images might contain more information than one frozen pixelized image.
Therefore, dynamic (real-time) pixelization is likely to enhance information
transmission to the visual system. Major object identification features (such as shape
or location) are extracted from different spatial patterns (such as local contrast
changes or relative position changes) resulting from image motion. Improved
sensitivity for moving contrast changes, compared to their static equivalents, has
69
Gestalt psychologists suggest that the visual system uses certain image features (proximity,
similarity, symmetry, contour closure, smoothness) as cues to extract and identify objects. Refer to
Leeuwenberg (2003) for more details on Gestalt features.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
129
previously been demonstrated (Lappin et al., 2002). Moreover, it has already been
established that dynamic presentations lead to better performance in tasks like facial
recognition (Christie & Bruce, 1998; Lander et al., 1999; Thornton & Kourtzi, 2002).
Hence, if one wants more accurate simulations of artificial vision, pixelization should
be performed in real-time and the gray level of each pixel should vary dynamically,
according to gaze position.
In the present chapter, a series of three paired comparisons of the effects of
different pixelization methods on full-page reading will be presented. Reading
performance was compared:
1) between off-line square pixelization and real-time square pixelization of the
image,
2) between off-line square pixelization and off-line gaussian pixelization of the
image, and
3) between real-time square pixelization and real-time gaussian pixelization of the
image.
6.3 Specific methods for these simulations
6.3.1 Subjects
Ten subjects aged between 23 and 41 years were recruited from the staff of the
Geneva University Ophthalmology Clinic. All of them had perfect knowledge of
French, corrected visual acuity of 20/20 or better, and normal ophthalmologic status.
They were familiar with the purpose of the study.
6.3.2 Experimental Setup
The experimental setup was the same as that used for the studies on full-page
reading described in Chapter 3. The apparatus corresponded to the stationary setup.
Please refer to Chapters 2 and 3 for a more detailed description of the experimental
setup.
Stimuli consisted of full-page texts generated following the same procedure as in
our previous study on full-page reading. The stimulus generation procedure has
already been described in Chapter 3. Briefly, articles were extracted from the
Internet website of the Swiss newspaper Le Temps and cut into 7-line text segments
of about 25 words. Arial font (Helvetica) was used. At a viewing distance of 57 cm,
the height of the small letter ‘x’ corresponded to a visual angle of 1.8°.
The information content of the stimuli was reduced using one of two pixelization
algorithms, square or gaussian, which differed in the resulting shape of the pixels.
These algorithms were applied either off-line, yielding images with ‘frozen’ pixels, or
130
IMPROVED SIMULATIONS OF ARTIFICIAL VISION
in real-time, yielding ‘dynamic’ pixels that changed with gaze position. Please refer to
Chapter 2 for a detailed description of the image processing techniques used.
The remaining aspects of the experimental procedure were exactly the same as
described in the preceding full-page reading study. Tests were performed
monocularly (using the dominant eye) and in central vision. For each run, subjects
had to read aloud several text segments of an article, randomly chosen out of a pool
of 50 (none of the subjects read an article twice). Test sessions frequently included
several runs, but they never lasted longer than 30 minutes to avoid subjects’ fatigue.
Similar to the previous experiments of full-page reading, performance was
determined on the basis of reading scores (in RAU units and in approximate %), and
reading rates (in WPM). More details can be found in Chapters 2 and 3.
6.4 Experiment 10: Real-time Square vs. Off-line Square
Pixelization
This experiment was designed to explore the effects of using real-time versus offline square pixelization on full-page reading at various image resolutions.
6.4.1 Experimental protocol
Five volunteers (VR, female, 22 years old; MF, male, 23 years old; LP, male, 24
years old; AP, female, 26 years old; and RS, male, 28 years old) read full-page texts
using off-line and real-time square pixelization. Five piexelization levels were tested:
28000, 1750, 572, 280, and 166 pixels in the viewing window70. All subjects started
with the easiest condition (highest pixelization) and progressed towards the most
difficult one (lowest pixelization). The first four text segments of an article
(approximately 100 words) had to be read in each run, and three runs were
performed per each pixelization condition. Off-line and real-time pixelization
conditions alternated71.
6.4.2 Results
Figure 78 compares mean reading performance versus number of pixels in the
viewing window for off-line and real-time pixelizations. Individual performances in
each experimental condition were established on the basis of 12 text segments and
data were fitted with exponential functions.
Down to a target resolution of 572 pixels, average reading scores were close to
perfect (> 95% correct) and statistically equivalent for both conditions. At 280 pixels,
70
These pixelization levels were identical to those used in our previous study on reading of isolated 4letter words.
71
Note that the first pixelization level (28000 pixels) corresponded to maximum screen resolution, so
no pixelization had to be actually performed. Off-line and real-time pixelization conditions were, thus,
identical in this particular case.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
131
b)
Figure 78. Reading performance versus number of pixels in the 10°x7° viewing window for 5 normal
subjects. Two stimuli generation procedures are compared in central vision: real-time pixelization
(dynamic stimuli – red plots) and off-line pixelization (static stimuli – blue plots). (a) Mean reading
scores expressed in RAU ±SEM (left scale) and in % (right scale). The dashed black line indicates
reading scores corresponding to good to excellent text comprehension. (b) Mean reading rates
expressed in WPM ±SEM.
subjects achieved reading scores of 94.3% with real-time pixelization, but of only
76.4% with off-line pixelization. This difference was statistically significant (p =
0.0017), and persisted at the lowest pixelization (166 pixels; 56.1% versus 29.3%; p
= 0.013). It is interesting to estimate the critical pixelization for subjects to reach
useful reading performances. In the previous study on full-page reading, we found
that adequate (good to excellent) text comprehension was closely correlated to high
reading scores (see fig. 46). This criterion was fulfilled when median scores of at
least 96.8% of correctly read words were reached. In the present case, the fits to
the data indicate that this score is reached at 498 pixels in the case of off-line
pixelization and at 322 pixels for real-time pixelization (see fig. 78a).
Reading rates appeared to be even more sensitive to the number of pixels in the
viewing window. At the highest resolutions, subjects reached an average reading
rate of 93 WPM. At 572 pixels, mean reading rates already dropped significantly (p <
0.0001) to 80 WPM for real-time pixelization and to 64 WPM for off-line pixelization
(see fig. 78b). The difference between both pixelization conditions was also
statistically significant (p < 0.0001), and persisted at 280 pixels (34 WPM for realtime pixelization versus 18 WPM for off-line pixelization; p = 0.002). The lowest
pixelization condition (166 pixels), was so difficult that reading rates were very low
(4 to 6 WPM) in both cases.
Taken together these results indicate that the same functional rehabilitation could
be reached at a significantly lower resolution when real-time pixelization is used.
132
IMPROVED SIMULATIONS OF ARTIFICIAL VISION
6.5 Experiment 11: Off-line Gaussian vs. Off-line Square
Pixelization
This experiment was designed to investigate the influence of pixel shape on
reading performance by comparing the use of gaussian pixels to the use of square
pixels. The effect of varying the gaussian width (i.e. different levels of overlap across
neighboring pixels) on full-page reading was also assessed.
6.5.1 Experimental protocol
Tests were performed on six subjects (AP, female, 26 years old; CB, female, 29
years old; EO, female, 29 years old; AC, male, 33 years old; MB, male, 34 years old;
and JS, male, 41 years old). Pixelizations with six different gaussian widths (σ values
of 0.036, 0.071, 0.143, 0.286, 0.571, and 1.143 pixels) were tested and compared to
square pixelization. In all conditions, the 10° x 7° viewing window contained 572
pixels (minimum information for useful text reading; see results from experiments 3
and 12). Each subject had to read an article of about 250 words (i.e. 10 consecutive
text segments), per condition. Three subjects started the experiment with gaussian
pixelization at the smallest σ value, progressed towards the larger gaussian widths,
to finish with square pixelization. The remaining three subjects conducted the
experiment in the inverse order.
6.5.2 Results
Mean reading performances versus gaussian function width (σ) are shown in
figure 79. Mean performances using square pixelization are also indicated for
comparison. Individual performance values in each experimental condition were
computed on the basis of all 10 consecutive text segments.
Four gaussian width values (σ = 0.071, 0.143, 0.286 and 0.571 pixels) resulted in
reading scores above 94% correctly read words. These scores were very close to
those obtained with square pixelization (see fig. 79a). Mean reading scores with σ =
0.143 and σ = 0.286 pixels were found to be significantly better than those obtained
with square pixelization (p = 0.04 and p = 0.009, respectively). Scores with σ =
0.071 and σ = 0.571 pixel were not statistically different from those obtained with
square pixelization. Reading scores dropped markedly for the two extreme gaussian
widths tested (σ = 0.036 and σ = 1.143 pixels). In these conditions, less than 80%
of the texts were read because either overlapping was too pronounced or pixels were
reduced to small isolated points of light.
Mean reading rates display a similar picture (see fig. 79b). A maximum reading
rate of 70 WPM was achieved at σ = 0.286 pixels. This value is significantly higher (p
< 0.001) than the reading rate of 57 WPM achieved with square pixelization. Reading
rates with σ = 0.143 and σ = 0.571 pixels were not statistically different from those
obtained with square pixelization. For σ = 0.036, 0.071 and 1.143 pixels, reading
rates dropped markedly below 40 WPM.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
133
b)
Figure 79. Reading performance versus gaussian function width (σ) used for stimulus pixelization, for
6 normal subjects. Results are compared to reading performances obtained with square pixelized
stimuli (dashed line ±SEM). The resolution of the 10°x7° viewing window in central vision was kept
constant at 572 pixels. (a) Mean reading scores expressed in RAU ±SEM (left scale) and in % (right
scale). (b) Mean reading rates expressed in WPM ±SEM (left scale).
Taken together, these results reveal that gaussian pixelization can lead to slightly,
but significantly better reading performance compared to its square counterpart.
6.6 Experiment 12: Real-time Gaussian vs. Real-time Square
Pixelization
Results of experiment 11 demonstrated that off-line gaussian pixelization could
lead to significantly better reading performance than off-line square pixelization.
Another experiment was thus dedicated to extend this comparison to real-time
mode.
6.6.1 Experimental protocol
For this comparison we would have rather used the ‘optimal’ gaussian width
determined in the second experiment (σ = 0.286 pixels). However, the total
processing time needed to simulate this condition turned out to be too important to
insure adequate image stabilization on the retina. Using the second best condition (σ
= 0.143 pixels) allowed us to keep processing time below 10 ms (refer to Chapter 2
for more details on this issue). The same 6 volunteers who had participated in
experiment 11 were requested to read 10 text segments in each of two conditions:
real-time square pixelization and real-time gaussian pixelization at σ = 0.143 pixels.
In both conditions the 10° x 7° viewing window contained 572 pixels. Three subjects
started with real-time square pixelization, and then switched to real-time gaussian
pixelization. The remaining three subjects performed the experiment inversely.
134
IMPROVED SIMULATIONS OF ARTIFICIAL VISION
6.6.2 Results
The results of this experiment are summarized in table 6. No significant difference
in performance could be found between both types of pixelization. However, reading
scores and reading rates tended to be slightly higher with square pixelization.
Comparing those ‘real-time’ scores with their ‘off-line’ counterparts gathered in the
second experiment reveals that both real-time conditions yield better performance.
This performance gain was significant for square pixelization (reading scores: p =
0.003; reading rates: p = 0.008), but not for gaussian pixelization (reading scores: p
= 0.12; reading rates: p = 0.25).
Table 6. Mean reading performances for real-time pixelization, for 6 normal subjects. Gaussian
pixelization is compared to square pixelization using a 10°x7° viewing window containing 572 pixels.
Mean reading scores
Gaussian pixelization
Square pixelization
RAU ± SEM
%
RAU ± SEM
%
p
115.8 ± 3.6
99.6
117.2 ± 3.4
99.8
0.22 (ns)
Mean reading rates
Gaussian pixelization
Square pixelization
WPM ± SEM
WPM ± SEM
p
69 ± 12
74 ± 15
0.35 (ns)
ns: non significant
6.7 Discussion
Experiment 10 clearly shows that for stimulus resolutions below a critical value
(about 1000 pixels in a 10°x7° viewing window) real-time square pixelization yields
better reading performances than its off-line equivalent. The major reason for this
advantage lays probably in the capability of the visual system to integrate various
low-resolution images, enhancing stimulus contrast and resolution in order to
improve perception (Lappin et al., 2002). This effect is also used in standard video:
when several low-resolution images are presented in a rapid sequence, the resulting
perception is that of a continuous, higher-resolution motion picture. In our
experiments, at constant pixel resolution, the readability of pixelized text images
depends on the exact position of the pixelization grid relative to the original stimulus
image. Therefore, the image can be modified with minor eye movements to optimize
viewing conditions. Figure 80 illustrates this effect for a series of minor changes in
grid position. We observed that subjects quickly adopted this strategy: when
resolution decreased, they increased the number of small saccades around the word
they were trying to decipher.
135
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
a)
b)
c)
Figure 80. Illustration of the effect of the initial position (reference point) of the pixelization grid on
the readability of the pixelized word. A single position does not provide enough information to identify
the word unambiguously, but by integrating all three of them, the French word ‘niveau’ can be easily
recognized.
Other effects are also likely to influence reading performance. Previous research
on face recognition revealed that ‘blocked’ images lead to poorer performance than
images filtered using other techniques (Harmon & Julesz, 1973; Bachmann, 1991;
Costen et al., 1994; Bachmann & Kahusk, 1997). Square pixelization adds some
artefactual high frequency components to the target image that may mask essential
features for identification (Uttal et al., 1997). Real-time pixelization does not suffer
from the same artefactual bias because pixel movement acts as a low-pass filter
subtracting some of these parasitic frequencies.
This is, most probably, one of the reasons why in experiment 11 off-line gaussian
pixelization yielded better reading performance than off-line square pixelization (for a
restricted range of gaussian widths around σ = 0.286 pixels). Another reason for this
finding could be the structural difference between both pixelizations. Blocks are
important distracters competing for attention with the overall image features hidden
within the structure of the pixelized image (Bachmann & Kahusk, 1997). An
important factor could also be the configurational distortion introduced by blocks,
modifying the underlying gestalt properties of the image (Bachmann, 1991; Uttal et
al., 1995a; Uttal et al., 1995b; Bachmann & Kahusk, 1997). While in this work
measured performance differences were not very important, additional research
would be necessary to thoroughly investigate these effects, especially at lower
resolutions. It should also be stressed that extreme gaussian widths noticeably
impaired performance. When very small gaussian widths were used, pixels appeared
as isolated small points of light, making it almost impossible to extract a cohesive
picture. With large gaussian widths, overlap was too pronounced leading to very low
contrast stimuli. Several authors have already demonstrated that such low contrast
images lead to poor reading performance (Legge & Rubin, 1986; Legge et al., 1987;
Thompson et al., 2003).
Results of experiment 12 might appear surprising in light of the findings of
experiment 11: when using real-time processing, the significant benefits of gaussian
pixelization vanished. In fact, this outcome is not astonishing. Real-time processing
already eliminates the major ‘handicap’ of square pixelization: the distracting highfrequency noise introduced at pixel borders is low pass filtered by pixel movement.
We believe that the use of the ‘optimal’ gaussian width σ = 0.286 pixels (instead of
0.143) would not change this result fundamentally.
136
IMPROVED SIMULATIONS OF ARTIFICIAL VISION
6.7.1 Implications of these results for simulations of artificial vision
The exact characteristics of the electrophysiological response of the retina to
patterned electrical stimulation remain undetermined to this date. However, the use
of 2D gaussian functions for stimulus pixelization is certainly a more physiologically
pertinent approach than the use of square pixels, for at least two reasons: pixel
borders are smoother and it allows for overlapping between neighboring pixels in a
simple way. Subjects consistently described such stimuli as being more comfortable
than their square-pixelized counterparts. As soon as the results of
electrophysiological experiments on retinal tissue become available, the parameters
of such 2D gaussian (or more adequate) functions should be adapted. Our
experiments also demonstrated that gaussian width was an important factor for
readability. Electrophysiology has already revealed that electrical stimulation can
activate patches of retinal tissue, the size of which depends on stimulation current
(Weiland et al., 1999; Stett et al., 2000). Consequently, these results suggest that
stimulating current strength and electrode spacing might have to be further ‘tuned’
(within safe and comfortable limits) to achieve the most efficient image transmission
possible.
Real-time processing also yields more realistic simulations of the visual
information provided by retinal prostheses. This is essentially the case for devices
transforming visual stimuli into electric currents ‘in situ’, but it is also pertinent for
prostheses with externally controlled (i.e. head-mounted) cameras. In this series of
central reading experiments, real-time pixelization yielded significantly better
performance than off-line pixelization, but this benefit was relatively moderate. In
order to briefly assess the impact of this same effect on eccentric reading, we
performed
control
measurements on a subject
trained to read pages of text
through a viewing window
stabilized at 15° eccentricity
in the lower visual field
(subject AD, see Chapter 3).
Figure 81 compares her
reading performance for offline pixelization at resolutions
of 286 and 572 pixels with
those obtained with real-time
pixelization at 286 pixels.
Figure 81. Reading performance at 15° eccentricity in the
Consistent with the results
lower visual field, for one trained subject (AD). Mean reading
presented in this study, at
scores [RAU] ±SEM and mean reading rates [WPM] ±SEM.
286 pixels real-time square
Four runs of 4 text segments were performed for each
pixelization improved reading
experimental condition: off-line pixelization at 572 pixels, offscores significantly (p =
line pixelization at 280 pixels, and real-time pixelization at
280 pixels in a 10°x7° viewing window. Only the 3 runs
0.01). However, performance
yielding the best reading scores were used for data analysis.
remained significantly lower
One was suppressed because the subject was in bad shape,
than off-line pixelization at
which resulted in considerably lower reading performances.
572 pixels (mean reading
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
137
scores: p = 0.032; mean reading rates: p = 0.005). Everything considered, real-time
pixelization better mimics the visual information provided by a retinal prosthesis and
enhances performance, but it does not allow for a significant reduction (e.g. a factor
of two) of the number of stimulation points. Most probably, this advantage will be
even less important in visual prostheses with external head-mounted cameras, since
head movements are, in general, larger and less frequent than eye movements.
Moreover, recurring head movements could result in an abnormal vestibulo-ocular
reflex (Cha et al., 1992a).
6.8 Conclusion
These results demonstrate that the spatial and temporal characteristics of image
pixelization play a role in artificial vision simulations. Equivalent performance could
be reached with a resolution reduction of about 30%, if stimulation parameters were
adequate. Small eye movements will help to improve performance with retinal
prostheses transforming light ‘in situ’. This effect is, however, not strong enough to
fundamentally change the minimum requirements determined in our previous studies
on the basis of simplified processing: 400-500 contacts covering a 3 x 2 mm2 retinal
area are necessary to transmit sufficient visual information for reading, visuomotor
coordination, and whole-body mobility.
6.9 Publications resulting from this research
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Safran, A.B., & Pelizzone, M. (2005). Simulation of
artificial vision: III. Do the spatial or temporal characteristics of stimulus pixelization really matter?
IOVS, 46: 3906-3912.
7 General Conclusions
Questions are never indiscreet, answers sometimes are.
Oscar Wilde (1854 - 1900)
7.1 Summary of the results
The studies presented here intend to be a systematic assessment of the minimum
requirements to obtain useful artificial vision. We assumed that all visual information
provided by the prosthesis can be transmitted to the nervous system (i.e. that there
is no loss of information at the electrode-nerve interface). Focus was given to the
amount and type of visual information critical to perform a set of basic, every-day
tasks: reading, visuomotor coordination, and mobility. This type of approach has
already been used to study speech perception cochlear implant users (Shannon et
al., 1995; Dorman & Loizou, 1997; de Balthasar et al., 1999; Loizou et al., 1999).
The results can be summarized as follows:
•
For images projected onto a 10° x 7° viewing window stabilized in the central
visual field, our results indicate that:
1) Between 400-500 pixels are necessary for full-page reading. Strings of 4-6
characters and about two lines of text have to be visualized simultaneously
for efficient reading. This corresponds to a (highly magnified) effective visual
field of about 2° x 1.4° for a typical newspaper (~ 200 pixels/deg2).
2) About 400 pixels encoding an effective visual field of 16° x 12° allow for
efficient visuomotor coordination (~ 2 pixels/deg2).
3) As few as about 200 pixels encoding an effective visual field of 33° x 23°
allow for mobility in familiar environments (~ 0.2 pixels/deg2). However,
much higher image resolutions (>1000 pixels, ~ 2 pixels/deg2) are needed
for subjects to feel safe in unpredictable environments including moving,
eventually hazardous objects.
•
In eccentric vision (15° in the lower visual field), initial performance for all tasks
was relatively poor. Remarkable improvements were already observed after a few
training sessions. After a period of systematic adaptation of variable length (~ 60
sessions for reading; ~ 30 sessions for visuomotor coordination; and ~ 30
sessions for whole-body mobility), all tasks could be performed with reasonable
accuracy and speed in eccentric vision.
Our results are in accordance with current clinical observations in low vision
patients. Depending on their clinical condition, their handicap can result in problems
with small object recognition, specifically reading, and with spatial orientation,
including whole-body mobility and visuomotor coordination (Weih et al., 2000). Our
results clearly depict the particular visual requirements of each category of tasks: on
139
140
GENERAL CONCLUSIONS
one hand, the highly resolution-demanding reading task and, on the other hand, the
more important visual field requirements of visuomotor coordination and mobility
tasks.
We also observed that a period of systematic adaptation was required in order to
achieve efficient performance when tasks had to be achieved in eccentric vision.
Interestingly, however, adaptation processes for each task appear to be quite
different. The lengthier and most complex oculomotor adaptation process was
observed for the reading task. For this task, we observed a rapid suppression of the
vertical (presumably foveating) movements and then a more progressive
restructuration of the horizontal pattern. On the contrary, for visuomotor and
mobility tasks, the adaptation of both horizontal and vertical components of the eye
movement trajectory were less pronounced and appeared to be simultaneous (see
Appendix D and Appendix E). This difference is most certainly due to the fact that
during the reading experiments, subjects were forced to explore (navigate) through
the pages of text with their eye movements. As a consequence, an important part of
the learning process for eccentric reading consisted in the development of relatively
precise eccentric oculomotor control. Conversely, in the visuomotor and mobility
tasks subjects were not forced to use eccentric eye movements to explore the
environment. Instead, they could use head/body movements. In this case, subjects
made less use of eccentric saccades, which are difficult to develop and control
(Hallett, 1978; Zeevi & Peli, 1979; Whittaker & Cummings, 1990; Whittaker et al.,
1991; Heinen & Skavenski, 1992).
The learning effects reported for all tasks might appear surprising due to the fact
that experiments were performed in normal subjects interleaving short eccentric
viewing sessions with much longer periods of normal foveal viewing. One possibility
is that learning effects acquired within a single experimental session are somehow
carried out to the next one. Another explanation could be that learning results from
some kind of perceptual assimilation occurring between sessions. Our experience
suggests us that it could be a combination of both.
7.2 Implication of these results for the development of visual
prostheses
The results summarized above delimit two fundamental parameters of future
visual prostheses. On one hand, 400-500 stimulation contacts72, equally distributed
on an implant surface of 3 x 2 mm2 (equivalent to a pixel density of ~ 80
pixels/mm2) seem to be the minimum to restore useful function. On the other hand,
the optimal effective field of view represented by the active area of the implant
seems to depend on the task at hand. This does not imply that each task will require
implants of different sizes. It could be easily realized by varying the optics of the
system used to capture the visual stimuli. In subretinal implants that transform light
into electric currents ‘in-situ’, this could be achieved with different lenses adapted for
each task. Such adapted ‘viewing glasses’ would modify the optics of the eye in order
72
Assuming that each stimulation electrode evokes a distinct phosphene.
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
141
to focus larger or smaller portions of the visual environment on the photodiodes.
Similarly, in visual prostheses that use an external head-mounted camera to capture
the visual stimulus, the different effective fields of view could be achieved by directly
modifying the objective of the camera. In addition, subjects will be able of
performing head/body movements to move toward or move away of objects/targets
in order to dynamically modify the actual field of view according to the instantaneous
visual requirements of a particular situation. This is the strategy that we observed
during the visuomotor coordination experiments (see Appendix D). As less visual
information was available in the viewing window (less pixels and larger effective
fields of view), subjects had to approach the working area to recognize a given chip.
In contrast to the subjects that participated in our experiments, future users of
retinal implants will wear their prosthesis permanently in daily life. Pelli (1987) points
out that visual impairment might be less disabling in low vision patients than in
normal subjects under conditions simulating restricted vision as a result of long-term
learning. This suggests that, since future visual prosthesis wearers will have much
more time to adapt to their new vision than the normal subjects who participated in
our simulation experiments, they might benefit even more from learning, as well as
from other possible brain plasticity mechanisms. In this respect, our results are very
encouraging for the future.
Since our ultimate goal is the rehabilitation of blind patients, our results should
help clinicians in advising their patients on the level of rehabilitation that could be
provided with a given device. Suppose, for example, that the first commercially
available prostheses have 256 contacts distributed in an active surface of 3 x 3 mm2
(equivalent to a pixel density of ~30 pixels/mm2). Clinicians would be able to tell
their patients that with such a device, they could expect to recover certain mobility
abilities (in indoor, familiar environments). However, restoration of reading abilities
would probably be out of reach with such a prosthesis. In particular, the reading task
is known to be one of the most demanding in terms of visual information. We are
convinced that future devices should aim to restore reading abilities since these are
strongly associated with vision-related estimates of quality of life, and represent one
of the main goals of low vision patients seeking rehabilitation (see. e.g. Wolffsohn &
Cochrane, 1998; McClure et al., 2000; Hazel et al., 2000). Several alternatives
already provide blind patients with access to some written material (e.g. Braille, textto-speech conversion devices). Still, patients are not satisfied with these substitutes
since, for example, the offer of texts available in Braille is very limited. Other
solutions such as text-to-speech conversion devices require bulky equipment that
might be difficult to transport and, consequently, might not be available when
needed. Another important behavioral task, judging by the complaints of people with
low vision is whole-body mobility (Pelli, 1987; Wolffsohn & Cochrane, 1998). It is
thus important to be aware of such minimum conditions when developing visual
prostheses even if less sophisticated devices might already bring some clinical
benefits to patients.
142
GENERAL CONCLUSIONS
7.3 Future work
Visual prostheses that allow users to explore the environment ‘naturally’, with eye
movements (as in subretinal implants transforming incident light into electric currents
‘in-situ’) could allow for better performance than systems where the environment has
to be explored with head movements (as in epiretinal implants that use external
head-mounted cameras). Furthermore, the use of large head movements to explore
the environment might lead to balance problems associated with an abnormal
vestibulo-ocular reflex (Cha et al., 1992a). The experiments presented on this
dissertation do not allow us to determine any eventual advantage of using eye
movements during vision-related tasks. This would be a very interesting line of
investigation for future psycophysical experiments. For example, performance could
on a given task (e.g. full-page reading) could be compared in two conditions: one
where the subjects are able to scan pages of text using their own eye movements,
and another one where subjects are allowed to scan the text using head movements.
Such information is crucial to determine the extent to which it would be desirable for
prostheses with head-mounted cameras to incorporate a sub-system that allows for
the coupling of eye movements.
Future research efforts could also explore the degree to which learning in
eccentric vision transfers from one task to another. Since the learning process for
each task was different, different factors might have influenced the adaptation to
performing each task in eccentric vision. This issue could be easily explored by
evaluating performance of subjects trained for one task on another one. For
example, one of the subjects trained for eccentric reading could perform a series of
sessions of mobility tasks, and the other way around. This investigation would allow
for a better outlook of the different processes involved in the adaptation to each
task.
Another interesting consideration relates to the type of cells that will probably be
stimulated. We have pointed out that the best implantation sites for retinal
prostheses will be situated at eccentricities beyond 10°. At such high eccentricities,
the photoreceptor population is mainly constituted of rods. This family of
photoreceptors function in dim lighting (scotopic and mesopic) conditions. However,
all the experiments described in this dissertation were carried out in photopic
conditions (> 10 cd/m2) and the stimuli used were grayscale images. In these
conditions, rods were saturated and the 3 families of cones were stimulated. It would
be, thus, interesting to explore in future experiments how would stimulating
exclusively one family of photoreceptors (i.e. rods or one type of cones) impact
performance on vision-related tasks. This could be carried out by examining
performance on a given task in various illumination conditions (e.g. scotopic,
photopic but limiting luminous stimuli to a given wavelength/color). Such an
investigation will not change the fundamental limits determined in our study (which
depend on the task at hand, but on the characteristics of stimulation), but might give
valuable indications for efficient electrical stimulation strategies.
Additional simulations on normal subjects could still provide important information
on image features and additional processing that could help improve performance
(i.e. contrast enhancement, edge detection). Furthermore, particular isolated sub-
MINIMUM REQUIREMENTS FOR A RETINAL PROSTHESIS TO
RESTORE USEFUL VISION
143
tasks (i.e. motion detection thresholds) could still be studied in more detail.
However, this knowledge is not essential to understanding visual function with an
artificial vision device, and would not fundamentally change the minimum
requirements determined in our studies.
Finally, at this point it is imperative that a substantial research effort is made to
better define the characteristics of transmission around the electrode-nerve interface
(spatial selectivity, stimulation thresholds, retinotopy of the perceived stimulus).
More in particular, fundamental knowledge will depend on the results of
psychophysical experiments on blind subjects both in acute and chronic trials.
7.4 Closing remarks
Implantable microelectrode arrays consisting of about 500 active contacts seem
feasible using present technology. Some research groups have already manufactured
first retinal implant prototypes consisting of several hundreds of electrodes (Zrenner
et al., 1999; Palanker et al., 2003; Chow et al., 2003). Our own consortium has
already produced a retinal prosthesis containing 200 ‘smart’ pixels/mm2 (photodiode
+ electronics + electrode; see Mazza et al., 2005). Five hundred contacts covering a
surface of 3 x 2 mm2 correspond to an inter-electrode spacing of approximately 100
µm (equivalent to a visual acuity of 20/400; see Palanker et al., 2005). Multisite
stimulation measurements on chicken retinae have demonstrated that spatial
resolution in this order of magnitude should be possible (Stett et al., 2000; Lecchi et
al., 2006).
The first visual prosthesis prototypes have been recently implanted in humans
with encouraging results (Humayun et al., 2003; Veraart et al., 2003; Chow et al.,
2004). Yet, several important challenges still need to be overcome before these
devices provide similar benefits to those of cochlear implants in cases of deafness
(NIH consensus 1995). The basic notion of patterned vision resulting from the
continuous stimulation of several electrodes has not been fully confirmed.
Furthermore, an appropriate method of selective stimulation eliciting the adequate
psychophysical response has not been developed yet. Multidisciplinary research is,
therefore, still needed to determine, if prototype chips can actually reach the
required spatial selectivity in neural excitation, as well as if they can preserve
retinotopic mapping. One major problem is to achieve efficient electrical stimulation
within safe charge density limits (Brummer et al., 1983). The critical parameter is the
threshold current needed to achieve stimulation. To remain within safe limits,
electrode area has to be enlarged as threshold current increases, which
fundamentally limits inter-electrode spacing at the same time. While there is some
data on epiretinal stimulation in humans, the exact characteristics of the
electrophysiological response of the human retina to subretinal stimulation are still
largely unknown. Therefore, depending on the results of future human studies, the
use of relatively large stimulation electrodes might turn out to be mandatory,
preventing the 100 µm inter-electrode spacing used in our simulations. In this case,
we would rather suggest increasing the total area of the retinal array (within feasible
limits) than limiting the number of stimulation contacts. A substantial research effort
144
GENERAL CONCLUSIONS
is therefore still required to solve these and other open issues before realizing the
level of electrode integration suggested by our studies.
We consider that our methodical investigation of the minimum
requirements for useful artificial vision is now quite complete. On the basis
of the results reported here, we can conclude that retinal implants might
be able to restore some reading, visuomotor, and whole-body mobility
abilities to blind patients. Between 400-500 phosphenes, retinotopically
arranged over a 10° x 7° retinal area (corresponding to an implant surface
of 3 x 2 mm2), appear to be the minimum visual information required to
restore useful function. However, the effective field of view represented by
the active area of the implant will have to be optimized for each task. A
highly magnified effective visual field simultaneously containing strings of
4-6 characters and about two lines of text (about 2° x 1.4° for a typical
newspaper) is required for efficient reading, an effective visual field of
16.4° x 11.6° seems to allow for efficient visuomotor coordination, and an
effective field of view of 33° x 23° appears to be necessary for tasks
involving whole-body mobility. Adequate lenses/optics could be used to
adapt the available field of view to the task at hand. A significant learning
process will be required to reach optimal performance with such devices,
especially if the implant has to be placed outside the fovea. Visual
prostheses should aim to meet these criteria in order to provide efficient
functional rehabilitation to blind patients.
8 References
Altpeter, E., Mackeben, M., & Trauzettel-Klosinski, S. (2000). The importance of sustained
attention for patients with maculopathies. Vision Res, 40(10-12), 1539-1547.
Antes, J.R. (1974). The time course of picture viewing. J Exp Psychol, 103(1), 62-70.
Apfelbaum, H., Pelah, A., & Peli, E. (2006). Collision detection by "tunnel vision" patients
walking in a virtual reality environment .
http://www.eri.harvard.edu/faculty/peli/papers/TAP_Collision_detection_051118.05.pdf
Arditi, A., Knoblauch, K., & Grunwald, I. (1990). Reading with fixed and variable character pitch.
J Opt Soc Am A, 7(10), 2011-2015.
Bachmann, T. (1991). Identification of spatially quantised tachistoscopic images of faces: How
many pixels does it take to carry identity? Eur J Cogn Psychol, 3, 87-107.
Bachmann, T. & Kahusk, N. (1997). The effects of coarseness of quantisation, exposure
duration, and selective spatial attention on the perception of spatially quantised ('blocked')
visual images. Perception, 26(9), 1181-1196.
Bagnoud, M., Sommerhalder, J., Pelizzone, M., & Safran, A.B. (2001). Information visuelle
necessaire a la restauration d'une lecture au moyen d'un implant retinien chez un aveugle
par degenerescence massive des photorecepteurs. Klin Monatsbl Augenheilkd, 218(5), 360362.
Bai, Q. & Wise, K.D. (2001). Single-unit neural recording with active microelectrode arrays. IEEE
Trans Biomed Eng, 48(8), 911-920.
Bai, Q., Wise, K.D., & Anderson, D.J. (2000). A high-yield microassembly structure for threedimensional microelectrode arrays. IEEE Trans Biomed Eng, 47(3), 281-289.
Bak, M., Girvin, J.P., Hambrecht, F.T., Kufta, C.V., Loeb, G.E., & Schmidt, E.M. (1990). Visual
sensations produced by intracortical microstimulation of the human occipital cortex. Med Biol
Eng Comput, 28(3), 257-259.
Baldasare, J. & Watson, G. (1986). Observations from the psychology of reading relevant to low
vision research. In: G.C. Woo (Eds.), Low vision principles and applications (pp. 272-286).
New York: Springer Verlag.
Beard, B.L., Levi, D.M., & Reich, L.N. (1995). Perceptual learning in parafoveal vision. Vision
Res, 35(12), 1679-1690.
Beckmann, P.J. & Legge, G.E. (1996). Psychophysics of reading XIV. The page navigation
problem in using magnifiers. Vision Res, 36(22), 3723-3733.
Berthoz, A. (1997). Le sens du mouvement. Paris: Odile Jacob.
Bingham, G.P. & Pagano, C.C. (1998). The necessity of a perception-action approach to definite
distance perception: monocular distance perception to guide reaching. J Exp Psychol Hum
Percept Perform, 24(1), 145-168.
Bizzi, E., Accornero, N., Chapple, W., & Hogan, N. (1984). Posture control and trajectory
formation during arm movement. J Neurosci, 4(11), 2738-2744.
Blau, A., Ziegler, C., Heyer, M., Endres, F., Schwitzgebel, G., Matthies, T., Stieglitz, T., Meyer,
J.U., & Gopel, W. (1997). Characterization and optimization of microelectrode arrays for in
vivo nerve signal recording and stimulation. Biosens Bioelectron, 12(9-10), 883-892.
145
Blouin, J., Bard, C., Teasdale, N., Paillard, J., Fleury, M., Forget, R., & Lamarre, Y. (1993).
Reference systems for coding spatial information in normal subjects and a deafferented
patient. Exp Brain Res, 93(2), 324-331.
Blouin, J., Gauthier, G.M., Vercher, J.L., & Cole, J. (1996). The relative contribution of retinal
and extraretinal signals in determining the accuracy of reaching movements in normal
subjects and a deafferented patient. Exp Brain Res, 109(1), 148-153.
Bossom, J. (1974). Movement without proprioception. Brain Res, 71(2-3), 285-296.
Boughman, J.A., Conneally, P.M., & Nance, W.E. (1980). Population genetic studies of retinitis
pigmentosa. Am J Hum Genet, 32(2), 223-235.
Bowers, A.R. & Reid, V.M. (1997). Eye movements and reading with simulated visual
impairment. Ophthalmic Physiol Opt, 17(5), 392-402.
Boyle, J.R., Maeder, A.J., & Boles, W.W. (2002). Image enhancement for electronic visual
prostheses. Australas Phys Eng Sci Med, 25(2), 81-86.
Brindley, G.S. (1973). Sensory effects of electrical stimulation of the visual and paravisual cortex
in man. In: R. Jung (Eds.), Handbook of Sensory Physiology (pp. 583-594). New York:
Springer-Verlag.
Brindley, G.S. & Lewin, W.S. (1968a). The sensations produced by electrical stimulation of the
visual cortex. J Physiol, 196(2), 479-493.
Brindley, G.S. & Lewin, W.S. (1968b). The visual sensations produced by electrical stimulation of
the medial occipital cortex. J Physiol, 194(2), 54-55P.
Brummer, S.B., Robblee, L.S., & Hambrecht, F.T. (1983). Criteria for selecting electrodes for
electrical stimulation: theoretical and practical considerations. Ann N Y Acad Sci, 405, 159171.
Buultjens, M., Aitken, S., Ravenscroft, J., & Carey, K. (1999). Size counts: The significance of
size, font and style of print for readers with low vision sitting examinations. Br J Vis Impair,
17, 5-10.
Campbell, P.K., Jones, K.E., Huber, R.J., Horch, K.W., & Normann, R.A. (1991). A silicon-based,
three-dimensional neural interface: manufacturing processes for an intracortical electrode
array. IEEE Trans Biomed Eng, 38(8), 758-768.
Cha, K., Horch, K.W., & Normann, R.A. (1992a). Mobility performance with a pixelized vision
system. Vision Res, 32(7), 1367-1372.
Cha, K., Horch, K.W., Normann, R.A., & Boman, D.K. (1992b). Reading speed with a pixelized
vision system. J Opt Soc Am A, 9(5), 673-677.
Chanderli, K. (2002) Stratégies de couplage action perception lors du pointage locomoteur à
l'approche d'une cible. PhD thesis No. 300, Université de Genève, Geneva, Switzerland.
Chen, H., Yao, D., & Liu, Z. (2004). A study on asymmetry of spatial visual field by analysis of
the fMRI BOLD response. Brain Topogr, 17(1), 39-46.
Chow, A.Y. & Chow, V.Y. (1997). Subretinal electrical stimulation of the rabbit retina. Neurosci
Lett, 225(1), 13-16.
Chow, A.Y., Chow, V.Y., Packo, K.H., Pollack, J.S., Peyman, G.A., & Schuchard, R. (2004). The
artificial silicon retina microchip for the treatment of vision loss from retinitis pigmentosa.
Arch Ophthalmol, 122(4), 460-469.
Chow, A.Y., Packo, K.H., Pollack, J.S., & Schuchard, R.A. (2003). Subretinal Artificial Silicon
Retina Microchip Implantation in Retinitis Pigmentosa Patients: Long Term Follow-Up. Invest
Ophthalmol Vis Sci, 44(5), 4205 (abstract).
Chow, A.Y., Pardue, M.T., Chow, V.Y., Peyman, G.A., Liang, C., Perlman, J.I., & Peachey, N.S.
(2001). Implantation of silicon chip microphotodiode arrays into the cat subretinal space.
IEEE Trans Neural Syst Rehabil Eng, 9(1), 86-95.
146
Chow, A.Y., Peyman, G.A., Packo, K.H., & Pollack, J.S. (2002). The artificial silicon retina (ASR)
prosthesis for the treatment of retinitis pigmentosa. Exp Eye Res, 27(2 (suppl.)), 95
(abstract).
Christie, F. & Bruce, V. (1998). The role of dynamic information in the recognition of unfamiliar
faces. Mem Cognit, 26(4), 780-790.
Chung, S.T. (2004). Reading speed benefits from increased vertical word spacing in normal
peripheral vision. Optom Vis Sci, 81(7), 525-535.
Chung, S.T., Legge, G.E., & Cheung, S.H. (2004). Letter-recognition and reading speed in
peripheral vision benefit from perceptual learning. Vision Res, 44(7), 695-709.
Chung, S.T., Mansfield, J.S., & Legge, G.E. (1998). Psychophysics of reading. XVIII. The effect
of print size on reading speed in normal peripheral vision. Vision Res, 38(19), 2949-2962.
Ciulla, T.A., Danis, R.P., & Harris, A. (1998). Age-related macular degeneration: a review of
experimental treatments. Surv Ophthalmol, 43(2), 134-146.
Clausen, J. (1955). Visual sensations (phosphenes) produced by AC sine wave stimulation. Acta
Psychiatr Neurol Scand, 94, 1-101.
Coello, Y. & Grealy, M.A. (1997). Effect of size and frame of visual field on the accuracy of an
aiming movement. Perception, 26(3), 287-300.
Congdon, N.G., Friedman, D.S., & Lietman, T. (2003). Important causes of visual impairment in
the world today. JAMA, 290(15), 2057-2060.
Cornelissen, F.W., Bruin, K.J., & Kooijman, A.C. (2005). The influence of artificial scotomas on
eye movements during visual search. Optom Vis Sci, 82(1), 27-35.
Cornelissen, F.W., Peters, E.M., & Palmer, J. (2002). The Eyelink Toolbox: eye tracking with
MATLAB and the Psychophysics Toolbox. Behav Res Methods Instrum Comput, 34(4), 613617.
Cornelissen, F.W. & Van den Dobbelsteen, J.J. (1999). Heading detection with simulated visual
field defects. Vis Impairment Res, 1(3), 71-84.
Costen, N.P., Parker, D.M., & Craw, I. (1994). Spatial content and spatial quantisation effects in
face recognition. Perception, 23(2), 129-146.
Cotter, J.R. (1990). The Visual Pathway: An Introduction to Structure and Organization. In: K.N.
Leibovic (Eds.), Science of Vision (pp. 3-15). New York: Springer-Verlag.
Cowey, A. & Rolls, E.T. (1974). Human cortical magnification factor and its relation to visual
acuity. Exp Brain Res, 21(5), 447-454.
Crist, R.E., Kapadia, M.K., Westheimer, G., & Gilbert, C.D. (1997). Perceptual learning of spatial
localization: specificity for orientation, position, and context. J Neurophysiol, 78(6), 28892894.
Crist, R.E., Li, W., & Gilbert, C.D. (2001). Learning to see: experience and attention in primary
visual cortex. Nat Neurosci, 4(5), 519-525.
Cummings, R.W., Whittaker, S.G., Watson, G.R., & Budd, J.M. (1985). Scanning characters and
reading with a central scotoma. Am J Optom Physiol Opt, 62(12), 833-843.
Curcio, C.A., Sloan, K.R.J., Packer, O., Hendrickson, A.E., & Kalina, R.E. (1987). Distribution of
cones in human and monkey retina: individual variability and radial asymmetry. Science,
236(4801), 579-582.
Cursiefen, C., Holbach, L.M., Schlotzer-Schrehardt, U., & Naumann, G.O. (2001). Persisting
retinal ganglion cell axons in blind atrophic human eyes. Graefes Arch Clin Exp Ophthalmol,
239(2), 158-164.
Cutting, J.E. & Vishton, P.M. (1995). Perceiving layout and knowing distances: the integration,
relative potency, and contextual use of different information about depth. In: W. Epstein &
S.J. Rogers (Eds.), Perception of space and motion (pp. 69-117). London: Academic Press.
147
Dagnelie, G., Kelley, A.J., & Yang, L. (2004). Effects of image stabilization on face recognition
and virtual mobility using simulated prosthetic vision. Invest Ophthalmol Vis Sci, 45(5),
4223 (abstract).
Dagnelie, G., Thompson, R.W., Barnett, D., & Zhang, W.Q. (2000). Visual perception and
performance under conditions simulating artificial vision. Perception, 29(suppl.), 84
(abstract).
Daniel, P. & Whitterridge, D. (1961). The representation of the visual field on the cerebral cortex
in monkeys. J Physiol (Paris), 159, 203-221.
de Balthasar, C., Cosendai, G., & Pelizzone, M. (1999). Simulations of the effects of electrical
stimulation selectivity on speech reception with cochlear implants. Med Hyg, 2273, 19841988.
De Graef, P., Christiaens, D., & d'Ydewalle, G. (1990). Perceptual effects of scene context on
object identification. Psychol Res, 52(4), 317-329.
Delbeke, J., Gerard, B., Lambert, V., Laloyaux, C., Schmitt, C., & Veraart, C. (2003a). A First
Attempt to Translate Images in Optic Nerve Stimuli for a Visual Prosthesis. Invest
Ophthalmol Vis Sci, 44(5), 5074 (abstract).
Delbeke, J., Oozeer, M., & Veraart, C. (2003b). Position, size and luminosity of phosphenes
generated by direct optic nerve stimulation. Vision Res, 43(9), 1091-1102.
Delbeke, J., Pins, D., Michaux, G., Wanet-Defalque, M.C., Parrini, S., & Veraart, C. (2001).
Electrical stimulation of anterior visual pathways in retinitis pigmentosa. Invest Ophthalmol
Vis Sci, 42(1), 291-297.
Delbeke, J., Wanet-Defalque, M.C., Gerard, B., Troosters, M., Michaux, G., & Veraart, C. (2002).
The microsystems based visual prosthesis for optic nerve stimulation. Artif Organs, 26(3),
232-234.
Desmurget, M., Pelisson, D., Rossetti, Y., & Prablanc, C. (1998). From eye to hand: planning
goal-directed movements. Neurosci Biobehav Rev, 22(6), 761-788.
Desmurget, M., Rossetti, Y., Jordan, M., Meckler, C., & Prablanc, C. (1997). Viewing the hand
prior to movement improves accuracy of pointing performed toward the unseen contralateral
hand. Exp Brain Res, 115(1), 180-186.
Dichgans, J., Bizzi, E., Morasso, P., & Tagliasco, V. (1974). The role of vestibular and neck
afferents during eye-head coordination in the monkey. Brain Res, 71(2-3), 225-232.
Dobelle, W.H. (2000). Artificial vision for the blind by connecting a television camera to the
visual cortex. ASAIO J, 46(1), 3-9.
Dobelle, W.H. & Mladejowsky, M.G. (1974). Phosphenes produced by electrical stimulation of
human occipital cortex, and their application to the development of a prosthesis for the
blind. J Physiol, 243(2), 553-576.
Dobelle, W.H., Mladejowsky, M.G., Evans, J.R., Roberts, T.S., & Girvin, J.P. (1976). "Braille"
reading by a blind volunteer by visual cortex stimulation. Nature, 259(5539), 111-112.
Dobelle, W.H., Mladejowsky, M.G., & Girvin, J.P. (1974). Artifical vision for the blind: electrical
stimulation of visual cortex offers hope for a functional prosthesis. Science, 183(123), 440444.
Dorman, M.F. & Loizou, P.C. (1997). Speech intelligibility as a function of the number of
channels of stimulation for normal-hearing listeners and patients with cochlear implants. Am
J Otol, 18(6 (suppl.)), S113-S114.
Doyle, J.B., Doyle, J.H., Turnbull, F.M., Abbey, J., & House, L. (1963). Electrical stimulation in
eighth nerve deafness. A preliminary report. Bull Los Angel Neuro Soc, 28, 148-150.
Drasdo, N. & Fowler, C.W. (1974). Non-linear projection of the retinal image in a wide-angle
schematic eye. Br J Ophthalmol, 58(8), 709-714.
148
Eckmiller, R. (1997). Learning retina implants with epiretinal contacts. Ophthalmic Res, 29(5),
281-289.
Eddington, D.K., Dobelle, W.H., Brackmann, D.E., Mladejowsky, M.G., & Parkin, J.L. (1998a).
Auditory prosthesis research with multiple channel intracochlear stimulation in man. Ann
Otol Rhinol Laryngol, 87(S53), 5-39.
Eddington, D.K., Dobelle, W.H., Brackmann, D.E., Mladejowsky, M.G., & Parkin, J.L. (1998b).
Place and periodicity pitch by stimulation of multiple scalal tympani electrodes in deaf
volunteers. Trans Am Soc Artif Intern Organs, XXIV, 1.
Einthoven, W. & Jolly, W.A. (1908). The form and magnitude of the electric response of the eye
to stimulation by light at various intensities. Q J Exp Physiol, 1(4), 373-416.
Elfar, S.D., Cottaris, N.P., Iezzi, R., & Abrams, G.W. (2004). Rapid mapping of cortical multielectrode arrays and its application for the evaluation of retinal prostheses. Invest
Ophthalmol Vis Sci, 45(5), 3403 (abstract).
Elliott, D.B., Trukolo-Ilic, M., Strong, J.G., Pace, R., Plotkin, A., & Bevers, P. (1997).
Demographic characteristics of the vision-disabled elderly. Invest Ophthalmol Vis Sci,
38(12), 2566-2575.
Fine, E.M., Hazel, C.A., Petre, K.L., & Rubin, G.S. (1999). Are the benefits of sentence context
different in central and peripheral vision? Optom Vis Sci, 76(11), 764-769.
Fine, E.M., Kirschen, M.P., & Peli, E. (1996). The necessary field of view to read with an optimal
stand magnifier. J Am Optom Assoc, 67(7), 382-389.
Fine, E.M. & Peli, E. (1996a). The role of context in reading with central field loss. Optom Vis
Sci, 73(8), 533-539.
Fine, E.M. & Peli, E. (1996b). Visually impaired observers require a larger window than normally
sighted observers to read from a scroll display. J Am Optom Assoc, 67(7), 390-396.
Fishman, H.A., Palanker, D.V., Mehenti, N.Z., Marmor, M.F., Bent, S.F., & Blumenkranz, M.S.
(2004). Design of a Neurotransmitter-Based Retinal Prosthetic Chip Powered by the Ambient
Light. Invest Ophthalmol Vis Sci, 45(5), 3402 (abstract).
Fishman, H.A., Peterman, M.C., Leng, T., Huie, P., Lee, C.J., Bloom, D.M., Sanislo, S.R.,
Marmor, M.F., Bent, S.F., & Blumenkranz, M.S. (2002). The Artificial Synapse Chip: A Novel
Interface for a Retinal Prosthesis based on Neurotransmitter Stimulation and Nerve
Regeneration. Invest Ophthalmol Vis Sci, 43(12), 2846 (abstract).
Fishman, H.A., Peterman, M.C., Marmor, M.F., & Blumenkranz, M.S. (2003). The Artificial
Synapse Chip: Multi-Cellular Neurotransmitter-Based Retinal Stimulation. Invest Ophthalmol
Vis Sci, 44(5), 5080 (abstract).
Fitzgibbon, T. & Taylor, S.F. (1996). Retinotopy of the human retinal nerve fibre layer and optic
nerve head. J Comp Neurol, 375(2), 238-251.
Fletcher, D.C. & Schuchard, R.A. (1997). Preferred retinal loci relationship to macular scotomas
in a low-vision population. Ophthalmology, 104(4), 632-638.
Foerster, O. (1929). Beiträge zur pathophysiologie der sehspäre. J Psychol Neurol, 39, 463485.
Foley, J.M. & Held, R. (1972). Visually directed pointing as a function of target distance,
direction, and available cues. Percept Psychophys, 12, 263-268.
Foulke, E. (1971). The perceptual basis for mobility. In: Research Bulletin No. 23 (pp. 1-8).
New York: American Foundation for the Blind.
Frick, K.D. & Foster, A. (2003). The magnitude and cost of global blindness: an increasing
problem that can be alleviated. Am J Ophthalmol, 135(4), 471-476.
149
Fujii, G.Y., Humayun, M.S., Weiland, J., Greenberg, R., Mech, B., Little, J., Rossi, J.V., Yanai, D.,
Tameesh, M.K., & Panzan, C.Q. (2003). Intraocular Retinal Prosthesis: First Generation
Implant and its Surgical Technique. Invest Ophthalmol Vis Sci, 44(5), 5079 (abstract).
Gasperini, J.L., Walraven, T.L., McAllister, J.P., Auner, G., Abrams, G., Givens, R., & Iezzi, R.
(2003). The Neuroprotective Effects of Aspirin and MK-801 Against Un-Caged CagedGlutamate for Use in a Visual Prosthetic Device. Invest Ophthalmol Vis Sci, 44(5), 5078
(abstract).
Gauthier, G.M., Nommay, D., & Vercher, J.L. (1990). The role of ocular muscle proprioception in
visual localization of targets. Science, 249(4964), 58-61.
Geruschat, D.R. & Del'Aune, W. (1989). Reliability and validity of O&M instructor observations. J
Vis Impair Blind, 83, 457-460.
Geruschat, D.R., Turano, K.A., & Stahl, J.W. (1998). Traditional measures of mobility
performance and retinitis pigmentosa. Optom Vis Sci, 75(7), 525-537.
Ghez, C., Gordon, J., & Ghilardi, M.F. (1995). Impairments of reaching movements in patients
without proprioception. II. Effects of visual information on accuracy. J Neurophysiol, 73(1),
361-372.
Goodale, M.A. & Milner, A.D. (1992). Separate visual pathways for perception and action.
Trends Neurosci, 15(1), 20-25.
Goodale, M.A. & Westwood, D.A. (2004). An evolving view of duplex vision: separate but
interacting cortical pathways for perception and action. Curr Opin Neurobiol, 14(2), 203211.
Gooley, J.J., Lu, J., Chou, T.C., Scammell, T.E., & Saper, C.B. (2001). Melanopsin in cells of
origin of the retinohypothalamic tract. Nat Neurosci, 4(12), 1165.
Gray, H. (1918). Anatomy of the Human Body. Philadelphia: Lea & Febiger; Bartleby.com,
accessed 19/03/2004. www.bartleby.com.
Greenberg, R.J. (2000). Visual Prostheses: A Review. Neuromodulation, 3(3), 161-165.
Greenberg, R.J., Velte, T.J., Humayun, M.S., Scarlatis, G.N., & de Juan Jr., E. (1999). A
computational model of electrical stimulation of the retinal ganglion cell. IEEE Trans Biomed
Eng, 46(5), 505-514.
Grumet, A.E., Wyatt, J.L., & Rizzo, J.F. (2000). Multi-electrode stimulation and recording in the
isolated retina. J Neurosci Methods, 101(1), 31-42.
Guthrie, B.L., Porter, J.D., & Sparks, D.L. (1983). Corollary discharge provides accurate eye
position information to the oculomotor system. Science, 221(4616), 1193-1195.
Haim, M., Holm, N.V., & Rosenberg, T. (1992). A population survey of retinitis pigmentosa and
allied disorders in Denmark. Completeness of registration and quality of data. Acta
Ophthalmologica (Copenh), 70(2), 165-177.
Hallett, P.E. (1978). Primary and secondary saccades to goals defined by instructions. Vision
Res, 18(10), 1279-1296.
Hamzavi, J.S., Baumgartner, W.D., Adunka, O., Franz, P., & Gstoettner, W. (2000). Audiological
performance with cochlear reimplantation from analogue single-channel implants to digital
multi-channel devices. Audiology, 39(6), 305-310.
Harding, S. (2003). Extracts from "concise clinical evidence". Diabetic retinopathy. BMJ,
326(7397), 1023-1025.
Harland, S., Legge, G.E., & Luebker, A. (1998). Psychophysics of reading. XVII. Low-vision
performance with four types of electronically magnified text. Optom Vis Sci, 75(3), 183190.
Harmon, L.D. & Julesz, B. (1973). Masking in visual recognition: effects of two-dimensional
filtered noise. Science, 180(91), 1194-1197.
150
Hayes, J.S., Yin, V.T., Piyathaisere, D., Weiland, J.D., Humayun, M.S., & Dagnelie, G. (2003).
Visually guided performance of simple tasks using simulated prosthetic vision. Artif Organs,
27(11), 1016-1028.
Hazel, C.A., Petre, K.L., Armstrong, R.A., Benson, M.T., & Frost, N.A. (2000). Visual function and
subjective quality of life compared in subjects with acquired macular disease. Invest
Ophthalmol Vis Sci, 41(6), 1309-1315.
Heinen, S.J. & Skavenski, A.A. (1992). Adaptation of saccades and fixation to bilateral foveal
lesions in adult monkey. Vision Res, 32(2), 365-373.
Held, R. & Freedman, S.J. (1963). Plasticity in Human Sensorimotor Control. Science,
142(3591), 455-462.
Henderson, J.M. & Hollingworth, A. (2003). Eye movements and visual memory: detecting
changes to saccade targets in scenes. Percept Psychophys, 65(1), 58-71.
Henderson, J.M., McClure, K.K., Pierce, S., & Schrock, G. (1997). Object identification without
foveal vision: evidence from an artificial scotoma paradigm. Percept Psychophys, 59(3),
323-346.
Higgins, K.E. & Wood, J.M. (1998). Predicting closed road sign recognition performance from
vision tests. In: Vision science and its applications (pp. 42-45). Washington DC: Optical
Society of America.
Higgins, K.E. & Wood, J.M. (2005). Predicting components of closed road driving performance
from vision tests. Optom Vis Sci, 82(8), 647-656.
Higgins, K.E., Wood, J.M., & Tait, A. (1996). Closed road driving performance: effect of
degradation of visual acuity. In: Vision science and its applications (pp. 78-81).
Washington DC: Optical Society of America.
Hill, E., Rieser, J., Hill, M., Halpin, J., & Halpin, R. (1993). How persons with visual impairments
explore novel spaces: Strategies of good and poor performers. J Vis Impair Blind, 87, 295301.
Hims, M.M., Diager, S.P., & Inglehearn, C.F. (2003). Retinitis pigmentosa: genes, proteins and
prospects. Dev Ophthalmol, 37, 109-125.
Hollingworth, A., Schrock, G., & Henderson, J.M. (2001). Change detection in the flicker
paradigm: the role of fixation position within the scene. Mem Cognit, 29(2), 296-304.
Hooge, I.T. & Erkelens, C.J. (1999). Peripheral vision and oculomotor control during visual
search. Vision Res, 39(8), 1567-1575.
Hoogerwerf, A.C. & Wise, K.D. (1994). A three-dimensional microelectrode array for chronic
neural recording. IEEE Trans Biomed Eng, 41(12), 1136-1146.
Huie, P., Palanker, D.V., Vankov, A., Aramant, R.B., Seiler, M.J., Fishman, H.A., Marmor, M.F., &
Blumenkranz, M.S. (2004). Migration of neural retina through a perforated membrane
implanted in the subretinal space of RCS rats: implications for prosthetics. Invest
Ophthalmol Vis Sci, 45(5), 4202 (abstract).
Huie, P., Palanker, D.V., Vankov, A., Fishman, H.A., Marmor, M.F., & Blumenkranz, M.S. (2003).
Perforated Membrane as an Interface for Focal Electrical Stimulation of Retina. Invest
Ophthalmol Vis Sci, 44(5), 5055 (abstract).
Huie, P., Peterman, M.C., Leng, T., Lee, C.J., Marmor, M.F., Bloom, D.M., Blumenkranz, M.S., &
Fishman, H.A. (2002). Tissue-engineered Neurite Conduits to Connect Retinal Ganglion Cells
to an Electronic Retinal Prosthesis. Invest Ophthalmol Vis Sci, 43(12), 4475 (abstract).
Huk, A.C., Palmer, J., & Shadlen, M.N. (2002). Temporal integration of visual motion
information: Evidence from response times. J Vis, 2(7), 228a (abstract).
Humayun, M.S. (2001). Intraocular retinal prosthesis. Trans Am Ophthalmol Soc, 99, 271-300.
151
Humayun, M.S., de Juan Jr., E., Dagnelie, G., Greenberg, R.J., Propst, R.H., & Phillips, D.H.
(1996). Visual perception elicited by electrical stimulation of retina in blind humans. Arch
Ophthalmol, 114(1), 40-46.
Humayun, M.S., de Juan Jr., E., Weiland, J.D., Dagnelie, G., Katona, S., Greenberg, R., &
Suzuki, S. (1999). Pattern electrical stimulation of the human retina. Vision Res, 39(15),
2569-2576.
Humayun, M.S., Weiland, J.D., Fujii, G.Y., Greenberg, R., Williamson, R., Little, J., Mech, B.,
Cimmarusti, V., Van Boemel, G., Dagnelie, G., & de Juan, E. (2003). Visual perception in a
blind subject with a chronic microelectronic retinal prosthesis. Vision Res, 43(24), 25732581.
Humphries, P., Kenna, P., & Farrar, G.J. (1992). On the molecular genetics of retinitis
pigmentosa. Science, 256(5058), 804-808.
Iezzi, R., Cottaris, N.P., Elfar, S.D., Walraven, T.L., Raza, T.M., Moncrieff, R., McAllister, J.P.,
Auner, G.W., Johnson, R.R., & Abrams, G.W. (2003). Neurotransmitter-based retinal
prosthesis modulation of retinal ganglion cell responses in vivo. Invest Ophthalmol Vis Sci,
44(5), 5083 (abstract).
Iezzi, R., Walraven, T., & Abrams, G. (2004). Toxicological profiles of the phototriggerable
molecules MNI and MCNI glutamate for use in visual prostheses. Invest Ophthalmol Vis Sci,
45(5), 4221 (abstract).
Iezzi, R., Walraven, T.L., McAllister, J.P., Givens, R., Auner, G., & Abrams, G. (2002).
Biocompatibility of Caging Chromophores for Use in Retinal and Cortical Visual Prostheses.
Invest Ophthalmol Vis Sci, 43(12), 4478 (abstract).
Ito, Y., Yagi, T., Kanda, H., Tanaka, S., Watanabe, M., & Uchikawa, Y. (1999). Cultures of
neurons on micro-electrode array in hybrid retinal implant. Proc '99 Conf Syst Man Cybern,
4, 414-417.
Jeannerod, M. (1981). Intersegmental coordination during reaching at natural visual objects. In:
J. Long & A. Baddeley (Eds.), Attention and Performance IX (pp. 153-168). Hillsdale NJ:
Lawrence Erlbaum Associates.
Jeannerod, M. (1984). The timing of natural prehension movements. J Mot Behav, 16(3), 235254.
Jeannerod, M. & Biguer, B. (1982). Visuomotor mechanisms in reaching within extrapersonal
space. In: D.J. Ingle, M.A. Goodale, & J.W. Mansfield (Eds.), Analysis of Visual Behavior
(pp. 387-409). Cambridge, MA: The MIT Press.
Jones, K.E., Campbell, P.K., & Normann, R.A. (1992). A glass/silicon composite intracortical
electrode array. Ann Biomed Eng, 20(4), 423-437.
Jones, K.E. & Normann, R.A. (1997). An advanced demultiplexing system for physiological
stimulation. IEEE Trans Biomed Eng, 44(12), 1210-1220.
Kaczmarek, K.A. (2000). Sensory augmentation and substitution. In: J.D. Bronzino (Eds.), The
Biomedical Engineering Handbook (pp. 143.1-143.10). Boca Raton: CRC Press.
Kapi, S.S., Walraven, T.L., Abrams, G.W., & Iezzi, R. (2004). Caged neurotransmitters for visual
prosthesis: Toxicological profiles for the phototriggerable cage NPEC. Invest Ophthalmol Vis
Sci, 45(5), 4205 (abstract).
Kelley, A.J., Yang, L., & Dagnelie, G. (2004). The effects of stabilization, font scaling and
practice on reading in simulated prosthetic vision. Invest Ophthalmol Vis Sci, 45(5), 5436
(abstract).
Kelso, J.A. & Holt, K.G. (1980). Exploring a vibratory systems analysis of human movement
production. J Neurophysiol, 43(5), 1183-1196.
152
Kerdraon, Y.A., Downie, J.A., Suaning, G.J., Capon, M.R., Coroneo, M.T., & Lovell, N.H. (2002).
Development and surgical implantation of a vision prosthesis model into the ovine eye. Clin
Experiment Ophthalmol, 30(1), 36-40.
Kerkhoff, G. (1999). Restorative and compensatory therapy approaches in cerebral blindness - a
review. Restor Neurol Neurosci, 15(2-3), 255-271.
Kewley D.T., Hills M.D., Borkholder D.A., Opris I.E., Maluf N.I., Storment C.W., Bower J.M., &
Kovacs G.T.A. (1997). Plasma-etched neural probes. Sens Actuators A Phys, 58(1), 27-35.
Kiang, N.Y., Eddington, D.K., & Delgutte, B. (1979). Fundamental considerations in designing
auditory implants. Acta Otolaryngol, 87(3-4), 204-218.
Kim, S.Y., Sadda, S., Humayun, M.S., de Juan Jr., E., Melia, B.M., & Green, W.R. (2002a).
Morphometric analysis of the macula in eyes with geographic atrophy due to age-related
macular degeneration. Retina, 22(4), 464-470.
Kim, S.Y., Sadda, S., Pearlman, J., Humayun, M.S., de Juan Jr., E., Melia, B.M., & Green, W.R.
(2002b). Morphometric analysis of the macula in eyes with disciform age-related macular
degeneration. Retina, 22(4), 471-477.
Klomp, G.F., Womack, M.V., & Dobelle, W.H. (1977). Fabrication of large arrays of cortical
electrodes for use in man. J Biomed Mater Res, 11(3), 347-364.
Kohler, K., Hartmann, J.A., Werts, D., & Zrenner, E. (2001). Histologische Untersuchungen zur
netzhautdegeneration und Gewebeverträglichkeit subretinaler Implantate. Ophthalmologe,
98(4), 364-368.
Kolb, H., Fernandez, E., and Nelson, R. (2003). Webvision. John Moran Eye Center,
University of Utah, accessed 27/11/2003. http://www.webvision.med.utah.edu.
Krause, F. (1924). Die sehbahnen in chirurgischer beziehung und die faradische reizung des
sehzentrums. Klin Wochenschr, 3, 1260-1265.
Kuyk, T. & Elliott, J.L. (1999). Visual factors and mobility in persons with age-related macular
degeneration. J Rehabil Res Dev, 36(4), 303-312.
Kuyk, T., Elliott, J.L., Biehl, J., & Fuhr, P.S. (1996). Environmental variables and mobility
performance in adults with low vision. J Am Optom Assoc, 67(7), 403-409.
Kuyk, T., Elliott, J.L., & Fuhr, P.S. (1998). Visual correlates of mobility in real world settings in
older adults with low vision. Optom Vis Sci, 75(7), 538-547.
Kuyk, T., Elliott, J.L., Wesley, J., Scilley, K., McIntosh, E., Mitchell, S., & Owsley, C. (2004).
Mobility function in older veterans improves after blind rehabilitation. J Rehabil Res Dev,
41(3A), 337-346.
Lakhanpal, R.R., Yanai, D., Weiland, J.D., Fujii, G.Y., Caffey, S., Greenberg, R.J., de Juan Jr., E.,
& Humayun, M.S. (2003). Advances in the development of visual prostheses. Curr Opin
Ophthalmol, 14(3), 122-127.
Lambert, V., Laloyaux, C., Schmitt, C., Gerard, B., Delbeke, J., & Veraart, C. (2003).
Localisation, Discrimination, and Grasping of Daily Life Objects with an Implanted Optic
Nerve Prosthesis. Invest Ophthalmol Vis Sci, 44(5), 4208 (abstract).
Land, M., Mennie, N., & Rusted, J. (1999). The roles of vision and eye movements in the control
of activities of daily living. Perception, 28(11), 1311-1328.
Lander, K., Christie, F., & Bruce, V. (1999). The role of movement in the recognition of famous
faces. Mem Cognit, 27(6), 974-985.
Lappin, J.S., Tadin, D., & Whittier, E.J. (2002). Visual coherence of moving and stationary image
changes. Vision Res, 42(12), 1523-1534.
Lateiner, J.E. & Sainburg, R.L. (2003). Differential contributions of vision and proprioception to
movement accuracy. Exp Brain Res, 151(4), 446-454.
153
Latham, K. & Whitaker, D. (1996). A comparison of word recognition and reading performance
in foveal and peripheral vision. Vision Res, 36(17), 2665-2674.
Leal, E.C., Santiago, A.R., & Ambrosio, A.F. (2005). Old and new drug targets in diabetic
retinopathy: from biochemical changes to inflammation and neurodegeneration. Curr Drug
Targets CNS Neurol Disord, 4(4), 421-434.
Leat, S.J., Li, W., & Epp, K. (1999). Crowding in central and eccentric vision: the effects of
contour interaction and attention. Invest Ophthalmol Vis Sci, 40(2), 504-512.
Lecchi, M., Linderholm, P., Pelizzone, M., Picaud, S., Renaud, P., Salzmann, J., Sommerhalder,
J., Safran, A.B., & Bertrand, D. (2006). What physiology tells us about electrical stimulation
in retinal implants. Invest Ophthalmol Vis Sci, 47(3195 (abstract).
Lecchi, M., Marguerat, A., Ionescu, A., Pelizzone, M., Renaud, P., Sommerhalder, J., Safran,
A.B., Tribollet, E., & Bertrand, D. (2004). Ganglion cells from chick retina display multiple
functional nAChR subtypes. Neuroreport, 15(2), 307-311.
Leeuwenberg, E. (2003). Miracles of perception. Acta Psychol (Amst), 114(3), 379-396.
Legge, G.E., Ahn, S.J., Klitz, T.S., & Luebker, A. (1997). Psychophysics of reading--XVI. The
visual span in normal and low vision. Vision Res, 37(14), 1999-2010.
Legge, G.E., Mansfield, J.S., & Chung, S.T. (2001). Psychophysics of reading. XX. Linking letter
recognition to reading speed in central and peripheral vision. Vision Res, 41(6), 725-743.
Legge, G.E., Parish, D.H., Luebker, A., & Wurm, L.H. (1990). Psychophysics of reading. XI.
Comparing color contrast and luminance contrast. J Opt Soc Am A, 7(10), 2002-2010.
Legge, G.E., Pelli, D.G., Rubin, G.S., & Schleske, M.M. (1985a). Psychophysics of reading I.
Normal vision. Vision Res, 25(2), 239-252.
Legge, G.E., Ross, J.A., Isenberg, L.M., & LaMay, J.M. (1992). Psychophysics of reading. Clinical
predictors of low-vision reading speed. Invest Ophthalmol Vis Sci, 33(3), 677-687.
Legge, G.E. & Rubin, G.S. (1986). Psychophysics of reading. IV. Wavelength effects in normal
and low vision. J Opt Soc Am A, 3(1), 40-51.
Legge, G.E., Rubin, G.S., & Luebker, A. (1987). Psychophysics of reading V. The role of contrast
in normal vision. Vision Res, 27(7), 1165-1177.
Legge, G.E., Rubin, G.S., Pelli, D.G., & Schleske, M.M. (1985b). Psychophysics of reading II. Low
vision. Vision Res, 25(2), 253-265.
Leng, T., Huie, P., Mehenti, N.Z., Peterman, M.C., Lee, C.J., Marmor, M.F., Sanislo, S.R., Bent,
S.F., Blumenkranz, M.S., & Fishman, H.A. (2002). Directed Ganglion Cell Growth and
Stimulation with Microcontact Printing as a Prototype Visual Prosthesis Interface. Invest
Ophthalmol Vis Sci, 43(12), 4454 (abstract).
Leonard, R. and Gordon, A. R. (2002). Statistics on Vision Impairment: A Resource Manual.
Lighthouse International, accessed 03/12/2003.
http://www.lighthouse.org/research_statistics.htm.
Levi, D.M., Klein, S.A., & Aitsebaomo, A.P. (1985). Vernier acuity, crowding and cortical
magnification. Vision Res, 25(7), 963-977.
Li, H.C., Brenner, E., Cornelissen, F.W., & Kim, E.S. (2002). Systematic distortion of perceived
2D shape during smooth pursuit eye movements. Vision Res, 42(23), 2569-2575.
Li, L., Nugent, A.K., & Peli, E. (2001). Recognition of jagged (pixelated) letters in the periphery.
Vis Impairment Res, 2, 143-154.
Liu, W., Sivaprakasam, M., Singh, P.R., Bashirullah, R., & Wang, G. (2003). Electronic visual
prosthesis. Artif Organs, 27(11), 986-995.
Livingstone, M. & Hubel, D. (1988). Segregation of form, color, movement, and depth: anatomy,
physiology, and perception. Science, 240(4853), 740-749.
154
Loftus, A., Murphy, S., McKenna, I., & Mon-Williams, M. (2004). Reduced fields of view are
neither necessary nor sufficient for distance underestimation but reduce precision and may
cause calibration problems. Exp Brain Res, 158(3), 328-335.
Loftus, G.R. & Mackworth, N.H. (1978). Cognitive determinants of fixation location during
picture viewing. J Exp Psychol Hum Percept Perform, 4(4), 565-572.
Loizou, P.C., Dorman, M., & Tu, Z. (1999). On the number of channels needed to understand
speech. J Acoust Soc Am, 106(4 Pt 1), 2097-2103.
Löwenstein, K. & Borchardt, M. (1918). Dtsch. Dtsch Z Nervenheilkd, 58, 264-292.
Magne, P. & Coello, Y. (2002). Retinal and extra-retinal contribution to position coding. Behav
Brain Res, 136(1), 277-287.
Majji, A.B., Humayun, M.S., Weiland, J.D., Suzuki, S., D'Anna, S.A., & de Juan Jr., E. (1999).
Long-term histological and electrophysiological results of an inactive epiretinal electrode
array implantation in dogs. Invest Ophthalmol Vis Sci, 40(9), 2073-2081.
Margalit, E., Maia, M., Weiland, J., Greenberg, R., Fujii, G., Torres, G., Piyathaisere, D., O'Hearn,
T., Liu, W., Lazzi, G., Dagnelie, G., Scribner, D., de Juan, E., & Humayun, M. (2002). Retinal
prosthesis for the blind. Surv Ophthalmol, 47(4), 335-335.
Margalit, E. & Sadda, S.R. (2003). Retinal and optic nerve diseases. Artif Organs, 27(11), 963974.
Margrain, T.H. (1999). Minimising the impact of visual impairment. Low vision aids are a simple
way of alleviating impairment. BMJ, 318(7197), 1504.
Margrain, T.H. (2000). Helping blind and partially sighted people to read: the effectiveness of
low vision aids. Br J Ophthalmol, 84(8), 919-921.
Marron, J.A. & Bailey, I.L. (1982). Visual factors and orientation-mobility performance. Am J
Optom Physiol Opt, 59(5), 413-426.
Mason, C. & Kandel, E.R. (1991). Central Visual Pathways. In: E.R. Kandel, J.H. Schwartz, &
T.M. Jessel (Eds.), Principles of Neural Science (pp. 420-439). Norwalk: Appleton & Lange.
Maynard, E.M. (2001). Visual prostheses. Annu Rev Biomed Eng, 3, 145-168.
Maynard, E.M., Fernandez, E., & Normann, R.A. (2000). A technique to prevent dural adhesions
to chronically implanted microelectrode arrays. J Neurosci Methods, 97(2), 93-101.
Maynard, E.M., Hatsopoulos, N.G., Ojakangas, C.L., Acuna, B.D., Sanes, J.N., Normann, R.A., &
Donoghue, J.P. (1999). Neuronal interactions improve cortical population coding of
movement direction. J Neurosci, 19(18), 8083-8093.
Mazza, M., Renaud, P., Bertrand, D., & Ionescu, A. (2005). CMOS pixels for subretinal implant
prosthesis. IEEE Sensors Journal, 5(1), 32-37.
McClure, M.E., Hart, P.M., Jackson, A.J., Stevenson, M.R., & Chakravarthy, U. (2000). Macular
degeneration: do conventional measurements of impaired visual function equate with visual
disability? Br J Ophthalmol, 84(3), 244-250.
McFarland, T.J., Zhang, Y., Appukuttan, B., & Stout, J.T. (2004). Gene therapy for proliferative
ocular diseases. Expert Opin Biol Ther, 4(7), 1053-1058.
Medeiros, N.E. & Curcio, C.A. (2001). Preservation of ganglion cell layer neurons in age-related
macular degeneration. Invest Ophthalmol Vis Sci, 42(3), 795-803.
Mehenti, N.Z., Peterman, M.C., Leng, T., Marmor, M.F., Blumenkranz, M.S., Bent, S.F., &
Fishman, H.A. (2003). A Retinal Interface Based on Neurite Micropatterning for Single Cell
Stimulation. Invest Ophthalmol Vis Sci, 44(5), 5069 (abstract).
Merton, P.A. (1961). The accuracy of directing the eyes and the hand in the dark. J Physiol
(Paris), 156, 557-577.
155
Nelson, P., Aspinall, P., & O'Brien, C. (1999). Patients' perception of visual impairment in
glaucoma: a pilot study. Br J Ophthalmol, 83(5), 546-552.
Nelson, P., Aspinall, P., Papasouliotis, O., Worton, B., & O'Brien, C. (2003). Quality of life in
glaucoma and its relationship with visual function. J Glaucoma, 12(2), 139-150.
Nelson, W.W. & Loftus, G.R. (1980). The functional visual field during picture viewing. J Exp
Psychol [Hum Learn], 6(4), 391-399.
Nilsson, U.L. (1990). Visual rehabilitation with and without educational training in the use of
optical aids and residual vision. A prospective study of patients with advanced age-related
macular degeneration. Clin Vis Sci, 6, 3-10.
Nodine, C.F., Carmody, D.P., & Herman, E. (1979). Eye movements during visual search for
artistically embedded targets. Bulletin of the Psychonomic Society, 13, 371-374.
Normann, R.A., Maynard, E.M., Rousche, P.J., & Warren, D.J. (1999). A neural interface for a
cortical vision prosthesis. Vision Res, 39(15), 2577-2587.
Normann, R.A., Warren, D.J., Ammermuller, J., Fernandez, E., & Guillory, S. (2001). Highresolution spatio-temporal mapping of visual pathways using multi-electrode arrays. Vision
Res, 41(10-11), 1261-1275.
Osterberg, G. (1935). Topography of the layer of rods and cones in the human retina. Acta
Ophthalmologica (Copenh), 6(Suppl.), 1-103.
Owsley, C., McGwin, G., Jr., Sloane, M.E., Stalvey, B.T., & Wells, J. (2001). Timed instrumental
activities of daily living tasks: relationship to visual function in older adults. Optom Vis Sci,
78(5), 350-359.
Oxford Reference Online (2004). The Concise Oxford English Dictionary. Oxford University Press;
Université de Genève, accessed 07/02/2006.
<http://www.oxfordreference.com/views/ENTRY.html?subview=Main&entry=t23.e42365>.
Paillard, J. (1982). The contribution of peripheral and central vision to visually guided reaching.
In: D.J. Ingle, M.A. Goodale, & J.W. Mansfield (Eds.), Analysis of Visual Behavior (pp. 367385). Cambridge, MA: The MIT Press.
Palanker, D., Vankov, A., Huie, P., & Baccus, S. (2005). Design of a high-resolution
optoelectronic retinal prosthesis. J Neural Eng, 2(1), S105-S120.
Palanker, D.V., Vankov, A., Fishman, H.A., Blumenkranz, M.S., & Marmor, M.F. (2004). Physical
constraints on resolution of the electronic retinal prosthesis. Invest Ophthalmol Vis Sci,
45(5), 4209.
Palanker, D.V., Vankov, A., Huie, P., Fishman, H.A., Marmor, M.F., & Blumenkranz, M.S. (2003).
Can a Self-powered Retinal Prosthesis Support 100,000 Pixels in the Macula? Invest
Ophthalmol Vis Sci, 44(5), 5067 (abstract).
Pardue, M.T., Ball, S.L., Hetling, J.R., Chow, V.Y., Chow, A.Y., & Peachey, N.S. (2001). Visual
evoked potentials to infrared stimulation in normal cats and rats. Doc Ophthalmol, 103(2),
155-162.
Parker, R.E. (1978). Picture processing during recognition. J Exp Psychol Hum Percept Perform,
4(2), 284-293.
Patla, A.E. (1997). Understanding the roles of vision in the control of human locomotion. Gait &
Posture, 5(1), 54-69.
Peachey, N.S. & Chow, A.Y. (1999). Subretinal implantation of semiconductor-based
photodiodes: progress and challenges. J Rehabil Res Dev, 36(4), 371-376.
Peli, E. (1986). Control of eye movement with peripheral vision: implications for training of
eccentric viewing. Am J Optom Physiol Opt, 63(2), 113-118.
Peli, E. (2001). Vision multiplexing: an engineering approach to vision rehabilitation device
development. Optom Vis Sci, 78(5), 304-315.
156
Peli, E., Goldstein, R.B., Young, G.M., Trempe, C.L., & Buzney, S.M. (1991). Image
enhancement for the visually impaired. Simulations and experimental results. Invest
Ophthalmol Vis Sci, 32(8), 2337-2350.
Pelisson, D., Prablanc, C., Goodale, M.A., & Jeannerod, M. (1986). Visual control of reaching
movements without vision of the limb. II. Evidence of fast unconscious processes correcting
the trajectory of the hand to the final position of a double-step stimulus. Exp Brain Res,
62(2), 303-311.
Pelli, D.G. (1987). The visual requirements of mobility. In: G.C. Woo (Eds.), Low vision:
Principles and applications (pp. 134-146). New York: Springer Verlag.
Pelz, J. B. (1995) Visual Representations in a Natural Visuo-motor Task. PhD thesis , University
of Rochester, New York.
Pelz, J.B. & Canosa, R. (2001). Oculomotor behavior and perceptual strategies in complex tasks.
Vision Res, 41(25-26), 3587-3596.
Pelz, J.B., Hayhoe, M., & Loeber, R. (2001). The coordination of eye, head, and hand
movements in a natural task. Exp Brain Res, 139(3), 266-277.
Penfield, W. and Jasper, H. (1954). Epilepsy and the functional anatomy of the human brain.
London: Churchill.
Pérez Fornos, A., Sommerhalder, J., Chanderli, K., Pittard, A., Baumberger, B., Fluckiger, M.,
Safran, A.B., & Pelizzone, M. (2004). Minimum requirements for mobility in known
environments and perceptual learning of this task in eccentric vision. Invest Ophthalmol Vis
Sci, 45, 5445 (abstract).
Pérez Fornos, A., Sommerhalder, J., Pittard, A., Safran, A.B., & Pelizzone, M. (2005a). Minimum
requirements for visuomotor coordination and perceptual learning of such tasks in eccentric
vision. Invest Ophthalmol Vis Sci, 46, 1533 (abstract).
Pérez Fornos, A., Sommerhalder, J., Rappaz, B., Safran, A.B., & Pelizzone, M. (2005b).
Simulation of artificial vision, III: do the spatial or temporal characteristics of stimulus
pixelization really matter? Invest Ophthalmol Vis Sci, 46(10), 3906-3912.
Peyman, G., Chow, A.Y., Liang, C., Chow, V.Y., Perlman, J.I., & Peachey, N.S. (1998). Subretinal
semiconductor microphotodiode array. Ophthalmic Surg Lasers, 29(3), 234-241.
Pirozzolo, F.J. (1983). Eye movements and reading disability. In: K. Rayner (Eds.), Eye
movements in reading. Perceptual and language processes. (pp. 499-509). New York:
Academic Press.
Prablanc, C., Echallier, J.F., Komilis, E., & Jeannerod, M. (1979). Optimal response of eye and
hand motor systems in pointing at a visual target. I. Spatio-temporal characteristics of eye
and hand movements and their relationships when varying the amount of visual information.
Biol Cybern, 35(2), 113-124.
Prablanc, C., Pelisson, D., & Goodale, M.A. (1986). Visual control of reaching movements
without vision of the limb. I. Role of retinal feedback of target position in guiding the hand.
Exp Brain Res, 62(2), 293-302.
Previc, F.H. (1990). Functional specialization in the lower and upper visual fields in humans: its
ecological origins and neurophysiological implications. Behav Brain Sci, 13(3), 519-575.
Provencio, I., Rodriguez, I.R., Jiang, G., Hayes, W.P., Moreira, E.F., & Rollag, M.D. (2000). A
novel human opsin in the inner retina. J Neurosci, 20(2), 600-605.
Provencio, I., Rollag, M.D., & Castrucci, A.M. (2002). Photoreceptive net in the mammalian
retina. This mesh of cells may explain how some blind mice can still tell day from night.
Nature, 415(6871), 493.
Purdy, K.A., Lederman, S.J., & Klatzky, R.L. (1999). Manipulation with no or partial vision. J Exp
Psychol Hum Percept Perform, 25(3), 755-774.
157
Rauschecker, J.P. & Shannon, R.V. (2002). Sending sound to the brain. Science, 295(5557),
1025-1029.
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research.
Psychol Bull, 124(3), 372-422.
Rayner, K. & Pollatsek, A. (1992). Eye movements and scene perception. Can J Psychol, 46(3),
342-376.
Reichle, E.D., Rayner, K., & Pollatsek, A. (2003). The E-Z reader model of eye-movement control
in reading: comparisons to other models. Behav Brain Sci, 26(4), 445-476.
Reppert, S.M. & Weaver, D.R. (2002). Coordination of circadian timing in mammals. Nature,
418(6901), 935-941.
Rieser, J.J., Hill, E.W., Talor, C.R., Bradfield, A., & Rosen, S. (1992). Visual experience, visual
field size, and the development of nonvisual sensitivity to the spatial structure of outdoor
neighborhoods explored by walking. J Exp Psychol Gen, 121(2), 210-221.
Rizzo, J.F. & Wyatt, J. (1997). Prospects for a visual prosthesis. Neuroscientist, 3(4), 251-262.
Rizzo, J.F., Wyatt, J., Loewenstein, J., Kelly, S., & Shire, D. (2003a). Methods and perceptual
thresholds for short-term electrical stimulation of human retina with microelectrode arrays.
Invest Ophthalmol Vis Sci, 44(12), 5355-5361.
Rizzo, J.F., Wyatt, J., Loewenstein, J., Kelly, S., & Shire, D. (2003b). Perceptual efficacy of
electrical stimulation of human retina with a microelectrode array during short-term surgical
trials. Invest Ophthalmol Vis Sci, 44(12), 5362-5369.
Rizzo, J.F., Wyatt, J.L., Loewenstein, J., Montezuma, S., Shire, D.B., Theogarajan, L., & Kelly,
S.K. (2004). Development of a wireless, ab-externo retinal prosthesis. Invest Ophthalmol Vis
Sci, 45(5), 3399.
Roll, R., Bard, C., & Paillard, J. (1986). Head orienting contributes to the directional accuracy of
aiming at distant targets. Hum Mov Sci, 5(4), 359 (abstract).
Rossetti, Y., Desmurget, M., & Prablanc, C. (1995). Vectorial coding of movement: vision,
proprioception, or both? J Neurophysiol, 74(1), 457-463.
Rossetti, Y., Stelmach, G., Desmurget, M., Prablanc, C., & Jeannerod, M. (1994). The effect of
viewing the static hand prior to movement onset on pointing kinematics and variability. Exp
Brain Res, 101(2), 323-330.
Rothwell, J.C., Traub, M.M., Day, B.L., Obeso, J.A., Thomas, P.K., & Marsden, C.D. (1982).
Manual motor performance in a deafferented man. Brain, 105(3), 515-542.
Rousche, P.J. & Normann, R.A. (1992). A method for pneumatically inserting an array of
penetrating electrodes into cortical tissue. Ann Biomed Eng, 20(4), 413-422.
Rousche, P.J. & Normann, R.A. (1998a). Chronic recording capability of the Utah Intracortical
Electrode Array in cat sensory cortex. J Neurosci Methods, 82(1), 1-15.
Rousche, P.J. & Normann, R.A. (1998b). Chronic recording capability of the Utah Intracortical
Electrode Array in cat sensory cortex. J Neurosci Methods, 82(1), 1-15.
Rousche, P.J. & Normann, R.A. (1999). Chronic intracortical microstimulation (ICMS) of cat
sensory cortex using the Utah Intracortical Electrode Array. IEEE Trans Rehabil Eng, 7(1),
56-68.
Rowland, L.P., Fink, M.E., & Rubin, L. (1991). Cerebrospinal fluid: blood-brain barrier, brain
edema, and hydrocephalus. In: E.R. Kandel, J.H. Schwartz, & T.M. Jessel (Eds.), Principles
of Neural Science (pp. 1050-1060). Norwalk: Appleton & Lange.
Rubin, G.S., Bandeen-Roche, K., Huang, G.H., Munoz, B., Schein, O.D., Fried, L.P., & West, S.K.
(2001). The association of multiple visual impairments with self-reported visual disability:
SEE project. Invest Ophthalmol Vis Sci, 42(1), 64-72.
158
Rubin, G.S. & Legge, G.E. (1989). Psychophysics of reading. VI--The role of contrast in low
vision. Vision Res, 29(1), 79-91.
Rutten, W.L. (2002). Selective electrical interfaces with the nervous system. Annu Rev Biomed
Eng, 4, 407-452.
Safadi, M.R., Washko, F., Lagman, A., Jaboro, C., Auner, G.W., Iezzi, R., McAllister, J.P., &
Abrams, G. (2003). Development of a Microfluidic Drug Delivery Neural Stimulating Device
for Vision. Invest Ophthalmol Vis Sci, 44(5), 5082 (abstract).
Santos, A., Humayun, M.S., de Juan Jr., E., Greenberg, R.J., Marsh, M.J., Klock, I.B., & Milam,
A.H. (1997). Preservation of the inner retina in retinitis pigmentosa. A morphometric
analysis. Arch Ophthalmol, 115(4), 511-515.
Schmidt, E.M., Bak, M.J., Hambrecht, F.T., Kufta, C.V., O'Rourke, D.K., & Vallabhanath, P.
(1996). Feasibility of a visual prosthesis for the blind based on intracortical microstimulation
of the visual cortex. Brain, 119(Pt 2), 507-522.
Schneider, G.E. (1969). Two visual systems. Science, 163(870), 895-902.
Schoups, A.A., Vogels, R., & Orban, G.A. (1995). Human perceptual learning in identifying the
oblique orientation: retinotopy, orientation specificity and monocularity. J Physiol, 483 ( Pt
3), 797-810.
Schwahn, H.N., Gekeler, F., Kohler, K., Kobuch, K., Sachs, H.G., Schulmeyer, F., Jakob, W.,
Gabel, V.P., & Zrenner, E. (2001). Studies on the feasibility of a subretinal visual prosthesis:
data from Yucatan micropig and rabbit. Graefes Arch Clin Exp Ophthalmol, 239(12), 961997.
Servos, P. (2000). Distance estimation in the visual and visuomotor systems. Exp Brain Res,
130(1), 35-47.
Servos, P. & Goodale, M.A. (1994). Binocular vision and the on-line control of human
prehension. Exp Brain Res, 98(1), 119-127.
Shannon, R.V., Zeng, F.G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition
with primarily temporal cues. Science, 270(5234), 303-304.
Sireteanu, R. & Rettenbach, R. (1995). Perceptual learning in visual search: fast, enduring, but
non-specific. Vision Res, 35(14), 2037-2043.
Sireteanu, R. & Rettenbach, R. (2000). Perceptual learning in visual search generalizes over
tasks, locations, and eyes. Vision Res, 40(21), 2925-2949.
Sivak, B. & Mackenzie, C.L. (1992). The contributions of peripheral vision and central vision to
prehension. In: L. Proteau & D. Elliot (Eds.), Vision and Motor Control (pp. 233-259).
Amsterdam: Elsevier Science Publishers.
Sjöstrand, J., Olsson, V., Popovic, Z., & Conradi, N. (1999a). Quantitative estimations of foveal
and extra-foveal retinal circuitry in humans. Vision Res, 39(18), 2987-2998.
Sjöstrand, J., Popovic, Z., Conradi, N., & Marshall, J. (1999b). Morphometric study of the
displacement of retinal ganglion cells subserving cones within the human fovea. Graefes
Arch Clin Exp Ophthalmol, 237(12), 1014-1023.
Slaughter, M. (1990). The Vertebrate Retina. In: K.N. Leibovic (Eds.), Science of Vision (pp. 5383). New York: Springer-Verlag.
Smith, A.F. & Smith, J.G. (1996). The economic burden of global blindness: a price too high! Br
J Ophthalmol, 80(4), 276-277.
Sommerhalder, J., Oueghlani, E., Bagnoud, M., Leonards, U., Safran, A.B., & Pelizzone, M.
(2003). Simulation of artificial vision: I. Eccentric reading of isolated words, and perceptual
learning. Vision Res, 43(3), 269-283.
159
Sommerhalder, J., Rappaz, B., de Haller, R., Perez Fornos, A., Safran, A.B., & Pelizzone, M.
(2004). Simulation of artificial vision: II. Eccentric reading of full-page text and the learning
of this task. Vision Res, 44(14), 1693-1706.
Sparks, D.L. & Mays, L.E. (1983). Spatial localization of saccade targets. I. Compensation for
stimulation-induced perturbations in eye position. J Neurophysiol, 49(1), 45-63.
Stampe, D.M. (1993). Heuristic filtering and reliable calibration methods for video-based pupiltracking systems. Behav Res Methods Instrum Comput, 25(2), 137-142.
Stett, A., Barth, W., Weiss, S., Haemmerle, H., & Zrenner, E. (2000). Electrical multisite
stimulation of the isolated chicken retina. Vision Res, 40(13), 1785-1795.
Stetten, G. (2000). Vision system. In: J.D. Bronzino (Eds.), The Biomedical Engineering
Handbook (pp. 34-42). Boca Raton: CRC Press.
Stoffregen, T.A. (1985). Flow structure versus retinal location in the optical control of stance. J
Exp Psychol Hum Percept Perform, 11(5), 554-565.
Straw, L.B. & Harley, R.K. (1991). Assessment and training in orientation and mobility for older
persons: program development and testing. J Vis Impair Blind, 85(7), 291-296.
Strelow, E.R. (1985). What is needed for a theory of mobility: direct perception and cognitive
maps -- lessons from the blind. Psychol Rev, 92(2), 226-248.
Studebaker, G.A. (1985). A "rationalized" arcsine transform. J Speech Hear Res, 28(3), 455462.
Suaning, G.J. & Lovell, N.H. (2001). CMOS neurostimulation ASIC with 100 channels, scaleable
output, and bidirectional radio-frequency telemetry. IEEE Trans Biomed Eng, 48(2), 248260.
Szlyk, J.P., Seiple, W., Fishman, G.A., Alexander, K.R., Grover, S., & Mahler, C.L. (2001).
Perceived and actual performance of daily tasks: relationship to visual function tests in
individuals with retinitis pigmentosa. Ophthalmology, 108(1), 65-75.
Tant, M.L.M., Cornelissen, F.W., Kooijman, A.C., & Brouwer, W.H. (2002). Hemianopic visual
field defects elicit hemianopic scanning. Vision Res, 42(10), 1339-1348.
Taub, E., Goldberg, I.A., & Taub, P. (1975). Deafferentation in monkeys: pointing at a target
without visual feedback. Exp Neurol, 46(1), 178-186.
Thompson, R.W., Barnett, G.D., Humayun, M., & Dagnelie, G. (2000). Reading speed and facial
recognition using simulated prosthetic vision. Invest Ophthalmol Vis Sci, 41(suppl.), S860
(abstract).
Thompson, R.W., Barnett, G.D., Humayun, M.S., & Dagnelie, G. (2003). Facial recognition using
simulated prosthetic pixelized vision. Invest Ophthalmol Vis Sci, 44(11), 5035-5042.
Thornton, A.R. & Raffin, M.J. (1978). Speech-discrimination scores modeled as a binomial
variable. J Speech Hear Res, 21(3), 507-518.
Thornton, I.M. & Kourtzi, Z. (2002). A matching advantage for dynamic human faces.
Perception, 31(1), 113-132.
Thylefors, B. (1992). Epidemiological patterns of ocular trauma. Aust N Z J Ophthalmol, 20(2),
95-98.
Thylefors, B., Négrel, A.D., Pararajasegaram, R., & Dadzie, K.Y. (1995). Global data on
blindness. Bull World Health Organ, 73(1), 115-121.
Toet, A. & Levi, D.M. (1992). The two-dimensional shape of spatial interaction zones in the
parafovea. Vision Res, 32(7), 1349-1357.
Tong, Y.C., Blamey, P.J., Dowell, R.C., & Clark, G.M. (1983). Psychophysical studies evaluating
the feasibility of a speech processing strategy for a multiple-channel cochlear implant. J
Acoust Soc Am, 74(1), 73-80.
160
Turano, K. & Wang, X. (1992). Motion thresholds in retinitis pigmentosa. Invest Ophthalmol Vis
Sci, 33(8), 2411-2422.
Turano, K.A., Geruschat, D.R., Baker, F.H., Stahl, J.W., & Shapiro, M.D. (2001). Direction of
gaze while walking a simple route: persons with normal vision and persons with retinitis
pigmentosa. Optom Vis Sci, 78(9), 667-675.
Turano, K.A., Geruschat, D.R., Stahl, J.W., & Massof, R.W. (1999). Perceived visual ability for
independent mobility in persons with retinitis pigmentosa. Invest Ophthalmol Vis Sci, 40(5),
865-877.
Tychsen, L. (1992). Binocular vision. In: W. Hart (Eds.), Adler's physiology of the eye (pp. St
Louis: Mosby.
Ungerleider, L.G. & Mishkin, M. (1982). Two cortical visual systems. In: D.J. Ingle, M.A.
Goodale, & J.S. Mansfield (Eds.), Analysis of Visual Behavior (pp. 549-586). London: The
MIT Press.
Urban, H. (1937). Zur physiologie der occipitalregion des menschen. Z Ges Neurol Psychiat,
257-261.
Ustun, T.B., Rehm, J., Chatterji, S., Saxena, S., Trotter, R., Room, R., & Bickenbach, J. (1999).
Multiple-informant ranking of the disabling effects of different health conditions in 14
countries. WHO/NIH Joint Project CAR Study Group. Lancet, 354(9173), 111-115.
Uttal, W.R., Baruch, T., & Allen, L. (1995a). Combining image degradations in a recognition
task. Percept Psychophys, 57(5), 682-691.
Uttal, W.R., Baruch, T., & Allen, L. (1995b). The effect of combinations of image degradations in
a discrimination task. Percept Psychophys, 57(5), 668-681.
Uttal, W.R., Baruch, T., & Allen, L. (1997). A parametric study of face recognition when image
degradations are combined. Spat Vis, 11(2), 179-204.
Van der Geest, J.N. & Frens, M.A. (2002). Recording eye movements with video-oculography
and scleral search coils: a direct comparison of two methods. J Neurosci Methods, 114(2),
185-195.
Van Essen, D.C., Anderson, C.H., & Felleman, D.J. (1992). Information processing in the primate
visual system: an integrated systems perspective. Science, 255(5043), 419-423.
Veraart, C., Grill, W.M., & Mortimer, J.T. (1993). Selective control of muscle activation with a
multipolar nerve cuff electrode. IEEE Trans Biomed Eng, 40(7), 640-653.
Veraart, C., Raftopoulos, C., Mortimer, J.T., Delbeke, J., Pins, D., Michaux, G., Vanlierde, A.,
Parrini, S., & Wanet-Defalque, M.C. (1998). Visual sensations produced by optic nerve
stimulation using an implanted self-sizing spiral cuff electrode. Brain Res, 813(1), 181-186.
Veraart, C., Wanet-Defalque, M.C., Gerard, B., Vanlierde, A., & Delbeke, J. (2003). Pattern
recognition with the optic nerve visual prosthesis. Artif Organs, 27(11), 996-1004.
Von Noorden, G.K. & Mackensen, G. (1962). Phenomenology of eccentric fixation. Am J
Ophthalmol, 53, 642-660.
Walraven, T.L., Buddi, R., Kapi, S., Abrams, G.W., & Iezzi, R. (2004). Chronic intravitreal
infusion of phototriggerable neurotransmitters for retinal prosthesis, in vivo. Invest
Ophthalmol Vis Sci, 45(5), 4211 (abstract).
Walraven, T.L., Iezzi, R., McAllister, J.P., Auner, G., Abrams, G., & Givens, R. (2003). The
effects of Dextromethorphan against the toxicity of photoactivated caged glutamate in vitro.
Invest Ophthalmol Vis Sci, 44(5), 5066 (abstract).
Walraven, T.L., Iezzi, R., McAllister, J.P., Auner, G., Givens, R., & Abrams, G. (2002).
Biocompatibility of a Neurotransmitter Based Retinal and Cortical Visual Prosthesis. Invest
Ophthalmol Vis Sci, 43(12), 4453 (abstract).
161
Warren, D.J. & Normann, R.A. (2000). Visual neuroprostheses. In: W.E. Finn & P.G. LoPresti
(Eds.), Handbook of Neuroprosthetic Methods (pp. 11.1-11.45). Boca Raton: CRC Press.
Warren, W.H. (1995). Self-motion: Visual perception and visual control. In: W. Epstein & S.J.
Rogers (Eds.), Perception of space and motion (pp. 263-325). London: Academic Press.
Warren, W.H. & Kurtz, K.J. (1992). The role of central and peripheral vision in perceiving the
direction of self-motion. Percept Psychophys, 51(5), 443-454.
Wässle, H., Grünert, U., Röhrenbeck, J., & Boycott, B.B. (1989). Cortical magnification factor
and the ganglion cell density of the primate retina. Nature, 341(6243), 643-646.
Watt, S.J., Bradshaw, M.F., & Rushton, S.K. (2000). Field of view affects reaching, not grasping.
Exp Brain Res, 135(3), 411-416.
Weih, L., McCarty, C.A., & Taylor, H.R. (2000). Functional implications of vision impairment. Clin
Experiment Ophthalmol, 28(3), 153-155.
Weiland, J.D. & Anderson, D.J. (2000). Chronic neural stimulation with thin-film, iridium oxide
electrodes. IEEE Trans Biomed Eng, 47(7), 911-918.
Weiland, J.D., Humayun, M.S., Dagnelie, G., de Juan Jr., E., Greenberg, R.J., & Iliff, N.T.
(1999). Understanding the origin of visual percepts elicited by electrical stimulation of the
human retina. Graefes Arch Clin Exp Ophthalmol, 237(12), 1007-1013.
Weiland, J.D., Liu, W., & Humayun, M.S. (2005). Retinal prosthesis. Annu Rev Biomed Eng, 7,
361-401.
Wensveen, J.M., Bedell, H.E., & Loshin, D.S. (1995). Reading rates with artificial central
scotomata with and without spatial remapping of print. Optom Vis Sci, 72(2), 100-114.
West, S.K., Rubin, G.S., Broman, A.T., Munoz, B., Bandeen-Roche, K., & Turano, K. (2002). How
does visual impairment affect performance on tasks of everyday life? The SEE Project.
Salisbury Eye Evaluation. Arch Ophthalmol, 120(6), 774-780.
Westheimer, G. (2001). Is peripheral visual acuity susceptible to perceptual learning in the
adult? Vision Res, 41(1), 47-52.
Whittaker, S.G. & Cummings, R.W. (1990). Foveating saccades. Vision Res, 30(9), 1363-1366.
Whittaker, S.G., Cummings, R.W., & Swieson, L.R. (1991). Saccade control without a fovea.
Vision Res, 31(12), 2209-2218.
Whittaker, S.G. & Lovie-Kitchin, J. (1993). Visual requirements for reading. Optom Vis Sci, 70(1),
54-65.
WHO (1997). Blindness. World Health Organization, accessed 03/12/2003.
http://www.who.int/health_topics/blindness/en/.
WHO (2000). Global Initiative for the Elimination of Avoidable Blindness. World Health
Organization, accessed 03/12/2003. http://www.who.int/inf-fs/en/fact213.html.
WHO (2002). Childhood blindness prevention project launched. Bull World Health Organ, 80(8),
688-688.
Wolffsohn, J.S. & Cochrane, A.L. (1998). Low vision perspectives on glaucoma. Clin Exp Optom,
81(6), 280-289.
Wu, P., Mehenti, N.Z., Leng, T., Marmor, M.F., Blumenkranz, M.S., & Fishman, H.A. (2003). Cell
Demographics from Full Thickness Retinal Explant Growth on Micropatterned Surfaces.
Invest Ophthalmol Vis Sci, 44(5), 5008 (abstract).
Wyatt, J. & Rizzo, J. (1996). Ocular implants for the blind. IEEE Spectrum, 33(5), 47-53.
Xing, J. & Heeger, D.J. (2000). Center-surround interactions in foveal and peripheral vision.
Vision Res, 40(22), 3065-3072.
162
Yagi, T., Kameda, S., Hayashida, Y., & Li, L. (1999). An artificial retina with adaptive
mechanisms and its application to retinal prosthesis. Proc '99 Conf Syst Man Cybern, 4,
418-423.
Yamauchi, Y., Enzmann, V., Franco, L.M., Jackson, D., Naber, J., Rizzo, J.F., Ziv, O.R., & Kaplan,
H.J. (2004). The retinal prosthesis - the stimulation threshold is lower with a subretinal
microelectrode array. Invest Ophthalmol Vis Sci, 45(5), 4222.
Yanai, D., Weiland, J.D., Mahadevappa, M., Fujii, G.Y., de Juan Jr., E., Greenberg, R.J.,
Williamson, R., Cimmarusti, V., & Humayun, M.S. (2003). Visual Perception in Blind Subjects
with Microelectronic Retinal Prosthesis. Invest Ophthalmol Vis Sci, 44(5), 5056 (abstract).
Zeevi, Y.Y. & Peli, E. (1979). Latency of peripheral saccades. J Opt Soc Am, 69(9), 1274-1279.
Ziegler, D. (2002) Characterization and improvement of an oscillating-pixel-circuit prototype for
an artificial retina implant. EPFL, Lausanne.
Ziegler, D., Linderholm, P., Mazza, M., Ferazzutti, S., Bertrand, D., Ionescu, A.M., & Renaud, P.
(2004). An active microphotodiode array of oscillating pixels for retinal stimulation. Sens
Actuators A Phys, 110(1-3), 11-17.
Zihl, J. (2000). Rehabilitation of visual disorders after brain injury. Hove, East Sussex, UK:
Psychology Press Ltd.
Zrenner, E. (2002a). The subretinal implant: can microphotodiode arrays replace degenerated
retinal photoreceptors to restore vision? Ophthalmologica, 216(1 (suppl.)), 8-20.
Zrenner, E. (2002b). Will retinal implants restore vision? Science, 295(5557), 1022-1025.
Zrenner, E., Miliczek, K.D., Gabel, V.P., Graf, H.G., Guenther, E., Haemmerle, H., Hoefflinger, B.,
Kohler, K., Nisch, W., Schubert, M., Stett, A., & Weiss, S. (1997). The development of
subretinal microphotodiodes for replacement of degenerated photoreceptors. Ophthalmic
Res, 29(5), 269-280.
Zrenner, E., Stett, A., Weiss, S., Aramant, R.B., Guenther, E., Kohler, K., Miliczek, K.D., Seiler,
M.J., & Haemmerle, H. (1999). Can subretinal microphotodiodes successfully replace
degenerated photoreceptors? Vision Res, 39(15), 2555-2567.
163
Appendix A
EyeLink I® System Specifications
Tracking Mode
Mode
Pupil-only
Sample Rate
250 Hz
Average Delay (filter off)
6 ms
Average Delay (filter on)
10 ms
Noise (RMS)
< 0.01°
Stability
Affected by headband slip and vibration
Operational/Functional Specifications
Image Processing
Hybrid analog-digital
Pupil tracking
Hyperacuity
Corneal reflection tracking
None
Sampling Rate
250 Hz
Average Data Transit Delay
Filter off = 6 ms
Filter on = 10 ms
Resolution (gaze)
Noise limited to < 0.01°
Velocity Noise
< 3°/s
Gaze Position Accuracy
< 0.5° average
Pupil Size Resolution
0.1% of diameter
Pupil Size Noise
< 0.01 mm
Heuristic Filtering
Nearest-neighbor heuristic filter
Eye Tracking Range
±30° horizontal, ±20° vertical (pupil only)
Gaze Tracking Range
±20° horizontal, ±18° vertical
Head Tracking Range
40-140 cm (standard setup), ~300 cm (special markers)
165
Head Rotation Compensation
Range
±15° for best accuracy, ±30° conditional on display location
Built-in Calibration, Validation
Calibration and validation using Pupil-only
Operating Environment
Required IR-free environment, physical stability
Subject Compatibility
Most eyeglasses and contact lenses
Data File
EDF, direct to disk
EDF File and link Data Types
Eye position, HREF position, gaze position, pupil size, buttons,
messages
On-line Eye Movement Analysis
Saccades, fixations, blinks, fixation updates
Real-Time
Gaze cursor during recording and validation, eye position cursor
during calibration, camera images
Physical Specifications
Image Processing Card
Full-length ISA (13.5” / 343 mm)
Headband
Leather-padded, height and size adjustments
Headband Weight
~420 g
Headband Cable Length
5m
Eye Camera Distance
40 to 80 mm
Binocular Tracking
Standard
Eye Illumination
925 nm IR, IEC-825 Class 1, <1.2 mW/cm2
Display Markers
925 nm IR, IEC-825 Class 1
Ethernet Link
TCP/IP or raw, 10BASE-2 or 10BASE-T, external card with
packet driver
Response Box Support
Digital
Analog Output
Optional ISA or PCI card
Digital Control
Configurable
Display Operating System API
MS-DOS, Macintosh, Windows
166
Appendix B
3M MicroTouch™ Touch Screen Specifications
General
Part Number
13-8051-03
Sensor Technology
ClearTek Capacitive Profile
Touch Screen Controller
USB EXII 5000UC (P/N: 14-205)
Electrical Specifications
Input Method
Finger. TouchPen available with qualified sensor, attachments
and electronics
Accuracy and Precision Area
Reported touch coordinates are within 1.0% of true position
(based on viewing window dimensions) when linearized and
used in conjunction with 3M Touch Systems Electronics
Touch Screen Resolution•
16k x 16k
Optical Specifications
Optical Clarity
Up to 88% light transmission at 550 nm; dependant on specific
surface finish chosen
Equipment used: BYK Gardner Haze Gard Plus
Surface finish
Industrial etch
Coating
ClearTek◊
Mechanical Specifications
Diagonal length
19.71” / 500.63 mm
Glass Thickness
0.125” (±0.01”) / 3.18 mm (±0.25 mm) typical
Radius Curve
0
Outline Dimensions
X: 15.85” / 402.59 mm
Y: 12.90” / 327.66 mm
•
◊
The maximum number addressable coordinates generated by the controller.
Protective glass overcoat that protects the sensor by resisting scratches and increasing durability.
167
Viewable (Active) Area
X: 15.31” / 388.87 mm
Y: 12.42” / 315.47 mm
Linearization
Factory linearization values are stored in the touch screen
NOVRAM, attached controller or 2D bar-code
Touch Contact Requirement
3 ms for finger input.
Surface and Scratch Hardness+
Cannot be scratched using any stylus with Mohs’ rating of less
than 6.5. Exceeds severe abrasion test per MIL-C-675C.
Withstands 10500 grams of force per Balance Beam Scrape
Adhesion Mar Tester. MicroScratch tester with 10-µm radius
tungsten carbide indenter takes a force of 1.8 Newtons
NEMA Rating
NEMA sealable
Gasketing
Complete water-resistant seal obtainable with polyethylene
gasket
Cleaning
Water, isopropyl, alcohol, and similar non-abrasive cleaners
Reliability
Endurance Test|
Over 225 million mechanical touches without noticeable
degradation of the surface
Surface Obstructions
Operation unaffected by surface obstructions such as dirt,
grease, dust, smoke, peanut butter, etc.
Chemical Resistance
ClearTek is highly resistant to corrosives, in accordance with
ASTM-D-1308-87 (1993) and ASTMD-F-1598-95
Liquid Resistance
Liquids on screen do not impede touch screen performance
Liquid Repellence
Extremely water repellent (contact angle of 94° and greater
measured using Sessile Drop Contact Angle Method)
Operating Temperature Range
-15°C to 70°C for touch screen
Storage Temperature
Between –50°C and 85°C (MIL-STD-810E)
+
Paul N. Gardner Co. model PA-2197 using a loop stylus (0.128” in O.D. Rockwell Hardness 55-61).
Mechanical touch activation in single x,y location using a finger-like stylus of 45 durometer, shore
“A” hardness, 0.5 inch diameter with a load of 0.46 pounds, ±0.01 pounds of force.
|
168
Appendix C
Logitech™ 3D Head Tracker Specifications
Functional Specifications
Available Modes
2D and 3D
Tracking Speed
Up to 30 inch/s
Tracking Area
1.52 m (5 ft), 100° cone
2D Mode Resolution
Position: 400 dpi
3D Mode Resolution
Position: 0.01 cm (0.004”)
Orientation: 0.1°
Latency
30 ms
Sampling Rate
50 Hz
Accuracy
2% of the distance between the transmitter and the receiver
Operational Specifications
Operating Temperature
5°C – 35°C
Operating Relative Humidity
10% - 90% (non-condensing)
Reliability
20000 h (Electrical MTBF)
Audio Specifications
Ultrasonic Frequency
23 kHz
Output Amplitude
1 Vp-p, 600 Ω load
Bandwidth
15 Hz – 5 kHz
Physical Specifications
Control Unit
Depth: 24 cm (9.5”)
Height: 4.1 cm (1.625”)
Width: 18.5 cm (7.25”)
169
Transmitter
Receiver
170
Appendix D
Analysis of Eye and Head Movements during the Experiments
on Visuomotor Coordination
Measurement of head movements
Head/trunk movements were recorded with a 3D head tracker (Logitech Inc.,
California, USA) attached to the back of the mobile setup. Simply speaking, the 3D
head tracker is made of a stationary ultrasonic transmitter that tracks the position of
a mobile receiver (attached to the back of the mobile setup) as it moves within the
transmitter’s active area (1.52 m, 100° cone; see fig. D1a). The transmitter then
sends these signals to a control unit, which transforms this information into 3D
position (cm) and orientation (°) coordinates (see fig. D1b) and sends this data to
the subject PC at a 50 Hz sample rate. Refer to Appendix C for detailed specifications
of the 3D head tracker.
a)
b)
Figure D1. The head tracking system. (a) The 3D head tracker consists of a fixed ultrasonic
transmitter that tracks the position of a moving receiver within a 1.52 m active area (100°). The
transmitter sends the data to a control unit (CU), which in turn transforms these signals into 3D
coordinates of position and orientation and sends them to the Subject PC. Modified from the 3D
Mouse & Head Tracker technical reference manual by Logitech™. (b) Positive, three-dimensional
direction of receiver movement (X, Y, and Z axes) and positive rotation about these axes (Pitch, Yaw
and Roll). Negative movement is in the direction opposite each arrow.
171
Analysis of eye and head movements during the acute experiments on
visuomotor coordination
The total area explored with eye movements on the screen and with head
movements on the working space was estimated by computing separately the
dispersion (standard deviation) of eye and head position (cm), along each axis (2D
for eye movements and 3D for head movements), across trials. Eye movement
behavior during visual search was characterized as average fixation duration (ms).
Results for the chips task
a)
b)
Figure D2. Dispersion of eye position coordinates versus number of pixels contained in the 10°x7°
viewing window for 3 normal subjects performing the chips task. Three effective fields of view
projected in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red plot),
16.5°x11.6° (blue plot), and 33°x23.1° (green plot). (a) Horizontal dispersion of eye position (cm)
±SEM. (b) Vertical dispersion of eye position (cm) ±SEM.
Figure D2 displays the mean dispersion of eye movements versus number of
pixels in the viewing window, for the chips task. Both horizontal and vertical eye
dispersion results were statistically equivalent across all pixelization levels for the
8.25° x 5.8° and the 16.5° x 11.6° fields of view. With the 33° x 23.1° field of view,
horizontal eye dispersion also remained unperturbed even at the lowest target
resolutions. Vertical eye dispersion, however, significantly (p < 0.05) increased at
498 pixels. This difference did not persist at target resolutions of 221 and 124 pixels.
The largest eye dispersion values were obtained with the 8.25° x 5.8° field of view
(about 2.7 cm horizontally and 1.7 cm vertically, equivalent to 6.3° x 4° on the
screen), and the smallest with the 33° x 23.1° field of view (around 1 cm horizontally
and 0.8 cm vertically, equivalent to 2.3° x 1.8° on the screen). The difference
between each consecutive field of view was of about 0.9 cm horizontally (~ 2°) and
172
0.45 vertically (~ 1°). This could be expected due to the fact that with the smallest
field of view the area of the screen that could be explored with eye movements was
the broadest. These data also reveal that the subjects explored relatively narrow
areas of the image available on the screen.
Figure D3 plots the mean dispersion of the 3D (horizontal, vertical, and
transversal) coordinates of head position versus number of pixels in the viewing
window, for the chips task. At 17920 and 1991 pixels, head dispersion along the 3
axes remained stable and was statistically equivalent with the 3 fields of view. From
498 pixels and below, head dispersion systematically increased. The most dramatic
increases were observed for the 33° x 23.1° field of view, especially around the
transversal coordinate. This clearly indicates that, as less information was available in
the viewing window, subjects had to approach the working model to compensate for
the lack of resolution. This strategy also leads to larger horizontal and vertical head
movements to explore the working area, as observed in the results.
a)
b)
c)
Figure D3. Dispersion of head position coordinates versus number of pixels contained in the 10°x7°
viewing window for 3 normal subjects performing the chips task. Three effective visual fields projected
in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red plot), 16.5°x11.6° (blue
plot), and 33°x23.1° (green plot). (a) Horizontal dispersion of head position (cm) ±SEM. (b) Vertical
dispersion of head position (cm) ±SEM. (c) Transversal dispersion of head position (cm) ±SEM.
Results for the LEDs task
Figure D4 displays the mean dispersion of eye movements versus number of
pixels in the viewing window, for the LEDs task. Results were similar to those
observed for the chips task. Both horizontal and vertical dispersion remained roughly
stable even at the lowest pixelization. Values were quite different when comparing
the different fields of view with each other. Largest dispersion was observed with the
8.25° x 5.8° field of view (around 5.9 cm horizontally and 4.6 vertically; equivalent to
13.8° x 10.7° on the screen) while the smallest eye dispersion was obtained with the
33° x 23.1° (about 2.1 cm horizontally and 1.6 vertically; equivalent to 4.8° x 3.8°
on the screen). These differences were more pronounced than for the chips task.
173
a)
b)
Figure D4. Dispersion of eye position coordinates versus number of pixels contained in the 10°x7°
viewing window for 3 normal subjects performing the LEDs task. Three effective fields of view
projected in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red plot),
16.5°x11.6° (blue plot), and 33°x23.1° (green plot). (a) Horizontal dispersion of eye position (cm)
±SEM. (b) Vertical dispersion of eye position (cm) ±SEM.
Eye dispersion values were approximately 2.4 x 2 cm2 (~ 5.5° x 4.5°) larger with the
8.25° x 5.8° field of view than with its 16.5° x 11.6° counterpart. Differences
between the 16.5° x 11.6° and the 33° x 23.1° fields of view were approximately 1.5
cm horizontally (~ 3.4°) and 1 cm vertically (~ 2.4°). In addition, when comparing
these results with those obtained for the chips task, it appears that eye dispersion
was about 2 times broader for the LEDs task, in all the conditions investigated.
a)
b)
c)
Figure D5. Dispersion of head position coordinates versus number of pixels contained in the 10°x7°
viewing window for 3 normal subjects performing the LEDs task. Three effective fields of view
projected in the 10°x7° viewing window are compared in central vision: 8.25°x5.8° (red plot),
16.5°x11.6° (blue plot), and 33°x23.1° (green plot). (a) Horizontal dispersion of head position (cm)
±SEM. (b) Vertical dispersion of head position (cm) ±SEM. (c) Transversal dispersion of head position
(cm) ±SEM.
174
Figure D5 plots the mean dispersion of the 3D coordinates of head position versus
number of pixels in the viewing window, for the LEDs task. Interestingly, in this case
the lack of visual information (i.e. less pixels, larger fields of view) did not seem to
have any effect on the head movement strategy used to accomplish the task: head
dispersion values remained at the same level across all pixelizations investigated and
similar results were obtained with the 3 fields of view.
Analysis of fixations
Figure D6 displays average fixation duration versus number of pixels for the chips
task. Results for the 8.25° x 5.8° field of view remained stable around 400 ms down
to 498 pixels. Fixation duration then increased significantly at 221 (p = 0.04) and
124 pixels (p = 0.02). For the 16.5° x 11.6° and 33° x 23.1° fields of view, average
fixation duration remained stable around 400 ms across all pixelization levels
investigated.
Figure D6b displays average fixation duration versus number of pixels for the
LEDs task. Results appeared to be mainly influenced by the effective field of view,
but not by pixelization level. Largest fixation durations were observed with the 33° x
23.1° field of view, while the shortest were performed with the 8.25° x 5.8° field of
view. For all fields of view, average fixation duration remained stable down to a
target resolution of 498 pixels. Below this pixelization level, a slight tendency towards
a)
b)
Figure D6. Average fixation duration (ms) ±SEM versus number of pixels contained in the 10°x7°
viewing window for 3 normal subjects performing: a) the chips task, and b) the LEDs task. Three
effective fields of view projected in the 10°x7° viewing window are compared in central vision:
8.25°x5.8° (red plot), 16.5°x11.6° (blue plot), and 33°x23.1° (green plot).
175
decreasing fixation durations was observed. Interestingly, mean duration of fixations
was longer for the LEDs task than for the chips task.
Summary of these results
Eye and head dispersion results demonstrate that the strategy for visual search
was significantly influenced by the viewing conditions used when performing both
visuomotor tasks. When small fields of view were available, subjects essentially
explored the environment using eye movements, and their search strategy remained
almost the same even when stimulus images were presented at low pixel resolutions.
When large fields of view were used, subjects used very few eye and head
movements at high pixelizations, since large portions of the environment were
available at glance. Potential targets were, thus, identified more easily. However, as
fewer pixels became available to perform the tasks, subjects had to approach the
working area (i.e. with large transversal head movements) to compensate for the
lack of resolution (information). This was particularly visible for the chips task.
Interestingly, on average subjects presented larger eye dispersions for the LEDs task
than for the chips task. Conversely, dispersion of head movements was broader
during the chips task, especially around the transversal axis and at lower
pixelizations. In general, the LEDs task required lengthier fixations, probably due to
the precision requirements of the task. For this task, fixations appeared to be
sensitive to the size of the effective field of view projected inside of the viewing
window, but not to pixelization level.
Analysis of eye movements during the habituation experiments on
visuomotor coordination
Since oculomotor adaptation to eccentric viewing has already been described in
detail for the more complex reading task, adaptation of eye movements to the
eccentric viewing conditions in this case was assessed only by calculating the
cumulative distance of the subjects’ gaze on the screen, for each experimental
session.
In the reading experiments presented in Chapter 3, we observed impressive
oculomotor adaptation mechanisms. After almost 2 months of training, subjects were
able to suppress reflexive foveating saccades and to recalibrate eccentric nonfoveating saccades to achieve adequate page navigation. Since the visual
requirements of visuomotor coordination are quite different to those of the reading
task, we decided to roughly explore how subjects adapted their eye movements in
this case.
Samples of eye movements recordings obtained during the last experimental
session in central vision, for the chips task, are presented in figure D7. This figure
reveals that the 3 subjects used different strategies for exploring the environment.
Subject AP explored a relatively small area of the available image with eye
176
movements, especially in the vertical plane. Subject AW appeared to explore only the
right half of the screen, while using the whole vertical axis of the available image.
The area explored by subject MV was the broadest, including most of the available
screen surface.
AP
AW
MV
Figure D7. Eye movements recorded for the 3 normal subjects while performing the chips task in
central vision (last session). The solid line represents the trajectory of the center of the viewing
window relative to the presented image. The panels on the top and right represent frequency
histograms of the horizontal and vertical coordinates of eye position on the screen, recorded every 4
ms. Gray bars indicate the position of the area of the screen that could be explored with eye
movements. Note that subjects could also explore the environment with head/trunk movements thus
modifying the actual image being explored with gaze; therefore, the image of the chips panel shown
is only representative of the initial view of the working area.
Figure D8 displays recordings of eye movements during several successive
experimental sessions (1st, 5th, 15th, and last) for each subject while performing the
chips task in eccentric vision. In the beginning, the pattern of eye movements was
quite broadly distributed along the vertical axis due to the presence of numerous
vertical (foveating) saccades. Less vertical foveating saccades could be observed as
training progressed, and frequency histograms of the vertical coordinates became
narrower. In the last experimental session, vertical coordinates of eye movements
covered only about half of the available exploration area, with maximum frequency
peaks at approximately 330, 250, and 280 pixels in subjects AP, AW, and MV,
respectively. This reveals that subjects had the tendency to place the center of the
viewing window in the lower part of the stimulation image, probably minimizing this
way the eccentricity of the relevant part of the target image. A similar strategy was
observed for the reading task. In that case, subjects tended to place the center of
the viewing window on the lower part of the lines they were reading (see fig. 48).
Interestingly, in all 3 subjects, some vertical foveating saccades could still be
observed during the last experimental sessions.
When recordings of eye position on the screen during the first central viewing
experiment are compared to those obtained during the last experimental sessions in
eccentric vision, it can be noted that more eye movements were necessary in the
177
AP
AW
MV
1st
5th
15th
Last
Figure D8. Eye movements recorded for the 3 normal subjects while performing the chips task
during a choice of experimental sessions (1st, 5th, 15th, and last). The solid line represents the
trajectory of the center of the viewing window relative to the presented image. The panels on the
top and right represent frequency histograms of the horizontal and vertical coordinates of eye
position on the screen recorded every 4 ms. Gray bars indicate the position of the area of the screen
that could be explored with eye movements. Note that subjects could also explore the environment
with head/trunk movements thus modifying the actual image being explored with gaze; therefore,
the image of the chips panel shown is only representative of the initial view of the working area.
latter condition. Moreover, less vertical portions of the screen were explored in the
eccentric viewing condition.
178
To quantify the changes observed in oculomotor behavior with training, we
calculated the length of the path described by each subject’s eye movements along
each one of the 2D (horizontal and vertical) axes, for each task. Results for the chips
task are presented in figure D9, normalized by the score of correctly placed chips.
Significant learning effects were observed in the analysis of vertical eye
trajectories versus time (Pearson’s correlation: r = 0.48, p < 0.05 for AP; r = 0.70, p
< 0.001 for AW; and r = 0.76, p < 0.001 for MV). In the first sessions, distances
along the vertical axis were of about 1.5 m/chip for subjects AP and MV, and
approximately 1 m/chip in the case of subject AW. During the final sessions, vertical
trajectories decreased to approximately 0.5 m/chip for all 3 subjects. Comparison of
these results with values obtained during the last sessions in central vision (0.3 to
0.65 m/chip; dashed lines in fig. D9a) indicates that vertical oculomotor adaptation
was different for each subject. For subject AP, vertical trajectories at the end of the
experiment in eccentric vision were about 2 times longer than those obtained in
central vision. In the case of subject AW, final eccentric vertical trajectories were
approximately 50% shorter than those obtained during the last sessions in central
vision. Final vertical trajectories in central and eccentric vision for subject MV were
equivalent.
a)
b)
Figure D9. Mean cumulative length of the total trajectory described by each subject’s eye
movements per correctly placed chip, versus session number. Distances along each one of the 2D
axes were calculated separately for: (a) the vertical coordinate, and (b) the horizontal coordinate of
eye position on the screen. The solid lines indicate the best fits to the data. The dashed lines indicate
average values for the last 3 sessions in central vision.
179
Significant learning effects were also observed in the analysis of horizontal
distances per chip for all 3 subjects (Pearson’s correlation: r = 0.44, p < 0.05 for AP;
r = 0.52, p < 0.01 for AW; and r = 0.67, p < 0.01 for MV). Initial values for subject
AP ranged from about 1.5 to 1 m/chip, and decreased to approximately 0.7 m/chip,
with results asymptoting after about 14 sessions. The decrease was less pronounced
in subject AW, who presented initial horizontal trajectories of 0.6 m/chip that
decreased to nearly 0.5 m/chip, and stabilized after around 30 sessions. For subject
MV, mean horizontal trajectories per chip were of approximately 1 m/chip and
declined to about 0.5 m/chip in the last sessions. Horizontal trajectories for this
subject were still decreasing when the experiment ended. When comparing
horizontal trajectories described during the last sessions in eccentric vision to those
obtained during the last sessions in central vision (0.3 to 0.42 m/chip; dashed lines in
fig. D9b), it appears that, for subjects AW and MV, final horizontal trajectories were
of similar lengths in central and eccentric vision. For subject AP, horizontal
trajectories obtained during the last training sessions in central vision were
approximately 40% shorter than those observed at the end of the experiment in
eccentric vision.
a)
b)
Figure D10. Mean cumulative length of the total trajectory described by each subject’s eye
movements per pointed target, versus session number, for the LEDs task. Distances along each one of
the 2D axes were calculated separately for: (a) the vertical coordinate, and (b) the horizontal
coordinate of eye position on the screen. The solid lines indicate the best fits to the data. The dashed
lines indicate average values for the last 3 sessions in central vision.
Figure D10 presents the length of the path described by each subject’s eye
movements during the LEDs task, calculated separately for the horizontal (x) and
vertical (y) coordinate, and normalized by the total number of presented targets
180
(24). Results for both components of the eye movement trajectory were very similar.
Significant learning effects were observed for subjects AP (Pearson’s correlation: r =
0.53, p < 0.01 for the vertical trajectory and r = 0.5, p < 0.01 for the horizontal
trajectory) and MV (Pearson’s correlation: r = 0.53, p < 0.01 for the vertical
trajectory and r = 0.43, p < 0.05 for the horizontal trajectory). Results for these 2
subjects asymptoted within 4 to 6 sessions. Subject AP started the experiment with
vertical trajectories of about 2.2 m/target and horizontal trajectories around 1.4
m/target. With training, results for both coordinates declined to approximately 0.5
m/target. Initial results for subject MV were above 1 m/target and eventually
decreased to approximately 0.3 and 0.5 m/target for the vertical and horizontal
trajectories, respectively. Results for subject AW remained stable around 0.3
m/target all through the experiment. For all 3 subjects, trajectories measured during
the first central vision experiments (dashed lines in fig. D10) and those obtained
during the last sessions in eccentric vision were similar.
181
Appendix E
Analysis of Eye Movements during the Experiments on Mobility
Analysis of eye movements during the habituation experiments on
mobility
We quantified the changes in the eye movement trajectory with training, by
calculating the length of the path described by each subject’s eye movements along
each one of the 2D (horizontal and vertical) axes. These results are presented in
figure E1. Highly statistically significant learning effects were observed in the analysis
of vertical eye trajectories versus time for subjects MS and KC (respectively,
Pearson’s correlation: r = 0.77, p < 0.0001 and r = 0.66, p < 0.0001). Results for
subject HB were very variable, and only showed a slight tendency to decrease with
time (Pearson’s correlation: r = 0.20, p = 0.21). For all subjects, vertical trajectories
a)
b)
Figure E1. Mean cumulative length of the total trajectory described by each subject’s eye movements
during training for mobility, versus session number. Distances along each one of the 2D axes were
calculated separately for: a) the vertical coordinate, and b) the horizontal coordinate of eye position
on the screen. The solid lines indicate the best fits to the data. The dashed lines indicate average
values for the last 3 sessions in central vision.
183
were still decreasing when the experiment ended. The analysis of horizontal
trajectories revealed highly significant learning effects with time for all subjects
(Pearson’s correlation: r = 0.79, p < 0.0001 for HB; r = 0.81, p < 0.0001 for MS;
and r = 0.53, p < 0.001 for KC). Horizontal eye trajectories were still decreasing
when the experiment was terminated.
Final eccentric trajectories around the vertical axis for subject HB were very
variable all through the experiment. However, during the last 10 training sessions his
vertical eye trajectories reached similar values to those observed during the last
sessions in central vision (dashed lines in fig. E1). For subject MS, final vertical eye
trajectories were similar in central and eccentric vision. For subject KC, final eccentric
vertical trajectories were shorter than in central vision. Interestingly, at the end of
training, horizontal eye trajectories were shorter in eccentric than in central vision for
all
subjects.
184
Publications
1. Reprinted from VISION RES, 43(3), Sommerhalder, J., Oueghlani, E., Bagnoud,
M., Leonards, U., Safran, A.B., & Pelizzone, M., Simulation of artificial vision: I.
Eccentric reading of isolated words, and perceptual learning, pp. 269-283,
Copyright 2003, with permission from Elsevier.
2. Reprinted from VISION RES, 44(14), Sommerhalder, J., Rappaz, B., de Haller, R.,
Perez Fornos, A., Safran, A.B., & Pelizzone, M., Simulation of artificial vision: II.
Eccentric reading of full-page text and the learning of this task, pp. 1693-1706,
Copyright 2004, with permission from Elsevier.
3. Reprinted from INVEST OPHTHALMOL VIS SCI, 47(4), Perez Fornos, A.,
Sommerhalder, J., Rappaz, B., Pelizzone, M., & Safran, A.B., Processes involved in
oculomotor adaptation to eccentric reading, pp. 1439-1447, Copyright 2006, with
permission from the Association for Research in Vision and Ophthalmology.
4. Reprinted from INVEST OPHTHALMOL VIS SCI, 46(10), Pérez Fornos, A.,
Sommerhalder, J., Rappaz, B., Safran, A.B., & Pelizzone, M., Simulation of
artificial vision, III: do the spatial or temporal characteristics of stimulus
pixelization really matter?, pp. 3906-3912, Copyright 2005, with permission from
the Association for Research in Vision and Ophthalmology.
185
Vision Research 43 (2003) 269–283
www.elsevier.com/locate/visres
Simulation of artificial vision: I. Eccentric reading of isolated
words, and perceptual learning
J€
org Sommerhalder
b
a,*
, Evelyne Oueghlani a, Marc Bagnoud a, Ute Leonards b,
Avinoam B. Safran a, Marco Pelizzone a
a
Ophthalmology Clinic, Geneva University Hospitals, 1211 Geneva 14, Switzerland
Division of Neuropsychiatry, Geneva University Hospitals, 1211 Geneva 14, Switzerland
Received 19 December 2001; received in revised form 4 September 2002
Abstract
Simulations of artificial vision were performed to assess ‘‘minimum requirements for useful artificial vision’’. Retinal prostheses
will be implanted at a fixed (and probably eccentric) location of the retina. To mimic this condition on normal observers, we
projected stimuli of various sizes and content on a defined stabilised area of the visual field. In experiment 1, we asked subjects to
read isolated 4-letter words presented at various degrees of pixelisation and at various eccentricities. Reading performance dropped
abruptly when the number of pixels was reduced below a certain threshold. For central reading, a viewing area containing about 300
pixels was necessary for close to perfect reading (>90% correctly read words). At eccentricities beyond 10°, close to perfect reading
was never achieved even if more than 300 pixels were used. A control experiment using isolated letter recognition in the same
conditions suggested that lower reading performance at high eccentricity was in part due to the ‘‘crowding effect’’. In experiment 2,
we investigated whether the task of eccentric reading under such specific conditions could be improved by training. Two subjects,
naive to this task, were trained to read pixelised 4-letter words presented at 15° eccentricity. Reading performance of both subjects
increased impressively throughout the experiment. Low initial reading scores (range 6%–23% correct) improved impressively (range
64%–85% correct) after about one month of training (about 1 h/day). Control tests demonstrated that the learning process consisted
essentially in an adaptation to use an eccentric area of the retina for reading. These results indicate that functional retinal implants
consisting of more than 300 stimulation contacts will be needed. They might successfully restore some reading abilities in blind
patients, even if they have to be placed outside the foveal area. Reaching optimal performance may, however, require a significant
adaptation process.
Ó 2002 Elsevier Science Ltd. All rights reserved.
Keywords: Visual prosthesis; Simulation; Reading performance; Eccentric reading; Learning
1. Introduction
Over the last few years, several research groups have
initiated important projects aiming at the development
of visual prostheses for the blind (Chow & Chow, 1997;
Dobelle, 2000; Humayun & de Juan, 1998; Normann,
Maynard, Rousche, & Warren, 1999; Rizzo & Wyatt,
1997; Veraart et al., 1998; Zrenner et al., 1999). Increasing interest in this domain is essentially due to recent progress in micro-technology. One issue of major
importance, when considering the conception of a visual
prosthesis, is the determination of minimum requirements for useful artificial vision. We used simulations of
*
Corresponding author. Fax: +41-2238-28382.
E-mail address: [email protected] (J. Sommerhalder).
artificial vision with normal subjects to assess this issue.
Our simulations were designed to mimic artificial vision
produced by a retinal prosthesis, but some of the results
may also be of interest for prostheses located at other
levels of the visual pathways (e.g. stimulating the optic
nerve or the visual cortex).
In everyday life, current visual tasks can be divided
into three main classes: recognition of (small) shapes as
it is specifically required for reading, localisation of
objects in 3D familiar-scale environments and spatial
orientation including whole body mobility. All of them
have to be thoroughly studied to determine what is the
minimal visual information required to restore a useful
visual function. In this study, we focussed on reading.
Understanding the fundamentals of reading has received a lot of attention. One of the main research
0042-6989/03/$ - see front matter Ó 2002 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 4 2 - 6 9 8 9 ( 0 2 ) 0 0 4 8 1 - 9
270
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
centres in this field is the laboratory for low vision research at the University of Minnesota. These authors
have systematically studied various aspects of reading
psychophysics in normal subjects and low vision patients. For normal subjects, Legge, Pelli, Rubin, and
Schleske (1985) reported that maximum reading rates
are achieved for characters subtending 0.3°–2° of visual
angle; that reading rate increases with field size, but only
up to 4 characters, independently of character size; that
reading rates increase with sample density, but only up
to a critical density which depends on character size,
when the text is matrix sampled or pixelised. Reading
was also found to be very tolerant to either luminance or
colour contrast reductions (Legge & Rubin, 1986;
Legge, Rubin, & Luebker, 1987; Legge, Parish, Luebker,
& Wurm, 1990). At very low (<10%) luminance contrast
however, reading speed drops due to prolonged fixation
times and to an increased number of saccades, presumably related to a reduced visual span (Legge, Ahn,
Klitz, & Luebker, 1997). When testing the effect of print
size on reading speed in normal peripheral vision, it was
found that the use of larger characters improves peripheral reading to some extent, up to a critical print size
(Chung, Mansfield, & Legge, 1998). But maximum
reading speed also decreased from about 808 words/min
for foveal vision to about 135 words/min for peripheral
vision at 20° eccentricity. 1 Thus print size was not
found to be the only factor limiting maximum reading speed in normal peripheral vision, contradicting
the scaling hypothesis (Latham & Whitaker, 1996; Toet
& Levi, 1992) which implies that peripheral word recognition can be made equal to that at the fovea by increasing print size.
In low-vision patients, reading is similar to normal
reading in several aspects (Legge, Rubin, Pelli, & Schleske, 1985; Legge et al., 1990; Legge et al., 1997; Rubin &
Legge, 1989), but difficult to predict on the basis of
routine clinical evaluations (Legge, Ross, Isenberg, & La
May, 1992). As a rule however, it can be stated that lowvision patients with central field defects achieve lower
reading rates than those with preserved central fields
(Legge et al., 1985; Rubin & Legge, 1989).
The studies quoted above (as well as many others)
have led to the identification of a series of important
parameters that are critical for reading in normal and
low vision subjects. To our knowledge, there is however
only a limited number of studies, which were specifically
oriented towards the development of visual prostheses. Cha, Horch, Normann, and Boman (1992) used a
pixelised vision system to simulate artificial vision in
normal subjects. Their results showed that a 25 25
array of pixels representing 4-letters of text projected on
a foveal visual field of 1.7° is sufficient to provide
reading rates near 170 words/min using scrolled text,
and near 100 words/min using fixed text. This investigation was conducted within the frame of a project,
which aimed at developing a cortical visual implant
(Normann et al., 1999). Another research group, developing a retinal implant to stimulate remaining retinal neurons in photoreceptor degenerative diseases
(Humayun et al., 1999), conducted experiments on the
properties of pixelised vision at the Johns Hopkins
University of Baltimore (Dagnielie, Thompson, Barnett,
& Zhang, 2000; Thompson, Barnett, Humayun, & Dagnelie, 2000). Reading speed and facial recognition were
measured by simulating prosthetic vision in the central
visual field using a head mounted video display. Subjects
used eye movements to scan the stimuli through a
pixelising grid. Several grid parameters were explored.
Results demonstrated reading speeds up to 100 words/
min, which dropped off (a) when the grid size covered
less than 4 letters, (b) when a grid density of less than 4
pixels per letter width was used, or (c) when more than
50% of the pixels were randomly turned off.
In all previous experiments attempting to mimic
conditions of artificial vision, eye movements could
be used to scan a target with the fovea. However, the
anatomo-physiology of the retina does not favour a
foveal location for such prostheses (see e.g. Sj€
ostrand,
Olsson, Popovic, & Conradi, 1999). Retinal implants are
primarily designed to stimulate neurones of the inner
retinal layers in cases of photoreceptor loss (e.g. retinitis
pigmentosa). Surviving bipolar and/or ganglion cells are
the targets for electrical stimulation. In the central
fovea, these neurons are not present. In the parafovea,
they are arranged in several superimposed layers that
makes it difficult to activate them in predictable patterns. The best sites for retinotopic activation without
major distortion are located beyond the parafoveal region. Such eccentric locations as well as the fact that a
retinal implant will stimulate a fixed area of the retina
have apparently not yet been fully taken into consideration. The aim of the present research work was to
assess reading performance with a system projecting
stimuli onto defined, stabilised areas of the visual field
placed at various eccentricities. In experiment 1, we
studied the influence of stimulus pixelisation, stimulus
eccentricity and stimulus size on reading performance.
In experiment 2, we investigated whether the task of
eccentric reading under such specific conditions could be
improved by training.
1
Such high reading rates were achieved by using rapid serial visual
presentation (RSVP).
All subjects were normal volunteers, recruited from
the staff of the Geneva University Eye Clinic. Age ran-
2. General methods
2.1. Subjects
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
271
Fig. 1. The SMI EyeLink system. Three cameras are attached to a headband. Two cameras are recording eye movements. One camera is recording
IR light points from each corner of the screen to monitor head position relative to the screen. On this basis, the EyeLink software calculates online
the gaze position in screen coordinates. In this example, a 4-letter word is projected on a stabilised retinal area located at 5° eccentricity in the lower
visual field. The horizontal line crossing the screen represents a fixation aid for the subject (used in experiment 1).
ged from 25 to 47 years. All had normal or corrected to
normal visual acuity of 20/20 in the tested eye. All of
them were native French speakers or had perfect
knowledge of French.
All experiments were conducted according to the
ethical recommendations of the Declaration of Helsinki
and were approved by local ethical authorities.
2.2. Experimental set-up
To simulate visual percepts produced by a retinal
implant, images were projected on a defined and stabilised area of the retina. Target image stabilisation in the
visual field was achieved by online compensation of the
gaze position on a fast computer display using a high
speed video based eye and head-tracking system, the
SMI EyeLink Gaze tracking system (SensoMotoric Instruments GmbH, Teltow/Berlin, Germany; see Fig. 1).
The experimental set-up consisted of two computers
and a headband mounted measuring unit. The ‘‘subjectÕs
PC’’ (PIII-450 equipped with a Matrox G200 graphics
card) was used to generate the stimuli on a 22’’ ELSA
Ecomo 22H99 screen set to a resolution of 600 800
pixels at a refresh rate of 160 Hz. It was connected via
Ethernet to the ‘‘operatorÕs PC’’, a Compaq Deskpro EP
(Celeron-400), which contained the hardware and software to collect and compute the data from the three
head mounted cameras. Gaze position data in screen
coordinates were transmitted to the subject PC every 4
ms (250 Hz), and were available for further computing
with a time delay of less than 10 ms.
The system worked in the following way: gaze position was used to move the target stimuli (bitmap images)
on the stimulation screen according to the eye movements of the subject. Images could thus be steadily
projected onto a defined (central or eccentric) area of the
retina. A pilot study (Bagnoud, Sommerhalder, Pelizzone, & Safran, 2001) demonstrated that this experimental set-up allowed to accurately stabilise targets in
the visual field by online compensation of the gaze position.
2.3. Generation and presentation of the stimuli
Stimuli were presented in rectangular white areas
(viewing areas), which were filled with black 4-letter
words of common French language (including accented
characters and capital letters for proper names). We
used 4-letter words for our experiments because this was
considered to be the minimum letter-sequence allowing
close to normal reading speeds (Legge et al., 1985). 2 We
used the largest possible font size fitting within the
viewing area. We chose the proportionally spaced
2
Although maximum reading speeds would be favored by viewing
areas containing a higher number of characters, as quoted by several
authors, the presentation of only 4 letters allowed to use a large
character size favoring peripheral reading.
272
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
Position of gaze
tape
Fig. 2. Presentation of the 4-letter word ‘‘tape’’ illustrating the five
degrees of pixelisation used in experiment 1: (A) maximum screen resolution, (B) 875 pixels, (C) 286 pixels, (D) 140 pixels, (E) 83 pixels. The
pixel numbers indicate the total number of pixels in the viewing area.
Helvetica (i.e. Arial) font style because it is commonly
used for text printing, and has been shown to provide
good reading conditions to low vision subjects (Buultjens, Aitken, Ravenscroft, & Carey, 1999).
The stimuli used for the experiments were pre-processed bitmap images, which had been pixelised 3 (mosaic pixelisation) to simulate the reduced information
content due to a limited number of parallel processing
channels in retinal prostheses. Fig. 2 shows an example
of one of our stimuli at different pixelisations. Images
were generated using Adobe Photoshopâ 5.5 software.
The subjects were comfortably seated facing the screen
at an eye-to-screen distance of 57 cm. At this distance,
the 30 cm 40 cm surface of the screen subtends a visual
angle of 30° 40°, 1° corresponding to 20 screen pixels at
the screen resolution of 600 800. The camera monitoring the eye was positioned so that the pupil was clearly
visible and well defined at any gaze position. At the beginning of each run the eye to screen distance was
checked, adjusted if needed, and a standard 9-point calibration of the eye-tracker was performed. Then, a block
of 50 words, randomly chosen among a library of 500
common French 4-letter words was presented. Subjects
were requested not to move during the run.
Reference point for the stimulus eccentricity was the
centre of the viewing area. Eccentric stimuli were presented in the lower visual field (Fig. 3). This offered at
least two practical advantages: (1) the retinal eccentricity
of each letter varies less when a word is projected below
or above the fixation point than when projected to the
left or the right; and (2) the lower visual field is most
commonly used for eccentric reading (see e.g. Chung
et al., 1998). For each item of the run, the subject had to
say the word he/she recognised. The response (right or
wrong) was entered by the examiner into the operator
PC and stored for further analysis. After each single
word presentation, the calibration was checked for
3
Mosaic pixelisation (i.e. square pixels of uniform grey level) was
used. Such simple patterns were adequate to simulate the reduced
information content (e.g. finite quantisation) of the stimuli, but were
not intended to mimic the nature (e.g. profile, shape, colour, etc.) of
the perceptual pixels elicited by electrical activation of the retina.
Window moving
according to the
direction of gaze
22” Monitor
Fig. 3. The stimulation screen seen by the subject. The viewing area, a
white surface with black text, was moving on the screen according to
the direction of gaze and with a constant offset (eccentricity). The
background of the remaining screen area was in a grey colour corresponding to the mean grey level of the target windows. The viewing
area subtended in this case a visual field of 20° 7°.
possible drifts or artefactual movements, and slightly
corrected if needed, to insure an exact control of the
target image position during the entire experiment.
2.4. Data analysis
Reading performance was determined as the percentage score of fully recognised words out of each 50item block. Results expressed on such a proportional
percentage scale are, however, not suitable for statistical
analysis. It is well known that with proportional scales,
variance is not correlated with the mean. In other words,
the data are not normally distributed around the mean
and scale values are not linear in relation to the test
variability. One can solve this problem by using an arcsine transformation. Studebaker (1985) proposed to use
so-called ‘‘rationalised arcsine units’’ (rau), producing
values that are numerically close to the original percentage range, while retaining all of the desirable properties
of the arcsine transform. For example, for a sample size
of 50 responses, 0% correct corresponds to )16.5 rau,
50% to correct to 50 rau and 100% correct to 116.5 rau.
All data were statistically analysed using scores expressed in rau. On the right ordinates of the graphs,
however, and also for the description of the results,
values on the original percent-correct scale are indicated
for better clarity.
3. Experiment 1
3.1. Specific methods
In experiment 1, reading performance was assessed as
a function of a series of variables, each being potentially
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
an important parameter of prosthetic vision. First, the
number of contacts in the prosthesis: five different degrees of pixelisation were tested, viewing areas of maximum screen resolution, 875, 286, 140 and 83 pixels (see
also Fig. 2). 4 Second, retinal placement of the prosthesis: five different eccentricities in the lower visual field
were tested, 0°, 5°, 10°, 15° and 20°. Third, size of the
prosthesis: two viewing areas were investigated. The
large area, subtending a visual field of 20° 7°, allowed
the use of a print size greater than the critical print size
needed for optimal reading performance at 20° eccentricity (Chung et al., 1998). The height of a small letter ÔxÕ
used on this large viewing area corresponded to a visual
angle of 3.6°. The small area, subtending 10° 3:5°,
corresponded to a surface of approximately 3 mm 1
mm on the retina, and was used to represent the surface
of a smaller, possibly more realistic retinal prosthesis that
would be manageable surgically. The height of a small
letter ÔxÕ used on this small viewing area corresponded to
a visual angle of 1.8°. Note that using the same number of
pixels on both viewing areas implied that pixels were
larger (4 times) on the large viewing area.
All tests were conducted monocularly on five normal
volunteers. Each subject performed one run (consisting
of a 50-word block) in each condition. Testing always
started at the lowest eccentricity, using maximum screen
resolution first, then successively coarser resolutions.
Then, the same procedure was repeated using the next
eccentricity. Possible global learning effects would
therefore favour greater performance at low pixel numbers and high eccentricities. Each word was presented
during 3 s. When eccentric stimulus presentation was
used, a fixation aid (consisting of a horizontal red filament crossing the screen) was installed to make it easier
for the inexperienced subject to keep the target on screen.
3.2. Reading performance on the larger viewing area
(20° 7°)
Mean reading performance versus pixel number for
various eccentricities on the larger viewing area is presented in Fig. 4a. In central vision, reading performance
of 4-letter words was almost perfect (i.e. higher than 90%
correct) for pixelisations down to 286 pixels. At lower
resolution, reading performance dropped abruptly. This
result indicates that approximately 300 pixels are necessary to transmit the relevant information under optimal conditions. In peripheral vision, maximum reading
performance decreased with increasing eccentricity. At
an eccentricity of 10°, almost perfect reading (better
4
â
Mosaic pixelisation in Adobe Photoshop reduces screen resolution by an integer factor. Therefore in our experimental conditions 875
pixels can be viewed as an array of 50 17:5 pixels, 286 pixels as
28:6 10 pixels, 140 as 20 7 pixels and 83 pixels as 15:4 5:4 pixels.
273
(a)
120
100
110
100
90
90
80
80
70
70
60
50
40
30
60
50
40
30
20
10
0
-10
20
0°
5°
10°
15°
20°
10
0
10000
1000
100
Number of pixels in the viewing area
(b)
10° x 3.5° viewing area
120
100
110
100
90
90
80
80
70
70
60
50
40
30
60
50
40
30
20
10
0
-10
20
0°
5°
10°
15°
20°
10
0
10000
1000
100
Fig. 4. Performance for single 4-letter word reading versus number of
pixels in a stabilised viewing area of (a) 20° 7° and (b) 10° 3:5°.
Mean reading scores in rationalised arcsine units SEM (left scale)
and in percent (right scale) for five normal subjects at five eccentricities
in the lower visual field. At maximum screen resolution, the large
viewing area contained 4 times more pixels than the small viewing area.
Otherwise, tests were performed at equal pixel resolution for both
viewing areas.
than 90% correct) was still possible at high pixel resolutions. At eccentricities of 15° and 20°, almost perfect
reading was never achieved even at high resolutions.
Maximum reading performance was limited to values of
88% and 63% correctly read words, respectively.
3.3. Reading performance on the smaller viewing area
(10° 3:5°)
Mean reading performance versus pixel number for
various eccentricities on the smaller viewing area is
274
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
presented in Fig. 4b. In central vision, results with the
smaller viewing area were very similar to those with the
larger area. Reading performance of 4-letter words was
almost perfect (or higher than 90% correct) for pixelisations down to 286 pixels. Then, it dropped abruptly.
The same limiting criterion of about 300 pixels was
found to transmit the relevant information. In peripheral
vision, the decrease in maximum reading performance
with increasing eccentricity was more pronounced than
on the larger viewing area: at eccentricities of 10°, 15°
and 20°, maximum reading performance was limited to
values of 89%, 57% and 30% correct, respectively.
(a)
3.4. Normalised data on both viewing areas
The raw observations presented in Fig. 4 demonstrate
that both number of pixels and eccentricity of the
stimuli affected reading performance of 4-letter words in
our experiments. In order to compare the effect of
the pixel number at different eccentricities, we normalised the data to the values obtained at maximum screen
resolution (Fig. 5). These normalised data demonstrate
that the pixel number affected reading performance very
similarly at all eccentricities and on both viewing areas.
This result is consistent with the fact that the number of
pixels influences directly the information content of the
source image. The eccentricity of the stimulus, however,
seems to affect the way information is processed by the
visual system, and appears to limit maximum reading
performance.
(b)
3.5. Single letter recognition versus 4-letter word reading
At eccentricities of 10° and more, most subjects
spontaneously reported that they had problems recognising letters occurring in the middle of the words.
This suggested that letters closely flanked by others
were more difficult to identify. To check this point, we
designed an additional experiment, using isolated letter
stimuli instead of 4-letter words. In brief, isolated single
letters of identical font type and size as for the word
experiments were presented on the small viewing area.
The overall surface of the viewing area did not change
and it contained the same total number of pixels as for
the word experiments. Blocks of 50 letters, chosen
among the French alphabet and according to their frequency of use in our pool of 500 words, were randomly
presented to five new subjects. The results of this additional experiment are presented in Fig. 6. Up to an
eccentricity of 15°, isolated letter recognition was almost independent of eccentricity. At 20° eccentricity,
maximum letter recognition was still about 90% correct
for high pixel resolutions. Fig. 7 compares reading
performance of isolated letters to that of 4-letter words
Fig. 5. Normalised reading performance for single 4-letter words
versus number of pixels in a stabilised viewing area of (a) 20° 7° and
(b) 10° 3:5°. Mean relative reading scores SEM for five normal
subjects at five eccentricities in the lower visual field. The data are
normalised to the mean reading performance values at maximum
screen resolution.
at 286 pixel resolution on the small viewing area. It
appeared that isolated letter recognition was much
less affected by eccentricity than word reading. In an
attempt to compare both results, we computed, as a first
approximation, the intrinsic probability to correctly
identify 4 isolated letters in a successive sequence. This
rough estimation still falls short to account for the very
low scores observed in the word-reading task. The
two tasks are difficult to compare quantitatively in an
accurate manner, but this observation suggests that
4-letter word reading was significantly reduced at high
eccentricities by the fact that the letters to be recognised
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
Fig. 6. Performance in isolated letter recognition versus number of
pixels in the stabilised viewing area of 10° 3:5° at five eccentricities in
the lower visual field. Mean letter recognition scores in rau SEM (left
scale) and in percent (right scale) for five normal subjects.
275
Finally, it should also be noted, that at high eccentricities, isolated letter recognition was significantly
better at a target resolution of 875 pixels than at maximum screen resolution (Fig. 6; p ¼ 0:016 at 15° eccentricity, p ¼ 0:018 at 20° eccentricity). The data for 4letter word reading on the small viewing area (Fig. 4b)
show the same trend. This finding may indicate, that at
high eccentricities a certain blur of the target (due to
pixelisation) leads to better performance, if the letter size
is below the critical print size. A recent study by Li,
Nugent, and Peli (2001) compared letter recognition of
jagged (pixelised) and smoothed (anti-aliased) letters on
a CRT display. They found no significant difference
between the two conditions in peripheral vision up to
12.5° eccentricity. While the stimuli used by Li and coworkers are not exactly comparable with the stimuli we
used, the present result suggests that observable differences may appear only at higher eccentricities (15° and
more).
4. Experiment 2
120
100
110
100
90
90
80
80
70
70
60
50
40
30
60
50
40
30
20
20
isolated single letters
probabilistic estimate to recognize
4-letters in sequence
4-letter words
10
0
-10
10
0
0
5
10
15
20
Fig. 7. Reading performance versus eccentricity for stimuli presented
on the smaller viewing area (10° 3:5°) containing 286 pixels. Results
of isolated letter recognition are compared to results of 4-letter word
reading. Mean performance in rau SEM (left scale) and in percent
(right scale) for five normal subjects. The dotted line indicates the
probabilistic prediction to recognise 4-letters in sequence on the basis
4
of the probability to recognise single isolated letters (pwords ¼ pletters
).
Note that this simple estimate indicates a lower limit (e.g. some words
may be identified without requiring recognition of all 4-letters).
are flanked by others. The ‘‘crowding effect’’ 5 (Tychsen, 1992) may be the underlying fundamental mechanism.
5
Increased difficulty in recognising words made up of closely spaced
letters, when presented in the peripheral visual field.
Results collected in experiment 1 might underestimate possible performances, especially at high eccentricities, because normal subjects were not used to
eccentric reading. Blind patients, potential recipients of
retinal implants, will have time to fully adapt to the use
of their prosthesis. We therefore investigated in experiment 2 the effect of training on eccentric reading.
4.1. Specific methods
We attempted to choose a test condition mimicking
as closely as possible a realistic retinal prosthesis. According to Sj€
ostrand et al. (1999) a radial one-to-one
connection between photoreceptors, bipolar cells and
ganglion cells is not guaranteed for eccentricities smaller
than about 10°. Retinotopic activation without major
distortion is essential if one wants to avoid complex preprocessing of the light falling on the retina. We therefore
decided to investigate the effects of training on a viewing
area placed at 15° eccentricity in the lower visual field
(corresponding to a surgically as well as physiologically
favourable location on the retina). We chose the smaller
viewing area of 10° 3:5° (corresponding to a surgically
manageable surface of 3 1 mm2 on the retina) containing 286 pixels (corresponding to a number of pixels
allowing close to perfect word recognition in central
vision and representing a number of contacts which is
manageable with present technology). Under such conditions in experiment 1, subjects could correctly identify
between 20% and 48% of the words. This level of performance was clearly above chance level, but insufficient
to provide useful function. For comparison, two
276
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
additional experimental conditions on the same viewing
area were used: (a) stimuli containing the same number
of pixels (286 pixels), but presented at an eccentricity of
0° (central reading); (b) stimuli presented at the same
eccentricity (15°), but containing 14,000 pixels (maximum screen resolution).
Two young female subjects, both 27 years old, participated in this experiment. 6 Subject EO performed all
tests in binocular condition, whereas Subject AR performed all tests in monocular condition. 7 They had not
participated in any of the previous studies on eccentric
reading.
Three experimental sessions were conducted each
working day of the week (5 days a week). Each session
included one run (consisting of a 50-word block) in each
of the following three successive conditions: first, 286
pixels at 0° eccentricity; second, 14,000 pixels at 15°
eccentricity; and third, 286 pixels at 15° eccentricity.
Thus, the easiest condition was tested first, and the most
difficult last, so that possible within-session learning
effects would favour results in the most difficult condition. Each experimental session lasted about 20 min, the
three sessions representing about 1 h of daily training. A
total of 69 sessions were conducted with each subject.
Except for weekends, the regular daily flow of sessions
was interrupted once (AR), or twice (EO), for 3-day
vacations.
The presentation time of each stimulus was limited to
10 s, but subjects were instructed to press a key as soon
as they had recognised the projected word. The response
time was recorded together with the nature of the subjectÕs response (ÔrightÕ or ÔwrongÕ). At the end of each
run, reading performance (expressed in number of correctly recognised words) and mean response time (expressed in seconds, on the basis of all 50 trials) were
automatically computed.
Learning curves were established for each experimental condition on the basis of reading scores and also
on the basis of mean response time. Data were fitted
using the non-linear regression function
y ¼ y0 þ að1 ebx Þ:
To determine if time (expressed as session number)
had a statistically significant effect on performance we
used a simple linear correlation (PearsonÕs correlation).
4.2. Learning effects on eccentric reading of pixelised
words
Fig. 8 presents reading performance versus session
number in the main condition (286-pixel resolution at
15° eccentricity). We observed impressive learning effects on both subjects. Both began the experiment with
low reading scores and improved over time by about
60% points. These improvements were highly statistically significant (PearsonÕs correlation: r ¼ 0:80, p <
0:0001 for EO and r ¼ 0:86, p < 0:0001 for AR).
Experimental data were fitted with the exponential
function presented in the methods section to average
session-to-session variability. The fits revealed some
noticeable individual differences. At the very beginning
of the learning period, subject EO was able to identify
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Session #
AR - monocular viewing
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
0
6
Since the main goal of the experiment was to show in a general
way, that performance in eccentric reading can be improved by
training, 2 subjects were estimated to be sufficient to demonstrate such
an effect.
7
Retinal implants will certainly essentially be used monocularly. It
was, however, interesting to compare monocular to binocular learning
in normal subjects who generally use binocular vision.
0
10
20
30
40
50
60
70
Fig. 8. Performance in reading 4-letter words versus session number at
15° eccentricity and using a viewing area containing 286 pixels. The
solid line indicates the best fit to the data.
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
about 23% of the words, whereas subject AR identified
only 6% of words. Both progressed over time. During
final sessions, subject EO achieved scores of about 85%
correctly read words, and subject AR scores of about
64% correctly read words. EOÕs scores asymptoted for
the last 15 sessions, while ARÕs scores never reached a
clear asymptote. It should however be noted that EO
was tested in binocular condition and AR in monocular
condition. Although overall improvements were similar
for both of them, these differences in absolute scores as
well as in the learning curve might thus reflect a binocular advantage.
A second experimental observation consistent with a
learning process appears in the analysis of the mean
response time (Fig. 9). During initial sessions, typical
response times ranged between 3 and 5 s. During final
6
5
4
3
2
1
0
0
10
20
30
40
50
6
0
Session #
277
sessions, typical response times decreased to 2–3 s. Experimental data were fitted to average session-to-session
variability. The fits revealed that both subjects significantly reduced the mean response time as the session
number increased (PearsonÕs correlation: r ¼ 0:79,
p < 0:0001 for EO and r ¼ 0:38, p ¼ 0:001 for AR).
The reduction in response time was more pronounced
for subject EO tested binocularly (2.4 s), than for subject
AR tested monocularly (0.6 s). Interestingly, the longer
initial response times of subject EO were also associated
with better initial reading performances, suggesting
inter-individual differences in the strategies used to
perform this difficult task.
This analysis, based on all, correct and incorrect responses, reflects best the effects of the global learning
process, but it does occult the time subjects took to read
the words they recognised correctly. An analysis based
on correct responses only (dashed lines in Fig. 9) revealed that mean response times for correctly read
words were generally shorter. In this case, only subject EO significantly reduced her correct response
time (PearsonÕs correlation: r ¼ 0:68, p < 0:0001). AR
shows the same trend, but this effect is not significant on
her data.
Taken together, these data clearly indicate that important improvements in performance can be obtained
by training. Both subjects progressed from a poor to a
relatively useful visual function during the course of this
experiment. This suggests that future users of visual
prostheses will need time to extract best performances
from these devices, as it is the case for deaf users of
cochlear implants (see e.g. Pelizzone, Cosendai, &
Tinembart, 1999). It is interesting to explore in more
detail some of the parameters influencing this learning
process.
AR - monocular viewing
4.3. Influence of eccentricity and pixel number on the
learning process
6
5
4
3
2
1
0
0
10
20
30
40
50
6
0
Fig. 9. Mean presentation time in reading 4-letter words versus
training session number at 15° eccentricity in the lower visual field and
using a viewing area containing 286 pixels. The solid lines indicate the
best fits to the data. The dashed lines indicate the best fits to the data
when only correct responses are taken in consideration.
Fig. 10 shows the data collected using the two additional experimental conditions mentioned in the method
section.
The influence of adaptation to read pixelised stimuli
is demonstrated by analysing data collected using the
same number of pixels, but presented via central reading
(0° eccentricity). As expected from experiment 1, central
reading performance of pixelised words was close to
perfect for both subjects. Although both subjects slightly
improved their central reading performance with time,
this improvement was small relative to the overall improvements observed upon eccentric reading. Hence, the
adaptation process to decipher pixelised words had only
a weak influence on reading performance.
The influence of adaptation to eccentricity is demonstrated by analysing data collected at the same eccentricity (15°), but using stimuli containing 14,000 pixels
278
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Session #
AR - monocular viewing
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Fig. 10. Performance in reading 4-letter words versus training session
number in the two control conditions: (1) in central vision using a
viewing area containing 286 pixels (grey triangles), and (2) at 15° eccentricity in the lower visual field using a viewing area at maximum
screen resolution (white diamonds). The solid lines indicate the best fit
to the data in these two conditions. For comparison, the dashed line of
the best fit to the data in the main condition is also shown.
eccentricity because scores collected at that eccentricity
asymptoted clearly below those collected with central
reading. This shows that providing more resolution can
improve performance to some extent, but does not entirely compensate for the loss due to eccentricity.
4.4. Influence of familiarisation with the word set
Subjects were confronted daily for more than one
month with the same finite set of 500 words. One might
wonder if they improved their identification performance simply because they were progressively learning
the set of possible correct answers.
We tested this issue by generating an additional pool
of 200 new 4-letter words. None of the words in the new
set had been used previously. Fifty word blocks were
randomly extracted from this new set and presented to
the subjects in testing sessions that occurred after the
end of the main experiment. All other aspects of testing
were exactly identical to that of the main experiment.
Fig. 11 compares mean reading performances using the
new word set to the data collected using the old 500
word-set throughout the experiment. For both subjects,
reading performance using the new word set was significantly higher (EO: p ¼ 0:002; AR: p < 0:001) than
the performance measured at the beginning of the main
experiment. Reading performance using unpractised
words was only slightly lower than that reached at the
end of the main experiment (EO: p ¼ 0:2; AR: p ¼ 0:03).
This demonstrates that the benefits derived from training with one set of words could be exploited to decipher
new, unpractised words.
One can conclude from this additional test that repeatedly using the same pool of words did not significantly bias our results. Thus, observed improvements
over time were actually real improvements in accomplishing the demanded task. Interestingly, however, familiarisation with the word pool was important for
reading speed. Indeed, pixelised words, familiar to the
reader, were recognised slightly more rapidly than unpractised words presented in the same conditions.
4.5. Binocular versus monocular perceptual learning
(maximum screen resolution). For both subjects, the
progression of performance over time was similar to that
observed using 286 pixel stimuli. Fits to the experimental data demonstrate that the performance at maximum screen resolution is about 10%–20% higher, the
effect of resolution being apparently slightly more pronounced at the end of the experiment. These results
confirm that adaptation to eccentric reading was the
principal component of the overall learning process.
One can conclude from these control conditions that
habituation to eccentricity is presumably a dominant
component in the learning process. It is interesting to
note that perfect performance was never achieved at 15°
Subject EO performed all tests using binocular vision
while subject AR performed all tests in monocular vision. The final scores reached by subject EO were about
20% higher than those of subject AR (see Fig. 8). Two
important questions can be raised: Is this difference reflecting an advantage of binocular vision? What would
reading scores be if subjects would be asked to perform
the task in the condition they did not use previously?
At the end of experiment 2, we measured on subject
EO reading scores in monocular viewing condition, and
on subject AR reading scores in binocular viewing
condition, using the same testing conditions as in our
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
10
0
EO
100
90
Trained cond.
80
70
60
50
40
30
20
Binocular
Untrained cond.
90
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
Trained cond.
100
Untrained cond.
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
279
80
70
60
50
40
30
20
10
0
AR
At the beginning of the main experiment
At the end of the main experiment
Fig. 11. Mean reading performance with unpractised 4-letter words
compared to results at the beginning and the end of the main experiment. Bars indicate mean values SD in three conditions: (1) first
three runs and (2) last three runs of the main experiment compared to
(3) three additional runs with unpractised words. Experimental conditions: 15° eccentricity using a viewing area of 10° 3:5° containing
286 pixels.
main experiment. No significant differences in reading
performance were found for both subjects in such ‘‘reversed’’ conditions (Fig. 12). Inter-individual differences
in reading scores were preserved. This indicates that the
condition, in which the perceptual learning of eccentric
reading was conducted, is not relevant. Training with
binocular reading benefits subsequent monocular reading and, conversely, training with monocular reading
benefits subsequent binocular reading There was, however, a slight, but un-significant, within-subject trend to
better scores with binocular vision. This small advantage
was possibly due to effects of binocular summation or
inter-ocular suppression.
We were also interested to know if perceptual learning gathered with one eye transfers to the non-habituated eye. Subject AR, who did all the tests monocularly
with her dominant right eye, was therefore retested
using her non-dominant left eye in all three different
experimental conditions. Fig. 13 shows clearly that there
is no significant difference in monocular reading performance across both eyes.
4.6. Persistence of perceptual learning
Finally, we were interested to investigate if the benefits of perceptual learning of eccentric reading could
persist after a significant period of non-practice.
For a 2 months period after the end of the experiments, subject EO did not participate in any testing. Her
reading performance was then re-tested using the main
experimental condition. Table 1 demonstrates that there
Fig. 12. Effects of using reversed viewing conditions (untrained versus
trained). Bars indicate mean values SD. Subject EO: three additional
runs in monocular condition (untrained) versus the last three runs of
the main experiment in binocular condition (trained). Subject AR:
three additional runs in binocular condition (untrained) versus the last
three runs of the main experiment in monocular condition (trained).
Experimental conditions: 15° eccentricity using a viewing area of
10° 3:5° containing 286 pixels.
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
0
0°/286 pixels
15°/max. pixels
15°/286 pixels
Trained right eye
Fig. 13. Comparison of reading performances between the trained and
the untrained eye at the end of the training process for subject AR.
Bars indicate mean values SD. For each condition, the last three runs
of the main experiment are compared to three additional runs conducted on the untrained eye.
was no significant change in performance after 2 months
of non-practice. This indicates that perceptual learning
of eccentric reading is at least preserved for a certain
time.
5. Discussion
This first study was designed to explore reading
performance in conditions mimicking artificial vision.
280
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
Table 1
Reading performance after two months of rest compared to the
reading performance at the end of training for subject EO
Mean reading performance
At the end of the experiment (day 36)
Two months after completion of the
experiment (day 99)
4-Letter words read (EO)
Rau
SD
Percent
87.2
85.6
9.8
1.5
88.0
85.9
Mean values are calculated on the basis of three runs: last three runs of
the main experiment (day 36) and three additional runs conducted at
day 99, two months after completion of the experiment. Experimental
conditions: 15° eccentricity using a viewing area of 10° 3:5° containing 286 pixels in binocular vision.
Several aspects of the experimental set-up deserve to be
discussed.
We used 4-letter word stimuli, because Legge et al.
(1985) initially demonstrated that this was the minimum
letter-sequence allowing close to normal reading speeds.
However, subsequent studies demonstrated that seeing
more than 4 letters at a glance could yield better reading
speeds (Beckmann & Legge, 1996; Fine, Kirschen, &
Peli, 1996; Fine & Peli, 1996). Since the aim of this study
was to simulate retinal implants of a relatively small
area and containing a finite number of stimulation
contacts, the use of a small number of letters was an
advantage: (1) A small number of letters permitted to fill
the restricted viewing area with large letters. The letter
size is an important limiting factor if one wants to investigate eccentric reading (see also Chung et al., 1998).
(2) Recent work by Thompson et al. (2000), and Dagnielie et al. (2000), demonstrated that a grid density of
about 4 pixels per letter width is needed to allow for
accurate character definition. This limited the number of
characters that could be presented via a finite number of
stimulation contacts. (3) The visual span is another
important limiting factor in eccentric reading. In a recent study, Legge, Mansfield, and Chung (2001) estimated that the average visual span decreases from at
least 10 letters in central vision to about 1.7 letters at 15°
eccentricity, this figure however increasing somewhat
with prolonged observation time. For those reasons, 4letter word stimuli represented an adequate compromise
for our experimental purpose.
We decided to conduct our experiments using a
proportionally spaced font. Proportionally spaced fonts,
which place letters closer together than equally spaced
ones, favour the crowding effect and would therefore be
less convenient for readers who are restricted to use
eccentric locations of their retina. However, books,
journals and most printed matters are almost exclusively
printed in such proportional fonts. It was mandatory to
adapt our simulations to this reality. Attempts to
modify letter spacing might improve performance, as
suggested by several authors (Arditi, Knoblauch, &
Grunwald, 1990; Latham & Whitaker, 1996; Toet &
Levi, 1992). This would however imply additional special hardware, which is too speculative to be considered
at this point. Furthermore, Chung (2002) concluded a
recent study, in which she used an equally spaced
Courier font, with the sentence: ‘‘Increased letter spacing beyond the standard size, which presumably decreases the adverse effect of crowding, does not lead to
an increase in reading speed in central or peripheral
vision’’. Hence, the effect of letter spacing on eccentric
reading is still controversial.
Finally, we used fixed text to present the stimuli. The
use of different presentation methods (e.g. scrolled text
or RSVP) might have increased reading speed. However,
none of these methods does really mimic the use of a
retinal prosthesis. Fixed text was the simplest condition
to be tested and we acknowledge this limitation in our
present work. More realistic experiments using full-page
navigation, as well as other modes of pixelisation, are
underway and will be reported soon.
Under these experimental conditions, experiment 1
clearly showed that about 300 pixels were necessary to
appropriately code 4-letter words. These data replicate
in part the work of Cha et al. (1992), using 4-letter
words. They extend their findings, because their experiments were limited to a central visual field of 1.7°, and
subjects were allowed to scan the image with eye
movements. About 300 pixels appear therefore to be an
intrinsic limit that is related to the type of stimulus (4letter words) more than to the presentation protocol.
Implantable microelectrode arrays consisting of about
300 active contacts seem feasible using present technology. Zrenner et al. (1997) as well as Peyman et al. (1998)
have already manufactured such first prototypes. Our
simulations attempted to mimic an implantable chip
covering a surface of about 3 1 mm2 on the retina
with an electrode-to-electrode separation of approximately 100 lm. Multi-site stimulation measurements on
chicken retinae have demonstrated that such closely
spaced contacts can selectively activate retinal neurons
(Stett, Barth, Weiss, Haemmerle, & Zrenner, 2000).
The amount of information that can be transmitted
via about 300 stimulation contacts is however really
useful only if projected onto the central part of the
visual field. As our study demonstrates, reading performance drops severely at eccentricities of 10° and
beyond, even if more pixels are used. At high eccentricities, the main factor limiting reading performance is
not the pixel number, but the fact that only part of the
information content of the stimuli can be grasped by the
subject. The smallest character size we used corresponded to a visual acuity of less than 20/250. The visual
acuity at eccentricities of 15°–20° is expected to be much
better (Cowey & Rolls, 1974; Daniel & Whitteridge,
1961). Hence, the low performance observed at high
eccentricities could not be attributed to decreased resolution in the periphery. This was confirmed by the fact
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
that eccentric recognition of single letters was much
better than eccentric reading of entire words (experiment
1). Reduced discrimination in presence of surrounding
stimuli due to the so-called ‘‘crowding effect’’ seems to
be a better explanation for low reading performance at
high eccentricities.
Strictly speaking, on basis of the results collected in
experiment 1, an eccentric position (>10°) of a retinal
implant would strongly impair reading performance.
There are however strong practical arguments suggesting that it might be required to place these implants at
such high eccentricities. As already mentioned, morphological studies of the neuronal architecture of the
retina, like those by Sj€
ostrand et al. (1999) and others,
show that a direct vertical connection between photoreceptor, bipolar cells and ganglion cells is best realised
in retinal areas beyond 10° eccentricity. Close to the
fovea, several layers of bipolar and ganglion cells are
superimposed. At up to about 10° of eccentricity, one
cone may be connected to several ganglion cells, and
ganglion cells are displaced radially from the photoreceptors they innervate. This distortion decreases with
eccentricity. Beyond 10° of eccentricity the distortion is
minimal. Such eccentric regions of the retina are therefore much better suited for retinotopic electrical stimulation. This mapping issue is of special importance for
retinal implants that are designed to use in situ light
falling on the retina. Such prostheses are presently developed by a German (Zrenner et al., 1999) and a US
(Chow & Chow, 1997) consortium. This type of device
would be the most elegant approach, if successful, but it
does not really afford for pre-processing to prevent nonretinotopic mapping. If other systems using an external
camera to capture the stimuli are considered, such as
those envisioned in the projects of Humayun et al.
(1999) or Rizzo and Wyatt (1997), the transmitting
hardware could possibly include remapping routines.
Although this is technically conceivable, it might require
prohibitive amounts of perceptual tests for adjustment.
For these reasons, we are convinced that it would be
optimal to try to place a retinal implant beyond 10° of
eccentricity in a first attempt.
These considerations raised the question, as to whether subjects could adapt to eccentric reading. Improvements in the accomplishment of tasks, involving
stimuli presented in peripheral vision, have already been
reported by several authors to be task-specific. For example, learning has been observed for vernier acuity and
bisection, for stereoscopic orientation and time discrimination tasks, but not for resolution tasks or Landolt C acuities (e.g. Beard, Levi, & Reich, 1995; Crist,
Kapadia, Westheimer, & Gilbert, 1997; Schoups, Vogels, & Orban, 1995; Westheimer, 2001). Taken together
these findings imply that spatial visual functions, which
rely on important processing in higher cortical areas,
can be improved by training in the visual periphery. In
281
particular the ‘‘crowding effect’’ seems to be of cortical and not of retinal origin (e.g. Levi, Klein, &
Aitsebaomo, 1985). Electrophysiological experiments in
the monkey, monitoring the functional properties of the
primary visual cortex area V1, suggest that perceptual
learning is accompanied by a decrease of the ‘‘crowding
effect’’ (Crist, Li, & Gilbert, 2001). Moreover, a paper by
Leat, Li, and Epp (1999) states that the ‘‘crowding effect’’ also includes an important component of attention; 8 this component being potentially improved by
training, as indicated by experiments on contour interaction (Manny, Fern, Loshin, & Marinez, 1988) or visual search (e.g. Sireteanu & Rettenbach, 1995, 2000).
There is also extensive evidence in the low vision literature that educational training (e.g. in the use of optical
aids) is an important factor for successful eccentric
reading by patients with macular scotoma (see e.g. Peli,
1986). In some cases, greatly improved reading capacities were already observed with as little as about 5 h of
training (Nilsson, 1990). However, the conditions encountered by low vision patients are markedly different
from those expected from users of retinal implants. Low
vision patients are generally able to use large parts of
their retina, situated relatively close to the fovea, while
the stimuli used in this study were restricted to a small
area, stabilised at a high eccentricity in the lower visual
field and pixelised. It was therefore important to test
if training could improve performance in conditions
mimicking retinal implants.
Experiment 2 was especially designed to investigate if
eccentric reading, under conditions simulating a retinal
implant, could be improved by learning or if it would be
limited by fundamental properties of the visual system.
The two subjects tested in this study, demonstrated
clearly that they were able to adapt, and their performance improved impressively over time. While more
subjects would be needed to better quantify the average
amount of improvements that can be expected, two
subject were sufficient to demonstrate the existence of
learning. Control measurements revealed that this type
of learning: (1) was not an artefact due to the progressive memorisation of the set of possible answers, and (2)
it was not specific to the trained eye and could be
transferred to the untrained eye. This latter finding is in
contrast to observations for which learning was restricted to the trained condition, with little or no
transfer to the non-trained aspects of the stimulus or to
the other eye (Karni & Sagi, 1991; Poggio, Fahle, &
Edelman, 1992). Complete interocular transfer, as observed here, favours perceptual learning mechanisms occurring in higher-order, binocular areas, as for
8
Attention, when directed towards the eccentric retinal locus,
reduces attentional effects of crowding.
282
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
example suggested for motion direction discrimination
(e.g. Ball & Sekuler, 1987; Schoups et al., 1995; Schoups
& Orban, 1996). An alternative explanation might be the
involvement of high-level attentional or other cognitive
mechanisms, which modulate the specific levels of early
visual processing, as suggested by Ahissar and Hochstein (1993, 1996) or Beard et al. (1995). We also found
that perceptual learning of eccentric reading was at least
maintained for a period of two months after completion
of the training. Persistence of perceptual learning over
periods of several months has been found, for example
by Fiorentini and Berardi (1981) in grating waveform
discrimination, by Ball, Beard, Roenker, Miller, and
Griggs (1988) or Sireteanu and Rettenbach (2000) for
visual search tasks, or by Beard et al. (1995) for vernier
and resolution acuity.
On the basis of the present study, it is not possible to
determine the importance of the different factors influencing on the learning process. Are better performances
mainly due to a better control of undesirable reflexive
eye movements during the experiments, or are they due
to a decrease in the ‘‘crowding effect’’? Both effects are
probably in close relation. Preliminary results on fullpage text reading under conditions simulating an eccentric retinal implant indicate that the suppression of
undesirable reflexive eye movements plays a dominant
role in the learning process.
6. Conclusion
Based on these results, it appears that functional
retinal implants with a few hundred stimulation contacts
might successfully restore some reading abilities to blind
patients, even if placed outside the fovea. Optimal performance with such devices will however require a significant adaptation process. As future users of retinal
implants will have to wear their prosthesis permanently,
we expect them to benefit even more from adaptation
than the normal subjects in our simulation experiment.
Our present results are in this respect very encouraging
for the future.
Additional research on eccentric reading of whole
page texts in similar conditions is required, as well as
studies focusing on other important visual tasks such as
spatial orientation (mobility) and spatial localisation
(visuo-motor coordination), to get a more complete
picture of the potential benefits that could be derived
from retinal prostheses.
Acknowledgements
The presented work was supported by the Swiss
National Foundation for Scientific Research (grant
3100-61956.00) and the ProVisu Foundation. The authors thank Dr. Andrew R. Whatham for his critical
reviewing of this manuscript. Results of experiment 1
have already partially been presented at ARVO (IOVS
2000; 41/4: S436).
References
Ahissar, M., & Hochstein, S. (1993). Attentional control of early
perceptual learning. Proceedings of the National Academy of
Sciences, 90, 5718–5722.
Ahissar, M., & Hochstein, S. (1996). Learning pop-out detection:
specificities to stimulus characteristics. Vision Research, 36, 3487–
3500.
Arditi, A., Knoblauch, K., & Grunwald, I. (1990). Reading with fixed
and variable character pitch. Journal of the Optical Society of
America A, 7, 2011–2015.
Bagnoud, M., Sommerhalder, J., Pelizzone, M., & Safran, A. B.
(2001). Information visuelle necessaire a la restauration dÕune
lecture au moyen dÕun implant retinien chez un aveugle par
degenerescence massive des photorecepteurs. Klinische Monatsbl€atter f€ur Augenheilkunde, 218, 360–362.
Ball, K., & Sekuler, R. (1987). Direction-specific improvement in
motion discrimination. Vision Research, 27, 953–965.
Ball, K. K., Beard, B. L., Roenker, D. L., Miller, R. L., & Griggs, D.
S. (1988). Age and visual search: expanding the useful field of view.
Journal of the Optical Society of America A, 5, 2210–2219.
Beard, B. L., Levi, D. M., & Reich, L. N. (1995). Perceptual learning in
parafoveal vision. Vision Research, 35, 1679–1690.
Beckmann, P. J., & Legge, G. E. (1996). Psychophysics of reading.
XIV. The page navigation problem in using magnifiers. Vision
Research, 36, 3723–3733.
Buultjens, M., Aitken, S., Ravenscroft, J., & Carey, K. (1999). Size
counts: The significance of size, font and style of print for readers
with low vision sitting examinations. British Journal of Visual
Impairment, 17, 5–10.
Cha, K., Horch, K. W., Normann, R. A., & Boman, D. K. (1992).
Reading speed with a pixelised vision system. Journal of the Optical
Society of America A, 9, 673–677.
Chow, A. Y., & Chow, V. Y. (1997). Subretinal electrical stimulation
of the rabbit retina. Neuroscience Letters, 225, 13–16.
Cowey, A., & Rolls, E. T. (1974). Human cortical magnification factor
and its relation to visual acuity. Experimental Brain Research, 21,
447–454.
Crist, R. E., Kapadia, M. K., Westheimer, G., & Gilbert, C. D. (1997).
Perceptual learning of spatial localization: specificity for orientation,
position and context. Journal of Neurophysiology, 78, 2889–2894.
Crist, R. E., Li, W., & Gilbert, C. D. (2001). Learning to see:
experience and attention in primary visual cortex. Nature Neuroscience, 4, 519–525.
Chung, S. T. L. (2002). The effect of letter spacing on reading speed in
central and peripheral vision. Investigative Ophthalmology and
Visual Science, 43, 1270–1276.
Chung, S. T. L., Mansfield, J. S., & Legge, G. E. (1998). Psychophysics
of reading. XVIII. The effect of print size on reading speed in
normal peripheral vision. Vision Research, 38, 2949–2962.
Dagnielie, G., Thompson, R. W., Barnett, G. D., & Zhang, W. Q.
(2000). Visual perception and performance under conditions
simulating prosthetic vision. Perception, 29, 84 (Abstract).
Daniel, P. M., & Whitteridge, D. (1961). The representation of the
visual field on the cerebral cortex in monkey. Journal of Physiology,
159, 203–221.
Dobelle, W. H. (2000). Artificial vision for the blind by connecting a
television camera to the visual cortex. ASAJO Journal, 46, 3–9.
J. Sommerhalder et al. / Vision Research 43 (2003) 269–283
Fine, E. M., & Peli, E. (1996). Visually impaired observers require a
larger window than normally sighted observers to read from a
scroll display. Journal of the American Optometric Association, 67,
390–396.
Fine, E. M., Kirschen, M. P., & Peli, E. (1996). The necessary field of
view to read with an optimal stand magnifier. Journal of the
American Optometric Association, 67, 382–389.
Fiorentini, A., & Berardi, N. (1981). Learning in grating waveform
discrimination: specificity for orientation and spatial frequency.
Vision Research, 21, 1149–1158.
Humayun, M. S., & de Juan, E., Jr. (1998). Artificial vision. Eye, 12,
605–607.
Humayun, M. S., de Juan, E., Jr., Weiland, J. D., Dagnelie, G.,
Katona, S., Greenberg, R., & Suzuki, S. (1999). Pattern electrical stimulation of the human retina. Vision Research, 39, 2569–
2576.
Karni, A., & Sagi, D. (1991). Where practice makes perfect in texture
discrimination: evidence for primary visual cortex plasticity.
Proceedings of the National Academy of Science, 88, 4966–4970.
Latham, K., & Whitaker, D. (1996). A comparison of word recognition and reading performance in foveal and peripheral vision.
Vision Research, 36, 2665–2674.
Leat, S. J., Li, W., & Epp, K. (1999). Crowding in central and eccentric
vision: the effects of contour interaction and attention. Investigative
Ophthalmology and Visual Science, 40, 504–512.
Legge, G. E., Ahn, S. J., Klitz, T. S., & Luebker, A. (1997).
Psychophysics of reading. XVI. The visual span in normal and low
vision. Vision Research, 37, 1999–2010.
Legge, G. E., Mansfield, J. S., & Chung, S. T. L. (2001). Psychophysics
of reading XX. Linking letter recognition to reading speed in
central and peripheral vision. Vision Research, 41, 725–743.
Legge, G. E., Parish, D. H., Luebker, A., & Wurm, L. H. (1990).
Psychophysics of reading. XI. Comparing color contrast and
luminance contrast. Journal of the Optical Society of America A, 7,
2002–2010.
Legge, G. E., Pelli, D. G., Rubin, G. S., & Schleske, M. M. (1985).
Psychophysics of reading. I. Normal vision. Vision Research, 25,
239–252.
Legge, G. E., Ross, J. A., Isenberg, L. M., & La May, J. M. (1992).
Psychophysics of reading. XII: Clinical predictors of low-vision
reading speed. Investigative Ophthalmology and Visual Science, 33,
677–687.
Legge, G. E., & Rubin, G. S. (1986). Psychophysics of reading. IV.
Wavelength effects in normal and low vision. Journal of the Optical
Society of America A, 3, 40–51.
Legge, G. E., Rubin, G. S., & Luebker, A. (1987). Psychophysics of
reading. V. The role of contrast in normal vision. Vision Research,
27, 1165–1177.
Legge, G. E., Rubin, G. S., Pelli, D. G., & Schleske, M. M. (1985).
Psychophysics of reading. II. Low vision. Vision Research, 25, 253–
265.
Levi, D. M., Klein, S. A., & Aitsebaomo, A. P. (1985). Vernier
Acuity, crowding and cortical magnification. Vision Research, 25,
963–977.
Li, L., Nugent, A. K., & Peli, E. (2001). Recognition of jagged
(pixelated) letters in the periphery. Visual Impairment Research, 2,
143–154.
Manny, R. E., Fern, K. D., Loshin, D. S., & Marinez, A. T. (1988).
The effects of practice on contour interaction. Clinical Visual
Science, 3, 59–67.
Nilsson, U. L. (1990). Visual rehabilitation with and without educational training in the use of optical aids and residual vision. A
prospective study of patients with advanced age-related macular
degeneration. Clinical Vision Sciences, 6, 3–10.
283
Normann, R. A., Maynard, E. M., Rousche, P. J., & Warren, D. J.
(1999). A neural interface for a cortical vision prosthesis. Vision
Research, 39, 2577–2587.
Peli, E. (1986). Control of eye movements with peripheral vision:
implications for training of eccentric viewing. American Journal of
Optometry and Physiological Optics, 63, 113–118.
Pelizzone, M., Cosendai, G., & Tinembart, J. (1999). Within-patient
longitudinal speech reception measures with continuous interleaved
sampling processors for Ineraid implanted subjects. Ear and
Hearing, 20, 228–237.
Peyman, G., Chow, A. Y., Liang, C., Chow, V. C., Perlman, J. I., &
Peachey, N. S. (1998). Subretinal semiconductor microelectrode
array. Opththalmic Surgery and Lasers, 29, 234–241.
Poggio, T., Fahle, M., & Edelman, S. (1992). Fast perceptual learning
in visual hyperacuity. Science, 256, 1018–1021.
Rizzo, J. F., & Wyatt, J. (1997). Prospects for a visual prosthesis. The
Neuroscientist, 3, 251–262.
Schoups, A. A., Vogels, R., & Orban, G. A. (1995). Human perceptual
learning in identifying the oblique orientation: retinotopy, orientation specificity and monocularity. Journal of Physiology (London), 483, 797–810.
Rubin, G. S., & Legge, G. E. (1989). Psychophysics of reading. VI. The
role of contrast in low vision. Vision Research, 29, 79–91.
Schoups, A. A., & Orban, G. A. (1996). Interocular transfer in
perceptual learning of a pop-out discrimination task. Proceedings
of the National Academy of Science USA, 93, 7358–7362.
Sireteanu, R., & Rettenbach, R. (1995). Perceptual learning in visual
search: fast, enduring, but not specific. Vision Research, 35, 2037–
2043.
Sireteanu, R., & Rettenbach, R. (2000). Perceptual learning in visual
search generalizes over tasks, locations, and eyes. Vision Research,
40, 2925–2949.
Sj€
ostrand, J., Olsson, V., Popovic, Z., & Conradi, N. (1999).
Quantitative estimations of foveal and extra-foveal retinal circuitry
in humans. Vision Research, 39, 2987–2998.
Stett, A., Barth, W., Weiss, S., Haemmerle, H., & Zrenner, E. (2000).
Electrical multisite stimulation of the isolated chicken retina. Vision
Research, 40, 1785–1795.
Studebaker, G. A. (1985). A ‘‘rationalized’’ arcsine transform. Journal
of Speech and Hearing Research, 28, 455–462.
Thompson, R. W., Barnett, D., Humayun, M., & Dagnelie, G (2000).
Reading speed and facial recognition using simulated prosthetic
vision. Investigative Ophthalmology and Visual Science, 41, S860
(Abstract).
Toet, A., & Levi, D. M. (1992). The two-dimensional shape of spatial
interaction zones in the parafovea. Vision Research, 32, 1349–1357.
Tychsen, L. (1992). Binocular vision. In W. Hart (Ed.), AdlerÕs
physiology of the eye (p. 832). St. Louis: Mosby.
Veraart, C., Raftopoulos, C., Mortimer, J. T., Delbeke, J., Pins, D.,
Michaux, G., Vanlierde, A., Parrini, S., & Wanet-Defalque, M. C.
(1998). Visual sensations produced by optic nerve stimulation using
an implanted self-sizing spiral cuff electrode. Brain Research, 813,
181–186.
Westheimer, G. (2001). Is peripheral visual acuity susceptible to
perceptual learning in the adult? Vision Research, 41, 47–52.
Zrenner, E., Miliczek, K. D., Gabel, V. P., Graf, H. G., Guenther, E.,
Haemmerle, H., Hoefflinger, B., Kohler, K., Nisch, W., Schubert,
M., Stett, A., & Weiss, S. (1997). The development of subretinal
microphotodiodes for replacement of degenerated photoreceptors.
Ophthalmic Research, 29, 269–280.
Zrenner, E., Stett, A., Weiss, S., Aramant, R. B., Guenther, E.,
Kohler, K., Miliczek, K. D., Seiler, M. J., & Haemmerle, H. (1999).
Can subretinal microphotodiodes successfully replace degenerated
photoreceptors? Vision Research, 39, 2555–2567.
Vision Research 44 (2004) 1693–1706
www.elsevier.com/locate/visres
Simulation of artificial vision: II. Eccentric reading of full-page
text and the learning of this task
J€
org Sommerhalder *, Benjamin Rappaz, Raoul de Haller, Angelica Perez Fornos,
Avinoam B. Safran, Marco Pelizzone
Ophthalmology Clinic, Geneva University Hospitals, 1211 Geneva 14, Switzerland
Received 3 June 2003; received in revised form 13 January 2004
Abstract
Reading of isolated words in conditions mimicking artificial vision has been found to be a difficult but feasible task. In particular
at relatively high eccentricities, a significant adaptation process was required to reach optimal performances [Vision Res. 43 (2003)
269]. The present study addressed the task of full-page reading, including page navigation under control of subject’s own eye
movements. Conditions of artificial vision mimicking a retinal implant were simulated by projecting stimuli with reduced information content (lines of pixelised text) onto a restricted and eccentric area of the retina. Three subjects, na€ıve to the task, were
trained for almost two months (about 1 h/day) to read full-page texts. Subjects had to use their own eye movements to displace a
10 · 7 viewing window, stabilised at 15 eccentricity in their lower visual field. Initial reading scores were very low for two subjects
(about 13% correctly read words), and astonishingly high for the third subject (86% correctly read words). However, all of them
significantly improved their performance with time, reaching close to perfect reading scores (ranging from 86% to 98% correct) at the
end of the training process. Reading rates were as low as 1–5 words/min at the beginning of the experiment and increased significantly with time to 14–28 words/min. Qualitative text understanding was also estimated. We observed that reading scores of at least
85% correct were necessary to achieve ‘good’ text understanding. Gaze position recordings, made during the experimental sessions,
demonstrated that the control of eye movements, especially the suppression of reflexive vertical saccades, constituted an important
part of the overall adaptive learning process. Taken together, these results suggest that retinal implants might restore full-page text
reading abilities to blind patients. About 600 stimulation contacts, distributed on an implant surface of 3 · 2 mm2 , appear to be a
minimum to allow for useful reading performance. A significant learning process will however be required to reach optimal performance with such devices, especially if they have to be placed outside the foveal area.
2004 Elsevier Ltd. All rights reserved.
Keywords: Visual prosthesis; Simulation; Reading performance; Eccentric reading; Learning
1. Introduction
Visual prostheses for the blind are currently being
developed by several research groups (Chow & Chow,
1997; Dobelle, 2000; Humayun, 2001; Normann, Maynard, Rousche, & Warren, 1999; Rizzo & Wyatt, 1997;
Veraart et al., 1998; Zrenner, 2002). Technological advances are impressive, but several fundamental constraints have to be seriously considered before one can
hope to restore useful vision. For example, retinal
prosthesis will be implanted at a fixed location in the eye
and they will, in all likelihood, subtend only a fraction
*
Corresponding author. Tel.: +41-2238-28420; fax: +41-223828382.
E-mail address: [email protected] (J. Sommerhalder).
0042-6989/$ - see front matter 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.visres.2004.01.017
of the entire visual field. Furthermore, all envisioned
prosthesis will consist of a finite number of discrete
stimulation contacts. The implications of these constraints need to be investigated, in order to establish
minimum requirements for useful artificial vision before
using such devices on patients.
We use simulations of artificial vision on normal
subjects to investigate this important issue. Pixelised
images are projected in a restricted viewing area, positioned at a fixed location in the visual field. In a previous
study (Sommerhalder et al., 2003), we demonstrated
that the amount of information conveyed by about 250–
300 pixels (or distinct stimulation spots in a retinal
prosthesis) is sufficient to allow close to perfect reading
of isolated four-letter words, if the stimulus is projected
in the central visual field. When the same images were
1694
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
projected in eccentric areas, reading performance decreased dramatically with increasing eccentricity. Even
when using high pixel resolutions, reading was almost
impossible for eccentricities beyond 10.
It is important to investigate the effect of stimulus
eccentricity on performance, because the anatomophysiology of the retina does not favour a foveal location for retinal prostheses (see e.g. Sj€
ostrand, Olsson,
Popovic, & Conradi, 1999). Retinal implants are primarily designed to stimulate neurons of the inner retinal
layers (bipolar and/or ganglion cells) in cases of blindness due to photoreceptor loss (e.g. retinitis pigmentosa). In the central fovea, these neurons are not present.
In the parafovea, they are arranged in several superimposed layers that make it difficult to activate them in
predictable patterns. The best sites, potentially preserving retinotopic activation without major distortion, are
located at an eccentricity of 10 and more. This means
that the vision of future users of retinal prosthesis will
probably be restricted to small peripheral areas of their
visual field. Our ability to identify objects in the
periphery is however poor. Especially reading words of
several letters is very difficult due to contour interaction,
the so-called ‘crowding effect’ (see e.g. Toet & Levi,
1992).
Although our acute experiments at 15 eccentricity
had shown very poor reading of isolated four-letter
words, performance could be improved impressively
upon systematic training (about 1 h/day for about 1
month––Sommerhalder et al., 2003). Whether this
promising result was due to the better control of undesirable reflexive eye movements or to a decrease of the
‘crowding effect’ 1 was unclear, but it suggests that retinal prostheses might successfully restore some reading
abilities in blind patients, even if the implant has to be
placed outside the fovea.
So far, our experiments were conducted using isolated
four-letter words and not full pages of text. They did not
require page navigation during reading. This means that
subjects did not have to move their gaze from one word
to the other and from the end of one line to the beginning of the next one. Page navigation implies not only
the stabilisation of the gaze on a particular point of
interest, but also micro-saccades to read lines of text as
well as larger saccades to jump from the end of one line
to the beginning of the next one. In previous literature,
page navigation has essentially been studied in connection with the use of special field of view magnifiers, intended as reading aids for low vision patients.
Beckmann and Legge (1996) measured reading rates of
normal and of low vision subjects in two conditions:
1
For a more detailed discussion about ‘crowding’, attention and
perceptual learning, see also Sommerhalder et al. (2003) and citations
therein.
with horizontally drifting text 2 requiring no page navigation and with a closed-circuit television magnifier
(CCTV) 3 requiring ‘manual’ page navigation. Manual
page navigation resulted in significantly lower reading
rates. This effect was more pronounced on normal
subjects than on low vision patients, suggesting that
overall reading performance was reduced in these patients because of limitations due to other visual factors.
A second comparative study of the same research group
(Harland, Legge, & Luebker, 1998), including RSVP 4
text presentation and ‘mouse’-controlled page navigation, confirmed their previous findings. The use of RSVP
and drifting text presentation resulted in better reading
performance than the use of CCTV or ‘mouse’ navigation. Interestingly, they did not observe significant differences in reading rates across the four methods of text
presentation in a group of patients with central field
loss, i.e. for subjects who were forced to use eccentric
fixation for reading.
It is difficult to predict from these data how patients
using retinal prosthesis would cope with the page navigation problem. The most advanced retinal prostheses,
envisioned at present, are devices, which will transform
in situ, light falling on the retina into electrical stimulation currents. Such an implant design has the fundamental advantage that the user can scan his/her
environment using eye movements, even if the implant
might need to be placed extra-foveally. An additional
benefit is also the fact that these devices do not require
the use of an extra-ocular camera, avoiding the delicate
issue of wired connections to the rapidly moving eye.
Neither horizontally drifting text nor RSVP do realistically mimic text reading using such retinal implants,
since both methods are expressively intended to minimise eye movements. 5 ‘Mouse’-controlled and CCTV
reading are both quite unnatural conditions, because
they rely on manual page navigation.
In this study we used simulations to address the issue
of full-page text reading in conditions mimicking artificial vision provided by a retinal implant.
On one hand, full-page text reading can be expected
to be more difficult than deciphering isolated words,
2
Text is drifting as a single line horizontally over a visualisation
screen. Thus the reader does not have to jump from one line to another
and can keep his gaze position quite stable.
3
A CCTV consists of a video camera equipped with a magnifying
lens and connected to a TV monitor. The reader can thus only see a
few characters at a time and has to move the video camera over the
lines of text.
4
Rapid serial visual presentation, involving no page navigation and
very few eye movements.
5
For other types of retinal implants, which are conceived to use an
external camera to capture the stimuli, eye movements will have to be
replaced by head movements.
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
because successful reading of several lines of text requires page navigation abilities, i.e. well-controlled eye
movements, which can be difficult to achieve with a
restricted viewing area, located at a fixed position in the
visual field, especially if eccentric retinal locations are
used. Future users of retinal implants will have to reference reading eye movements to a non-foveal retinal
locus. Patients with central field loss develop one or
several preferred retinal loci (PRL), in attempt to
compensate for the missing fovea. Fletcher and Schuchard (1997) studied the locations of PRLs relative to
the macular scotoma, and how their patients managed
to use these PRLs (fixation, pursuit and saccadic ability). 54% of them showed some pursuit ability; 57% of
them were able to stabilise a stationary target within a
discrete retinal area; 77% of them were able to move
their PRL to separate targets. Most of them made
however several saccades when moving from one target
to another, having also difficulties with a stable target
fixation. Foveating saccades have shorter initiation
latencies and are faster than non-foveating saccades.
Whittaker, Cummings, and Swieson (1991) compared
the saccadic eye movements of patients with macular
degeneration with those of normal subjects. They found
out, that even if patients were capable to consistently
direct images to their PRL, their saccades still kept the
characteristics of non-foveating saccades, suggesting
that patients with macular scotoma suppress rather
than adapt the foveating saccade mechanism. There is
very little literature concerning the time course of
saccadic adaptation to a non-foveal location. Heinen
and Skavenski (1992) studied this issue on monkeys.
They introduced bilateral foveal lesions in three adult
animals and found that new PRLs were stable within
two days, while the saccadic system did not stabilize for
at least two weeks. Two of the three animals were not
capable to bring the target directly to the new fixation
locus.
On the other hand, full-page text reading might be
expected to be easier than deciphering isolated words,
because subjects can make use of context information to
facilitate reading. We are better at reading meaningful
sentences than random words (Fine & Peli, 1996; Latham & Whitaker, 1996). The benefits of context are
however controversial when it comes to peripheral vision. It has been suggested that readers with central field
loss would be less efficient in using context to facilitate
reading (see Baldasare & Watson, 1986 or Latham &
Whitaker, 1996). But this hypothesis is contradicted by
other studies. For example, Fine and Peli (1996) compared reading rates for meaningful sentences to reading
rates for random words for normally sighted subjects
and for subjects with central field loss, they found that
speed gains due to context were present and equivalent
for both groups of subjects when using RSVP and
scrolled text presentation.
1695
2. Methods
We conducted two successive experiments. In the first
experiment, subjects were asked to read pixelised fullpage texts using a viewing area, stabilised on the fovea.
In the second experiment, subjects were asked to perform the same task, but using a viewing area stabilised
at 15 eccentricity. In both experiments, we expected
that the subjects might adapt progressively to the task
and perform better with time. Thus, experimental sessions were repeated daily, until we were sure that scores
were stable. In the first experiment, reading scores asymptoted within a few sessions. In the second experiment, an important learning process was observed. A
period of almost two months was necessary until
eccentric reading scores asymptoted.
2.1. Subjects
Three young subjects (AD, female, 24 years old; DV,
female, 23 years old; DS, male, 30 years old), working at
the Geneva Eye Clinic, participated in both experiments.
All of them had normal or corrected to normal vision as
well as a normal ophthalmologic status. They were native French speakers and they were familiar with the
purpose of the study. All of them knew that two daily
sessions over a time period of several weeks would be
necessary to complete the experiment. They did not
participate in any of the previous studies on eccentric
reading.
All experiments were conducted according to the
ethical recommendations of the Declaration of Helsinki,
and were approved by local ethical authorities. 6
2.2. Experimental procedure
To simulate the visual percepts produced by a retinal
prosthesis (in our case a retinal implant, which transforms in situ incident light into electrical stimulation
signals), images were projected on a defined and stabilised area of the retina. Briefly, the position and content
of the stimulus were generated on a fast computer display used in association with a SMI EyeLink gaze
tracking system (Senso Motoric Instruments GmbH,
Teltow/Berlin, Germany) for online monitoring of the
gaze position. Gaze position data were used to move a
small viewing window over a screen displaying full-page
text (Fig. 1). The position of the viewing window relative
to the gaze position could be offset arbitrarily. The
viewing window was thus projected either onto the
central retina (no offset between viewing window and
gaze position) or onto a defined eccentric area of retina
6
Comite d’Etique de la Recherche sur l’Etre Humain (CEREH) des
H^
opitaux Universitaires de Geneve.
1696
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
Fig. 1. The experimental set-up used to simulate full-page text reading
in conditions mimicking vision with a retinal prosthesis. The subject
was asked to read a page of pixelised text, using her own eye movements to move a restricted viewing window on the computer screen.
(constant non zero offset). The experimental set-up was
described in detail in a previous paper (Sommerhalder
et al., 2003).
From previous experiments, we knew that a sampling
density of 286 pixels, distributed over an area corresponding to 10 · 3.5 of the visual field, would allow for
close to perfect recognition of isolated words. We observed however, that the height of such a viewing window was limiting page navigation, because it did not
allow to see the preceding or the following line of the
text. Therefore, we conducted pilot experiments in central vision to determine a more adequate height of the
viewing window. We found that doubling the height of
the viewing window to 7 allowed to visualise two lines
of the text at once, and this was found to be greatly
helpful to orient page navigation. When increasing the
viewing area height to 10, more than two lines were
visible simultaneously, but this was not experienced as a
further improvement by the subjects. We therefore
decided to use a 10 · 7 viewing window for the present
experiments. Using the same pixel density as in our
previous studies, this resulted in an area containing 572
pixels. 7 Such a viewing area would correspond to a
surgically manageable implant size of 3 · 2 mm2 as well
as to a technically feasible contact to contact spacing of
about 100 lm.
2.3. Generation and presentation of the stimuli
Stimuli were pre-pixelised bitmap images. The texts
used to generate these images were extracted from the
Swiss newspaper ‘‘Le Temps’’. This newspaper is written
in common French language. It is a good representative
7
This value represents an array of 28.6 · 20 pixels, using the same
pixel density as used for our previous study (Sommerhalder et al.,
2003).
of a common information newspaper, being neither too
elementary, nor too sophisticated. One hundred small
articles of diverse contents (culture, politics, economics,
sports, etc.) were downloaded from the website of the
journal. 8
The texts of these articles were presented on the
screen using a Helvetica (i.e. Arial) font size as in our
previous experiments on single word recognition. At a
viewing distance of 57 cm, the height of the small letter
‘x’ corresponded to a visual angle of 1.8. In these
conditions, a segment of seven successive lines of text
could be displayed on the screen, and about six successive letters could be visualised at once in the 10 width of
the viewing window. Hyphenation was used to allow for
the presentation of a maximum of words, resulting in an
average of about 25 words per text segment. Each article
was divided into such successive segments and pixelised 9 using commercial software (Adobe Photoshop
5.5). Fig. 2a shows an example of such a stimulus.
Subjects were tested monocularly using their dominant eye. During the experiment, they were comfortably
seated facing the stimulation screen, and wearing the
headband mounted SMI eye tracking system. At the
beginning of each session the eye to screen distance was
checked, and a standard nine-point calibration of the
eye-tracker was performed. Subjects were requested not
to move during the session. Then, the first segment of an
article, randomly chosen out of the 100-article library,
was presented on the stimulation screen. The subjects
could use their own eye movements to scan the stimulation screen, but only a small part of the entire text
segment was visible through the 10 · 7 window (see
Fig. 2b). During each session, subjects had to read aloud
the first four initial text segments (about 100 words) of
an article. Their voice and gaze position were recorded
for further analysis. After the presentation of each text
segment, the calibration was checked for possible drifts
or artefactual movements, and if necessary, slightly
corrected to insure an exact control of the viewing
window position during the entire experimental session.
In very rare cases, these controls revealed a significant
drift, meaning that the position of the viewing window
was not stable during the presentation of the text segment. The results from this segment were discarded and
an additional text segment was read. At the end of each
session (i.e., after reading four successive text segments),
a qualitative comprehension test was performed by
questioning the subject on the content of the article.
8
http://www.letemps.ch/
Mosaic pixelisation (i.e., square pixels of uniform grey level) was
used. Such simple patterns were adequate to simulate the reduced
information content (e.g. finite quantisation) of the stimuli, but do not
pretend to mimic exactly percepts elicited by electrical activation of the
retina.
9
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
1697
sessions were performed repeatedly for a period of almost two months. In general, two sessions were conducted each working day of the week (5 days per week).
The regular daily flow of sessions was interrupted for
weekends, and exceptionally for brief vacations. The
duration of each experimental session was variable
throughout the experiment, but it never exceeded 30
min. Two sessions represented therefore less than 1 h of
daily training. This experiment was stopped when
reading scores asymptoted. Between 55 and 68 sessions
per subject were necessary to achieve this criterion.
2.4. Data analysis and statistics
Fig. 2. (a) A segment of pixelised full-page text as presented on the
computer screen. The tree dots, at the beginning and at the end,
indicate a text segment situated somewhere in the body of an article.
Texts were not justified. (b) The screen viewed by the subject. Only a
small part of the test could be seen through a 10 · 7 viewing window.
The rest of the screen was blanked by an uniform grey foreground. The
gaze position was constantly monitored by the system and the viewing
window moved accordingly on the screen. The content of the viewing
window was thus permanently projected on the same retinal area at an
eccentricity 0 for experiment 1 and, as illustrated, at an eccentricity of
15 for experiment 2. For illustration purposes only, the whole text
segment was made slightly visible in this figure. This was not the case
during the experiments.
A different article was used in each session. None of the
subjects read an article twice.
In experiment 1, several sessions were conducted
using a centrally located viewing window. This experiment lasted until subjects became familiar with the task
of reading pixelised text, using a small viewing window
for page navigation. A stable percentage of correctly
read words was used as criterion to stop the experiment.
Experiment 2, testing eccentric reading, began only
when the subjects had adapted to central reading. To
investigate for possible learning effects, experimental
Reading performance was measured in terms of
reading scores (expressed in percentage of correctly read
words per session) and in terms of reading rate (expressed in number of correctly read words per minute
during each session).
Reading scores, expressed on a proportional percentage scale are, however, not suitable for statistical
analysis. It is well known that with proportional scales,
variance is not correlated with the mean. In other words,
the data are not normally distributed around the mean
and scale values are not linear in relation to the test
variability. This problem can be solved by using an
arcsine transformation. Studebaker (1985) proposed to
use so-called ‘‘rationalised arcsine units’’ (rau), values
that are numerically close to the original percentage
range, while retaining all of the desirable properties of
the arcsine transform. Therefore, reading scores were
statistically analysed using scores expressed in rau. For
better clarity however, an approximate %-correct
scale 10 is indicated on the right ordinates of the graphs.
Qualitative comprehension of the text was judged by
two examiners using an arbitrary four-grade scale:
‘None’ meaning that the text was not understood at all;
‘insufficient’ meaning very partial comprehension,
insufficient to understand the issue reported in the text;
‘good’ meaning that the main issue was grasped but not
all details; ‘excellent’ meaning a perfect and detailed
comprehension of the text. 11 After each reading session,
the subjects had to describe what they had read and were
then questioned by the two examiners, who had no
difficulties to attribute one of the four comprehension
levels. Subjects reported spontaneously to be satisfied,
when they reached ‘excellent’ or ‘good’ levels of
10
Note that the %-correct to rau transformation is dependent of
sample size (in our case the total number of words used in one session).
Therefore, our approximate %-correct scales are based on the average
number of words computed across all session presented on the graphs.
11
We used such an uncommon four-level scale, because it is much
easier to judge first if a subject understood the main issue of a text and
then to ask some questions to determine whether the subjects grasped
some parts of the text or if the subject really had a detailed
understanding.
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
y ¼ y0 þ að1 ebx Þ:
We used a simple linear correlation (Pearson’s correlation) to determine if training (expressed as the
number of sessions) had a statistically significant effect
on performance.
AD
120
110
100
90
DV
120
110
100
100
90
DS
120
110
100
100
90
0
2
4
2
4
(a)
Reading rate [words/min]
3. Experiment 1: central reading of full-page text
Experiment 1 was dedicated to familiarise normal
subjects with the unusual task of reading pixelised fullpage text, using a small viewing window for page navigation. For this easier experiment, subjects read six text
segments per session instead of the four segments per
session used in the more difficult experiment 2.
Fig. 3 presents reading performance in central vision
versus session number for each subject. All three subjects performed immediately very well. Perfect, or close
to perfect reading scores (i.e. >95% correct) were already
achieved in the first sessions. No significant learning
effect was observed in the analysis of reading scores
versus time.
Reading rates improved with time for all three subjects. Analysis of the experimental data, using the
exponential function presented in Section 2, revealed
that the average reading rate almost doubled from 71 to
122 words per minute for AD. It improved from 65 to 89
words per minute for DV, and from 60 to 72 words per
minute for DS. This improvement was however statistically significant only for subject AD (Pearson’s correlation: r ¼ 0:78, p ¼ 0:003). Interestingly, subject AD
achieved at the end of this experiment reading rates,
which were quite superior to those of the two other
subjects. These reading rates can be compared to those
achieved by the same subjects in ‘normal’ conditions.
Reading rates were measured to be significantly higher,
ranging between 160 and 180 words per minute, for
articles directly read from the same journal.
In conclusion, experiment 1 clearly demonstrates that
useful full-page text reading can be obtained under
conditions mimicking artificial vision in the central visual field. The relevant information for reading could be
transmitted and captured by the visual system. The increased difficulty of page navigation using a restricted
viewing window, and the fact, that this viewing window
contained pixelised stimuli, resulted in reading rates,
significantly below normal values, but almost all words
were correctly deciphered.
100
Words correctly read [%]
comprehension, associated with a reading rate of about
20 words per minute. From a clinical point of view,
these two later levels might be considered as a gratifying
and useful performance on full-page text reading.
‘Learning curves’ were established on the basis of the
evolution of reading performance versus time. Data
were fitted using the non-linear regression function
Words correctly read [rau]
1698
120
100
80
60
AD
120
100
80
60
DV
120
100
80
60
DS
0
(b)
6
8 10
Session #
12
14
16
6
12
14
16
8
10
Session #
Fig. 3. Reading performance during experiment 1 for three subjects
(AD, DV and DS). Full-page texts were read using central vision
(10 · 7 viewing window containing 572 pixels). (a) Reading scores as
well as (b) reading rates versus session number. The solid lines indicate
the best fits to the data.
4. Experiment 2: eccentric reading of full-page text
Experiment 2 was started when subjects had adapted
to perform the task in central vision. Based on our
previous results on eccentric reading of isolated words
(Sommerhalder et al., 2003), we expected that eccentric
reading of full-page text might also require significant
adaptation to reach maximum performance.
Fig. 4 presents individual reading scores versus session number for full-page text reading at 15 eccentricity. Experimental data were fitted with the exponential
function presented in Section 2, to average session-tosession variability. The performances of two out of the
three subjects tested showed massive improvements
during the course of the experiment. At the beginning of
the experiment, the subjects DV and DS were able to
identify only about 13 % of the words and they ended up
with scores of 86% and 98% correct respectively. In
contrast, subject AD already performed very well in the
initial sessions (about 85% correct), and ended up with
80
70
60
50
40
30
20
10
AD
0
0
10
20
30
40
50
60
70
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
DV
Words correctly read [%]
Words correctly read [rau]
Session #
0
0
10
20
30
40
50
60
70
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
100
90
80
70
60
50
40
30
20
10
Words correctly read [%]
Words correctly read [rau]
Session #
DS
0
0
10
20
30 40 50
Session #
60
70
Fig. 4. Reading scores during experiment 2 versus session number for
three subjects (AD, DV and DS). Full-page texts were read in eccentric
vision (15 eccentricity in the lower visual field), using a viewing window of 10 · 7 containing 572 pixels. The solid lines indicate the best
fits to the data.
almost perfect scores (98% correct). Her learning curve
was therefore less spectacular. However, reading score
improvements were highly statistically significant for all
three subjects (Pearson’s correlation: r ¼ 0:57, p <
0:0001 for AD; r ¼ 0:81, p < 0:0001 for DV and
r ¼ 0:77, p < 0:0001 for DS).
Reading rate [words/min]
90
1699
The second important parameter indicating the
presence of learning is the reading rate. Fig. 5 presents
reading rates achieved during experiment 2. Reading
rates improved for all three subjects. Subject AD improved from 5 to 26 words per minute (w/m), subject
60
55
50
45
40
35
30
25
20
15
10
5
0
AD
0
Reading rate [words/min]
100
10
20
30 40 50
Session #
60
55
50
45
40
35
30
25
20
15
10
5
0
60
70
DV
0
Reading rate [words/min]
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
Words correctly read [%]
Words correctly read [rau]
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
10
20
30 40 50
Session #
60
55
50
45
40
35
30
25
20
15
10
5
0
60
70
DS
0
10
20
30 40 50
Session #
60
70
Fig. 5. Reading rates during experiment 2 versus session number for
three subjects (AD, DV and DS). Full-page texts were read in eccentric
vision (15 eccentricity in the lower visual field), using a viewing window of 10 · 7 containing 572 pixels. The solid lines indicate the best
fits to the data. Three subjects: AD, DV and DS.
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
Text comprehension [arbritary units]
DV from 3 to 14 w/m, and subject DS from 1 to 28 w/m.
In spite of a large session-to-session variability (especially for subject AD), reading rate improvements were
statistically significant for all three subjects (Pearson’s
correlation: r ¼ 0:74, p < 0:0001 for AD, r ¼ 0:81,
p < 0:0001 for DV and r ¼ 0:90, p < 0:0001 for DS). At
the end of experiment 2, reading rates for eccentric
reading were still significantly below values obtained in
similar conditions for central reading, and of course
below normal reading rates, but they were remarkable
when compared to what subjects achieved at the
beginning of experiment 2. It is also important to note
that reading rates continued to improve after almost two
months of training, (i.e. at the time we terminated the
experiment because reading scores had asymptoted).
This suggests that higher reading rates could still have
been achieved with further practice.
Word recognition scores and reading rates can be
measured accurately, and are therefore helpful experimental values to demonstrate changes in performance,
but they do not reflect to which degree the content of the
text was understood. Text comprehension is not easy to
quantify, but we tried to assess this parameter using a
qualitative four-level scale (see Section 2). Fig. 6 presents the evolution of text comprehension on the three
subjects throughout experiment 2. During initial sessions, subjects DV and DS had experienced major
problems to understand the texts they read. ‘Good’
understanding could only be achieved after 16 sessions
or more. In contrast, subject AD achieved ‘good’ to
‘excellent’ text comprehension from the beginning. At
the end of experiment 2, subjects AD and DS systematically achieved ‘excellent’ text comprehension, but not
subject DV. These results fit well with the performance
curves in Figs. 4 and 5 where subject AD achieved high
reading scores from the beginning and subject DV finished with the lowest performances.
AD
excellent
good
insufficient
none
DV
excellent
good
insufficient
none
DS
excellent
good
insufficient
none
0
10
20
30
40
50
60
70
Session #
Fig. 6. Text comprehension estimates during experiment 2 versus
session number for the three subjects (AD, DV and DS).
All subjects
Words correctly read [%]
0
20
40
60
80
100
excellent
Text comprehension [arbitrary units]
1700
good
insufficient
none
0
20 40 60 80 100 120
Words correctly read [rau]
excellent
good
insufficient
none
0
10
20
30
Reading rate [words/min]
40
Fig. 7. Text comprehension estimates versus reading scores and
reading rate. All the data collected on the three subjects were merged.
Box plots indicate median values, 25th and 75th percentile values (grey
box) as well as 10th and 90th percentile values (vertical bars). Circles
indicate outliners.
It is interesting to plot the results of text comprehension versus reading scores and reading rates (Fig.
7). Reading scores of 85% correct or more were required to reach ‘excellent’ or ‘good’ levels of text
comprehension. Below 60% correct, text understanding
seemed to be impossible. The distribution of comprehension levels versus reading rates was more variable.
For example, ‘excellent’ or ‘good’ comprehension levels
were reached over a large range of reading rates, and
even occasionally at reading rates below 10 w/m. Good
text comprehension appeared thus to be more closely
associated to high reading scores than to high reading
rates.
Taken together, results from experiment 2 demonstrate that an important learning process occurred for
eccentric reading of full-page text. This process was
however expressed differently across subjects. Subject
DS, for example, improved impressively in each of the
three measured parameters throughout the experiment.
In contrast, subject AD begun the experiment with relatively high reading scores and good text comprehension. In her case, the learning process was best expressed
by a major improvement of reading rates. Both subjects,
AD and DS, achieved clearly functionally useful eccentric full-page reading after almost two months of daily
training. Subject DV, while showing significant
improvements in all aspects, did not achieve the same
level of performance in the same period of time.
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
1701
4.1. Analysis of eye movements
During initial sessions, full-page text reading using an
eccentric part of the retina was reported by the subjects
to be ‘‘very difficult’’. The same task became however
‘‘much easier’’ with time. As the gaze position on the
stimulation screen was recorded every 4 ms throughout
all experiments, we were able to analyse how subjects
used their eye movements to perform the task.
Fig. 8 illustrates such data with an example, comparing on the same subject gaze position recordings
made during the first and the last experimental session.
During the first session, the subject experienced major
difficulties in controlling her eye movements. The viewing window wandered across the whole screen, irrespective of the positions of the lines of text. During the
last session, the subject had learned to control her eye
movements, and the viewing window focussed quite
accurately on the successive lines of text. Occasionally
the subject traced back on the same line, to visualise
again specific words.
We quantified gaze stability by computing histograms
of the vertical position of the viewing area during the
test (plotted on the right side of Fig. 8). For the first
session, this histogram is broad, roughly centred on the
screen. There is no evidence for successful focusing on
single lines of the text. Numerous uncontrolled reflexive
vertical saccades (i.e., an automatic foveation of the
stimuli) could not be prevented. For the last session, the
histogram is completely different. A series of small
peaks, with a vertical spacing corresponding to that of
the lines of text, can be observed in the histogram. This
analysis also revealed that the subject had the tendency
to place the centre of the viewing area slightly below the
lines to read, probably minimising in this way the
eccentricity of the relevant part of the target image.
How did the overall control of eye movements improve during the experiment? On-line recordings of the
gaze position were used to compute the mean cumulative length of vertical eye movements for each experimental session. Fig. 9 presents fits to these data for each
subject. The mean cumulative length of vertical eye
movements decreased dramatically during the course of
the experiment for all subjects. Initial values ranged
between 35 and 48 m per text segment, while final values
dropped to 5–9 m per text segment; a fivefold decrease.
For subjects DV and AD, they asymptoted within the
10–20 initial sessions. For subject DS, they were still
decreasing when experiment 2 terminated. This quantitative analysis demonstrates clearly, that the control of
eye movements under such unusual reading conditions
was one prominent factor in the entire learning process.
However, this was only one factor affecting performance, since reading scores and reading rates continued
to progress after vertical eye movements had stabilised
in two out of the three subjects.
Fig. 8. Trajectory of the centre of the viewing window (solid line)
relative to the text during (a) the first, and (b) the last experimental
session of experiment 2 for subject AD. The panels on the right represent frequency histograms of the vertical coordinates of the trajectory recorded every 4 ms. Grey bars indicate the position of the lines of
text.
4.2. Control experiments
Two of the three subjects (AD and DS) participated
in additional control experiments. Immediately after
completion of experiment 2, eight successive experimental sessions were dedicated to tests using the untrained (non-dominant) contra-lateral eye, and eight
additional sessions were conducted using stimuli presented with maximum screen resolution. After a rest
period of three months for subject DS, and six months
for subject AD, both subjects were tested again.
Fig. 10 summarises the results of these control
experiments, which were conducted to address three
different issues: (1) Could the benefits of learning gathered with one eye be transferred to the other eye? No
significant difference in performance was found for fullpage text reading using the untrained contra-lateral eye.
Thus, learning accumulated during experiment 2 appeared to be fully transferred to the contra-lateral eye.
1702
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
significant change in reading scores was found after
several months of non-practice. We observed a tendency
towards lower reading rates (about 20% reduction), but
this trend was not statistically significant. This indicates
that learning of eccentric full-page text reading was
generally preserved, at least for a period of several
months.
5. Discussion
55
50
45
40
35
30
25
20
15
10
5
0
120
110
100
90
80
70
60
50
40
30
20
10
0
-10
AD
DS
AD
Reading rate [words/min]
Words correctly read [rau]
Fig. 9. Mean cumulative length of vertical eye movements per text
segment versus session number in experiment 2. Lines are representing
best fits to individual data for subjects AD, DV and DS. Note that
absolutely all vertical eye movements, including saccades and microsaccades, are summed up in this computation.
DS
Trained eye at the end of experiment 2
Non-trained contralateral eye
Trained eye using stimuli at full screen resolution
Trained eye 6 months (AD) or 3 months (DS)
after completion of experiment 2
Fig. 10. Mean performances (reading scores and reading rates) observed in the control experiments conducted on subjects AD and DS.
Mean reading scores and reading rates were computed on the basis of
eight experimental sessions for the conditions 1–3. For condition 4,
four experimental sessions were conducted on AD, and two on DS.
Errors bars indicate SDs.
(2) Could increased image resolution increase performance? The use of full-screen resolution did not improve
reading accuracy (expressed in % of correctly read
words), but it improved reading rates. For subject DS,
this improvement was highly statistically significant
(p < 0:0001); for subject AD it was at the limit of significance (p ¼ 0:013). (3) Would the benefits of learning
persist after a significant period of non-practice? No
In the present study, subjects had to move, under the
control of their own eye movements, a small viewing
window on a computer screen to read full pages of
pixelised text. When the window was placed in the centre
of the visual field, reading performances were (almost
immediately) close to perfect. In contrast, when the
window was placed at 15 eccentricity in the lower visual
field, reading performances dropped markedly and were
much poorer in two of the three subjects. However, all
subjects improved spectacularly after almost two
months of daily training. At the end of the study, two
out of the three subjects reached very high percentages
of correctly read words, reading rates of 25–30 w/m 12
and repeatedly ‘‘excellent’’ or ‘‘good’’ levels of text
comprehension. The third subject improved impressively
during the course of the study, but terminated the
experiments at a lower performance level, perhaps because her learning process was not yet completed.
Control measurements demonstrated inter-ocular
transfer as well as persistence of learning for this difficult
task. Taken together, these results indicate that useful
full-page text reading can be achieved in conditions
mimicking a retinal implant.
During their adaptation to full-page reading using an
eccentric area of the visual field subjects had to cope
with several difficulties: suppression of unwanted
reflexive eye movements; scanning several lines of text
using an eccentric and restricted viewing window;
focussing attention to this peripheral region of the visual
field; extracting a maximum of information out of low
resolution (pixelised) stimuli; reconstructing meaningful
sentences out of words and phrase fragments; and this
list is surely not exhaustive. Although all these difficulties had to be surmounted to achieve the task, it is
impossible to analyse them in an isolated way. It is
however interesting to discuss in more details some
12
Reading rates of 25 w/m seem to be very low compared to
‘normal’ values of more than 100 w/m. Very few people would want to
read a novel at 25 w/m, but greatly reduced reading rates are still useful
for daily-living reading tasks, such as reading price tags or correspondence (see e.g. Whittaker & Lovie-Kitchin, 1993, or Rumney, 1995,
who estimate that 40 w/m would be a reasonable value). To our clinical
experience, low vision patients (e.g. with central scotoma) appreciate to
be able to read even at lowest reading rates.
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
factors potentially contributing to the learning process
as well as others, limiting reading performance.
5.1. Analysis of the learning process
Monitoring of eye movements throughout the learning process indicated clearly that an important part of
the overall learning process could be attributed to progressive suppression of uncontrolled reflexive eye
movements. Subjects learned to reference their eye
movements for reading to a non-foveal and even highly
eccentric retinal locus. The temporal evolution of vertical saccades (Fig. 9) was however not perfectly correlated with the evolution of other measures, such as
reading scores (Fig. 4), reading rate (Fig. 5), or text
comprehension (Fig. 6). This is most evident for subject
DV. While she could already control her vertical eye
movements after about five sessions, her reading scores
required more than 30 sessions to asymptote and both,
her reading rate and her text comprehension still progressed slightly after 60 sessions. This mismatch, also
observed for the two other subjects, suggests that other
factors were influencing reading performance during the
overall learning process.
In a previous study, we observed significant learning
effects for eccentric reading of pixelised isolated words
(Sommerhalder et al., 2003). Similar to the present results, these improvements took a time period of about
70 experimental learning sessions. These experiments
did not require page navigation, which implies that one
important component of the overall learning process
seems to be independent of the accurate control of eye
movements and that it is likely to be associated with
performance improvements in deciphering eccentric
low-resolution stimuli. Crist, Li, and Gilbert (2001)
suggested that this kind of perceptual learning is
accompanied by a concomitant decrease of the
‘‘crowding effect’’. The decrease of crowding has been
found to be related to attention (Leat, Li, & Epp, 1999),
which in turn can be improved by learning (see e.g.
Sireteanu & Rettenbach, 2000). This would mean that a
significant decrease of the ‘‘crowding effect’’ is very
likely to be an important part of the overall learning
process.
5.2. Text comprehension and influence of context
It is interesting to retain that subjects had to achieve
relatively high reading scores above 85% of correctly
read words, to reach ‘‘good’’ or ‘‘excellent’’ text comprehension levels, i.e. to achieve useful text comprehension. Lower reading scores were almost always paired
with insufficient text comprehension. This finding, based
on a very simple qualitative comprehension test, has to
be considered with prudence, but it indicates that
1703
reading scores, which can be measured in a quantitative
way, have to be at high levels to allow for useful reading.
The influence of context information on performance
remains a difficult issue to assess. The comparison of the
present data on full-page text reading to those of our
previous study using isolated words is interesting in this
respect. After training, we observed mean reading scores
of about 75% correct for isolated words (two subjects in
Sommerhalder et al., 2003) and mean reading scores of
about 94% correct for full-page text reading (three
subjects in this study). While this difference is not statistically significant for such a small number of subjects,
it is in agreement with the results reported by Fine and
Peli (1996) and Fine, Hazel, Latham, and Rubin (1999),
supporting the hypothesis that the use of context
information helps reading performance, even in eccentric reading.
5.3. Eccentric versus central full-page text reading
At the end of experiment 2, all subjects had reached
relatively efficient full-page text reading. It was therefore
interesting to compare, quantitatively and qualitatively,
their performances to those achieved with central reading in similar experimental conditions.
Table 1 compares ‘final’ reading performances,
averaged across the last four sessions of experiment 1
(central reading) to those averaged across the last four
sessions of experiment 2 (eccentric reading). Final
reading scores across the two conditions were not statistically different for two of the three subjects. Only
subject DV presented significantly lower scores for
eccentric reading (p ¼ 0:0004). The pattern of results
was however quite different for reading rates. Average
reading rates using eccentric vision were considerably
lower (by a factor of 2.5–5.8) than those achieved with
central vision, confirming that target eccentricity was
one major factor limiting the reading rate. Other authors already reported low reading rates for eccentric
vision. Wensveen, Bedell, and Loshin (1995), for
example, found that simulated central scotoma produced dramatic decrements of reading rates. Simulated
8 central scotomata resulted in a threefold reduction of
the reading rates for their younger subjects. The experimental conditions in their study were however markedly different from those used in the present study. It is
therefore difficult to compare results quantitatively. In
this context should also be noted, that, when experiment
2 was terminated, reading rates had not really asymptoted. We believe, however, that only subject DS would
have been able to further increase his reading rate with
prolonged training (see Fig. 5).
It is also interesting to compare, for the same subject,
eye movements when scanning full-page text using either
a central or an eccentric viewing window. Fig. 11 presents the trajectory of the centre of the viewing area
1704
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
Table 1
Mean reading scores and reading rates at the end of experiment 1 (central vision) compared to those at the end of experiment 2 (eccentric vision)
Subject
Central vision
Mean reading scores
rau
%
Eccentric vision (15)
rau
%
p
AD
DV
DS
116.5
113.6
117.8
99.9
99.4
100
113.0
89.8
109.1
99.3
88.2
98.2
ns
0.0004
ns
words/min
words/min
p
117.7
87.4
71.8
27.6
14.9
29.0
<0.0001
0.0002
<0.0001
Mean reading rates
AD
DV
DS
Means are calculated on the basis of the four last experimental sessions in each condition. p values indicate the statistical significance for the
difference between means.
ns: non-significant.
Fig. 11. Trajectory of the centre of the viewing window (solid line)
relative to the text during the last experimental session of experiment 1
(central reading) for subject AD. The panel on the right represents the
frequency histogram of the vertical coordinate of the gaze position
recorded every 4 ms. Grey bars indicate the position of the lines of text.
relative to the lines of text during the last session of
experiment 1 (central reading) for subject AD. Equivalent data on the same subject, but collected for eccentric
reading at the end of experiment 2 were presented in Fig.
8b. While similitudes are striking, there are noticeable
differences. The subject traced backwards much less
frequently when using a central viewing area, the vertical gaze position was better controlled, and finally, the
viewing window appeared to be more accurately centred
on the lines, not slightly below.
Central reading remained obviously more efficient
than eccentric reading, even after a prolonged twomonth period of adaptation to the task. This difference
might have several foundations.
The width of the viewing window used in this study
limited the maximum visual span to about 6 letters.
Legge, Mansfield, and Chung (2001) proposed that the
average visual span shrinks from at least 10 letters in
central vision to about 1.7 letters at 15 eccentricity; this
low value being however increased upon prolonged
observation times. As a consequence, the visual span
experienced by the subjects was artificially limited to 6
letters in central vision, thus reducing reading rates for
central vision. At 15 eccentricity, this effect was even
more pronounced, not because of this experimental
limitation, but because of the ‘natural’ reduction of the
visual span at high eccentricity. Subjects had therefore
either to increase the number of saccades to decipher a
given word, or to increase fixation time to extend the
visual span; both strategies leading to lower reading
rates, which is consistent with our experimental observations.
Finally, one can wonder to which extent lower reading rates, observed with eccentric vision, could be
attributed to the decreased spatial resolution in peripheral regions of the retina. The visual acuity at an
eccentricity of 15 is expected to be about 20/125 (Cowey & Rolls, 1974; Daniel & Whitteridge, 1961). Whittaker and Lovie-Kitchin (1993) proposed to use font
sizes, several times bigger than the acuity threshold, to
reach optimal reading rates. Bowers and Reid (1997)
suggested print sizes of at least four times the acuity
threshold. The character size we used corresponds to a
visual acuity of about 20/400. This size was thus just
adequate, and did not, in our view, significantly limit
reading rates for eccentric reading in this study.
5.4. Other factors limiting reading performance
At the end of experiment 1 (central reading), subjects
achieved reading rates ranging between 72 and 118 w/m,
which was considerably lower than the 160–180 w/m,
they achieved when reading the same newspaper under
‘normal’ reading conditions. This speed reduction of
about a factor two must be associated with constraints
of the experimental set-up.
One probable reason for this reduction of reading
rates is the spatial limitation of the viewing window.
Several authors, such as Fine, Kirschen, and Peli (1996)
or Beckmann and Legge (1996), have reported that the
reading rate in central vision can increase for visual
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
spans up to 13–14 characters. As stated above, the visual
span was artificially limited to a maximum of about six
letters in this study. This is considerably smaller than the
optimal visual span required to achieve maximum
reading rates with central vision. Additionally, the use of
a restricted viewing window was certainly also affecting
page navigation since peripheral vision was not available
to orient saccades, which probably also contributed to a
further decrease of reading rates.
Another experimental factor reducing reading rates
was certainly the use of target pixelisation. The pixel
resolution we used for our experiments (572 pixels in a
viewing window of 10 · 7) was high enough to transmit the necessary information for full-page text reading.
This was demonstrated by close to perfect reading scores
in central vision. However, one of our control measurements, conducted at 15 eccentricity, revealed that
the use of non-pixelised text could significantly improve
the reading rate (see Fig. 10). This implies that performing this task using pixelised stimuli, containing a
close to threshold information content (which is pertinent in view of the development of retinal prosthesis),
does put some significant load on the visual system in
order to extract the relevant information. Using stimuli
largely exceeding this lowest limit does facilitate reading,
probably through the use of redundancy. One could
relate this issue to findings, obtained by Legge, Ahn,
Klitz, and Luebker (1997) when using low contrast text,
or to those of Whittaker and Lovie-Kitchin (1993) or
Bowers and Reid (1997) who suggested that print size
and contrast should be at least several times the
threshold values to achieve optimal reading rates.
00
Finally, it is important to recall, that on the 22 PC
monitor we used in this study, the pages of text consisted
of seven lines, made of two to five words per line.
Scanning such a text is certainly less optimal than scanning normal text in a newspaper. However, such optimal
conditions are unfortunately difficult to conceive in the
present stage of development of retinal implants.
6. Conclusion
On the basis of the results reported in the present
work, we can conclude that retinal implants might be
able to restore full-page text reading abilities to blind
patients. About 600 electrodes equally distributed on an
implant surface of 3 · 2 mm2 , appear to be a minimum
to restore useful function. 13 A significant learning process will however be required to reach optimal performance with such devices, especially if the implant has to
be placed outside the fovea. Future users of retinal im13
Prototypes of highly integrated retinal prosthesis, reaching this
level of contact density, have already been realised by Zrenner et al.
(1997) and Peyman et al. (1998).
1705
plants will wear their prosthesis permanently in daily
life. They will have much more time to adapt to their
new vision than the normal subjects who participated in
these simulation experiments. One might therefore expect them to benefit even more from learning, as well as
from other possible brain plasticity mechanisms. Our
present results are in this respect very encouraging for
the future.
Additional research using similar simulations to assess the potential feasibility of other important visual
tasks, such as spatial orientation (mobility) and spatial
localisation (visuo-motor coordination) are required to
get a more complete picture of the potential benefits that
could be derived from retinal prostheses. Multidisciplinary research is also needed to determine, if prototype
chips can actually reach the required spatial selectivity in
neural excitation, as well as if they can preserve to some
extent retinotopic mapping.
Acknowledgements
This work was supported by the Swiss National
Foundation for Scientific Research (grant 310061956.00 and grant 3152-063915.00) and the ProVisu
Foundation.
References
Baldasare, J., & Watson, G. (1986). Observations from the psychology
of reading relevant to low vision research. In G. C. Woo (Ed.), Low
vision principles and applications (pp. 272–286). New York:
Springer Verlag.
Beckmann, P. J., & Legge, G. E. (1996). Psychophysics of reading.
XIV. The page navigation problem in using magnifiers. Vision
Research, 36, 3723–3733.
Bowers, A. R., & Reid, V. M. (1997). Eye movements and reading with
simulated visual impairment. Ophthalmic and Physiological Optics,
17, 392–402.
Chow, A. Y., & Chow, V. Y. (1997). Subretinal electrical stimulation
of the rabbit retina. Neuroscience Letters, 225, 13–16.
Cowey, A., & Rolls, E. T. (1974). Human cortical magnification factor
and its relation to visual acuity. Experimental Brain Research, 21,
447–454.
Crist, R. E., Li, W., & Gilbert, C. D. (2001). Learning to see:
experience and attention in primary visual cortex. Nature Neuroscience, 4, 519–525.
Daniel, P. M., & Whitteridge, D. (1961). The representation of the
visual field on the cerebral cortex in monkey. Journal of Physiology,
159, 203–221.
Dobelle, W. H. (2000). Artificial vision for the blind by connecting a
television camera to the visual cortex. ASAJO Journal, 46, 3–9.
Fine, E. M., & Peli, E. (1996). The role of context in reading with
central field loss. Optometry and Vision Science, 73, 533–539.
Fine, E. M., Kirschen, M. P., & Peli, E. (1996). The necessary field of
view to read with an optimal stand magnifier. Journal of the
American Optometric Association, 67, 382–389.
Fine, E. M., Hazel, C. A., Latham, K., & Rubin, G. S. (1999). Are
benefits of sentence context different in central and peripheral
vision? Optometry and Vision Science, 76, 764–769.
1706
J. Sommerhalder et al. / Vision Research 44 (2004) 1693–1706
Fletcher, D. C., & Schuchard, R. A. (1997). Preferred retinal loci.
Relationship to macular scotomas in a low-vision population.
Ophthalmology, 104, 632–638.
Harland, S., Legge, G. E., & Luebker, A. (1998). Psychophysics of
reading. XVII. Low-vision performance with four types of electronically magnified text. Optometry and Vision Science, 75, 183–190.
Heinen, S. J., & Skavenski, A. A. (1992). Adaptation of saccades and
fixation to bilateral foveal lesions in adult monkey. Vision
Research, 32, 365–373.
Humayun, M. S. (2001). Intraocular retinal prosthesis. Transcations of
the American Ophtalmological Society, 99, 271–300.
Latham, K., & Whitaker, D. (1996). A comparison of word recognition and reading performance in foveal and peripheral vision.
Vision Research, 36, 2665–2674.
Leat, S. J., Li, W., & Epp, K. (1999). Crowding in central and eccentric
vision: the effects of contour interaction and attention. Investigative
Ophthalmology and Visual Science, 40, 504–512.
Legge, G. E., Ahn, S. J., Klitz, T. S., & Luebker, A. (1997).
Psychophysics of reading. XVI. The visual span in normal and low
vision. Vision Research, 37, 1999–2010.
Legge, G. E., Mansfield, J. S., & Chung, S. T. L. (2001). Psychophysics
of reading XX. Linking letter recognition to reading speed in
central and peripheral vision. Vision Research, 41, 725–743.
Normann, R. A., Maynard, E. M., Rousche, P. J., & Warren, D. J.
(1999). A neural interface for a cortical vision prosthesis. Vision
Research, 39, 2577–2587.
Peyman, G., Chow, A. Y., Liang, C., Chow, V. C., Perlman, J. I., &
Peachey, N. S. (1998). Subretinal semiconductor microelectrode
array. Opththalmic Surgery and Lasers, 29, 234–241.
Rizzo, J. F., & Wyatt, J. (1997). Prospects for a visual prosthesis. The
Neuroscientist, 3, 251–262.
Rumney, N. J. (1995). Using visual thresholds to establish vision
performance. Ophthalmic & Physiological Optics, 15, S18–S24.
Sireteanu, R., & Rettenbach, R. (2000). Perceptual learning in visual
search generalizes over tasks, locations, and eyes. Vision Research,
40, 2925–2949.
Sj€
ostrand, J., Olsson, V., Popovic, Z., & Conradi, N. (1999).
Quantitative estimations of foveal and extra-foveal retinal circuitry
in humans. Vision Research, 39, 2987–2998.
Sommerhalder, J., Oueghlani, E., Bagnoud, M., Leonards, U., Safran,
A. B., & Pelizzone, M. (2003). Simulation of artificial vision: I.
Eccentric reading of isolated words, and perceptual learning. Vision
Research, 43, 269–283.
Studebaker, G. A. (1985). A rationalized arcsine transform. Journal of
Speech and Hearing Research, 28, 455–462.
Toet, A., & Levi, D. M. (1992). The two-dimensional shape of spatial
interaction zones in the parafovea. Vision Research, 32, 1349–1357.
Veraart, C., Raftopoulos, C., Mortimer, J. T., Delbeke, J., Pins, D.,
Michaux, G., Vanlierde, A., Parrini, S., & Wanet-Defalque, M. C.
(1998). Visual sensations produced by optic nerve stimulation using
an implanted self-sizing spiral cuff electrode. Brain Research, 813,
181–186.
Wensveen, J. M., Bedell, H. E., & Loshin, D. S. (1995). Reading rates
with artificial central scotoma with and without special remapping
of print. Optometry and Vision Science, 72, 100–114.
Whittaker, S. G., Cummings, R. W., & Swieson, L. R. (1991). Saccade
control without a fovea. Vision Research, 31, 2209–2218.
Whittaker, S. G., & Lovie-Kitchin, J. (1993). Visual requirements for
reading. Optometry and Vision Science, 70, 54–65.
Zrenner, E., Miliczek, K. D., Gabel, V. P., Graf, H. G., Guenther, E.,
Haemmerle, H., Hoefflinger, B., Kohler, K., Nisch, W., Schubert,
M., Stett, A., & Weiss, S. (1997). The development of subretinal
microphotodiodes for replacement of degenerated photoreceptors.
Ophthalmic Research, 29, 269–280.
Zrenner, E. (2002). Will retinal implants restore vision? Science, 295,
1022–1025.
Processes Involved in Oculomotor Adaptation to
Eccentric Reading
Angélica Pérez Fornos, Jörg Sommerhalder, Benjamin Rappaz, Marco Pelizzone, and
Avinoam B. Safran
PURPOSE. Adaptation to eccentric viewing in subjects with a
central scotoma remains poorly understood. The purpose of
this study was to analyze the adaptation stages of oculomotor
control to forced eccentric reading in normal subjects.
METHODS. Three normal adults (25.7 ⫾ 3.8 years of age) were
trained to read full-page texts using a restricted 10° ⫻ 7°
viewing window stabilized at 15° eccentricity (lower visual
field). Gaze position was recorded throughout the training
period (1 hour per day for approximately 6 weeks).
RESULTS. In the first sessions, eye movements appeared inappropriate for reading, mainly consisting of reflexive vertical
(foveating) saccades. In early adaptation phases, both vertical
saccade count and amplitude dramatically decreased. Horizontal saccade frequency increased in the first experimental sessions, then slowly decreased after 7 to 15 sessions. Amplitude
of horizontal saccades increased with training. Gradually, accurate line jumps appeared, the proportion of progressive
saccades increased, and the proportion of regressive saccades
decreased. At the end of the learning process, eye movements
mainly consisted of horizontal progressions, line jumps, and a
few horizontal regressions.
CONCLUSIONS. Two main adaptation phases were distinguished:
a “faster” vertical process aimed at suppressing reflexive foveation and a “slower” restructuring of the horizontal eye movement pattern. The vertical phase consisted of a rapid reduction
in the number of vertical saccades and a rapid but more
progressive adjustment of remaining vertical saccades. The
horizontal phase involved the amplitude adjustment of horizontal saccades (mainly progressions) to the text presented
and the reduction of regressions required. (Invest Ophthalmol
Vis Sci. 2006;47:1439 –1447) DOI:10.1167/iovs.05-0973
I
n humans, selective attention is mainly focused around the
fovea, the retinal area providing the highest spatial resolution. The oculomotor system is constructed essentially to subserve foveal function by directing and stabilizing images of
interest to that retinal location. When the fovea is lost as a
result of disease, affected subjects strive to use optimally
spared retinal areas as a replacement. Adaptation to this viewing condition may involve several processes. Spared retinal
areas with best visual acuity and/or appropriate visual field
From the Ophthalmology Clinic, Department of Clinical Neurosciences, Geneva University Hospitals, Geneva, Switzerland.
Supported by the Swiss National Foundation for Scientific Research Grants 3100-61956.00 and 3152-063915.00, by the ProVisu
Foundation, and by the Fondation en Faveur des Aveugles, Geneva,
Switzerland.
Submitted for publication July 26, 2005; revised November 23,
2005; accepted February 14, 2006.
Disclosure: A. Pérez Fornos, None; J. Sommerhalder, None; B.
Rappaz, None; M. Pelizzone, None; A.B. Safran, None
The publication costs of this article were defrayed in part by page
charge payment. This article must therefore be marked “advertisement ” in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Corresponding author: Jörg Sommerhalder, Ophthalmology
Clinic, Geneva University Hospitals, 24 rue Micheli-du-Crest, 1211
Geneva 14, Switzerland; [email protected].
Investigative Ophthalmology & Visual Science, April 2006, Vol. 47, No. 4
Copyright © Association for Research in Vision and Ophthalmology
(“visual span”) should be identified. Such eccentric retinal
locations are commonly known as preferred retinal loci
(PRL).1,2 Selective attention must be transferred to these eccentrically located PRL.3 In addition, oculomotor control
mechanisms should be reorganized to allow shifting images of
interest directly to the PRL.4 There is no simple rule by which
patients select a particular PRL,5,6 but it appears that PRL
location can be influenced by several factors (e.g., attentional3,7). Multiple PRL with specific complementary functions
can be used in combination.8 –14
The development of eccentric fixation seems to appear
before the ability to perform saccades shifting the image of
interest onto that new fixation area. An experimental study
where bilateral foveal lesions were made in three adult monkeys showed that while both fixation and saccadic mechanisms
may adapt to foveal loss, saccadic adaptation requires a much
lengthier process.4 In that experiment, eccentric fixation occurred as early as 1 day after the lesion, and new PRL stabilized
within 2 days. In contrast, numerous reflexive saccades inappropriately projecting visual stimuli onto the damaged fovea
were still observed the first days after the lesion. Saccades
gradually adapted to reference the newly developed PRL over
a period that lasted several weeks. Two months after the
lesions were induced, two of the three animals were able to
generate saccades, bringing the PRL directly or close to the
target image. In a clinical report of a patient who had the
sudden development of a large central scotoma, the first stages
of oculomotor adaptation to the field defect have been analyzed.15 Impressively, target fixation attempts evolved from
large, apparently poorly controlled saccades to small, mainly
horizontal saccades centered on defined retinal areas in a rapid
(⬍20 seconds) and structured process. In another clinical
study involving patients with central field defects, various refixation strategies were identified.16 One of these patients had
a central monocular scotoma and therefore presumably had
not used his affected eye for fixation. With that eye, he showed
a striking foveating– defoveating strategy: to fixate a stimulus
appearing in the peripheral visual field, he first performed a
foveating saccade (inappropriately projecting the target image
onto the central scotoma) and subsequently generated an additional saccade projecting the image onto the PRL. This distinction between the development of eccentric fixation and
the adaptation of eccentric (non-foveating) saccades suggests
that oculomotor adaptation to peripheral viewing relies on
multiple mechanisms.
In our laboratory, we are conducting a series of psychophysical experiments to determine, for a variety of tasks, the
minimum requirements for a retinal prosthesis to restore useful
artificial vision to blind patients. In this context, we recently
published a study17 focusing on full-page text reading. Tested
subjects were requested to read entire pages of text presented
on a computer screen, using a restricted viewing area that was
stabilized at a defined, first central and then eccentric visual
field location. After systematic training, useful reading was
achieved with a viewing area stabilized at high eccentricity.
Our paper focused on issues essential to the development of
visual prostheses, and most of the data collected on the oculomotor adaptation process are not reported. These results,
1439
1440
Pérez Fornos et al.
IOVS, April 2006, Vol. 47, No. 4
They had normal ophthalmic status and normal or corrected to normal
visual acuity. The experiments were designed according to the guidelines of the Declaration of Helsinki and were approved by the local
ethical authorities.
Experimental Procedure
FIGURE 1. Example of a pixelized full-page text. Hyphenation was
used to maximize words and texts were not justified. The image
covered the whole screen, subtending a visual angle of 40° ⫻ 30°.
however, also offer a unique opportunity to study the overall
process through which subjects adapt to eccentric viewing. In
the present study we further analyze these data, to better
define the processes of oculomotor adaptation to eccentric
reading. To our knowledge, this is the first attempt at describing such processes in human subjects.
The restricted and stabilized viewing area used in our experiments can be considered an artificial, suddenly imposed
PRL. In this setting, we found that one of the main issues
involved in the learning process was the adjustment of oculomotor control, to reference as accurately as possible the eccentric viewing area used to navigate across text pages. We
observed that the adaptation of eye movements includes several distinct processes. This analysis is presented herein.
METHODS
Subjects
Three normal subjects (AD, DV, and DS; respective ages: 23, 24, and 30
years) participated in the study. All of them were native French speakers, naı̈ve to the task but familiar with the purpose of the experiment.
Stimuli consisted of full pages of text presented to the subjects in
bitmap image format. A pool of 100 articles was downloaded from the
Internet Web site of the popular Swiss newspaper Le Temps (http://
www.letemps.ch). Each article was divided in 10 text segments, each
containing seven lines of text (⬃25 words). The Arial font was used
and the height of the lowercase letter x was 1.8° at a 57-cm viewing
distance. These text segments were transformed to bitmap images and
processed (pixelized) with commercial software (Photoshop ver. 5.5;
Adobe, Mountain View, CA). Figure 1 displays an example of such a
text.
Subjects sat in front of a 22-in. screen, at a viewing distance of 57
cm. In each experimental session, subjects were requested to read the
first four pages of a newspaper article displayed on the screen. The text
page was visible only through a 10 ⫻ 7° rectangular viewing area,
stabilized at a determined location of the visual field. Stabilization of
the viewing window on the retina was achieved by online gaze position compensation. Eye movements were monitored with a fast videobased eye and head-tracking system (SMI EyeLink; SensoMotoric Instruments GmbH, Teltow/Berlin, Germany). A photograph of one of the
subjects sitting in front of the stimulation screen and wearing the
eye-tracking system is shown in Figure 2a. Gaze position data captured
by the eye-tracking system were used to move the viewing window
with respect to the page of text and to update its contents accordingly
(see Fig. 2b). The maximum delay between the actual eye movement
and the subsequent update of the position and content of the viewing
window was 14 ms. The viewing window contained 572 pixels (minimum information content required for useful reading18), and at least
four characters were visible inside it at a glance (minimum required for
efficient reading19).
In the first phase of the experiment, the center of the viewing
window was stabilized on the fovea. Once subject became accustomed
to the experimental setup and reached stable reading performances
(i.e., after 8 to 16 experimental sessions) the training period for
eccentric reading began. For this second phase, the center of the
viewing window was stabilized at 15° eccentricity in the lower visual
field. Two experimental sessions were conducted each working day of
the week. An experimental session never lasted more than 30 minutes,
to avoid fatiguing the subject, resulting thus in approximately 1 h/day
FIGURE 2. Experimental setup. (a) A tested subject is wearing the head-mounted eye tracker. (b) Illustration of the screen as viewed by the subject
during an experimental session. The page of text was only visible through a 10° ⫻ 7° viewing window that shifted to follow the subject’s eye
movements (arrows). This window was stabilized either in central vision or at 15° eccentricity in the lower visual field (as shown in the
illustration). Please note that the cross marking of the foveal fixation point is only schematic (there was no foveal fixation point present during the
task).
IOVS, April 2006, Vol. 47, No. 4
Oculomotor Adaptation to Eccentric Reading
1441
FIGURE 3. Saccade categorization. (a) Saccadic eye movements were categorized as horizontal (oriented between ⫺20° and ⫹20° around the
horizontal axis and directed either forward or backward), vertical (oriented between 70° and 110° around the vertical axis and directed either up
or down), and oblique. (b) According to their direction and amplitude, horizontal saccades were further subcategorized into progressions,
regressions, and line jumps.
of training. Experiments stopped once reading scores became asymptotic (between 55 and 68 sessions). Tests were performed monocularly, with the dominant eye. Eye movements were recorded throughout the experiment and stored for further analysis.
Please refer to our previous papers17,18 for more details on the
experimental setup and procedure.
Data Analysis and Statistics
Saccades detected online by the automatic parser of the eye-tracking
system were analyzed to define the various stages of the oculomotor
adaptation process for eccentric reading. For a saccade to be detected
by the system, several criteria had to be fulfilled: a minimum eye
displacement of 0.1°, a velocity threshold of at least 30 deg/s, and a
minimum acceleration threshold of 8000 deg/s2.
Detected saccades were categorized into three main groups according to their orientation (Fig. 3a): horizontal saccades (those with an
angle of ⫾20° around the horizontal axis, and directed either right or
left), vertical saccades (those with angles between 70° and 110°
around the vertical axis, and directed either up or down), and oblique
saccades (those not fitting into any of the preceding categories).
Horizontal saccades were further subcategorized (Fig. 3b) into progressions (horizontal saccades directed right and ⬍10° in amplitude),
regressions (horizontal saccades directed left and ⬍10° in amplitude),
and line jumps (horizontal saccades directed left and ⬎20° in amplitude).
Saccade frequency was calculated as the total number of saccades
performed during an experimental session (i.e., four full pages of text).
Saccade amplitude was computed as the total absolute eye displacement (length) between the eye position at the beginning of the saccade
and its end position. Average saccade amplitude for a given experimental session was calculated on the basis of the absolute amplitude of
all saccades performed during the session.
Significant changes in oculomotor behavior throughout the learning process were determined with Pearson’s correlation (linear regression). In addition, whenever the results allowed it, we computed
learning curves to average intersession variability and better highlight
the time course of the learning process. These learning curves were
obtained by fitting the data to an exponential function. Stabilization
times were determined based on the exponential time constant (␶),
which corresponds to the time required for the function to vary by a
factor of 1/e (approximately 0.368). The stabilization time of an exponential function is generally estimated as 3␶.
RESULTS
When using central vision, performance increased and stabilized after 8 to 16 experimental sessions. Detailed reading
performance results have already been reported in our previous paper.17 Briefly, when using central vision, initial reading
scores were already higher than 95%. Reading rates increased
from 60 to 70 words per minute to stabilize at approximately
72 to 122 words per minute. For eccentric reading, two subjects (DV and DS) started the experiment with reading scores
that were nearly 13% correct and reached final scores between
86% to 98% correct. Subject AD, who had already attained good
scores (⬎85% correct) in the first sessions, achieved final
scores higher than 98% correct. Reading rates improved impressively: from 5 to 26 words per minute for subject AD, from
3 to 14 words per minute for subject DV, and from 1 to 28
words per minute for subject DS.
Samples of successive gaze position recordings obtained
during a choice of experimental sessions, superimposed to the
corresponding text page presented, are displayed in Figures 4
(central reading) and 5 (eccentric reading).
During the first training sessions for eccentric reading, oculomotor behavior appeared quite inappropriate for the reading task: large vertical saccades predominated. Subjects
seemed unable to fixate presented words or to follow a line of
text. Oculomotor behavior evolved gradually. Eye movements
intended to decipher single words were already visible as early
as in the 5th session, especially for subject DV. At the end of
the training period, all subjects developed a structured page
navigation strategy.
When comparing final eccentric reading strategies with
those observed by the end of the previous central vision reading tasks (compare Figs. 4, 5), it appears that, after training,
both eye movement patterns were roughly similar. The viewing window focused on consecutive words and across successive lines of text. Forward-directed saccades shifting fixation
from one word to the next (progressions) and saccades shifting
fixation from the end of one line to the beginning of the next
(line jumps) were clearly distinguishable. Occasionally, subjects traced back on the same line (regressions), to visualize
specific words again. However, differences could also be noted
between central and eccentric reading. In eccentric vision,
1442
Pérez Fornos et al.
IOVS, April 2006, Vol. 47, No. 4
FIGURE 4. Gaze position recorded for the three normal subjects while they performed the reading task in central vision (last session). Solid line:
trajectory of the center of the viewing window relative to the text (see Fig. 2b).
regressions occurred more frequently. Moreover, horizontal
saccades seemed less precise; therefore, more small corrective
saccades were required.
These considerations were based on a qualitative assessment of the oculomotor adaptation process observed in our
subjects. To provide an objective evaluation of the changes
that occurred in the reading strategy, a quantified analysis of
our data was conducted. The characteristics of the recorded
saccades will be described in the following section.
Saccadic Adaptation
The distribution of saccades performed during the 1st, 5th,
15th, and last eccentric reading sessions is plotted in Figure 6.
During the first training session, bundles of large vertical saccades were observed. Many of these eye movements were
between 10° and 20° in amplitude, probably reflecting recurring (reflexive) attempts to bring the stimulus image onto the
fovea (foveating saccades), followed by an equivalent saccade
of opposite direction attempting to bring the viewing window
back on the stimulation screen. In the fifth session, these
movements were no longer visible in subjects AD and DV, and
only a few of them were still observed in subject DS. The
remaining vertical saccades gradually decreased in amplitude,
to become hardly visible at the end of training. In contrast,
structured patterns of horizontal eye movements developed in
the 5th session in two subjects (AD and DV). From the 15th
session on, horizontal saccades predominated over the initially
prevailing vertical pattern. In the last training session, eye
movements essentially consisted of progressions, regressions,
line jumps, and other small corrective saccades.
Changes in saccade counts (frequencies) by category, are
plotted in Figure 7. The total number of vertical saccades
decreased significantly over time in all subjects (Pearson’s
correlation: r ⫽ 0.58, P ⬍ 0.0001 for AD; r ⫽ 0.39, P ⬍ 0.01
for DV; and r ⫽ 0.72, P ⬍ 0.0001 for DS). An approximate
15-fold drop was observed after 3, 20, and 25 sessions in
subjects DV, AD, and DS, respectively. Slighter (approximately
5-fold) but significant (Pearson’s correlation: r ⫽ 0.72, P ⬍
0.0001 for AD; r ⫽ 0.73, P ⬍ 0.0001 for DV; and r ⫽ 0.82, P ⬍
0.0001 for DS) frequency decays were observed for oblique
saccades. In subjects AD and DS, the process was slower (33
and 38 sessions, respectively) than for vertical saccades. In
subject DV, values were still decreasing when the experiment
ended. Evolution of horizontal saccade counts was more complex, and data could not be fitted with an exponential curve. In
AD and DV, these increased significantly during the first 15
sessions (respectively, Pearson’s correlation: r ⫽ 0.60, P ⬍
0.05 and r ⫽ 0.72, P ⬍ 0.01) and then significantly decreased
(respectively, Pearson’s correlation: r ⫽ 0.48, P ⬍ 0.001 and
r ⫽ 0.31, P ⬍ 0.05). In subject DS, horizontal saccade counts
increased significantly during the first seven sessions (Pear-
son’s correlation: r ⫽ 0.91, P ⬍ 0.01) and then decreased
significantly (Pearson’s correlation: r ⫽ 0.82, P ⬍ 0.0001).
Additional results were obtained after horizontal saccade
subcategorization (Fig. 8). The proportion of progressions increased significantly in all three subjects, from average values
ranging between 45% and 60% in the first sessions up to
approximately 65% by the end of training (Pearson’s correlation: r ⫽ 0.74, P ⬍ 0.0001 for AD; r ⫽ 0.43, P ⬍ 0.001 for DV;
and r ⫽ 0.74, P ⬍ 0.0001 for DS). Only subject AD reached an
asymptote (after 50 sessions). Regressions behaved inversely.
In the beginning of training, they represented approximately
41%, 34%, and 43% of the total number of horizontal saccades
in AD, DV, and DS, respectively. These proportions significantly decreased to 17%, 26%, and 27%, respectively (Pearson’s
correlation: r ⫽ 0.70, P ⬍ 0.0001 for AD; r ⫽ 0.65, P ⬍ 0.0001
for DV; and r ⫽ 0.81, P ⬍ 0.001 for DS). At the end of the
experiment, the proportion of regressions was still decreasing
in subjects DV and DS, whereas in subject AD, values stabilized
after approximately 30 sessions. The total number of line
jumps increased significantly with training in DV and DS (Pearson’s correlation: r ⫽ 0.64, P ⬍ 0.0001 and r ⫽ 0.61, P ⬍
0.0001, respectively). Line jump counts in AD were more
variable, but also tended to increase over time (Pearson’s
correlation: r ⫽ 0.20, P ⫽ 0.1). Values in subjects AD and DV
stabilized after approximately 16 and 27 sessions. In the case of
subject DS, line jump counts had not reached an asymptote
when the experiment ended.
Average amplitude of the different saccade categories was
also modulated throughout the training period (Fig. 9). For
vertical saccades, amplitudes dropped significantly from initial
values of 5° to 8° down to final values of around 3° (Pearson’s
correlation: r ⫽ 0.56, P ⬍ 0.0001 for AD; r ⫽ 0.69, P ⬍ 0.0001
for DV; and r ⫽ 0.72, P ⬍ 0.0001 for DS). Asymptotes were
reached after 13, 20, and 27 sessions in subjects DV, AD, and
DS, respectively. Average amplitude of oblique saccades remained stable in subject DS, and decreased very slightly but
significantly in subjects AD and DV (respectively, Pearson’s
correlation: r ⫽ 0.34, P ⬍ 0.01 and r ⫽ 0.47, P ⬍ 0.0001). In
contrast, average amplitude of horizontal saccades significantly
increased from values ranging between 5°, 4°, and 2.5°, up to
7°, 6°, and 4° in subjects AD, DV, and DS (correspondingly,
Pearson’s correlation: r ⫽ 0.42, P ⬍ 0.001; r ⫽ 0.51, P ⬍ 0.001;
and r ⫽ 0.80, P ⬍ 0.0001). In subject DS, amplitudes did not
stabilize, whereas in subjects AD and DV, curves reached
asymptote after 20 and 23 sessions, respectively.
DISCUSSION
Eccentric vision requires adaptation of oculomotor control to
such specific viewing conditions. Reflexive foveating mecha-
IOVS, April 2006, Vol. 47, No. 4
Oculomotor Adaptation to Eccentric Reading
1443
FIGURE 5. Gaze position recorded for the three normal subjects during the 1st, 5th, 15th, and last eccentric reading sessions. Solid line: trajectory
of the center of the viewing window relative to the text (see Fig. 2b).
nisms must be suppressed and saccadic eye movements must
be redirected to the new fixation locus.
Our data demonstrate that the pattern of eye movements
changed impressively throughout the learning process. Certain
oculomotor adaptation stages appeared consistently in all
tested subjects. Two essential adaptation processes were distinguishable: a faster, vertical phase aimed at suppressing reflexive foveation, and a slower, horizontal phase dedicated to
the restructuring of the horizontal eye-movement pattern.
During the first sessions, numerous vertical foveating saccades were observed. Interestingly, the first rapid, vertical
adaptation process appeared to include two relatively distinct,
parallel phases: one consisting of the reduction of the vertical
saccade count, the second of the reduction of both the oblique
saccade count and vertical saccade amplitude. According to
our results, the former occurred promptly, and the latter,
although rapid, was more progressive. It is reasonable to presume that both aim at reducing reflexive foveation, but each
relies on distinct mechanisms, as suggested by their different
time course.
The second, slower adaptation phase concerned the restructuring of the horizontal eye movement pattern. In the
initial sessions, no structured reading sequence was distinguishable. Frequency of horizontal saccades increased during
the first 7 to 15 sessions and then slowly decreased, whereas
their average amplitude increased all through the learning
process. The proportion of progressions increased gradually. It
has been demonstrated that, in eccentric vision, the visual span
1444
Pérez Fornos et al.
IOVS, April 2006, Vol. 47, No. 4
FIGURE 6. Angular and amplitude (polar) distribution of the saccades performed during the 1st, 5th, 15th, and last eccentric reading sessions, for
each of the three subjects.
can increase with training.20 This should result in fewer but
longer saccades, as observed in our data. A significant reduction in the proportion of regressive saccades was also observed
in all subjects. As a rule, when reading difficulty decreases,
saccade length increases, and the frequency of regressions
diminishes.21,22 Subjects spontaneously reported that the task
became easier with training, resulting in better word recognition during eccentric fixation (see also our previous publications17,18). Thus, fewer regressions were necessary for deciphering. Line jumps developed gradually, and better
calibration of progressive saccades was achieved with training.
Hence, as better eccentric oculomotor control was developed,
fewer corrective saccades were needed. Two parallel, presumably related phenomena may therefore be distinguished during
the development of horizontal saccade control. The first one
corresponded to the adaptation of the amplitude of horizontal
saccades (mainly progressions) to the text presented. The
second one consisted of the reduction in number of regressions.
Our results showed that, even when optimal eccentric reading performance has been attained, oculomotor behavior was
not optimal compared with that observed in central vision
(compare results for central and eccentric reading in Figs. 7, 8,
9). Although subjects adapted to the eccentric reading task,
IOVS, April 2006, Vol. 47, No. 4
Oculomotor Adaptation to Eccentric Reading
1445
FIGURE 7. Changes in saccade frequency versus session number for the three subjects during training of eccentric reading, by saccade category.
Average values in central vision (black dashed lines) are also shown for comparison.
vertical saccades did not disappear completely. More horizontal and oblique saccades were necessary in eccentric vision
than in central vision. In two of the three subjects, more line
jumps were performed and horizontal saccades were smaller
during eccentric reading. In general, oblique saccades were
smaller for eccentric than central viewing conditions. These
results clearly demonstrate that, even after extensive training,
the characteristics of saccades performed during eccentric and
central reading differed. A previous investigation23 in patients
with central scotoma described similar behavior. Even when
these patients had adapted to direct images consistently onto
the PRL, characteristics of eccentric saccades differed from
those of foveating saccades. Typically, foveating saccades have
shorter latencies and are more accurate than eccentric, nonfoveating saccades.24 –26 Taken together, these findings confirm that subjects suppress foveating saccades and then adapt
nonfoveating saccades to reference the new fixation locus, in
accordance with previous reports.23
Limitations and Implications of the Present Study
Our experimental setting obviously does not fully simulate the
functional constraints and remaining retinal capacities found in
conditions associated with central scotomas and eccentric
reading. Furthermore, the results presented herein were cer-
tainly influenced by the artificial constraints imposed by our
experimental setting.
Unlike most patients with macular disease, our subjects
were also confronted with artificial tunnel-vision conditions.
As a consequence, page navigation was not only limited by
eccentric viewing, but also, because of the lack of peripheral
information (restricted by the size of the viewing window). It
is therefore possible that subjects learned to make stereotyped
patterns of horizontal forward and return saccades to move the
viewing window along the text. Another possibility is that
subjects achieved horizontal page navigation by performing
saccades to “meaningful” portions of the text that were already
visible in the viewing window. Both strategies limit the amplitude and timing of the resultant eye movements, therefore
restricting the maximum reading speed that the subjects attained. Furthermore, patients with central scotoma can develop multiple PRL that may be used in combination to improve reading performance,12 whereas, in our experimental
setting, subjects had to cope with a single and fixed PRL.
The retinal location of our artificial PRL and the choice of
the task to be performed (reading) also influenced the oculomotor adaptation pattern. The orientation of foveating saccades obviously depends on the absolute position of the restricted viewing window relative to the fovea. In our case,
FIGURE 8. Evolution of the different horizontal saccade subcategories versus session number for the three subjects during training of eccentric
reading. The proportions (%) of progressions and regressions were calculated on the basis of the total number of horizontal saccades. Average
values for central vision (black dashed lines) are also shown for comparison.
1446
Pérez Fornos et al.
IOVS, April 2006, Vol. 47, No. 4
FIGURE 9. Changes in average saccade amplitude (in degrees) versus session number for the three subjects during training of eccentric reading,
by saccade category. Average values for central vision (black dashed lines) are also shown for comparison.
foveating saccades were vertical because the restricted viewing window was stabilized at a given position along the vertical
meridian. Furthermore, for this reading task, the page navigation strategy essentially consisted of horizontal saccades (saccades performed from one word to the next, and from the end
of one line to the beginning of another). Therefore, because of
the nature and requirements of the task, subjects essentially
optimized the horizontal oculomotor pattern once foveating
saccades were controlled. Moreover, Peli27 suggested that an
orthogonal paradigm (where eccentricity direction is perpendicular to direction of gaze or target movement), as used in our
experiments, might favor eccentric oculomotor adaptation.
Despite these methodological limitations, our experimental
setting simulated the fundamental constraints faced by patients
with central scotoma. In these patients, at least one new
fixation locus must be developed to compensate for the missing fovea, and eye movements should be recalibrated accordingly. We therefore believe the present results offer useful
indications of how mechanisms for eccentric reading are constructed, at least in certain circumstances.
Additional Considerations
Even after extensive training, eccentric reading remained a
difficult task resulting in low reading rates, as consistently
observed in clinical practice.6,19,20 As already discussed, suboptimal control of eccentric, non-foveating saccades might
limit the maximum reading performance that can be achieved
in eccentric viewing conditions. However, optimal peripheral
reading rates were approximately five times slower than those
obtained by the same subjects in the initial central viewing
experiments. Oculomotor deficits observed at the end of training can hardly account for such a slowdown. Previous research18,28 –30 demonstrated that peripheral reading remains
slow, even when no eye movements are necessary (i.e., the
RSVP paradigm). Similar to these investigations, in our experiments, maximum reading rates were essentially limited because of the spatial constraint of the viewing area. Such visual
span restrictions are known to be even more important in
peripheral than in central vision.17,31,32 Other possible factors
have already been discussed in our companion publication.17
When considering the analysis reported herein, together
with the reading performance results reported in our previous
paper,17 no consistent correlation could be established between reading performance and the course of oculomotor
adaptation to eccentric reading. Furthermore, significant learning effects have been demonstrated in previous eccentric reading studies18,20 where page navigation (i.e., the development
of precise eccentric oculomotor control) was not required.
This suggests the existence of additional adaptation mechanisms that were difficult to analyze in this study, such as
learning to shift attention from the foveal region toward the
eccentric retinal area stimulated.3,7
The learning effects reported in this article, altogether with
those described in our companion publication,17 might appear
surprising because experiments were performed in normal
subjects interleaving short eccentric viewing sessions with
much longer periods of normal foveal viewing. Similar learning
effects, however, have also been reported elsewhere.18,33 One
possibility is that learning effects acquired within a single
experimental session are somehow retained in the next one.
Another explanation could be that learning results from some
kind of perceptual assimilation occurring between sessions.
Our experience suggests us that it could be a combination of
both. This issue would be an interesting line of investigation
for future research efforts.
In conclusion, our results demonstrate that oculomotor
adaptation to eccentric reading involves at least two parallel
processes: a faster suppression of the mechanisms generating
reflexive foveating saccades and a lengthier process aimed at
optimizing the remaining saccades (especially in the horizontal
plane in the case of reading). Even after systematic training,
eccentric reading remains a difficult task resulting in low reading rates.
References
1. Von Noorden GK, Mackensen G. Phenomenology of eccentric
fixation. Am J Ophthalmol. 1962;53:642– 660.
2. Cummings RW, Whittaker SG, Watson GR, Budd JM. Scanning
characters and reading with a central scotoma. Am J Optom
Physiol Opt. 1985;62:833– 843.
3. Altpeter E, Mackeben M, Trauzettel-Klosinski S. The importance of
sustained attention for patients with maculopathies. Vision Res.
2000;40:1539 –1547.
4. Heinen SJ, Skavenski AA. Adaptation of saccades and fixation to
bilateral foveal lesions in adult monkey. Vision Res. 1992;32:365–
373.
5. Timberlake GT, Peli E, Essock EA, Augliere RA. Reading with a
macular scotoma. II. Retinal locus for scanning text. Invest Ophthalmol Vis Sci. 1987;28:1268 –1274.
6. Fletcher DC, Schuchard RA, Watson G. Relative locations of macular scotomas near the PRL: effect on low vision reading. J Rehabil
Res Dev. 1999;36:356 –364.
7. Mackeben M. Sustained focal attention and peripheral letter recognition. Spat Vis. 1999;12:51–72.
8. Whittaker SG, Budd J, Cummings RW. Eccentric fixation with
macular scotoma. Invest Ophthalmol Vis Sci. 1988;29:268 –278.
IOVS, April 2006, Vol. 47, No. 4
9. Lei H, Schuchard RA. Using two preferred retinal loci for different
lighting conditions in patients with central scotomas. Invest Ophthalmol Vis Sci. 1997;38:1812–1818.
10. Duret F, Issenhuth M, Safran AB. Combined use of several preferred retinal loci in patients with macular disorders when reading
single words. Vision Res. 1999;39:873– 879.
11. Safran AB, Duret F, Issenhuth M, Mermoud C. Full text reading
with a central scotoma: pseudo regressions and pseudo line losses.
Br J Ophthalmol. 1999;83:1341–1347.
12. Deruaz A, Whatham AR, Mermoud C, Safran AB. Reading with
multiple preferred retinal loci: implications for training a more
efficient reading strategy. Vision Res. 2002;42:2947–2957.
13. Deruaz A, Matter M, Whatham AR, et al. Can fixation instability
improve text perception during eccentric fixation in patients with
central scotomas? Br J Ophthalmol. 2004;88:461– 463.
14. Crossland MD, Kabanarou SA, Rubin GS. An unusual strategy for
fixation in a patient with bilateral advanced age related macular
disease. Br J Ophthalmol. 2004;88:1479 –1480.
15. Safran AB, Landis T. Plasticity in the adult visual cortex: implications for the diagnosis of visual field defects and visual rehabilitation. Curr Opin Ophthalmol. 1996;7:53– 64.
16. Duret F, Buquet C, Charlier J, Mermoud C, Viviani P, Safran AB.
Refixation strategies in four patients with macular disorders. Neuroophthalmology. 1999;22:209 –220.
17. Sommerhalder J, Rappaz B, de Haller R, Pérez Fornos A, Safran AB,
Pelizzone M. Simulation of artificial vision: II. Eccentric reading of
full-page text and the learning of this task. Vision Res. 2004;44:
1693–1706.
18. Sommerhalder J, Oueghlani E, Bagnoud M, Leonards U, Safran AB,
Pelizzone M. Simulation of artificial vision: I. Eccentric reading of
isolated words, and perceptual learning. Vision Res. 2003;43:269 –
283.
19. Legge GE, Rubin GS, Pelli DG, Schleske MM. Psychophysics of
reading II. Low vision. Vision Res. 1985;25:253–265.
Oculomotor Adaptation to Eccentric Reading
1447
20. Chung ST, Legge GE, Cheung SH. Letter-recognition and reading
speed in peripheral vision benefit from perceptual learning. Vision
Res. 2004;44:695–709.
21. Pirozzolo FJ. Eye movements and reading disability. In: Rayner K,
ed. Eye Movements in Reading. Perceptual and Language Processes. New York: Academic Press; 1983:499 –509.
22. Rayner K. Eye movements in reading and information processing:
20 years of research. Psychol Bull. 1998;124:372– 422.
23. Whittaker SG, Cummings RW, Swieson LR. Saccade control without a fovea. Vision Res. 1991;31:2209 –2218.
24. Hallett PE. Primary and secondary saccades to goals defined by
instructions. Vision Res. 1978;18:1279 –1296.
25. Zeevi YY, Peli E. Latency of peripheral saccades. J Opt Soc Am.
1979;69:1274 –1279.
26. Whittaker SG, Cummings RW. Foveating saccades. Vision Res.
1990;30:1363–1366.
27. Peli E. Control of eye movement with peripheral vision: implications for training of eccentric viewing. Am J Optom Physiol Opt.
1986;63:113–118.
28. Latham K, Whitaker D. A comparison of word recognition and
reading performance in foveal and peripheral vision. Vision Res.
1996;36:2665–2674.
29. Rubin GS, Turano K. Low vision reading with sequential word
presentation. Vision Res. 1994;34:1723–1733.
30. Chung ST, Mansfield JS, Legge GE. Psychophysics of reading. XVIII.
The effect of print size on reading speed in normal peripheral
vision. Vision Res. 1998;38:2949 –2962.
31. Beckmann PJ, Legge GE. Psychophysics of reading: XIV. The page
navigation problem in using magnifiers. Vision Res. 1996;36:3723–
3733.
32. Fine EM, Kirschen MP, Peli E. The necessary field of view to read
with an optimal stand magnifier. J Am Optom Assoc. 1996;67:382–
389.
33. Beard BL, Levi DM, Reich LN. Perceptual learning in parafoveal
vision. Vision Res. 1995;35:1679 –1690.
Simulation of Artificial Vision, III: Do the Spatial or
Temporal Characteristics of Stimulus Pixelization
Really Matter?
Angélica Pérez Fornos, Jörg Sommerhalder, Benjamin Rappaz, Avinoam B. Safran, and
Marco Pelizzone
PURPOSE. In preceding studies, simulations of artificial vision
were used to determine the basic parameters for visual prostheses to restore useful reading abilities. These simulations
were based on a simplified procedure to reduce stimuli information content by preprocessing images with a block-averaging algorithm (square pixelization). In the present study, how
such a simplified algorithm affects reading performance was
examined.
METHODS. Five to six volunteers with normal vision were asked
to read full pages of text with a 10° ⫻ 7° viewing window
stabilized in central vision. In a first experiment, reading performance with off-line and real-time square pixelizations was
compared at different resolutions. In a second experiment,
off-line square pixelization was compared with off-line Gaussian pixelization with various degrees of overlap. In a third
experiment, real-time square pixelization was compared with
real-time Gaussian pixelization.
RESULTS. Results from the first experiment showed that realtime square pixelization required approximately 30% less information (pixels) than its off-line counterpart. Results from
the second experiment, using off-line processing, revealed a
restricted range of Gaussian widths for which performances
were equivalent or significantly better than that obtained with
square pixelization. The third experiment demonstrated, however, that reading performances were similar in both real-time
pixelization conditions.
CONCLUSIONS. This study reveals that real-time stimulus pixelization favors reading performance. Performance gains were
moderate, however, and did not allow for a significant (e.g.,
twofold) reduction of the minimum resolution (400 –500 pixels) needed to achieve useful reading abilities. (Invest Ophthalmol Vis Sci. 2005;46:3906 –3912) DOI:10.1167/iovs.04-1173
C
urrently, several research groups are working toward the
development of visual prostheses for the blind.1–7 Despite
fundamental design differences (implantation site, image acquisition, and processing techniques), these approaches share
common features that lead to several major constraints on the
From the Ophthalmology Clinic, Department of Clinical Neurosciences, Geneva University Hospitals, Geneva, Switzerland.
Supported by Swiss National Foundation for Scientific Research
Grants 3100-61956.00 and 3152-063915.00 and by the ProVisu Foundation.
Submitted for publication October 4, 2004; revised February 14,
May 24, and June 2, 2005; accepted August 1, 2005.
Disclosure: A. Pérez Fornos, None; J. Sommerhalder, None; B.
Rappaz, None; A.B. Safran, None; M. Pelizzone, None
The publication costs of this article were defrayed in part by page
charge payment. This article must therefore be marked “advertisement” in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Corresponding author: Jörg Sommerhalder, Ophthalmology
Clinic, Geneva University Hospitals, 24 rue Micheli-du-Crest, 1211
Geneva 14, Switzerland; [email protected].
3906
visual percepts that can be elicited. Envisioned devices consist
of a finite number of discrete stimulation contacts, will be
implanted at a fixed location in the eye, and will subtend only
a fraction of the entire visual field. If one expects to restore
useful vision to blind patients, these constraints have to be
thoroughly considered.
Our research group is part of a larger multidisciplinary
research effort aiming to develop a subretinal implant. Our
CMOS-Retina8 –10 is built to transform incident light on the
retina into electric stimulation currents “in situ.” In this context, we have developed special experimental conditions (simulations) to explore the minimum requirements to restore
useful artificial vision.
Our simulations use low-resolution (pixelized) images that
are projected in a “small” viewing area, stabilized at a fixed
location in the visual field. We attempt to mimic the type of
visual information provided by a retinal implant, using photodiode technology to transform incident light into an electric
signal. With this methodological approach we explored, in a
first study,11 the reading of isolated four-letter words. In central
vision, accurate recognition was possible with pixelizations
down to 286 pixels, distributed over a 10° ⫻ 3.5° viewing
window. After a period of systematic training, comparable
results were achieved with the same viewing window stabilized at 15° eccentricity in the lower visual field. In a second
study,12 we explored full-page text reading under similar conditions. Tests were performed with a larger viewing window of
10° ⫻ 7° containing 572 pixels, that moved across the page of
text under control of the subject’s eye movements. Performance was close to perfect with central vision. With eccentric
vision, subjects achieved reading scores between 86% and 98%
after a period of methodical training.
In earlier studies, we used a simplified technique to simulate the limited number of stimulation contacts available in a
visual prosthesis. Stimulus images were decomposed into a
finite number of pixels with a simple block-averaging algorithm. This resulted in a mosaic of square pixels of various gray
levels, the gray level within each pixel being constant (square
pixelization). However, electrophysiological research13–15 revealed that the patterns of neural activity elicited by electric
stimulation of the retina depend on the strength of the stimulation current and that neural activation diminishes progressively with increasing electrode-to-neural target distance.
These findings imply that phosphenes elicited by electrical
stimulation of the retina should not be of constant luminosity
and not of square shape. Furthermore, depending on the
strength of the stimulation current, the percepts may develop
from a collection of isolated phosphenes toward more continuous patterns with different degrees of overlap across neighboring phosphenes.
One could argue that square pixelization is adequate to
simulate the reduced information content of the stimuli transmitted by a retinal implant. In a given condition, the detailed
shape of each pixel does not alter the overall information
content of the image. However, studies on face recognition
Investigative Ophthalmology & Visual Science, October 2005, Vol. 46, No. 10
Copyright © Association for Research in Vision and Ophthalmology
Simulation of Artificial Vision
IOVS, October 2005, Vol. 46, No. 10
have demonstrated that detection is considerably hampered
when images are decomposed into uniform square pixels.
Harmon and Julesz16 suggested that the oriented high-frequency noise introduced at block borders masks certain image
features essential for recognition. Gestalt psychologists17,18
further proposed that square pixelization distorts the image to
the point of modifying its intrinsic gestalt properties.19 Bachmann and Kahusk20 also suggest that the “block” constituents
or pixels of the processed image compete for attention with
the particular features of the image, thus affecting recognition.
If one wants to avoid these drawbacks, square pixelization
should be replaced by other types of image quantization featuring softer borders and allowing for variable amounts of
overlap.
Another shortcoming of our previous studies is that the
pixelization algorithm was applied off-line over the entire original image (e.g., seven lines of full-page text). Subjects were
allowed to scan this preprocessed image through a viewing
window containing a subset of 572 pixels, the gray level of
these “frozen” pixels being independent of the point of gaze
on the image. This would not be the case in artificial vision
systems, since stimulation intensity at each electrode contact
would depend on the exact point of gaze relative to the image
observed. For retinal implants transforming light falling on the
retina into stimulation currents “in situ,”4,7,10 this would happen due to eye movements. Head movements would act similarly in systems using an external head-mounted camera for
stimulus generation.1–3,5,6 In the case of reading, when focusing on a string of a few characters, its appearance would
change on small eye (or camera) movements. Temporal cues
seem to play a significant role in visual perception: the human
visual system is optimized for detecting structural changes in
dynamic images. A dynamic sequence of slightly different pixelized images may contain more information than one frozen
pixelized image; therefore, dynamic (real-time) pixelization is
likely to enhance information transmission to the visual system.
Major object identification features (such as shape or location)
are extracted from different spatial patterns (such as local
contrast changes or relative position changes) resulting from
image motion. Improved sensitivity for moving contrast
changes, compared to their static equivalents, has previously
been demonstrated.21 Moreover, it has already been established that dynamic presentations lead to better performance
in tasks like facial recognition.22–24 Hence, if one wants more
accurate simulations of artificial vision, pixelization should be
performed in real-time and the intensity of each pixel should
vary dynamically, according to gaze position.
To our knowledge, psychophysical research using simulations of prosthetic vision has not been extensive so far. Reading and mobility were first studied by a group at the University
of Utah.25,26 Their head-mounted experimental setup consisted of a video camera sending images to a monochrome
monitor that projected to the subject’s right eye (maximum
viewing angle of 1.7°). Pixelization was achieved by overlaying
the monitor with opaque masks containing a variable number
of square perforations (pixels). Recently, another group at The
Johns Hopkins University presented a series of experiments
that used simulations specifically designed to mimic percepts
evoked by retinal implants.27–29 Different pixelization algorithms were used: a square pixelizing filter similar to the one
presented in this article, a constant luminosity circular pixelizing filter, and a nonoverlapping Gaussian filter. Unfortunately, no direct comparison of the different pixelizing algorithms has been reported. Moreover, all these experiments
neglected a fundamental aspect of artificial vision with a retinal
implant: Viewing areas were not stabilized at fixed (eccentric)
retinal positions. In more recent studies, the latter authors
acknowledged that the stabilization of the viewing area on the
3907
retina can significantly affect performance (Dagnelie G, et al.
IOVS 2004;45:ARVO E-Abstract 4223; Kelley AJ, et al. IOVS
2004;45:ARVO E-Abstract 5436), especially in visually demanding tasks such as reading.
To validate our previous studies as well as to improve our
simulation methods for future studies, we decided to investigate specifically the influence of the spatial and temporal characteristics of stimulus pixelization on reading performance. In
the present study, we report a series of three paired comparisons of the effects of different pixelization methods on fullpage reading. We compared reading performance: (1) between
off-line square pixelization and real-time square pixelization of
the image, (2) between off-line square pixelization and off-line
Gaussian pixelization of the image, and (3) between real-time
square pixelization and real-time Gaussian pixelization of the
image.
METHODS
Subjects
Ten subjects aged between 23 and 41 years were recruited from the
staff of the Geneva University Ophthalmology Clinic. All of them had
perfect command of French, corrected visual acuity of 20/20 or better,
and normal ophthalmic status. They were familiar with the purpose of
the study and signed appropriate consent forms. All experiments were
conducted according to the ethical recommendations of the Declaration of Helsinki and were approved by local ethics authorities.
Experimental Setup
The stabilized projection of a 10° ⫻ 7° viewing window on the retina
was achieved with a high-speed video-based eye and head-tracking
system (EyeLink; SensoMotor Instruments GmbH, Berlin, Germany)
and a high-refresh-rate monitor (Fig. 1). Please refer to our preceding
publications11,12 for a more detailed description of the experimental
setup.
Generation and Presentation of the Stimuli
Stimuli consisted of full-page texts generated by the same procedure as
was used in our previous study on full-page text reading.14 Articles
were extracted from the Internet Web site of the Swiss newspaper Le
Temps (http://www.letemps.ch) and cut into seven-line text segments
of approximately 25 words. Arial font (Helvetica) was used. At a
viewing distance of 57 cm, the height of the lowercase letter x
corresponded to a visual angle of 1.8°. The information content of the
stimuli was reduced using one of two pixelization algorithms, square
or Gaussian, which differed in the resultant shape of the pixels. These
FIGURE 1. Experimental setup used for prosthetic vision simulations.
Subjects were asked to read full-page texts by using their eye movements to move a stabilized, restricted viewing window on a computer
screen.
3908
Pérez Fornos et al.
IOVS, October 2005, Vol. 46, No. 10
algorithms were applied either off-line, yielding images with “frozen”
pixels, or in real-time, yielding “dynamic” pixels that changed with
gaze position.
Square pixelization was performed with a simple block-averaging
algorithm, in which matrices of n ⫻ n pixels of the original image are
fused into single uniform pixels with luminance values corresponding
to the mean gray scale levels of the original n ⫻ n matrices (Fig. 2a).
Gaussian pixelization was performed by applying a two-dimensional (2-D) Gaussian function to each pixel of the stimulus image (Fig.
2b):
I共x,y兲 ⫽ A共 ␮ x, ␮ y兲 䡠 G共x,y兲.
I(x,y) represents the light intensity (gray scale level) at the coordinates
(x,y) of the stimulus image. A(␮x,␮y) is the mean gray scale level of the
original n ⫻ n pixel matrix with center coordinates (␮x,␮y). G(x,y)
stands for the 2-D Gaussian function calculated as:
G共x,y兲 ⫽
共x ⫺ ␮x兲2 ⫹ 共y ⫺ ␮y兲2
1
2 ␴2
,
2 e
2 ␲␴
where ␴ denotes the SD of the particular Gaussian function around its
horizontal (␮x) and vertical (␮y) means. In our case, ␴ determines the
amount of overlap of each pixel onto its neighbors (Gaussian width),
whereas ␮x and ␮y correspond to the center coordinates for each pixel
(Fig. 3).
Off-Line Pixelization. All text segment images (seven lines of
full-page text) used for static presentations were processed off-line,
during the preparation phase of the experiment. Subjects could scan
these prepixelized images through the 10° ⫻ 7° viewing window,
under control of their gaze position on the screen.
Real-Time Pixelization. In this condition, only the small portion of the entire text segment image displayed in the 10° ⫻ 7° viewing
window (determined by the subject’s gaze position on the screen) was
pixelized in real-time. Gaze position data were used to reposition the
viewing window and to display its newly pixelized content on the
screen. To achieve adequate image stabilization on the retina, the
maximum image-processing time (stimulus pixelization and display)
was kept below 10 ms. To fulfill this condition, enormous processing
power is needed when large Gaussian widths are used, due to significant amounts of overlap across neighboring pixels. For real-time
pixelization, the processing power of our equipment limited us to
Gaussian widths up to 0.14 pixels.
Testing Procedure
The remaining aspects of the experimental procedure were exactly the
same as described in our preceding study on full-page text reading.12
Briefly, tests were performed monocularly (using the dominant eye)
FIGURE 3. Gaussian pixelization. A 2-D Gaussian function was applied
to each pixel. Block averaging was used to determine the peak of the
Gaussian function. ␴ represents the SD used in the Gaussian function
(Gaussian width); ␮x and ␮y are the center coordinates of the stimulus
pixel to which the function is applied.
and in central vision. For each run, subjects had to read aloud several
text segments of an article, randomly chosen out of a pool of 50 (none
of the subjects read an article twice). Test sessions frequently included
several runs, but they never lasted longer than 30 minutes, to avoid
fatiguing the subjects.
The programs and algorithms used for image processing and experiment control were developed in commercial software (Visual
C⫹⫹ 6.0 SP5; Microsoft, Redmond, WA) and the latest Platform SDK
libraries available at the time of the experiment. Some functions of the
EyeLink Windows API library (v. 1.0; SensoMotor Instruments, GmbH)
were also used.
Data Analysis and Statistics
Two variables were measured to assess reading performance: reading
scores, expressed in percentage of correctly read words (gender and
conjugation mistakes were considered as errors), and reading rates,
expressed in the number of correctly read words per minute. Since
percentage scales are not adequate for statistical analysis,30 reading
scores were transformed to rationalized arcsine units (rau). Nevertheless, for better clarity, an approximate percentage scale is shown on
the right axes of the figures and is also used in the text.
Results were calculated as the mean of the cumulative performance
of each subject ⫾ SEM. Statistically significant differences in reading
performance were determined by standard (paired) t-tests with a significance level of 0.05.
RESULTS
Real-Time Square Pixelization Versus Off-Line
Square Pixelization
FIGURE 2. Pixelization methods: (a) square pixelization (block averaging); (b) Gaussian pixelization.
Five normal volunteers (22, 23, 24, 26, and 28 years of age)
were requested to read full-page texts using off-line and realtime square pixelization. Five resolution levels were tested:
28,000, 1,750, 572, 280, and 166 pixels in the viewing window. These resolution levels were identical with those used in
our previous study on reading of isolated four-letter words.11
All subjects started with the easiest (highest) resolution and
progressed toward the most difficult (lowest) one. The first
four text segments of an article (approximately 100 words) had
to be read in each run. Three runs were performed per each
pixelization condition. Off-line and real-time pixelization conditions alternated. It is important to note that the first resolution level (28,000 pixels) corresponded to maximum screen
resolution (no pixelization had to be performed). Off-line and
real-time pixelization conditions were thus identical in this
particular case.
IOVS, October 2005, Vol. 46, No. 10
Figure 4 compares mean reading performances versus
number of pixels in the viewing window for off-line and
real-time pixelizations. Individual performances in each experimental condition were established on the basis of 12
text segments and data were fitted with psychometric functions. Down to a target resolution of 572 pixels, average
reading scores were close to perfect (above 95% correct)
and statistically equivalent for both conditions. At 280 pixels, subjects achieved reading scores of 94.3% with real-time
pixelization, but of only 76.4% with off-line pixelization.
This difference was statistically significant (P ⫽ 0.0017), and
persisted at the lowest resolution (166 pixels; 56.1% versus
29.3%; P ⫽ 0.013). It is interesting to estimate the critical
target resolution for subjects to reach useful reading performances. In our previous study on full-page reading,12 we
Simulation of Artificial Vision
3909
FIGURE 5. Pixelization with various Gaussian widths ␴ (pixel overlapping). Gaussian pixelizations with: (a) ␴ ⫽ 0.071 pixels (little overlap),
(b) ␴ ⫽ 0.286 pixels (medium overlap), and (c) ␴ ⫽ 1.143 pixels (large
overlap).
found that adequate (good to excellent) text comprehension
correlated closely with high reading scores. This criterion
was fulfilled at median scores of 96.8%. In the present case,
the fits to the data indicate that this score is reached at 498
pixels in the case of off-line pixelization and at 322 pixels for
real-time pixelization (Fig. 4a).
Reading rates appeared to be even more sensitive to the
number of pixels in the viewing window (Fig. 4b). At the
highest resolutions, subjects reached an average reading rate
of 93 words/min. At 572 pixels, mean reading rates had
significantly (P ⬍ 0.0001) decreased to 80 words/min for
real-time and to 64 words/min for off-line pixelization. The
difference between both pixelization conditions was also
statistically significant (P ⬍ 0.0001) and persisted at 280
pixels (34 words/min for real-time pixelization versus 18
words/min for off-line pixelization; P ⫽ 0.002). The lowest
pixelization condition (166 pixels) was so difficult that reading rates were very low (four to six words/min) in both
cases.
Taken together, these results indicate that equivalent reading performances could be reached at a significantly lower
resolution with real-time pixelization.
Off-Line Gaussian Pixelization Versus Off-Line
Square Pixelization
FIGURE 4. Reading performance versus number of pixels in the 10° ⫻
7° viewing window for five normal subjects. Two stimuli generation
procedures are compared in central vision: real-time pixelization and
off-line pixelization. (a) Mean reading scores expressed in rau ⫾ SEM
(left scale) and in % (right scale). Dashed line: indicates reading scores
corresponding to good-to-excellent text comprehension. (b) Mean
reading rates expressed in words per minute ⫾ SEM.
Six normal subjects (26, 29, 29, 33, 34, and 41 years of age)
participated in the second experiment. Pixelizations with six
different Gaussian widths (␴ of 0.036, 0.071, 0.143, 0.286,
0.571, and 1.143 pixels) were tested and compared with
square pixelization. The effect of varying the Gaussian width ␴
for image pixelization is illustrated in Figure 5. In all conditions, the 10° ⫻ 7° viewing window contained 572 pixels
(resolution shown to provide enough information for useful
full-page text reading12). Each subject had to read an article of
approximately 250 words (i.e., 10 consecutive text segments,
per condition). Three subjects started the experiment with
Gaussian pixelization at the smallest ␴ value, progressed toward the larger Gaussian widths, to finish with square pixelization. The remaining three subjects conducted the experiment inversely.
Mean reading performances versus Gaussian function width
(␴) are shown in Figure 6 and compared to results obtained
with square pixelization. Four Gaussian widths (␴ ⫽ 0.071,
0.143, 0.286, and 0.571 pixels) resulted in reading scores
above 94% correctly read words. These scores were very close
to those obtained with square pixelization (Fig. 6a). Mean
reading scores with ␴ ⫽ 0.143 and 0.286 pixels were found to
be significantly better than those obtained with square pixelization (P ⫽ 0.04 and 0.009, respectively). Reading scores
declined markedly below 80% for the two extreme Gaussian
widths tested (␴ ⫽ 0.036 and 1.143 pixels).
Mean reading rates displayed a similar picture. A maximum
reading rate of 70 words/min was achieved at ␴ ⫽ 0.286 pixels.
This value is significantly higher (P ⬍ 0.001) than the reading
3910
Pérez Fornos et al.
IOVS, October 2005, Vol. 46, No. 10
Taken together, these data reveal that Gaussian pixelization
can lead to slightly, but significantly better reading performance than can its square counterpart. This suggests that some
degree of image smoothing resulting from overlapping between neighboring pixels can be beneficial for reading. This
benefit is, however, only observed for a restricted range of
overlapping.
Real-Time Gaussian Pixelization Versus Real-Time
Square Pixelization
Results of the second experiment demonstrated that off-line
Gaussian pixelization could lead to significantly better reading
performance than off-line square pixelization. A third experiment was thus dedicated to extend this comparison to realtime mode.
For this evaluation we would have rather used the “optimal”
Gaussian width (␴ ⫽ 0.286 pixels) determined in the second
experiment. However, the total processing time needed to
simulate this condition turned out to be too important to
ensure adequate image stabilization on the retina. Using the
second best condition (␴ ⫽ 0.143 pixels) allowed us to keep
processing time below 10 ms. The same six normal volunteers
who had participated in the second experiment were requested to read 10 text segments in each of two conditions: (1)
real-time Gaussian pixelization at ␴ ⫽ 0.143 pixels and (2)
real-time square pixelization. In both conditions, the 10° ⫻ 7°
viewing window contained 572 pixels. Three subjects started
with real-time square pixelization and then switched to realtime Gaussian pixelization. The remaining three subjects performed the experiment inversely.
The results of this experiment are summarized in Table 1.
No significant difference in performance was recorded between both types of pixelization. However, reading scores and
reading rates tended to be slightly higher with square pixelization. Comparing those real-time scores with their off-line counterparts gathered in the second experiment reveals that both
real-time conditions yielded better performance. This performance gain was significant for square pixelization (reading
scores: P ⫽ 0.003; reading rates: P ⫽ 0.008), but not for
Gaussian pixelization (reading scores: P ⫽ 0.12; reading rates:
P ⫽ 0.25).
DISCUSSION
FIGURE 6. Reading performance versus Gaussian function width (␴)
used for stimulus pixelization in six normal subjects. Results are compared with reading performances obtained with square-pixelized stimuli (dashed line, ⫾ SEM). The resolution of the 10° ⫻ 7° viewing
window in central vision was kept constant at 572 pixels. (a) Mean
reading scores expressed in rau ⫾ SEM (left scale) and in % (right
scale). (b) Mean reading rates expressed in words per minute ⫾ SEM
(left scale).
rate of 57 words/min achieved with square pixelization. Reading rates with ␴ ⫽ 0.143 and 0.571 pixels were not significantly different from those obtained with square pixelization.
For ␴ ⫽ 0.036, 0.071, and 1.143 pixels, reading rates declined
markedly (below 40 words/min).
The first experiment clearly shows that at low stimulus resolutions (below approximately 1000 pixels in a 10° ⫻ 7° viewing area) real-time square pixelization yields better reading
performances than its off-line equivalent. The major reason for
this performance improvement lies probably in the capability
of the visual system to integrate various low-resolution images,
enhancing stimulus contrast and resolution21 to improve perception. This effect is also used in standard video: when several
low-resolution images are presented in a rapid sequence, the
resultant perception is that of a continuous, higher-resolution
motion picture. In our experiments, at constant pixel resolution, the readability of pixelized text images depends on the
exact position of the pixelization grid relative to the original
stimulus image. Therefore, the image can be modified with
TABLE 1. Mean Reading Performances with Real-Time Stimulus Pixelization in Six Normal Subjects
Gaussian Pixelization
Mean Reading Scores (rau ⫾ SEM)
Mean Reading Rates (words/min ⫾ SEM)
115.8 ⫾ 3.6
69 ⫾ 12
(99.6%)
Square Pixelization
117.2 ⫾ 3.4
74 ⫾ 15
Gaussian pixelization compared to square pixelization using a 10°⫻7° viewing area containing 572 pixels.
(99.8%)
P
0.22 (ns)
0.35 (ns)
Simulation of Artificial Vision
IOVS, October 2005, Vol. 46, No. 10
minor eye movements to optimize viewing conditions. Figure
7 illustrates this effect for a series of minor changes in grid
position. We observed that subjects quickly adopted this strategy: When resolution decreased, they increased the number of
small saccades around the word they were trying to decipher.
Other effects are also likely to influence reading performance. Previous research on face recognition16 –18,20,31 revealed that blocked images lead to poorer performance than
images filtered using other techniques, mainly because these
add artifactual high-frequency components to the target image
that may mask essential features for identification. Real-time
pixelization does not have the same artifactual bias because
pixel movement acts as a low-pass filter that subtracts some of
these parasitic frequencies. This could also explain why in the
second experiment off-line Gaussian pixelization yielded better
reading performance than off-line square pixelization (for a
restricted range of Gaussian widths of approximately ␴ ⫽
0.286 pixels). Additional research, especially at lower resolutions, would be necessary to investigate other factors. It should
also be stressed that extreme Gaussian widths noticeably impaired performance. When very small Gaussian widths were
used, pixels appeared as isolated small points of light, making
it almost impossible to extract a cohesive picture. With large
Gaussian widths, overlap was too pronounced, leading to verylow-contrast stimuli.
Results of experiment 3 might appear surprising in light of
the findings of experiment 2: When using real-time processing,
the benefits of Gaussian pixelization vanished. In fact, this
outcome is not astonishing. Real-time processing had already
eliminated the major handicap of square pixelization. The
distracting high-frequency noise introduced at pixel borders is
low-pass filtered by pixel movement. We believe that the use of
the optimal Gaussian width ␴ ⫽ 0.286 pixels (instead of 0.143)
would not change this result fundamentally.
Implications of the Results for Simulations of
Artificial Vision
The exact characteristics of the electrophysiological response
of the retina to patterned electrical stimulation remain undetermined to this date. However, the use of 2-D Gaussian functions for stimulus pixelization is certainly a more physiologically pertinent approach than the use of square pixels (pixel
borders are smoother and it allows for overlapping between
neighboring pixels). As soon as the results of electrophysiological experiments on retinal tissue become available, the parameters of such 2-D Gaussian (or more adequate) functions should
be adapted. Our experiments also revealed that Gaussian width
is an important factor for readability, suggesting that stimulating current strength and electrode spacing might have to be
further “tuned” (within safe and comfortable limits) to achieve
the most efficient image transmission possible.
Real-time processing also allows for more realistic simulations of the visual information provided by retinal prostheses.
Our results demonstrated that it yields significantly better performance than its off-line counterpart. However, this benefit
FIGURE 7. Illustration of the effect of the initial position of the pixelization grid on the readability of the pixelized word. A single position
does not provide enough information to identify the word unambiguously, but by integrating all three of them, the French word “niveau”
can be easily recognized.
3911
was relatively moderate, not allowing for a significant reduction (e.g., a factor of two) of the number of stimulation points.
Most probably, this advantage will be even less important in
visual prostheses with external head-mounted cameras, since
head movements are larger and less frequent than eye movements. Recurring head movements could also result in an
abnormal vestibulo-ocular reflex.
The first visual prosthesis prototypes have been recently
implanted in humans with encouraging results.5–7 Yet, several
important challenges still need to be overcome before these
devices can provide benefits similar to those of cochlear implants in cases of deafness. The basic notion of patterned vision
resulting from the continuous stimulation of several electrodes
has not been fully confirmed. An appropriate method of selective stimulation eliciting the adequate psychophysical response
has not been developed yet. Another major problem is to
achieve efficient electrical stimulation within safe charge density limits.32 To reduce the total electrical charge injected on
the retina, the use of relatively large stimulation electrodes
(fundamentally limiting interelectrode spacing) as well as alternate solutions (such as inverted polarity, interleaved stimulation, and/or increasing the total area of the retinal array within
feasible limits) may be mandatory. A substantial research effort
is therefore still needed to solve these and other open issues
before realizing the level of electrode integration suggested by
our studies.
In conclusion, these results demonstrate that the spatial and
temporal characteristics of image pixelization play a role in
artificial vision simulations. Equivalent performance could be
reached with a resolution reduction of approximately 30%, if
stimulation parameters were adequate. This effect is not strong
enough, however, to change fundamentally the minimum requirements determined in our previous studies on the basis of
simplified processing:11,12 Four to five hundred contacts covering a 2 ⫻ 3-mm2 retinal area are necessary to transmit
sufficient visual information for full-page text reading. Reading
is particularly important because it is strongly associated with
vision-related estimates of quality of life and represents one of
the main goals of low vision patients seeking rehabilitation.33–35 It is thus important to be aware of such minimal
conditions when developing visual prostheses, even if less
sophisticated devices might already bring some clinical benefits to patients.
Acknowledgments
The authors thank Andrew Whatham, PhD, for insightful contributions
and a critical review of the manuscript.
References
1. Rizzo JF, Wyatt J. Prospects for a visual prosthesis. Neuroscientist.
1997;3:251–262.
2. Normann RA, Maynard EM, Rousche PJ, Warren DJ. A neural
interface for a cortical vision prosthesis. Vision Res. 1999;39:
2577–2587.
3. Dobelle WH. Artificial vision for the blind by connecting a television camera to the visual cortex. ASAIO J. 2000;46:3–9.
4. Zrenner E. Will retinal implants restore vision? Science. 2002;295:
1022–1025.
5. Humayun MS, Weiland JD, Fujii GY, et al. Visual perception in a
blind subject with a chronic microelectronic retinal prosthesis.
Vision Res. 2003;43:2573–2581.
6. Veraart C, Wanet-Defalque MC, Gerard B, Vanlierde A, Delbeke J.
Pattern recognition with the optic nerve visual prosthesis. Artif
Organs. 2003;27:996 –1004.
7. Chow AY, Chow VY, Packo KH, Pollack JS, Peyman GA, Schuchard
R. The artificial silicon retina microchip for the treatment of vision
loss from retinitis pigmentosa. Arch Ophthalmol. 2004;122:460 –
469.
3912
Pérez Fornos et al.
8. Lecchi M, Marguerat A, Ionescu A, et al. Ganglion cells from chick
retina display multiple functional nAChR subtypes. Neuroreport.
2004;15:307–311.
9. Linderholm P, Bertsch A, Renaud P. Resistivity probing of multilayered tissue phantoms using microelectrodes. Physiol Meas.
2004;25:645– 658.
10. Ziegler D, Linderholm P, Mazza M, et al. An active microphotodiode array of oscillating pixels for retinal stimulation. Sensors and
Actuators A: Physical. 2004;110:11–17.
11. Sommerhalder J, Oueghlani E, Bagnoud M, Leonards U, Safran AB,
Pelizzone M. Simulation of artificial vision: I. Eccentric reading of
isolated words, and perceptual learning. Vision Res. 2003;43:269 –
283.
12. Sommerhalder J, Rappaz B, de Haller R, Pérez Fornos A, Safran AB,
Pelizzone M. Simulation of artificial vision: II. Eccentric reading of
full-page text and the learning of this task. Vision Res. 2004;44:
1693–1706.
13. Weiland JD, Humayun MS, Dagnelie G, De Juan E, Greenberg RJ,
Iliff NT. Understanding the origin of visual percepts elicited by
electrical stimulation of the human retina. Graefes Arch Clin Exp
Ophthalmol. 1999;237:1007–1013.
14. Stett A, Barth W, Weiss S, Haemmerle H, Zrenner E. Electrical
multisite stimulation of the isolated chicken retina. Vision Res.
2000;40:1785–1795.
15. Rizzo JF, Wyatt J, Loewenstein J, Kelly S, Shire D. Perceptual
efficacy of electrical stimulation of human retina with a microelectrode array during short-term surgical trials. Invest Ophthalmol Vis
Sci. 2003;44:5362–5369.
16. Harmon LD, Julesz B. Masking in visual recognition: effects of
two-dimensional filtered noise. Science. 1973;180:1194 –1197.
17. Bachmann T. Identification of spatially quantised tachistoscopic
images of faces: how many pixels does it take to carry identity? Eur
J Cogn Psychol. 1991;3:87–107.
18. Uttal WR, Baruch T, Allen LA parametric study of face recognition
when image degradations are combined. Spat Vis. 1997;11:179 –
204.
19. Leeuwenberg E. Miracles of perception. Acta Psychol (Amst).
2003;114:379 –396.
20. Bachmann T, Kahusk N. The effects of coarseness of quantisation,
exposure duration, and selective spatial attention on the percep-
IOVS, October 2005, Vol. 46, No. 10
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
tion of spatially quantised (‘blocked’) visual images. Perception.
1997;26:1181–1196.
Lappin JS, Tadin D, Whittier EJ. Visual coherence of moving and
stationary image changes. Vision Res. 2002;42:1523–1534.
Christie F, Bruce V. The role of dynamic information in the recognition of unfamiliar faces. Mem Cognit. 1998;26:780 –790.
Lander K, Christie F, Bruce V. The role of movement in the
recognition of famous faces. Mem Cognit. 1999;27:974 –985.
Thornton IM, Kourtzi Z. A matching advantage for dynamic human
faces. Perception. 2002;31:113–132.
Cha K, Horch KW, Normann RA. Mobility performance with a
pixelized vision system. Vision Res. 1992;32:1367–1372.
Cha K, Horch KW, Normann RA, Boman DK. Reading speed with
a pixelized vision system. J Opt Soc Am A. 1992;9:673– 677.
Humayun MS. Intraocular retinal prosthesis. Trans Am Ophthalmol Soc. 2001;99:271–300.
Hayes JS, Yin VT, Piyathaisere D, Weiland JD, Humayun MS, Dagnelie G. Visually guided performance of simple tasks using simulated prosthetic vision. Artif Organs. 2003;27:1016 –1028.
Thompson RW, Barnett GD, Humayun MS, Dagnelie G. Facial
recognition using simulated prosthetic pixelized vision. Invest
Ophthalmol Vis Sci. 2003;44:5035–5042.
Studebaker GA. A “rationalized” arcsine transform. J Speech Hear
Res. 1985;28:455– 462.
Costen NP, Parker DM, Craw I. Spatial content and spatial quantisation effects in face recognition. Perception. 1994;23:129 –146.
Brummer SB, Robblee LS, Hambrecht FT. Criteria for selecting
electrodes for electrical stimulation: theoretical and practical considerations. Ann N Y Acad Sci. 1983;405:159 –171.
Wolffsohn JS, Cochrane AL. The changing face of the visually
impaired: the Kooyong low vision clinic’s past, present, and future. Optom Vis Sci. 1999;76:747–754.
Hazel CA, Petre KL, Armstrong RA, Benson MT, Frost NA. Visual
function and subjective quality of life compared in subjects with
acquired macular disease. Invest Ophthalmol Vis Sci. 2000;41:
1309 –1315.
McClure ME, Hart PM, Jackson AJ, Stevenson MR, Chakravarthy U.
Macular degeneration: do conventional measurements of impaired
visual function equate with visual disability? Br J Ophthalmol.
2000;84:244 –250.

Documentos relacionados