
Gaze Interaction on Rainy Days
Eye tracking is finally becoming a thing. Well, it’s been a thing for accessibility for a while, but now we have mixed reality headsets with eye trackers. Which is also not that new, but now they are built and sold by the big guys. Like it or not, that makes people pay attention to it.
The exciting thing about eye tracking in mixed reality is that we can now explore possibilities for gaze interaction in 3D. Ok, that’s also not that new, head-mounted eye trackers have also been around for a while. But let me be excited about it, ok?
So I’ve been thinking about ways to use gaze in mixed reality. A common use case is to have a layer of interactive objects that the user can interact with their gaze. Apple Vision Pro has something like that, and the objects are selected with a multimodal interaction that explores the user’s gaze and hand movements.
My colleagues at UIUC published a paper that shows a very interesting idea that allows the user to perform selections by changing their gaze depth. After seeing that, I had an idea, that I’m not going to work on anytime soon, so I’m writing it down here.
Gaze-based interaction
Interacting with the computer using our eyes may sound all fun at first (or maybe you just find it creepy, invasive and useless - I respect that). But as we start to actually design these systems we realize most of the time we move our eyes unconsciously. We quickly learn that our eyes are great at reacting to things, but when we want to control them, it’s not as easy as it seems.
I’ll show you an example. You will find two circles below. One is static and the other is dynamic. Try “drawing” each of them with your eyes. You can enable the webcam to record yourself if you want (don’t worry, these videos WILL NOT leave your computer, they are stored in memory only).
If you don’t want to enable the webcam, you can just try to draw the circles and pay attention to how your eyes are moving. I recorded myself doing it (sorry, I know it’s a bit weird looking at other people’s eyes, but I just want to show you something cool). I have slowed it down a bit so it’s easier to see what is going on. Can you guess which circle I was looking at each video?

Instructions to use the demo below:
- Click/tap on the circle you want to draw;
- Move the mouse/finger to the “Start” button (don’t move it, because it will be the same button used to stop the recording);
- Position your face in the circle;
- Click/tap on the “Start” button;
- “Draw” the contour of the circle with your eyes once or twice;
- Click/tap on the “Stop” button.
- You can delete the video and record again if you want, just click/tap on the button with the trash icon and repeat the process.
What happened here?
In case you are curious, I was looking at the dynamic circle in the first video and the static circle in the second. How can you tell? You may notice that in the second video, my eyes are moving in a more “jumpy” way. These movements are called saccades. They are the kind of movement we make most of the time.
In the first video, the movements are smoother. These are called pursuit movements, and they require a moving target. We can’t trigger this kind of movement without anything to follow. Have you tried again? I know, I also tried it after learning this fact.
Ok, so what’s the big idea?
The whole point of this discussion is to argue that visual cues are important for gaze interaction. So the problem I was focused on is: how can we provide a visual cue for the user that a layer of interactive objects is available without disrupting their field of view?
That’s when it occurred to me that our brains are very good at ignoring things that are right in front of our eyes. For example: suppose you go to the aquarium and see the some beautiful fish swimming around. You decide to take a picture of them, but when you look at the picture later, this is what you find:

What happened here? The glass of the aquarium reflected the light, and the camera captured the reflection instead of the fish. But when you were there, you didn’t see the reflection, did you? That’s because your brain ignored it.
Want another example? Have you noticed your nose lately? Made you look? We learned to ignore our nose because it’s always there and it’s not a very useful visual information most of the time. But now that I mentioned it, you can’t unsee it, can you?
This same effect happens with raindrops on the window. Assuming it’s not raining too hard, we can see through the window even with the raindrops on it. We can even ignore that the raindrops are there. But if we choose to, we can focus on them and see their details clearly.
So, with all that said, my idea is simply that: what if we use raindrops on a virtual window right in front of the user as a visual cue that a layer of interactive objects is available? The user can choose to ignore the raindrops and focus on the content behind them, or they can focus on the raindrops and interact with the objects.
Would it be unconfortable? Would it be useful? I don’t know. But I think it’s a cool idea. Maybe I’ll try it someday. We’ll see.