Vision Is a Series of “Snapshots” – Movie Analogy

October 13, 2019; revised April 4, 2022; May 16, 2023; August 19, 2023 (#3)

It is essential to understand that the mind can capture only one sensory input at a time. This helps one understand that no “person” is involved in Paṭicca Samuppāda in an ULTIMATE SENSE.

Vision – How Do We See an Object?

1. Vision or “seeing” appears to us as continuous. We see people moving around, vehicles moving, animals running around, etc. However, in reality, “seeing” happens due to a series of “snapshots” our physical eyes take. Please bear with me as I set the stage with the following Pāli terms. Knowing these Pāli terms in detail is unnecessary; try to get the basic idea. If you have not read the post, “Seeing Is a Series of ‘Snapshots,’” it could be helpful to read that first.

A key idea behind Buddha Dhamma is that we experience only one citta (loosely translated as a thought) at a time and that citta focuses on ONE ārammana. In other words, while the mind registers a visual event, it cannot hear, smell, taste, or feel a touch. The keyword ārammana was introduced in the post, “Vipāka Vēdanā and “Samphassa jā Vēdanā” in a Sensory Event.”
“Seeing” does not happen continuously since the mind can process only one cakkhudvāra citta vithi (with 17 cittā) at a time. The mind processes that cakkhudvāra citta vithi with three more manōdvāra citta vithi. At the end of those citta vithi, the mind has captured a ‘snapshot” of the object and recognized it. Those four citta vithi define one “snapshot” of a moving object.
Our “seeing of a moving external object” involves many “snapshots” within seconds. Our perception of a moving object results from many such “snapshots.” We do not see the individual “snapshots.”

Movie Analogy – Series of Snapshots

2. We can simplify and understand the above process using an analogy. What I stated above is — in principle — what happens when we watch a movie.

To make a movie, a video camera captures many static pictures (snapshots) of a scene. Then those snapshots are projected to a screen at a specific rate. If the playback speed is too slow, we can see individual pictures, but above a certain “projection rate,” it looks like natural motion. Here is a video that illustrates this well:

<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>

A movie projector projects static pictures to the screen at a rate of about 30 frames a second, and we see the movie as a continuous progression of events. If the projection rate is low, we can see it frame by frame or as individual “snapshots.” We do not perceive those static pictures when projected at 30 frames a second; instead, we perceive a continuous progression without any gaps.
More details in the post, “Citta and Cetasika – How Viññāṇa (Consciousness) Arises.”
That is why the Buddha said that the mind (or viññāṇa) is a magician. We perceive a streamlined world, even though the reality is that our sensory faculties detect only a series of “snapshots,” It is the mind that conceals the reality and gives us a perception of a continuous progression of events.
It is critical to understand this point. It helps get rid of sakkāya diṭṭhi; see “Chachakka Sutta – Six Types of Vipāka Viññāna.”

Mind and the Brain – Two Different Entities

3. In an early post on this series, I pointed out that cakkhāyatana and cakkhu pasāda rūpa are not physical eyes. Visual signals received by eyes are processed by the brain and detected by cakkhu pasāda rūpa; See #12 of “Buddhist Worldview – Introduction.”

That cakkhu pasāda rūpa (or simply cakkhu) is part of the gandhabba, our “mental body.” The gandhabba has the seat of the mind (hadaya vatthu), surrounded by the five pasāda rūpa corresponding to vision, hearing, taste, smell, and touch.
When our physical eyes capture an image of an external object, that image goes to the visual cortex in the brain. The signal is processed there and then transmitted to the cakkhu pasāda rūpa, making contact with the hadaya vatthu. That contact (phassa) leads to the arising of cakkhu viññāṇa at the hadaya vatthu. More details in “Brain – Interface between Mind and Body.”
By the way, that is the step, “cakkhuñca paṭicca rūpe ca uppajjati cakkhuviññāṇaṃ” discussed in #7 in the post, “Contact Between Āyatana Leads to Vipāka Viññāṇa.”

4. Therefore, the brain works like a computer. It converts the image from the eyes to a form “processable” by hadaya vatthu (seat of the mind.) Therefore, vision involves a somewhat complex process.

Similar processes take place for the other four sensory events. For example, when the physical ears capture a sound, that signal goes to the auditory cortex in the brain for processing. That signal then goes to the sōta pasāda rūpa, which makes contact with hadaya vatthu to transfer. That gives rise to sōta viññāṇa via, “sōtañca paṭicca sadde ca uppajjati sotaviññāṇaṃ.”

Reviewing the Whole Series Could Be Helpful

5. It may need some effort to understand this sequence of events. But it is necessary to comprehend the overall process before we get to the next post.

It is good to print all the posts in the “Worldview of the Buddha” subsection and review them carefully.
It is unnecessary to understand the DETAILS of #6 and #7 below. But it is good to get the general ideas involved. I am providing this information to illustrate the following. New findings in science are not only compatible with Buddha Dhamma but also help explain critical concepts in Buddha Dhamma.

The Brain Processes Visual Signals at About 30 Frames per Second

6. A recent study has reported that the minimum time to recognize a static picture is about 13 milliseconds (Ref. 1). That means we should be able to see such snapshots projected at 77 frames per second at the highest rate. However, that is probably “pushing it” and uncomfortable for the brain to handle. That is why movies use a projection rate of about 30 frames per second, as mentioned in #2 above.

Interestingly, the time for neural information to reach the brain takes about 15 to 30 milliseconds (References 49, 50 in Ref. 2). Therefore, a projection rate of 30 to 50 frames is compatible with that measurement too.
A millisecond is a thousandth of a second.
Also, note that the eyes do not capture an image “in one shot.” It takes many frames taken via automatic “saccadic movements of the eyes.” See the Wikipedia article “Saccade.”

The same Analysis Holds For the Other Four Physical Senses

7. A similar set of rules are valid for hearing as well. Another recent study (Ref. 2) found that sounds could be recognized at rates up to 30 sounds per second. That corresponds to a “sound packet” of about 33 milliseconds that can be detected and recognized.

However, people speak at a much slower rate of 150 words per minute. That is about two words per second, much less than 30 possible words per second that would be possible according to the above study. So, there is no problem with hearing what other people speak, even if someone talks faster than the average rate.
Currently, no studies are available from science for the other three sensory events (taste, smell, and body touches). But the same process holds for those as well.

Aside – Cognition (Saññā) Requires More Than Detection

The following points (#8, #9) are “asides.” It is not necessary information, but it could help those familiar with Abhidhamma.

8. We must remember that “experiencing a sensory input” is much more complex than just receiving that sensory input. For example, the mind needs to see an object or hear a sound, recognize what it is, and generate a vēdanā.

For example, upon hearing the sound “apple,” the mind needs to know what an “apple” is. Someone who does not speak English would not know what is meant by the word “apple.” But those who speak English AND have had an experience eating apples would have MEMORIES of those. Therefore, the mind must compare the received sensory with memories to recognize it!
The mind does that very fast with the help of the manasikāra cētasika. As you may know, manasikāra is one of the seven universal cētasika that arises with each citta. Thus, the mind can recognize a sensory input instantaneously as soon as it receives a “data packet.”
More details in “Citta and Cetasika – How Viññāṇa (Consciousness) Arises.”

Aside – Process In Abhidhamma Language

9. Actual “seeing” or vision takes place at hadaya vatthu. Same for the other four types of sensory events. For example, consider a “packet of data” sent from the physical eye to the brain. The brain processes that information and transmits it to the cakkhu pasāda. As you may remember, the five pasāda rupā (cakkhu, sōta, ghāna, jivhā, kāya) surround the hadaya vatthu. Now the cakkhu pasāda makes contact with the hadaya vatthu by hitting it. That causes the hadaya vatthu to vibrate 17 times, like a gong struck by an iron rod vibrating for a certain fixed number of times.

The 17 vibrations of the hadaya vatthu correspond to the 17 cittā in a citta vithi. Such a citta vithi is a pañcadvāra citta vithi because one of the five physical senses or pañcadvāra (“pañca” or five + “dvāra” or “door”) initiates it.
Imagine a blade clamped at one edge and hit on the un-clamped side. The blade will vibrate. It vibrates a certain FIXED number of times. For a given material, that is a fixed number.
The same happens when a pasāda rūpa strikes the hadaya vatthu. The hadaya vatthu vibrates 17 times, with each vibration leading to the arising of a citta. That is the origin of a citta vithi with 17 cittā. Those 17 vibrations are a form of energy called a hadaya rūpa.

10. The misconception that any rūpa has a lifetime of 17 thought moments arose because of not understanding the difference between a rūpa (the image of an external object) and a hadaya rūpa (which is just the 17 vibrations of the hadaya vatthu).

In other words, this information packet is received and processed by the hadaya vatthu within the 17 cittā. The information is complete by the fourth citta (fourth vibration of the hadaya vatthu), and then the rest of the citta in that citta vithi deal with this information. Three more citta vithi run by the hadaya vatthu itself completes the process. The additional citta vithi, initiated by the mind, are manōdvāra citta vithi. Here, manōdvāra means the “mind-door.”
Details of #9 and #10 at “Does any Object (Rupa) Last only 17 Thought Moments?.”

The mind is Fast, and the Brain is Slow

11. Thus, we can see a vast difference in time between the two processes involved. The physical body acquiring data takes time of the order of 10 milliseconds. The mind processes that information within a billionth of a second (using one pañcadvāra citta vithi and three manōdvāra citta vithi.)

Even if the five senses keep sending data continuously, the mind is “just sitting there” most of the time. Let us examine this in more detail: Suppose the brain keeps sending data from the eye non-stop. Since each “packet” takes, say, ten milliseconds, then in a second, there will be 100 “data packets” of vision coming in. If the brain is going at full speed, it can send at most 500 (=100×5) “data packets” from all five physical senses in a second. Then the mind will spend less than a millionth of a second processing all that data. Thus, if we add actual “active times of the mind” for a movie that lasts two hours, it is probably less than a second.
In other words, the brain spends a lot of energy processing the data streams during a two-hour movie. But the “seat of the mind” or the hadaya vatthu absorbs that information at an unimaginable speed. That is why we might only get a headache watching too many movies.

12. During those “gaps,” the hadaya vatthu also interacts (both ways) with the mana indriya in the brain. In particular, it gives instructions to the brain (via mana indriya) on how to control the physical body in response to sensory inputs.

Thus, for the most part, the mind (or, more precisely, the hadaya vatthu) is sitting there idly. That “idle state” of the mind is the “bhavaṅga” state.
A key point here is that the mind spends only a VERY SHORT TIME experiencing the SENSORY INPUTS. There is no “self” watching a movie. The mind gives the illusion that a “self” is watching the movie. Details are in the next post, “Chachakka Sutta – Six Types of Vipāka Viññāṇa.”
The above is a very brief discussion. Of course, there are more details, but one can hopefully get the basic idea. Please ask questions if something is not clear. It is critical to understand this post.

Summary

13. The critical point embedded in the Chachakka Sutta (MN 148) is that there is no “self” experiencing the external world. It is just a series of events, and the mind MAKES IT APPEAR that a “person” is watching a movie. We discussed the initial steps in sensory events addressed by that sutta.

The key message in the sutta is that the mind DOES NOT experience the external world CONTINUOUSLY. Instead, the mind is active only for brief periods when receiving inputs from the five pasāda rūpa. As mentioned above, the brain is “on” much longer than the mind. Once the brain processes information packets, the mind absorbs that information within a “blink of an eye.”
On the other hand, the brain has a heavy workload while watching a movie. It has to process audio and video inputs rapidly for the movie’s duration. One could get a headache if one watches two movies at a stretch. But even during that time, the mind is mainly in the bhavaṅga state. There is no “self” watching the movie. It is just a series of events taking place. The mind is “putting all those “events” together and giving the appearance of a continuous progression of events. Thus, one perceives “I am watching a movie” and NOT “watching a series of static pictures.”
Details are in the next post, “Chachakka Sutta – Six Types of Vipāka Viññāṇa.”
Later, we will discuss why it is also incorrect to say there is “no-self.” As long as Paṭicca Samuppāda is not understood, there is a “self” or a “person” going through the rebirth process and experiencing much suffering!

REFERENCES

M. C. Potter et al., “Detecting Meaning in RSVP at 13 ms per Picture, Journal of Cognitive Neuroscience,” vol. 13, pp. 90-101 (2014).
V. Isnard et al., “The time course of auditory recognition measured with rapid sequences of short natural sounds,” Scientific Reports, vol. 9, pp. 1-10 (2019).

Click on the links to download the publications.