VIRTUAL REALITY
Home ] Up ] VIRTUAL REALITY ] LESSON 2 ] LESSON3 ] [ LESSON 4 ] LESSON 5 ] LESSON 6 ] LESSON 7 ] LESSON 8 ] LESSON 9 ] LESSON 10 ] DESIGN PROJECT ]

 

QUEST4
optical illusions
eye
ear
magiceye
lincoln
distance

 

 

LESSON 4 - REALITY OF A VIRTUAL WORLD

What is it that makes a world "real"? There are a number of factors which contribute to the feeling of "realness" in any artificial experience. All of these revolve around the information imparted through the senses; however, only part of them deal with the actual approximation of the received information to what would be experienced in the real world. For example, background music in a movie can impart a mood and create a realism within the viewer’s mind greater than the same scene with only the actual sounds.

Because in most virtual worlds we are creating the sense inputs rather than recording actual ones, we will examine how our senses perceive and interpret information.

VISUAL

The majority of our information is received as visual information through our eyes. To understand how this takes place we need to look at the structure and operation of our optic systems. Light enters the eye through a lens which focuses it upon the back of the eyeball where light sensitive nerve cells convert it into electrical signals which are transmitted to the brain. These nerve cells are of two types: rods and cones. Cones are sensitive to colors. They are concentrated in the central viewing fields and require a higher level of light than the rods which can only distinguish light levels rather than colors. It is unknown exactly how our color vision system functions but it has been pretty well established that there are probably three sets of sensors for three primary colors. Rod cells are more sensitive to low light levels and are spread around the eye. Note that at the edge of your field of vision you can notice movement and changes of light level but to see the color of an object you look at it. In low light levels you can see objects, detect their shapes and outlines, and see movement, but color information is lacking. This gave rise to the expression, "All cats are gray in the dark."

Another interesting fact is that you have a blind spot in your field of vision. This is the point at which the optic nerve bundle leaves the rear of the eye. At this point there can be no sensor cells. If you look at point sources of light, such as the stars, you may note that one near the center of your field of view will disappear. If you shift gaze slightly it will reappear.

The eye can accommodate an extremely wide range of light intensities. This is accomplished to some extent by the adjustment of the size of the iris which admits light to the lens. As you enter a darkened room, the iris dilates, admitting more light. When you enter into bright sunlight, the iris contracts, restricting the light. This is aided by changes in sensitivity of the sensors. When emerging from a dark room into bright light, it takes some time for the eye to change its sensitivity. When measured, this change in sensitivity follows an exponential curve as you would find for the voltage on a charging capacitor. In an exponential curve the rate of change is measured as a time constant. After one time constant the curve has reached 63% of its final value; after two, 86%; after three, 95%; after four, 98%; and after five, 99%. Although the curve never reaches it final value, after three to five time constants it is effectively there. For moving from a dark room into bright light , the eye has a time constant of about 10 seconds. When moving in the other direction - from bright light into a darkened room - the time constant is about 10 minutes. Thus when you come from sunlight into a darkened theater it requires a half hour or more for your eyes to fully adapt.

When you see a scene the image is not recorded and stored in complete detail. In fact when you look quickly at an image you do not "see" all of the detail at once. Images are stored in some form of "iconic" storage, transferred to short term memory, and then to long term memory. At each stage, information is processed and converted from one form into another. Often what you think you see is not the same as what is actually there. We will examine some of the influences and effects which can alter what you "see".

The mind relies upon matching to its own knowledge base to fill in details of images. There is an experiment run by a psychology professor in which, during his lecture, someone runs on stage, aims a banana at him, there is the sound of a shot and he falls. Afterwards each observer is asked to write a description of what happened. Almost without exception no one notices the banana. Instead they firmly believe the man held a handgun. This is largely because the mind has an established association between the actions and sounds with that of someone firing a gun. In any situation there is a tendency to try to fit what we are seeing into a pattern compatible with out previous experiences. This effect is widely used when building a virtual world: we don’t have to show a detailed image of an object if we can make it a sufficient representation. The viewer’s mind supplies the details. A door, for example, can be depicted as a shaded rectangle set at an angle to another outline drawing of a rectangle. We rely on the viewer’s experience with real doors to supply the details of both appearance and function.

If an image is unclear, we tend to match it to some form with which we are familiar. The ability of pattern recognition of humans is extraordinary. In the 1970s some work was done at Utah with blind patients undergoing brain surgery. When electrodes were connected directly to the surface of the brain and driven with an electrical pulse, a point of light was perceived by the patient. An array of 64 such electrodes (8 x 8) was attached and driven by a computer. Using only this limited number of points it was possible for the patient to recognize an individual face.

Pattern recognition is a useful ability when viewing in poor conditions such as fog, rain, or low light levels. The ability to distinguish between an antelope and a sabre tooth tiger at dusk could mean survival to early man. It is logical how such a talent could evolve. Even in contemporary jobs it is useful. For example, a lawyer skimming hundreds of pages of material can spot a reference to his particular subject without reading each word. Or a security guard can spot something out of place without detailed examination of his entire surroundings. In designing a virtual world we can use this ability by using less detailed and incomplete representations of objects. If the objects are familiar to the viewer he can interpret them as what they represent.

Another interesting insight into how the mind interprets images can be obtained from examining optical illusions. For example, if two straight lines of equal length are placed horizontally one above the other and arrowheads are added, one line with a set pointing out from the ends of the line and the other with the arrowheads pointing in toward the center of the line, the viewer will see the lines as different lengths. The arrowheads pointing out appears shorter because the trailing angles of the arrows tend to pull the observers vision in toward the center of the line, thus shortening its length. Another examples can be made with two circles of equal size. If, in one case, the circle is surrounded by several others which are larger, and in a second case the circle is surrounded by several smaller circles, the original two will appear unequal. The circle surrounded by larger items appears to be smaller than the other because we tend to compare what we see with nearby objects.

In a similar fashion to the optical illusions, the mind sees clues which indicate distance. If we wish to present a three dimensional world using two dimensional drawings we can take advantage of these clues to distance. James Gibson in his book THE PERCEPTION OF THE VISUAL WORLD defined thirteen varieties of perspective.

1. Texture. The density of the texture of a surface increases as it recedes into the distance.

2. Size. Objects of equal size appear smaller at a distance.

3. Linear. Parallel lines appear to meet at a great distance.

4. Binocular. Because of the separation of the two eyes, each receives a different image. The differences between these two images is much greater at close distances.

5. Motion. A moving object appears to move more slowly at a greater distance.

6. Aerial. When looking over a long distance the haze of the atmosphere tends to fuzz the outlines of objects. Likewise there are distortions of color when viewing objects at far distances.

7. Blur. If the focus of the eye is on a close objects, objects at greater distances appear blurred. The greater the distance removed from the focus point, the greater the blur.

8. Relative upward location. When looking at a scene we tend to associate objects high in the vision field with distance. One looks up to see the horizon and down to see the path in from of him.

9. Shift of texture or spacing. When looking down at a valley from the edge of a cliff there is a sharp break in the gradually increasing texture density observed when looking over a flat plane. Blades of grass may be individually seen near the observer, but in the distant valley an entire lawn is no wider than the single blades in the foreground.

10. Change in the amount of double imagery. When looking at a distant point everything closer will be doubled. The farther from the viewer, the less this doubling.

11. Shift in the rate of motion. When the viewer moves and two overlapping objects do not appear to move relative to each other it must be because they are so far away that the shift is not perceived.

12. Continuity of outline. It is assumed that a near object will block the outline of a distant object. If the complete or continuous outline of an object is not visible, the object must be at a distance. This has been exploited to a high degree in the are of camouflage.

13. Transitions between light and shade. An abrupt change in brightness is interpreted as an edge. As a person looks across a field at a cliff, he will note a sharp change in the brightness of the scene at the cliff edge.

These cues can be used by the virtual world designer to impose a three dimensional interpretation to a two dimensional image. Even without the 3-D head mounted display an illusion of depth can be given. If unequal sized objects are placed on a plain background it will appear that they are of different size but at the same distance from the viewer. If, however, a patterned floor is added with a change in linear spacing and texture density, it will appear that the objects are all of equal size but at different distances, the largest being the nearest.

Another technique for a "hidden" 3D effect is the stereogram, or magiceye.  By combining the two different views for the two eyes a 3D image is produced which may be viewed by about 80% of the population.   Approximately one in five people cannot see the 3D effect at all.

AUDIO

In the same manner as the two eyes provide two images which the brain interprets as a view of three dimensional space, the two ears provide two audio signals. The differences in these are used to provide a three dimensional source for the sound.

Sound is a pressure wave traveling through some medium. Usually we think of sound as traveling through air; however, sound not only travels through other media but it moves at different speeds. In air the speed of sound is approximately 1100 feet per second at sea level. In steel its speed is approximately four times this value. If a swimmer, under water, hears a distant sound, such as an explosion, and then surfaces, he may hear the same sound again. This is because the sound traveled through water at a faster speed than it did through the air and thus did not reach the listener above water until some time later.

We hear a sound when these pressure waves strike the ear drum, a thin membrane. They are then transmitted through a series of bones to the inner ear, a fluid filled spiral structure rather like a snail shell. Along the length of and inside the spiral are located a large number of hair-like nerve cells. As the pressure wave travels through the fluid it excites some of these cells which produce an electrical signal which travels to the brain. Which cells are excited is determined by the frequency of the vibration; the lower frequencies excite cells in the first, or largest, part of the spiral while higher frequencies excite those farther toward the center. The normal human ear can hear frequencies between about 20 and 16,000 Hertz (Hz), or cycles per second. Some people can hear up to about 20,000 Hz. Exposure to loud noises can kill the nerve cells in the inner ear, those responding to the higher frequencies first. Today, one third of students finishing high school have lost a large part of their high frequency hearing from loud music. Earphones are particularly implicated because they provide a source very close to the ear and thus do not need as loud a sound to cause damage. If you can hear the high pitched whine near a TV set you probably have not lost much hearing as this is at 15,750 Hz.

When a sound is produced and the waves strike the two ears, there is a difference in the two signals. First there is the time difference in the arrival of the sounds. Although this is not large, the ear can distinguish time intervals as short as 70 microseconds. Secondly, as the sound travels around the head to the second ear, the waveform is changed. Most sounds are not made of a single frequency but rather a combination of many. As the sound moves around the head, the high frequencies are distorted more than the low. Finally, the sounds are shaped and altered by their travel through the outer ear. Sounds coming from different angles are altered in different manners.

These three effects are combined by the brain to produce a point of origin for the sound. A person sitting blindfolded in a room can pinpoint with great accuracy the location of a dropped set of keys without any visual cues. The goal of the virtual world designer is to be able to place a sound at any location relative to the listener. To do this it is necessary to duplicate the three effects described above and reproduce the two different sounds in the two different earphones used by the listener. Some measurements of outer ear distortion and the shaping caused by travel around the head have been made. An accurate model of the head and ears is made and outfitted with miniature pinhead microphones. The model is then placed in a chamber to block all other sounds and recordings made as a sound is produced at various locations within the room. Equations can then be determined which will yield the two altered sounds for the two earphones.

Although the above procedure sounds easy enough, there are a number of problems associated with it. A large number of measurements must be made. A wide variety of sounds must be mapped. The equations are extremely complex and require extensive computing power even if they can be accurately determined. Even if we can accomplish all of this there is still another limitation. The chamber used makes no information available about how the sound would be perceived in different rooms or locations. Soft drapes would give the same response as hard marble walls and as an outdoor theater. The computation required to produce a faithful copy of actual 3-D sounds is enormous and beyond the capability of current computers. For such calculations it is estimated that an operation speed of about 1200 MFLOPS (million floating point operations per second) would be required. Current PCs have a capability of 0.5 to 4 MFLOPS. Full reproduction of 3-D sound is not yet available, although some significant progress has been made. One firm in Madison has produced a system which is expected to be available in chip form which will require only a sound, an elevation and an azimuth to reproduce a reasonably good 3-D effect.

Another problem with creating sounds in a virtual world has to do with the regularity of the event cycle. The event cycle is the series of operations which must be repeated over and over to produce the virtual world. These include such things as reading sensors, creating and displaying the visual images, and producing any other outputs. Usually the majority of each such cycle is spent in generating the visual images. To give the illusion of true motion we need to produce approximately 20 to 30 images per second. However, if one image is displayed for 1/20 second, the next for 1/40 second and the third for 1/30 second the overall effect is not changed. Therefore, event cycles are usually controlled by the time necessary to generate a visual image, which in turn can vary widely depending upon the complexity, the degree of change from the previous image and other factors. Thus the event cycle is not of constant duration. Sounds, on the other hand, require regularly spaced generation cycles. If a separate computer were used to produce the sounds, a regular cycle could be maintained without constraining the image generating process. But in this arrangement the synchronization of the two would become far more complex.

Most of these limitations can be overcome with advances in computing speed. It would not be surprising if realistic sound is available to the virtual world designer within a few years.

TOUCH

Our sense of touch actually has several components. We sense the amount of force applied (or reflected) when, for example, we lift a fifty pound weight. However we can also identify the texture of a surface by the light application of force and sense the difference between rough and smooth. The skin contains on the order of 100 sensors per square cm. In our attempts to duplicate such sensors on robot hands we can not come within two orders of magnitude using current techniques. Likewise, when we attempt to use stimulators to provide touch feedback to the human operators we can get little more than one per square cm.

In addition to the direct senses of force and touch we can also sense changes in accelerations and positions. The use of motion platforms in flight simulators is an attempt to provide a simulated input for these senses.

The skin is also a very good sensor of infrared radiation and temperature. Very small changes in temperature or in radiated energy can be easily detected. For example, if you wish to keep a running faucet at a constant temperature, holding your fingers in the stream of water is probably as accurate as using a thermometer. Changes of a fraction of a degree are easily detected. There is, however, a problem with absolute measurements. Our sense of temperature - like our other senses - adjusts for current conditions. A simple experiment can show this. Place one hand in a pan of warm water and the other in a pan of cold water. After several minutes place both hands in a pan of neutral temperature water. The hand which has been in the cold water will sense the neutral water as hot; the other hand will sense the same water as cold because it has become used to a warm environment.

The use of temperature has not been widely employed in virtual worlds. Some systems have used heated or cooled air, but the problem of providing localized temperature stimulation has not been explored. It might be considered for future systems. For example, if, wearing a data glove, you reach out and pick up a virtual cup of steaming coffee, a heat stimulus applied through the glove could enhance the realism of the experience.

SMELL

Our sense of smell is very sensitive and often does more to evoke an emotional response than any of the other senses. A unique odor can often cause the recall of details of a past event far better than pictures or sounds. Still our sense of smell is far more limited than that of many other animals. A silk moth can detect the smell emitted by a mate over a half mile distant. A cockroach can react to as few as 30 molecules of an odor. Many other animals depend to a great extent on their sense of smell. But because we are probably not creating a virtual world for a cockroach, we should look more closely at our human perception of smell.

In the United States our society has greatly reduced the importance of this sense. In many other cultures smell is considered an important characteristic. For example, in Arab countries in arranging a marriage the brokers will often ask to smell the prospective bride. This is not done for fear of an offensive odor but rather to check for a compatibility with the groom.

Because of the complexity of producing particular odors on demand this aspect is not usually incorporated into a virtual world. The Sensorama described earlier was one exception. While exact odors would require a very complex system, some general smells could be added to help establish a location or mood. Pine scent could enhance the realism of a forest visual; damp, salt air could help with an ocean. Control of even these general scents would be difficult in an interactive environment and could probably only be used in general background capacities which are not likely to change rapidly in response to the user. The use of odor is a largely unexplored area in virtual reality systems.

OTHER SENSES

Taste provides less information than our other senses. Most of what we think of as taste is actually smell. Our sense of taste can only distinguish four different inputs: salt, sweet, sour, and bitter. Blindfolded and with his nose blocked, a person can’t tell if he is biting into an apple or an onion. This and the complexity of providing any kind of taste feedback lower the importance of taste in virtual reality.

There is indication that some animals and even humans have other senses. Some animals appear to sense the earth’s magnetic field and use it for navigation during migrations. The lack of detailed evidence of any such additional senses in Man remove them from consideration for virtual worlds at this time.