Introduction to Learning Theory and Behavioral Psychology
Learning can be defined as the process leading to relatively permanent behavioral change or potential behavioral change. In other words, as we learn, we alter the way we perceive our environment, the way we interpret the incoming stimuli, and therefore the way we interact, or behave. John B. Watson (1878-1958) was the first to study how the process of learning affects our behavior, and he formed the school of thought known as Behaviorism. The central idea behind behaviorism is that only observable behaviors are worthy of research since other abstraction such as a person’s mood or thoughts are too subjective. This belief was dominant in psychological research in the United Stated for a good 50 years.
Perhaps the most well known Behaviorist is B. F. Skinner (1904-1990). Skinner followed much of Watson’s research and findings, but believed that internal states could influence behavior just as external stimuli. He is considered to be a Radical Behaviorist because of this belief, although nowadays it is believed that both internal and external stimuli influence our behavior.
Behavioral Psychology is basically interested in how our behavior results from the stimuli both in the environment and within ourselves. They study, often in minute detail, the behaviors we exhibit while controlling for as many other variables as possible. Often a grueling process, but results have helped us learn a great deal about our behaviors, the effect our environment has on us, how we learn new behaviors, and what motivates us to change or remain the same.
Classical and Operant Conditioning
Classical Conditioning. One important type of learning, Classical Conditioning, was actually discovered accidentally by Ivan Pavlov (1849-1936). Pavlov was a Russian physiologist who discovered this phenomenon while doing research on digestion. His research was aimed at better understanding the digestive patterns in dogs.
During his experiments, he would put meat powder in the mouths of dogs who had tubes inserted into various organs to measure bodily responses. What he discovered was that the dogs began to salivate before the meat powder was presented to them. Then, the dogs began to salivate as soon as the person feeding them would enter the room. He soon began to gain interest in this phenomenon and abandoned his digestion research in favor of his now famous Classical Conditioning study.
Basically, the findings support the idea that we develop responses to certain stimuli that are not naturally occurring. When we touch a hot stove, our reflex pulls our hand back. It does this instinctually, no learning involved. It is merely a survival instinct. But why now do some people, after getting burned, pull their hands back even when the stove is not turned on? Pavlov discovered that we make associations which cause us to generalize our response to one stimuli onto a neutral stimuli it is paired with. In other words, hot burner = ouch, stove = burner, therefore, stove = ouch.
Pavlov began pairing a bell sound with the meat powder and found that even when the meat powder was not presented, the dog would eventually begin to salivate after hearing the bell. Since the meat powder naturally results in salivation, these two variables are called the unconditioned stimulus (UCS) and the unconditioned response (UCR), respectively. The bell and salivation are not naturally occurring; the dog was conditioned to respond to the bell. Therefore, the bell is considered the conditioned stimulus (CS), and the salivation to the bell, the conditioned response (CR).
Many of our behaviors today are shaped by the pairing of stimuli. Have you ever noticed that certain stimuli, such as the smell of a cologne or perfume, a certain song, a specific day of the year, results in fairly intense emotions? It's not that the smell or the song are the cause of the emotion, but rather what that smell or song has been paired with...perhaps an ex-boyfriend or ex-girlfriend, the death of a loved one, or maybe the day you met you current husband or wife. We make these associations all the time and often don’t realize the power that these connections or pairings have on us. But, in fact, we have been classically conditioned.
Operant Conditioning. Another type of learning, very similar to that discussed above, is called Operant Conditioning. The term "Operant" refers to how an organism operates on the environment, and hence, operant conditioning comes from how we respond to what is presented to us in our environment. It can be thought of as learning due to the natural consequences of our actions.
Let's explain that a little further. The classic study of Operant Conditioning involved a cat who was placed in a box with only one way out; a specific area of the box had to be pressed in order for the door to open. The cat initially tries to get out of the box because freedom is reinforcing. In its attempt to escape, the area of the box is triggered and the door opens. The cat is now free. Once placed in the box again, the cat will naturally try to remember what it did to escape the previous time and will once again find the area to press. The more the cat is placed back in the box, the quicker it will press that area for its freedom. It has learned, through natural consequences, how to gain the reinforcing freedom.
We learn this way every day in our lives. Imagine the last time you made a mistake; you most likely remember that mistake and do things differently when the situation comes up again. In that sense, you’ve learned to act differently based on the natural consequences of your previous actions. The same holds true for positive actions. If something you did results in a positive outcome, you are likely to do that same activity again.
Reinforcement
The term reinforce means to strengthen, and is used in psychology to refer to anything stimulus which strengthens or increases the probability of a specific response. For example, if you want your dog to sit on command, you may give him a treat every time he sits for you. The dog will eventually come to understand that sitting when told to will result in a treat. This treat is reinforcing because he likes it and will result in him sitting when instructed to do so.
This is a simple description of a reinforcer (Skinner, 1938), the treat, which increases the response, sitting. We all apply reinforcers everyday, most of the time without even realizing we are doing it. You may tell your child "good job" after he or she cleans their room; perhaps you tell your partner how good he or she look when they dress up; or maybe you got a raise at work after doing a great job on a project. All of these things increase the probability that the same response will be repeated.
There are four types of reinforcement: positive, negative, punishment, and extinction. We’ll discuss each of these and give examples.
Positive Reinforcement. The examples above describe what is referred to as positive reinforcement. Think of it as adding something in order to increase a response. For example, adding a treat will increase the response of sitting; adding praise will increase the chances of your child cleaning his or her room. The most common types of positive reinforcement or praise and rewards, and most of us have experienced this as both the giver and receiver.
Negative Reinforcement. Think of negative reinforcement as taking something negative away in order to increase a response. Imagine a teenager who is nagged by his mother to take out the garbage week after week. After complaining to his friends about the nagging, he finally one day performs the task and to his amazement, the nagging stops. The elimination of this negative stimulus is reinforcing and will likely increase the chances that he will take out the garbage next week.
Punishment. Punishment refers to adding something aversive in order to decrease a behavior. The most common example of this is disciplining (e.g. spanking) a child for misbehaving. The reason we do this is because the child begins to associate being punished with the negative behavior. The punishment is not liked and therefore to avoid it, he or she will stop behaving in that manner.
Extinction. When you remove something in order to decrease a behavior, this is called extinction. You are taking something away so that a response is decreased.
Research has found positive reinforcement is the most powerful of any of these. Adding a positive to increase a response not only works better, but allows both parties to focus on the positive aspects of the situation. Punishment, when applied immediately following the negative behavior can be effective, but results in extinction when it is not applied consistently. Punishment can also invoke other negative responses such as anger and resentment.
Reinforcement Schedules
Know that we understand the four types of reinforcement, we need to understand how and when these are applied (Ferster & Skinner, 1957). For example, do we apply the positive reinforcement every time a child does something positive? Do we punish a child every time he does something negative? To answer these questions, you need to understand the schedules of reinforcement.
Applying one of the four types of reinforcement every time the behavior occurs (getting a raise after every successful project or getting spanked after every negative behavior) is called a Continuous Schedule. Its continuous because the application occurs after every project, behavior, etc. This is the best approach when using punishment. Inconsistencies in the punishment of children often results in confusion and resentment. A problem with this schedule is that we are not always present when a behavior occurs or may not be able to apply the punishment.
There are two types of continuous schedules:
Fixed Ratio. A fixed ratio schedule refers to applying the reinforcement after a specific number of behaviors. Spanking a child if you have to ask him three times to clean his room is an example. The problem is that the child (or anyone for that matter) will begin to realize that he can get away with two requests before he has to act. Therefore, the behavior does not tend to change until right before the preset number.
Fixed Interval. Applying the reinforcer after a specific amount of time is referred to as a fixed interval schedule. An example might be getting a raise every year and not in between. A major problem with this schedule is that people tend to improve their performance right before the time period expires so as to "look good" when the review comes around.
When reinforcement is applied on an irregular basis, they are called variable schedules.
Variable Ratio. This refers to applying a reinforcer after a variable number of responses. Variable ratio schedules have been found to work best under many circumstances and knowing an example will explain why. Imagine walking into a casino and heading for the slot machines. After the third coin you put in, you get two back. Two more and you get three back. Another five coins and you receive two more back. How difficult is it to stop playing?
Variable Interval. Reinforcing someone after a variable amount of time is the final schedule. If you have a boss who checks your work periodically, you understand the power of this schedule. Because you don’t know when the next ‘check-up’ might come, you have to be working hard at all times in order to be ready.
In this sense, the variable schedules are more powerful and result in more consistent behaviors. This may not be as true for punishment since consistency in the application is so important, but for all other types of reinforcement they tend to result in stronger responses.
1 comments:
Post a Comment