CONDITIONING AND LEARNING
Conditioning was discussed in the first lesson of this course. Turn back to that part of the course and revise the notes on conditioning.
- Classical conditioning can be defined as “learning to recognize”.
- Operant conditioning can be defined as “learning to respond”.
In the early part of the 20th century, J.B. Watson developed a new school of psychology called “Behaviourism”. He thought that all learned responses were chains of unconditioned responses (i.e. an animal would learn that its performance of a particular unconditioned response was in fact the conditioned response that would result in another unconditioned response).
Later in the 20th century, B.F. Skinner developed an alternative explanation (i.e. operant conditioning) – that the response is a novel behaviour rather than a reflex chain, and that an animal adopts that behaviour through shaping. Shaping could be defined as a type of self conditioning by the animal.
For shaping to occur, an animal needs some type of goal, and the motivation to attain that goal. Giving a reward, such as feeding, might be a way of providing motivation for the animal to achieve a goal. During a process of self shaping, any animal (wild or domestic) will experiment with different ways of achieving any behaviour that brings it closer to a goal that it is motivated to attain. Successful behaviours will be performed with increasing frequency, and behaviours that are irrelevant will be discarded. In due course, an effective behaviour emerges: this is a learned motor program.
By repeating attempts in a trial and error process, it would be suggested that an animal is using thought and memory (ie. cognition), and learning to respond (ie. operant conditioning is taking place).
Extinction and Habituation
Habituation results from repeatedly presenting a stimulus, without immediate consequences. Extinction results from repeatedly presenting a stimulus, without reinforcement.
This appears different to classical conditioning – because the animal needs to perform some action before receiving any reinforcement. Instrumental learning cannot happen though without also establishing conditions for classical conditioning.
Thorndike’s Law of Effect (1913)
A response followed by a rewarding or satisfying state of affairs would increase in probability of occurrence, while a response followed by an aversive or annoying consequence would decrease in probability of occurrence. Thus, the success of instrumental learning is attributed to the fact that learned behaviour can be modified directly by its consequences. Thorndike assumed that a reinforcer increases the probability of the response upon which it is contingent because it strengthens the learned connection between the response and the prevailing stimulus situation.
This became known as the “stimulus–response theory of learning”.
(Ref: McFarland, D.; Animal Behaviour 3rd edition, Prentice Hall, 1999)
Operant and Respondent Behaviour
Operant behaviour is “spontaneous action without any obvious stimulus”.
Respondent behaviour is “behaviour performed in response to an identifiable stimulus”.
Example: The Skinner Box
This is a device used in a classic experiment. A hungry bird is placed in a small cage containing a device that will supply it with feed, and a key at head height. Whenever food is delivered, a small light illuminates the feed. The birds learn quickly to associate the light with feeding, and approach the feed tray whenever the light goes on.
The next stage of shaping involves making food delivery depend on some aspect of the animal’s behaviour –usually requiring the bird to peck the key before food is supplied. Key pecking can be shaped by limiting supply of feed to movements which approach/are similar to key pecking. Eventually feed is only supplied when the bird stands upright and pecks the key. NB: Pecking might be encouraged by gluing a grain of feed to the key. When the key is pecked, it activates a switch releasing food.
From this point, the animal’s learning can be refined; perhaps giving it two keys to peck, one that supplied feed and one that does not, so the animal is rewarded only after a certain number of pecks, rather than every peck. The bird will adapt and alter its learning accordingly.