McGreevy on Operant Conditioning

This post is part of the McGreevy seminar series. Click here for the index.

Please note: This article assumes some prior knowledge of operant or instrumental conditioning, as it mostly focuses on McGreevy’s comments on operant and instrumental conditioning, rather than on explaining these terms itself. If you are lacking a comprehensive understanding of Operant Conditioning, then I suggest this page from Crystal at Reactive Champion blog. If you already have some idea of operant conditioning, come on in. This may be confusing, but we can only hope it may add to your understanding.

Operant conditioning, also called instrumental conditioning, is when the animal’s voluntary response is instrumental (i.e. important) in establishing the consequence (i.e. reinforcement or punishment). (By voluntary, we mean responses that the animal has control over. Involuntary would be things like salivating or growing hair.)

McGreevy used the diagram below to consider operant conditioning.

Here, the ‘x’ marks the spot of neutral stimuli that does not modify behaviours. That is, a neutral experience. From here, stimuli can either be reinforcing and increase the probability of behaviours, or they can be punishing, and decrease the animal’s responses in question. The purple arrows indicate negative punishment (-P) and negative reinforcement (-R). Negative punishments use the removal of attractive stimuli to make a response less probable. Negative reinforcements uses the removal of adverse stimuli to make a response more probable.

Where a response becomes less likely, negative or positive punishment was used.
When a response becomes more likely, negative or positive rewards were used.

The merit or success of a reward or punishment is measured by degree to which it makes the behaviour more likely in the future. If an animal changes their behaviour, then we know that something aversive or attractive has occurred. However, animals can habituate to adverse stimuli (for example, horses can develop a ‘hard mouth’). These quadrants are often not occurring individually, but in unison.

McGreevy finds working within the positive reinforcement spectrum most appealing, as it is hard to make mistakes.

McGreevy said that when a behaviour has ceased to occur, the animal has been punished for exhibiting the behaviour – but not necessarily by positive punishment. (However, I would say that animals may also cease to exhibit behaviours because there are rewards in engaging in different behaviours that are greater than the rewards offered in the existing behaviour. For example, a dog may cease to bark at the postman when he learns that the postman’s bike is actually a cue that a biscuit will be served up inside.)

McGreevy presented a diagram to the seminar. I tried to recreate it for you, but alas, I failed. Basically, it shows that a goose and a person. The person has food. The goose has to balance its desire to seek food against its desire to avoid people. Where the goose ends up standing is the ‘neutral zone’.

“The world is a puzzle box”

McGreevy introduced the concept of calling the whole world a “puzzle box” for the dog. That is, the world has many opportunities for rewards and punishment, and a number of levers (i.e. behaviours in different contexts) that need to be pushed to obtain those rewards and punishments. Owners have the role of getting dogs excited about finding (or being shown) levers in life, and showing them the consequences that result.

This is just some of the many insights McGreevy provided into training. More to come!

Further reading: The #@*$ing Four Quadrants with Dr Ian Dunbar

This post is part of the McGreevy seminar series. Click here for the index.

Blogging on all things dogs, particularly dog science and dog politics.

One thought on “McGreevy on Operant Conditioning”