Rate of reinforcement and reinforcement schedules
Reinforcement rates and schedules, when done incorrectly probably make up 90% of the reason that dog owners (and less experienced trainers) end up missing the results or are unable to achieve higher levels of training.
A prime example of this is what is known as the yo-yo effect when heeling. Most people have issues with this exercise, because if they don’t maintain the appropriate ratio for their rewards at the beginning, their dog will take his treat and then disengage (and pull, sniff, etc) but then will return in time for his next reward.
What is the reinforcement rate?
To put it in simple English, it means how many times we will deliver the primary reinforcement (reward) to our dog, per minute. This is the crucial part. Without this the whole reward training system (including clicker training) would be virtually impossible. In order for an animal to learn properly we use a high-reinforcement rate system in dog training.
High Reinforcement Rate
This is the most confusing part for many dog owners, and one that is often the most controversial one when it comes to reward-based training in general. How many times do you reward your dog?If I only had a dime for every time I’ve seen puzzlement or shock on the faces of my students and clients when I’ve told them that a good rate of reinforcement for simple behaviors is 18-24 clicks (treats) per MINUTE!!! Normally, their next comments are always:
- “But my dog will gain weight”
- “I can’t give my dogs treats 24/7”
My answers to these are as follows;
First of all, when training new exercises and behaviors, you are only training for a few minutes at a time, so you won’t spend an hour of time stuffing your dog with treats. It is recommended that you cut down and adjust on the dog’s food, if necessary, in order to balance his daily portion. Some dog trainers even choose to deliver all of the dog’s daily meal through training sessions.
Regarding rewarding your dog 24/7, keep in mind that we use a high reinforcement rate only in the beginning, until the desired behavior is formed. Then in most cases we switch to a random and variable reinforcement schedule (you can find out more about this below).
The myth that a dog needs to work hard to earn
It is a common belief that “a dog has to work hard in order to get paid”. This will get you nowhere. It is important to have a high number of successful repetitions of a behavior, in order for the behavior to form correctly and reliably.
If you are asking too much based on one single treat (this is in the beginning of training), you will face the next issues: Mother Nature: no animal will invest more energy into a performance than they will get out of it. It is through training that they learn to break this “rule” and work for extensive periods of time for a “minimal” reward.Frustration: if you are asking too much from your dog in the beginning, he will get frustrated, lose interest and actually start avoiding the work in general. Confusion: This happens especially in the beginning when we try cutting on reinforcement too soon, or if the dog hasn’t properly clued into what he is getting reinforced for. Dogs learn to switch off: maybe the best example for this is with heeling. If you are delivering a treat every 5 steps, for example, your dog will learn to take the treat and then start looking around, sniffing, etc, and once you hit the fifth step he will automatically turn towards you, take the next treat and then switch off (disengage) again. Or even worse, you will create a yo-yo effect where your dog will return (often you will have to somehow call him back) and as soon as he gets the treat he will dart forward and continue doing what he was doing (sniffing, pulling, etc.)
Another example is when people start training their dogs and then do heeling exercises by walking straight 30-50ft while holding the treat over the dog’s head. This only teaches the dog two things:
- Only work in the presence of a treat (bribing).
- Eventually give up, because it isn’t worth the effort (frustration).
At the end of the day, many people and even dog trainers end up using unnecessary aversive measures in order to "fix’ a problem that wasn’t a problem at all. Most things can be avoided by applying the appropriate reinforcement rate for the exercise you are planning.
In the case of heeling, if you start with a high reinforcement rate (approximately 18-20 rewards a minute) and start in an environment with no distractions, within a few exercises, you should be able to start lowering your reinforcement rate, as well as gradually introducing distractions.
TIP: If you run into problems while clicker/marker training, remember these key pointers:
- Keep a high reinforcement rate
- Lower the criteria, if necessary. Keep the level of exercise always at your dog’s pace so that he can learn, then move forward and increase the criteria. (You can find out more about Criteria here.)
- Pay attention to where you are positioning your reward, this way you can get more repetitions from your dog, as well as help him clue in faster. For example, if your goal is for the dog to sit on a bench in the middle of the room, it would be a waste of time and misleading for the dog if I toss the treats on the opposite side of the room.
- Arrange the environment, if necessary, to speed up the process and increase the level of success.
- Make sure that your clicker/marker training timing is as perfect as possible so that your dog can clearly understand what he is getting rewarded for.
- Last but not least, make sure that your reward is indeed rewarding to your dog.
Back to kindergarten
Always keep in mind that when changing to new environments and situations, we often have to take a few steps back in our training, to help ensure success. For example, if you managed a great heeling performance for 30ft on a single reward in one environment, in a new environment you may need to go back to rewarding every 15 ft or even 10ft until you work your way back to the 30ft mark. This is normal. As soon as your dog gets familiar with the new surroundings (and this is normally relatively fast, only needing a few repetitions) you can go back to the normal training.
This happens to all creatures and species in the world, including humans. If you were to sit at the table to learn a poem by heart and repeat it over and over again, you would be able to chant it, relatively quickly. Now, if you were to try doing it when climbing the stairs in a strange building, or while pushing a shopping cart at the store, you would realize that you tend to pause, lose the words, or even forget entire parts of the poem. The reason for this is that our brain has to adjust to the new environment.
Basically this action is a natural response; therefore we need to factor in an adjustment period into our training to accommodate it. Keep in mind that no matter how simple a dog’s exercise appears to us, for them, they are also processing a new environment, just like we would do. Also remember that all of these exercises are not something natural to dogs; if they were we wouldn’t need to train them.
Will this happen forever? That depends on the level of communication between a dog and his handler, as well as how good their training plan is. Dogs do reach a generalization point and any good dog trainer knows this, therefore they tend to work in different environments until they reach this point. This process is time consuming and requires patience, but the benefits are huge.
Can you avoid all of this? Probably not; if your dog doesn’t reach the generalization point, you will always face some level of issues whenever you move into new environments.
What are reinforcement schedules?
This will cover the subject of reward schedules. In animal training it is called reinforcement schedules. There are many different reinforcement schedules, however there are generally three (although the last two are sometimes put together in the same group for dog training, I will separate them here to make it easier for understanding).
- Continuous reinforcement schedule
- Fixed ratio schedule
- Variable reinforcement schedule
Continuous reinforcement schedule
This is the first level that we use in our training, and unfortunately most dog owners never successfully transition from this level. The continuous reinforcement schedule simply means that we reward every repetition. It is a simple formula: command → behavior → click (marker) → reward.
As many dog owners have experienced, this is a quick results approach. Dogs love it, the learning is fast, dogs are happy, etc. However, there is a problem with this type of system. If you were to keep this system for too long, it would actually start working against you.
For example: if you were to reward your dog for every “sit” that you ask him, throughout a period of perhaps one year (the timeframe would vary from dog to dog) and then you try to change the reinforcement schedule, the behavior will crumble and disappear relatively fast. The reason for this is that the dog will simply stop offering behaviors, because the reinforcement (reward) is missing.
This is also part of the issue that many dog owners face, as soon as they start phasing out the reward, the behaviors are gone down the drain.
In some cases, dog trainers keep their dogs on continuous reinforcement schedules for certain exercises. However, for normal day to day life many people agree that it is better to switch dogs to a variable and random reinforcement schedule to help strengthen the behavior.
Fixed ratio schedule
This “in-between” reinforcement schedule for most dog owners is a level in between continuous reinforcement schedules and the ultimate, variable reinforcement schedule. In a nutshell, it works on a very simple principle; instead of every repetition being reinforced (rewarded) it is every second, third or even more (depending upon your plan, exercises, etc.). In our human world we use this concept on a daily basis; for example, buy three get one free, make your quota per day in order to meet your targets to get paid, etc.
In a dog’s world, this means that we reward every second or every third “sit” instead of every single one of them, for example.
This is not as easy as many people think, as there are many factors that may go wrong. Here is a list of some of them:
- If you spent too much time in the continuous reinforcement schedule or for some other reason your dog believes that he gets rewarded for each repetition, he may stop offering behaviors.
- Dogs often get frustrated (especially in the beginning), this may turn them away from work completely if you keep pushing.
- It often happens that animals start “slacking off” on the non-rewarded behaviors. A great example was given by Karen Pryor once when talking about dolphins; if you keep the dolphin on a fixed ratio of every third jump gets rewarded, he will start slacking off on his first two jumps and only invest all of his energy and strength into the third one as he knows that that is the one that brings the reward. In order to avoid this, every now and then you will need to reinforce the in-between behaviors to keep them strong and fluent.
Another issue is the actual value of the reward itself. Although many people think that their dogs are crazy about treats, as soon as things get tougher, their “food drive” disappears. In order to get reliable responses you need to meet a few criteria:
- Your dog needs to be driven by the reward. This means that the reward needs to be a high enough value for him not to quit easily. There are ways to build drive in your dog, but this is a whole other training aspect which is not the focus here.
- This is the time that building a good relationship and building good working habits will start to pay off. People, including inexperienced dog trainers, who put all the training efforts on the value of the reward, will soon realize that it is not all about rewards. Dogs need to enjoy working and hanging out with you. This is the ultimate and the strongest reward. The whole working concept has to be extremely rewarding to your dog. If you make it that way, then the actual reward is just like a cherry on top of whipped cream. I know that it sounds easy but it is not, however it is not impossible either.
As mentioned above, the majority of dog owners don’t pass the first barrier of continuous reinforcement even though this is crucial for successful dog training and to keep asking your dog for more.
Variable reinforcement schedule
The variable reinforcement schedule is the most challenging one. It is the type of the system that is constantly changing, and constantly increasing therefore always slightly pushing the limits. The whole key here is for our dog to not be able to predict what repetition will bring the reward. The most accurate comparison would be the casino slot machine. Not every coin is the winning one, but it drives you forward because you know that eventually you will win. This leads into the next question...
How far can we push the limits?
Theoretically, by using the variable reinforcement schedule technique, we can gradually postpone the reinforcement indefinitely. However, in reality, it really depends on many factors. I’ve seen dogs quit after 10-15 non-rewarded repetitions, but I have also seen dogs that went 70+ repetitions and would still keep going without being rewarded. So what’s the difference?
Not every dog is the same and not every dog has the same level of drive and motivation. In general, the more you invest in building a solid relationship, and the more the training is motivating for your dog, the action and exercises themselves will become more self-rewarding driven activities. However, in the end, the results always depend on the dog himself.
The name of the game: Behavior extinction bursts – A tricky plan
Even though this is a way to increase the intensity of behaviors and to postpone the reward, it can be so difficult for some that many dog trainers (not to mention dog owners) don’t bother experimenting with it. I will describe the basics of it for you here, so that you can understand how this concept works.
I will use the mirage effect in the desert, as an example. Imagine a person walking thirsty through the desert and suddenly spotting an oasis on the horizon. He will walk towards it (beginning of the behavior) in order to access the water (reinforcement). At some point along the way he will start second-guessing whether or not he can reach it, but will decide to try pushing and walking an extra mile to see if maybe he can get there (behavior extinction burst), eventually the person will realize that there is no way to reach it, so he will stop walking towards it (quitting, the behavior falls apart and disappears).
If by any chance, the person was able to reach the water after that extra push, then the next time, if in a similar situation, that person will invest even more energy and walk even further, therefore postponing the quitting sequence.
Often we do this unconsciously with our dogs and their negative behaviors, barking is one of these examples or jumping.
If using this approach for training wanted behaviors, we need to capture these moments and make sure to reward them appropriately. It is a risky job, as you need to really have a lot of experience and be able to read the behavior extinction burst signals in order to capture them. Why is it considered risky?
If you miss the behavior extinction burst, the next step is the crumbling and disappearance of the behavior. As mentioned, this is the tricky part and many dog trainers and dog owners don’t end up using it, or at least not intentionally.
Why is this so important?
At some point in time, you will need to ask your dog for more. More speed, more reliability, more repetitions for less rewards. The reason that it is important to do it properly is because it is the deciding factor between success and failure; it’s that simple.
If you follow the steps above, you will be a few steps closer to your goals, the behaviors that you are working on will strengthen, and you will be closer to the final product; a strong reliable behavior that is controllable on command (behaviors put on cue). On the other hand, if you miss or don’t do the job right, as soon as you try switching your dog from a continuous reinforcement schedule to anything else, you will most likely face failure.
Return from Reinforcement to Clicker Training
Return from Reinforcement to Training Your Dog and You