Constructing an effective mystery shopping program
Written by Donna Guido
Mystery shopping programs have become an important tool in the researcher’s kit of information gathering techniques. As with all tools, however, it’s important to recognize what they can and can’t do well.
Properly designed and implemented mystery shopper programs can provide an early warning system for any business that relies on extensive public contact. Executional problems can then be corrected before they result in sagging customer perceptions and, eventually, falling sales.
Mystery shopper programs can also be an excellent barometer of how changes in products, systems, people, marketing, weather or even dayparts and weekparts affect the execution of customer service and product quality compared to company standards.
Additionally, mystery shoppers can provide objective data about employee performance on specific, observable behavioral measures for use in training, compensation and motivation of both hourly and management employees. In some instances, this information is invaluable in gaining, improving or proving compliance with government regulations.
In short, when properly designed and implemented, mystery shopper programs can provide valuable information about the way businesses and people actually operate at the customer level. What they can’t do is determine how businesses ought to operate.
They can’t say with any reliability what a target market wants from a business or product or, conversely, whether a business or product is delivering what customers want. Nor can they reliably make the attitudinal trade-offs that consumers routinely make throughout a sales or service transaction, e.g., "The service person was slow reaching me, but was really friendly and helpful so, on balance, the experience was positive."
Not a substitute
While they can offer a great deal of insight into what is feasible in terms of real-world deliverables, specifically what mystery shoppers are not is a valid substitute for traditional, quantifiable consumer market research.
The first reason is fairly simple: traditional consumer research is based on perceptions. Mystery shoppers, at their best, deal with observable, measurable behaviors and consumer deliverables. Consumer research may say that sales people are perceived as rude; mystery shoppers can report on what behaviors are present or missing that help shape that perception. Is it lack of smiles and eye contact? Is it sales personnel talking to each other instead of to customers? Is it sneering? Swearing? Talking on the phone? Refusing to respond to a customer’s request for help?
The second reason shoppers are not interchangeable with more traditional market research is more subtle but also has far-reaching implications in terms of managing mystery shoppers. Research validity is based on the aggregate perceptions of a statistically sound sample, while a mystery shopper report should be capable of standing alone. While it’s true that a mystery shopper report is only a snapshot of a particular visit to a particular location at a specific date and time, the report should exhibit the accuracy of a snapshot, not the blurred edges of a memory. And as with photos, the more there are, the greater the validity, e.g., an album of mystery shopper reports taken over time is better than a single report. The important point here is that the report must have the integrity to stand alone and the entire process that yields mystery shopper reports must be designed to support that integrity, one shopper at a time.
Individual shopper reports simply do not enjoy the luxury of tempering that takes place in market research when a few strong perceptions are rolled up into the total sample. But while each shopper report must be accurate in terms of objective observations and measured results, one individual shopper’s perception should not be treated any differently than any other single perception in any research.
This is not to say that shopper perceptions do not add interesting footnotes on an individual basis or that they can’t be rolled up to provide a statistically valid sample; both of these can be true, but neither makes the best, most economically efficient use of what distinguishes shoppers from more typical research respondents. Shoppers can be directed to observe and measure specific behaviors and deliverables.
Why does this difference between aggregate perceptions and individually accurate observations and measurements matter so much? After all, perceptions drive customer satisfaction and sales.
The answer is this: Observations and measurements provide calibration that can be used to change behavior much more efficiently than perceptions.
Let’s look at an example. A fast-food restaurant company knows that speed of service is a critical factor in its customers’ selection process. Additionally, its scores have fallen on this attribute within the target market in the company’s biannual attitude, trial and usage tracking studies. The company’s management has numerous choices for how it responds to this information and what steps it takes to improve its speed of service perceptions:
Tell its operations people to improve "or else," then wait for the results of the next tracking study to see what happens. Of course, they could lose a lot of customers and a lot of potential sales during this process.
Another choice is to make changes in its menu and systems to make it easier for employees to fill orders faster. McDonald’s did this when it stopped toasting burger buns.
The company can also decide that the issue is perception, not reality, and that it can change the perception by training service people to be friendlier so customers don’t mind waiting. Or it might decide the issue is too much reality and take down all the large clocks installed recently to create awareness of time and speed of service.
If management is lucky, however, and if it has been wise, it will have several sources of empirical data - from point-of-sale readings to ongoing mystery shopper timings - to help determine what is driving the drop in perceptions. Is speed of service really slower? Is it slower everywhere at all dayparts and weekparts? Is it only slower during dayparts and weekparts in which responsibility has recently been shifted from managers to shift leaders? Is the time from order-taking to food delivery slower or is the line longer? Is actual delivery time the same, but perceptions down because the competition is now faster than it used to be?
Mystery shopping can not only help define the real problem, but after a solution is introduced, it can help provide the necessary ongoing specific feedback to help the employees at each location deliver whatever is required to ensure the improved perceptions that build sales and show up on the next wave of the tracking study.
In short, perceptions help drive sales. One of the measures market research can provide is the strength of and changes in perceptions. A well-designed and properly implemented mystery shopper program can measure the specific components of employee behavior, product deliverables and customer experience that drive those perceptions. How can a company and its mystery shopping company develop a "well-designed and properly implemented" mystery shopper program?
Measure what matters and what the employees at the location can control. First a company must determine what employee behavior and which product deliverables help drive its business at the point of customer contact, then set standards for the smallest measurable components. Mystery shoppers can’t help build vacuum cleaner sales over the long term if the vacuums don’t clean well. What they can do is help ensure that customers who walk into the vacuum cleaner store are waited on promptly and politely and that the sales pitch is presented consistently and completely. But before addressing mystery shoppers, any company needs to know that it has the right product and what is required on the part of employees to sell, deliver and service the product to gain maximum long-term benefits.
Create an objective mystery shopper evaluation form. Although perceptions can be included as interesting footnotes, the observations and measurements should be as objective as possible. Generally, that means questions with yes or no answers and specific measurements of time and, where appropriate, temperatures, weights and distances. Ratings of 1 though 10 are fine for aggregate perceptions, but extremely difficult to teach shoppers to use reliably. If ratings must be used, better to stick with 1 through 3: the difference between "great," "acceptable" and "poor" is much easier to validate than the difference between 7 and 8.
Set shopper requirements. For businesses that serve a broad target market, such as fast-food restaurants, requirements may be very basic, e.g., driver’s license, reasonable intelligence, ability to follow directions and use measuring tools, availability at specified times and places, reliability. For other businesses, requirements can vary widely. Shoppers for packaged alcohol sales may be required to be 21, but look younger. Shoppers for financial institutions may need to be employed and live within a certain area. Shoppers for apartment leasing agents may need to fit specific socio-demographic profiles. Knowing who will use the training materials will make it easier to target them effectively.
Create training materials. It’s best to be brief and to the point, but assume little. Point out that "dirty" and "old" are different when rating the cleanliness of an older establishment. Recognize, however, that unless that’s part of the shopper specifications, shoppers are not and should not pretend to be experts in the field being evaluated. Unless a company specifically wants experts, most shoppers should maintain a customer’s point of view. Training is not to teach them the business, but to ensure that they understand the questions, their role, the standards that should be applied, the use of any equipment required and any specifications for making the visit and reporting on it.
Set up a visit schedule. Although some companies prefer totally random visits, the best shopper programs are based on creating daypart and weekpart comparability by geographical regions. This is important because it allows companies to look at trends and changes on a comparable basis. Most businesses not only experience peaks and valleys in customer traffic on a daily, weekly or monthly basis, they also staff differently at different times. While most businesses want all customers well-served, most make their money during busy periods, so that’s when comparability in the data makes analysis and deciding on corrective action easier. Where scheduling is concerned more frequent is always better, but especially when a business or location is new.
Make sure the shopper reporting process not only validates the data, but also turns it around quickly to unit-level management. Although time requirements for processing reports vary based on complexity, "quickly" in this case means time should be measured in days, not weeks and never months. At the unit level, the manager will need to distinguish between personnel and systemic issues. The longer the time between shopper visit and report, the harder this is to do.
Create a roll-up reporting and distribution system that gets the right information to the right people in a user-friendly format on a timely basis. The reports should help answer these questions:
-How am I (a unit, district, region, area, company) doing versus company standards?
-Was that good or bad?
-What changed? Specifically, when and where did it change?
The bottom line is that in most businesses, the customer’s experience at every point of contact directly affects future sales trends. A well-designed and properly implemented mystery shopper program can be one of management’s most powerful tools to help make each customer contact memorably positive.