Dear Media Guru- I have a two part question , both dealing with the same subject, tv sampling error. Suppose ER gets a 20% rating and Seinfeld gets an 18%, both off a sample of 1000 resondents. What are the odds of there being absolutely no difference between these two ratings? This is not as simple as looking up the standard error of each rating. I remember that it has something to do with the standard error of the difference, but I just can't recall the calculating process.Could you please explain? Then to complicate matters, I'm looking at the same phenonena on a grander scale. Suppose the estimated delivery in rating points for a tv schedule is 1000 grps and it underdelivers by 10%- ie. 900 grps. What is the likelihood that the difference had to do with pure chance ( sampling error) and how do I calculate that? I know this is more difficult since you have to account for buying many programs in the estimate and the actual. Naturally, we are assuming that the error differences are all due to sampling, and not the idiosyncrasies of the marketplace or the impurity of the sample. In this case I know the answer is going to be technical, but that is what I need. Thanks
The Guru loves this kind of stuff. The answer is technical but hopefully, in simple terms.
First, if ER has a 20 rating and Seinfeld has 18, with a sample of 1000 (for that demographic), then the ER 20 rating's standard error is +/- 1.265 while Seinfeld's 18's is 1.215 (See formulas in the Jan 25 18:23 Guru Q&A below).
Note that the absolute size of the error on the 20 is larger but it is relatively smaller. Also note that the range of these errors is such that they can make the two programs' ratings equal: 20 - 1.265 = 18.735 which overlaps 18 + 1.215 or 19.215.
There is a 68% probablity that these two ratings fall within this range. But the swing could go either way on either number. And could fall anywhere within the +/- range specified
There is a 90% probabilty that these two ratings fall within +/- 1.999 on the 18 and +/- 2.081 on the 20. The odds are 95% that they fall within +/- 2.381 for 18 and 2.479 for the 20.
These odds actually relate to reliabilty. That is, if you repeat the same rating study 100 times with the same actual facts existing, 68% of those studies will give ER a rating between 18.735 and 21.265.
Now the 1000 GRP underdelivered by 10% is different As the beginning of the explanation showed, while there is a swing around any rating (a 5 would be +/- 0.689 in the same study), the odds equally favor underestimates and overestimates. This is the same as the reason why small samples don't necessarily underestimate ratings. So in 1000 GRPs made up of 500 spots with an average 2 rating, the sampling error on the individual ratings somewhat cancels out.
To calculate this in an Arbitron measured radio buy using a single survey and one station, for example, the formula is
GRP x ((100 x #spots) - GRP) / sample x Factor)))
"Factor" is from a table provided, specific to demographic and #quarter hours in the daypart of the buy.
So, if your 1000 GRP were based on Adults 18-49 ( with a 1000 A18-49 sample), and a Mon-Fri, 6a-7p schedule, the calculation would be:
(1000 x ((100 x 500) -1000) / (1000 x 2.42)))
or +/- 143 GRP at the 68% confidence interval. Obviously, if the average rating were higher, hence fewer spots or if the sample was larger the variance would be smaller. With an average 20 rating, the swing is about +/- 40 GRP.
So, depending on average ratings and sample sizes, the 10% underdelivery could be within the range of standard error.