Wine scores: Better the devil you know

Mattsays at the Royal Queensland Wine Awards

A WINE WRITER on Substack caught my eye with his piece about wine scores. His thesis is that they are inherently untrustworthy and commercially damaging and that consumers should back their own judgement. It is a very good piece and is worth reading:

The 100-Point Score, Explained By Someone Who Stopped Trusting It

But, on the first point, I disagree.

The research is extensive, long-standing and conclusive on the author’s remarks about repeatability and agreement. I see it all the time.

I collate the 30-45,000 scores that the Australian and New Zealand wine show scores produce each year. These scores are added to a database that now numbers 42,000 individual labels of wines mostly from these countries. To this database is then added prices, trophy information and provenance data down to the subregional level. And I have done it since 2014 . The data points measure in their hundreds of thousands. So it’s probably the biggest longitudinal study of prices and scores in Australasia. And it allows for some pretty robust conclusions.

Firstly, statistical noise is a fact of life when it comes to subjective organoleptic assessment. We see wines that sometimes exhibit up to 14 times in a single year. Most labels average between 2 and three times exhibits per year. Score dispersion for individual labels regularly ranges from gold medal (95+ points) to no score (usually below 85 points depending on whether a show publishes non-medal scores). We have also done some extensive comparison between the scores awarded by shows and those by critics of the same wine. There is little correlation. If anything, the data shows that critics typically score higher than shows.

Not perfect

No system is perfect and even participating judges acknowledge wine shows’ shortcomings. What shows do have, however, is systematic regularity that legitimises statistical comparison. They all follow the same rules and use two common systems of scoring – the 20-point and 100-point scales. The two are easily equated mathematically.

The same cannot be said of critics whose decisions about tasting blind, swallowing or spitting or tasting a wine over several days are left to themselves and usually not declared.

We take the view that the dispersion of show scores does not invalidate them so much as point us in the direction of their true value. Which is a set of data that collectively guides us to a confident view of quality.

It is a wisdom of the crowds approach. In other words, a wine that attracts high scores more often is more likely to be better than one that does not. Thus we employ a system that not only observes scores but the dispersion and frequency of the scores, the size of the shows they attend and the performance of the whole cohort – ie, same class of wine. We then use this system to generate a composite score which we use to rank the wines in a class.

Poor alternatives

Is this a useful tool? Well, consider the alternative. Consumer research about what drives wine purchase decisions is unequivocal: it is price. Secondary factors such as prior experience, recommendation by shop staff or friends, shelf position, food matching, occasion and label design all play a part. But the first hurdle any wine must clear is price.

The problem is that price is an extremely poor guide to quality. Just as research casts doubt on the reliability of objective quality assessment, it is just as emphatic that price has no reliable correlation with wine quality.

Again, we see it all the time in the wine shows. Our two most recent surveys of pinot noir and chardonnay in New Zealand and Australia are instructive.

Not-so Wild West: Australia’s Top Chardonnays of 2025
Butter up: New Zealand’s top Chardonnays of 2025

Both varieties are well understood. Both offer distinctive style differences within their respective classes but judges should know what a good example of each variety looks like. On that basis you’d be reasonably confident of good levels of agreement about the better wines in each class. And so it is. The top examples of each variety show more high scores than other wines and have relatively low dispersion.

But what they also show is that cheaper wines are just as likely to win as expensive ones. In fact, the top ranked wine for both varieties in Australia and New Zealand actually come from their lowest price buckets in three out of four cases. That is not to say that all good wines are likely to be cheap or that all expensive wines are likely to be bad. Rather, it is simply that price is not a reliable indicator either way.

Unfortunately, that observation is in direct conflict with consumers’ perceptions of the material world. The maxim that you get what you pay for is hard-wired and we apply it to nearly everything we buy. It is also the basis on which most wine shop staff use to guide prospective wine purchasers. “How much do you want to pay?” is a common opening question used by shop staff and immediately implies that paying more will buy a better wine.

Again, it might. But it just as well might not.

Even winemakers will tacitly admit this when it suits them. Suggesting that their wine is much better than the equivalently priced French version (usually) or is much cheaper than its European counterpart at the same quality level is just a version of the disconnect between quality and price.

So if scores are not a perfect tool to help make wine purchasing decisions and neither are prices – what is? I would argue the intersection of the two – otherwise known as “value” – is the single best tool. But that’s hard to calculate in a shop that features dozens if not hundreds of labels and your wine purchase consideration period is measured in seconds. So it’s usually a choice between the two simplest vectors: price and score. I would choose score.