Facets add value to the search user experience by helping users refine the usual ranked, best-first list of documents on a search page. The quality of faceted search, however, is at risk when search result precision is traded for recall. Users should ideally be able to find all documents relevant for a given query (high recall), and nothing other than these relevant documents (high precision). Balancing precision and recall is a constant challenge for enterprise search practitioners, since it’s notoriously difficult to achieve enough of both at the same time. When a trade-off has to be made, it may seem safer to err on the side of high recall. Then nothing of potential interest is left out, at least. But we shall see that high recall plays tricks with faceted search, forcing us to reconsider this assumption.
By design, facets and facet values summarize the entire search result set, assigning hit counts to each unique facet value. A Brand facet may tell you that the search result contains 15 Canon cameras, 14 Nikon and 7 Sony cameras. If you want to refine you search to target a particular brand, you simply choose the corresponding facet value.
It was while reading a recent book on faceted search by Daniel Tunkelang that I came across the answer to a question that has troubled me for some time. What does facet values and their hit counts really say about the corresponding documents and their relevance to your query? How can you know up-front whether you’re making a good refinement choice or not? It defeats their purpose if facets and facet values are poor indicators of relevance, and I would like to know how facet ranking and presentation effects the user experience. Are facet values with high counts always more relevant than those with low counts?
With ranked retrieval (as opposed to set retrieval) documents are scored according to how well they match the user’s query, and the documents in the search result are ranked on this score, sorted from highest to lowest. A query that favors recall may cast a wide net, allowing the user to search freely through all parts of the document, perhaps with linguistic processing and synonym expansion applied to the query. Such a query will hopefully retrieve most of the interesting documents, but it may also dig up a lot of documents that are not particularly relevant.
It’s usually safe to assume that ordinary relevance ranking will banish the lesser relevant documents to the dark depths of the search result, well hidden from all but the most insistent users. Faceted search is played by a different set of rules, however, and low ranking documents may very well contribute significantly to the facets seen by the user. Facet values are usually sorted on hit count (if not alphabetically or hierarchical), but this ranking does not necessarily reflect the relevance ranking of the documents themselves. On the contrary, a facet value with a high count is not more relevant if it represents many irrelevant documents, as the case may be with a query that favors recall.
Knowing about the hidden menace of mindless recall, we can take some measures to ensure a satisfactory faceted search user experience:
- Increase precision by restricting queries to specific document fields, and limit the use of linguistic processing. It may not be necessary to search everywhere with full-blown synonym expansion and phonetic normalization of the query. In short, err on the side of high precision instead of recall if possible.
- Restrict facets to summarize only the most relevant part of the search result. Computing facets from e.g. just the 4000 highest ranked documents may dampen the noise introduced by lesser relevant documents. In FAST ESP, this technique is referred to as shallow navigators. Others may call it hedging.
- Rank facet values according to a relevance score based on document relevancy. I imagine it would be possible to compute a utility measure similar to tf-idf suitable for ranking facets, and that this measure would favor facet values originating mainly from high ranking documents. I admit this is speculation on my part, and mostly off the top of my head, and some readers may even tell me that this feature is out-of-the-box in their favorite enterprise search software.
Finally, Faceted Search: The Book is in my opinion a must-read for all enterprise search practitioners. I give it my warmest recommendations.
















Wow, thanks! Not only am I grateful for your kind words about my book, but I’m delighted that you picked up on this particular point. It a topic that I’ve thought about a lot over the past years.
This was just one of several good points in your book that caught my attention. Another one was set-based browsing, with reference to Parallax and Endeca. Perhaps I’ll write another blog post about that
You briefly mention that different utility measures can be used for facet ranking, and for all I know, ranking by frequency seems to be the most common approach. But you know more about this than I do. Given your extensive experience on the topic, which other utility measures have you seen implemented, and how well did they work?
Here’s a simple but powerful variation: consider frequency in the result set as a positive signal, but to consider frequency outside the result set as a negative signal. I’m also a fan of separating out good summary values from good refinement values–more on this in the HCIR 2008 proceedings (“Summarization and Refinement Tags in Folksonomies”).
http://research.microsoft.com/en-us/um/people/ryenw/hcir2008/doc/HCIR08-Proceedings.pdf
Thank you for sharing that insight with us, Daniel!
Do you have any idea if users are more likely to choose the top navigator in a navigator list? If so, then sorting them by relevance could prove useful, if not, then it’s still an interesting concept.
If I search for “sneakers” on buzzilions.com then several of the top 10 brands is not in the Brand navigator, and shoes from the top brands are not on the first page. A good example of relevance not being used in the navigator. It would perhaps been better to list the top rated brands in the navigator instead of the ones with most products/reviews.
I liked this article (and the Things On Top site) very much. We recently attempted an ambitious faceted search project utilizing a metadata catalog transformed into RDF with a faceted search engine on top of it. The search technology was a form of proprietary SPARQL from Siderean in an app called Seamark, and I will say that the customers who tried the beta version loved the functionality.
However, what made it really useful was that it incorporated a set of controlled vocabularies based on the engineering development life cycle that our mission teams used to populate the metadata values. The fact that the metadata was populated with terms that were derived from processes that the internal customers were used to using everyday in normal task completion, made a big difference in the inherent usability of the application.
I guess there is really no substitute for understanding the business problem and defining the requirements as clearly as possible.
Cheers,
Jayne
PS – Did you see that Marti Hearst is also coming out with a book on faceted navigation? She was one of the early pioneers in the field with a project at Stanford University called Fandango.
Jayne, I didn’t realize that Siderean was still around–their founder / CTO has been at Symantec for almost a year–but I’m glad to hear about your experience! And indeed, picking the facets and facet values is just as important as picking the technology for implementing faceted search. You can separate technology from process.
Re Marti’s book: you can check it out here: http://searchuserinterfaces.com/ (with a link to my review from http://searchuserinterfaces.com/reviews.html). It’s a great book, but it isn’t so much about faceted navigation at search user interfaces in general. But yes, her work on Flamenco was seminal–I cover it in my own book too.
Hi,
Sorry to say that Siderean is no longer around. You’re right that Brad Allen, their CTO, is now at Symantec. But Seamark was a great search tool and it’s too bad that the VCs couldn’t stick it out.
BTW, I was recently laid off from my position at JPL, so if you know of any companies on the US West Coast that need search specialists, keep me in mind!
Cheers,
Jayne
Great point, the number of hits is a very limited way to choose top facets. At least it’s clear to users what the sorting scheme is. I’ve also sorted aspects chronologically (backwards), by price (upwards), alphabetically for authors, and by distance. Definitely worth stopping and thinking about, and even some usability testing.
That said, I think there’s a holistic aspect: the number of items in each facet item says a lot about the nature of the corpus as related to the search terms.
And one of my favorite aspects of faceted search results it that they allow peple to drill down by more mundane aspects, such as “price” and “availability”.
Avi
Avi, determinism is nice, whether for sorting or filtering. The problem for faceted search comes when the initial text search (typically a full-text search) dramatically favors recall over precision–especially when the facets available for refinement have lots of values. In that case, the facet values with the most hits in the search results may be quite unrepresentative of the relevant results. I’ve seen this in real life, and it ain’t pretty.
In fact, a great example of the “mindless recall” problem is Google Product Search. Try any search for an expensive item (e.g., iphone 3gs), and then sort by price from low to high. Lots of screen protectors and cases. Of course, it’s deterministic–those results do contain the search terms. But that misses the point of helping users actually find what they are looking for.
Mixing set retrieval and ranked retrieval metaphors can get you the best of both worlds–or the worst.
Hi Vegard
I think this is a fascinating post. At the risk of over generalising, I’d say the judgement that is most salient in users’ minds when navigating faceted menus is not so much the absolute values on the hit counts (although clearly there are cases when this is crucial), but the choice between the facet values themselves, and the extent to which they provide a summary of options in that information space.
For example, many sites will sort the facet values by bin count on first exposure, usually of the top 5 values or so, but when expanded they will typically be presented as an alphabetically sorted list (to aid scanning), in which case the bin counts play a much reduced part in the ensuing navigation experience.
So the issue of knowing whether you’re making a good refinement choice is then based on the context (relative to other refinements), rather than the absolute hit count itself. Also I think we need to differentiate between structured data applications (e.g. eCommerce) where the facet values lend themselves well to set-based retrieval, versus document-centric applications where the issue of relevance is much more salient (and subjective).
Cheers,
Tony
Hi Tony!
I have seen a few sites that re-sort facet values when expanded, but not often. Right now I can only think of Buzzillions.com, Oodle.com and Google Shopping. Buzzillions do a really nifty hierarchical re-sorting that isn’t alphabetical either. Re-sorting can perhaps create slight confusion because of geographical memory, but I think scannability makes up for that. I’m quite fond of the concept, and I try to work it into my own designs.
Another issue is whether to show bin counts/frequencies or not. How does the user interpret a high count versus a low count? Does it add significant value to the user experience, or is it just visual noise? It depends, I guess. I have a preference for leaving them out, but I would like to see or perform a clarifying user study. Do you know of any?
I work mainly with FAST/Microsoft technology, and it breaks my heart to see how uncritically many search professionals apply the default facet configuration (sorted decending on frequency) to their designs (or lack of design). And web designers are usually not any better, since they generally lack experience with search interfaces. Daniel’s comparison between faceted search and chess is a good one, saying that it takes just a few minutes to grasp the rules, but a lifetime to master the subtleties of the game.
I’m still learning
Hey Vegard, while putting a link to this into delicious, I realized where you started with the recall. I rarely see search engines which default to retrieving on any word in the query, I usually see the ones that default to matching all words.
So you are absolutely right: in very broad recall situations, like legal discovery or ecommerce, getting a huge result set could easily generate with misleading facets. The faceting is orthogonal to the relevance ranking, and it this is the place where it could be really bad. Limiting the results to those with a reasonable confidence level or relevance score would make the interface much less random!
We’re all still learning
Well put, Avi!
Hi Vegard
I think you are asking all the right questions … and prompting me to review some of the assumptions underlying some of our design patterns
Which can’t be a bad thing. Trouble is, there is never enough time to research these things properly (at least, there rarely is outside of academia), and as a scientist I am always uncomfortable basing my opinions / rationale on an evidence base that is largely anecdotal.
In the Buzzilions you example, are you referring to the pop up that appears when you select “more”? If so, I wonder how this would work with a long list of facet values (most of the examples I found were just a handful of values or fewer). And re your point about showing facet values (or not), all my instincts as a UI designer say you should by default, but again I can’t point to any conclusive evidence for this.
Cheers,
Tony
You’re absolutely right, Tony. Hard evidence is a precious commodity. I’ll do my best to keep my feet on the ground as I’m constructing these arguments
I have a hunch that facet value counts are more essential to exhaustive research (and perhaps also re-finding) scenarios, where the counts can aid the user in systematic exploration of the information space. Knowing how many documents are present in each category may help the user organize her search for something specific or unusual.
Could it be that bin counts are less effective for more conventional exploratory search scenarios like shopping? If the user is just looking for a few good items to choose from, does it really help knowing the exact number of TV’s, cell phones or dishwashers found? My gut feeling says it’s not that important, and that the facet value label itself is enough to reassure the user that something of interest is waiting to be found behind it. Showing counts or not is a small detail, I admit that, but uneccesary detail can create confusion.
I would default to not showing counts for facet values, unless I knew for sure that they convey something meaningful, and help the user complete a search task. And I feel that counts are more likely to be usefull for legal discovery and intelligence gathering, and not so much for e-commerce.
At Buzzillions, did you look at the facets under the heading “Find Product Reviews”? They expand and re-sort themselves when you click on the “more” link. Really neat how they morph from a flat list to a hierarchy.
Oodle.com is a better example of pop-up dialogs with many facet values. Do a search for bmw and click on any of the “more choices” links to expand the facets. It’s a design that scales well.
Hi Vegard
Intersting point about not showing facet values by default – that would be heresy to some! Perhaps showing them but with a light design treatment is the most conservative compromise? It should work for most use cases on most sites (not that we should ever design for the lowest common denominator, but you know what I mean
I like the Buzzilions example – you’re right, the morphing is neat. And the lightbox / popup at Oodle is nice, although (unsurprisingly) it breaks when you get a multi-select AND dimension (try checking the first half dozen options under ‘Features’. Without a real-time refresh, it returns zero results)
Cheers,
Tony
Tony,
just to avoid a possible confusion; we are both speaking of facet value *counts*, right – those numbers accompaning each facet value. I’m just suggesting to hide the counts, not the value labels.
Guess that’s all I got to say about this topic for now. Thank you for taking time to discussi with me!
[...] for faceted search, being a set-retrieval oriented task, and a pingback on his blog led me to a fascinating elaboration on this pain in another fine search blog (recommended [...]