19 Responses

  1. Daniel Tunkelang
    Daniel Tunkelang August 24, 2009 at 13:31 |

    Wow, thanks! Not only am I grateful for your kind words about my book, but I’m delighted that you picked up on this particular point. It a topic that I’ve thought about a lot over the past years.

    Reply
  2. Daniel Tunkelang
    Daniel Tunkelang August 24, 2009 at 15:47 |

    Here’s a simple but powerful variation: consider frequency in the result set as a positive signal, but to consider frequency outside the result set as a negative signal. I’m also a fan of separating out good summary values from good refinement values–more on this in the HCIR 2008 proceedings (“Summarization and Refinement Tags in Folksonomies”).

    http://research.microsoft.com/en-us/um/people/ryenw/hcir2008/doc/HCIR08-Proceedings.pdf

    Reply
  3. Mikael Svenson
    Mikael Svenson August 24, 2009 at 21:42 |

    Do you have any idea if users are more likely to choose the top navigator in a navigator list? If so, then sorting them by relevance could prove useful, if not, then it’s still an interesting concept.

    If I search for “sneakers” on buzzilions.com then several of the top 10 brands is not in the Brand navigator, and shoes from the top brands are not on the first page. A good example of relevance not being used in the navigator. It would perhaps been better to list the top rated brands in the navigator instead of the ones with most products/reviews.

    Reply
  4. Jayne Dutra
    Jayne Dutra August 25, 2009 at 03:58 |

    I liked this article (and the Things On Top site) very much. We recently attempted an ambitious faceted search project utilizing a metadata catalog transformed into RDF with a faceted search engine on top of it. The search technology was a form of proprietary SPARQL from Siderean in an app called Seamark, and I will say that the customers who tried the beta version loved the functionality.

    However, what made it really useful was that it incorporated a set of controlled vocabularies based on the engineering development life cycle that our mission teams used to populate the metadata values. The fact that the metadata was populated with terms that were derived from processes that the internal customers were used to using everyday in normal task completion, made a big difference in the inherent usability of the application.

    I guess there is really no substitute for understanding the business problem and defining the requirements as clearly as possible.

    Cheers,

    Jayne

    PS – Did you see that Marti Hearst is also coming out with a book on faceted navigation? She was one of the early pioneers in the field with a project at Stanford University called Fandango.

    Reply
  5. Daniel Tunkelang
    Daniel Tunkelang August 25, 2009 at 10:47 |

    Jayne, I didn’t realize that Siderean was still around–their founder / CTO has been at Symantec for almost a year–but I’m glad to hear about your experience! And indeed, picking the facets and facet values is just as important as picking the technology for implementing faceted search. You can separate technology from process.

    Re Marti’s book: you can check it out here: http://searchuserinterfaces.com/ (with a link to my review from http://searchuserinterfaces.com/reviews.html). It’s a great book, but it isn’t so much about faceted navigation at search user interfaces in general. But yes, her work on Flamenco was seminal–I cover it in my own book too.

    Reply
  6. Jayne Dutra
    Jayne Dutra August 25, 2009 at 17:31 |

    Hi,

    Sorry to say that Siderean is no longer around. You’re right that Brad Allen, their CTO, is now at Symantec. But Seamark was a great search tool and it’s too bad that the VCs couldn’t stick it out.

    BTW, I was recently laid off from my position at JPL, so if you know of any companies on the US West Coast that need search specialists, keep me in mind!

    Cheers,

    Jayne

    Reply
  7. Avi Rappoport
    Avi Rappoport September 1, 2009 at 23:04 |

    Great point, the number of hits is a very limited way to choose top facets. At least it’s clear to users what the sorting scheme is. I’ve also sorted aspects chronologically (backwards), by price (upwards), alphabetically for authors, and by distance. Definitely worth stopping and thinking about, and even some usability testing.

    That said, I think there’s a holistic aspect: the number of items in each facet item says a lot about the nature of the corpus as related to the search terms.

    And one of my favorite aspects of faceted search results it that they allow peple to drill down by more mundane aspects, such as “price” and “availability”.

    Avi

    Reply
  8. Daniel Tunkelang
    Daniel Tunkelang September 1, 2009 at 23:29 |

    Avi, determinism is nice, whether for sorting or filtering. The problem for faceted search comes when the initial text search (typically a full-text search) dramatically favors recall over precision–especially when the facets available for refinement have lots of values. In that case, the facet values with the most hits in the search results may be quite unrepresentative of the relevant results. I’ve seen this in real life, and it ain’t pretty.

    In fact, a great example of the “mindless recall” problem is Google Product Search. Try any search for an expensive item (e.g., iphone 3gs), and then sort by price from low to high. Lots of screen protectors and cases. Of course, it’s deterministic–those results do contain the search terms. But that misses the point of helping users actually find what they are looking for.

    Mixing set retrieval and ranked retrieval metaphors can get you the best of both worlds–or the worst.

    Reply
  9. Tony Russell-Rose
    Tony Russell-Rose September 2, 2009 at 11:39 |

    Hi Vegard

    I think this is a fascinating post. At the risk of over generalising, I’d say the judgement that is most salient in users’ minds when navigating faceted menus is not so much the absolute values on the hit counts (although clearly there are cases when this is crucial), but the choice between the facet values themselves, and the extent to which they provide a summary of options in that information space.

    For example, many sites will sort the facet values by bin count on first exposure, usually of the top 5 values or so, but when expanded they will typically be presented as an alphabetically sorted list (to aid scanning), in which case the bin counts play a much reduced part in the ensuing navigation experience.

    So the issue of knowing whether you’re making a good refinement choice is then based on the context (relative to other refinements), rather than the absolute hit count itself. Also I think we need to differentiate between structured data applications (e.g. eCommerce) where the facet values lend themselves well to set-based retrieval, versus document-centric applications where the issue of relevance is much more salient (and subjective).

    Cheers,
    Tony

    Reply
  10. Avi Rappoport
    Avi Rappoport September 5, 2009 at 00:52 |

    Hey Vegard, while putting a link to this into delicious, I realized where you started with the recall. I rarely see search engines which default to retrieving on any word in the query, I usually see the ones that default to matching all words.

    So you are absolutely right: in very broad recall situations, like legal discovery or ecommerce, getting a huge result set could easily generate with misleading facets. The faceting is orthogonal to the relevance ranking, and it this is the place where it could be really bad. Limiting the results to those with a reasonable confidence level or relevance score would make the interface much less random!

    We’re all still learning :-)

    Reply
  11. Tony Russell-Rose
    Tony Russell-Rose September 7, 2009 at 09:16 |

    Hi Vegard

    I think you are asking all the right questions … and prompting me to review some of the assumptions underlying some of our design patterns :) Which can’t be a bad thing. Trouble is, there is never enough time to research these things properly (at least, there rarely is outside of academia), and as a scientist I am always uncomfortable basing my opinions / rationale on an evidence base that is largely anecdotal.

    In the Buzzilions you example, are you referring to the pop up that appears when you select “more”? If so, I wonder how this would work with a long list of facet values (most of the examples I found were just a handful of values or fewer). And re your point about showing facet values (or not), all my instincts as a UI designer say you should by default, but again I can’t point to any conclusive evidence for this.

    Cheers,
    Tony

    Reply
  12. Tony Russell-Rose
    Tony Russell-Rose September 11, 2009 at 10:46 |

    Hi Vegard

    Intersting point about not showing facet values by default – that would be heresy to some! Perhaps showing them but with a light design treatment is the most conservative compromise? It should work for most use cases on most sites (not that we should ever design for the lowest common denominator, but you know what I mean :)

    I like the Buzzilions example – you’re right, the morphing is neat. And the lightbox / popup at Oodle is nice, although (unsurprisingly) it breaks when you get a multi-select AND dimension (try checking the first half dozen options under ‘Features’. Without a real-time refresh, it returns zero results)

    Cheers,
    Tony

    Reply
  13. Vegard Sandvold
    Vegard Sandvold September 11, 2009 at 15:29 |

    Tony,

    just to avoid a possible confusion; we are both speaking of facet value *counts*, right – those numbers accompaning each facet value. I’m just suggesting to hide the counts, not the value labels.

    Guess that’s all I got to say about this topic for now. Thank you for taking time to discussi with me!

    Reply
  14. Searching for Faceted Search « The Alter Egozi

    [...] for faceted search, being a set-retrieval oriented task, and a pingback on his blog led me to a fascinating elaboration on this pain in another fine search blog (recommended [...]