Some of the early leaders in this space like Summon are even announcing a "2.0" version, which may or may not be marketing hype but is symbolic I guess in signalling that products in this class have reached a certain amount of maturity.
Today in 2013, Summon alone has over 500 libraries using it, and many more are using Worldcat local, Primo Central, Ebsco Discovery Service etc. As usual, this has led to the rise of professional literature written on the topic (see list curated by me here as well as Flipboard custom magazine), covering a host of areas including
- impact on resource usage (full-text downloads, print catalogue items, A&I usage, articles not indexed in discovery)
- impact on workflow for management of eresources
- proper marketing and positioning of discovery products for users
- impact on teaching of information literacy by librarians
- surveys on attitudes of librarians, undergraduates, graduate students and faculty towards discovery services vs databases
- usability testing & intergretion of discovery services into library websites
- Relevancy ranking results can be inconsistent if not awful (opinions vary on how bad this issue is, possibly depending on expectations, implementation and discipline).
- Lack of advanced search features
- Worry that some important material is missed out in the index or in some disciplines totally inadequate. Related is the view that a subject specific database is almost always better eg PubMed.
- Worry that users are unaware that they are missing out material not found in the index, and they may settle for good enough instead of the best available
- Worry that discovery services are damaging information literacy skills by misleading users into thinking research is easy
- Technical issues relating to instability of linking to full-text, clarity of labels in the interface etc
- Uncertainty on how to position discovery systems next to databases and how to teach
- Worry that libraries are handing over too much power to discovery services due to lockin by discovery service providers who are simultaneously content providers (example of recent dispute).
As expected, experienced researchers and faculty staff generally mirror the opinions of librarians and they are a lot less enthusiastic than undergraduates in general because they are familiar with what databases offer and are more demanding on what they should get.
That said the Ithaka Faculty Survey 2012 speculates that library heavy investment in discovery services are paying off leading to more faculty starting their search from the library catalogue in 2012, the first time ever it increased since the survey started in 2003.
As Barbara Fister points out, faculty staff are often searching "for known items, something discovery systems seem to handle rather badly", so this seems off.
5. Relevancy ranking can still be improved
This differs from service to service with some services claiming superiority in this area.
Head to head tests give mixed results, eg. This gives victory to EDS over Summon, this to Summon , this simple one gave A&I>Summon>Google Scholar, but this one gave it to Google Scholar over Summon etc.
But I doubt most librarians will say Summon or any other discovery service is as good as it can be and would yearn for better relevancy.
I am personally more sympathetic towards discovery systems in this area, though having spent countless hours studying and duplicating thousands of user searches since June 2012, I am well aware of how poor the relevancy ranking of Summon can be on some searches (I have also done limited testing on other systems).
Lest I be accused of not giving examples here's one Singapore "national service" , where currently the first 9 results are totally irrelevant. Though one example hardly proves a pattern, I am sure any librarian familiar with discovery services can give dozens of examples similar to this one. But of course, relevancy isn't an easy problem to solve and to be fair in this case, doing the same search without quotes actually gives you better results but still poor results.
Compared to the early days where discovery services raced to sign up content providers and boasted of the size of their index (they still do I guess), there is a increasing realization by all parties whether librarians or discovery providers that all this content can be counter-productive if the relevancy ranking isn't capable enough to surface the right or at least decent content.
Also as mentioned before there was in the early days doubts on how good such systems are for known item searching particularly for catalogue items and this continues to this day despite improvements.
6. Adding Federated search does not add much to web scale discovery (currently)
This is somewhat more controversial. But I believe the current consensus is moving towards the idea that tagging on federated search to web scale discovery is not that useful, at least with current implementations of this. An early debate in 2009 was sparked on the Federated Search Blog with the post Beyond Federated Search and followups, that critiqued Summon for lacking federated search, claiming that a hybrid solution of indexing what you can, and doing a broadcast search (federated search) over what you can't should be the way to go.
I could be wrong, but my impression is that many libraries that implemented Ebsco Discovery Service which does have federated search, have chosen to turn off the federated search portion, basically because it wasn't used and/or was counterproductive.
Federated Search is Dead -- and Good Riddance! , a piece explaining why James Madison University (JMU) turned off the EBSCO Integrated Search federated search add on included in EBSCO Discovery Service is perhaps a typical reaction.
Essentially the sheer size of the index of discovery services like Ebsco Discovery service or other services, means that students have no incentive to wait 30 seconds for more results, the problem they face typically is too many results, not insufficient results. Scholars will already be using traditional databases anyway as primary search tool (e.g Scopus) and may just use Web Scale discovery tools as a final round-up of what they have missed so they don't really have a dying need to see results from such traditional databases anyway.
I would say even Ebsco is downplaying the significance of the option of federated search in their EDS service, as a look at their pages on EDS does not mention federated search at all (though to be fair it's a seperate product EHIS), and there is even a page on platform blending (which I frankly don't quite understand what is going on here despite a vendor explaining it to me) where they go out of their way to state it is "not federation"
Of course, an argument could be made (correctly I think) that the idea of a hybrid system is sound but the implementation needs a lot of work to make it worth it, but currently it seems of the 4 major players in the market none seem to have cracked this issue yet and may not do so in the foreseeable future as it is perhaps not a priority.
I would also add that many of the issues brought up back then about the dangers of ceding control to your discovery provider on the content can be found if you don't do federation, may still retain its teeth (again see recent spate), but at the very least on the practical front, the inclusion of a federated search option in a web scale discovery system generally isn't considered critical by most librarians now.
7. Content providers are generally eager to cooperate with discovery vendors to have their content indexed.
One of the reasons why the need for federated search seems to have diminished is because more and more content is getting indexed. In 2009, there was still uncertainty on how content providers would react , would they want to be included? and discovery vendors had to work hard to get content included. If most did, then federated search would be of limited value except for reasons related to currency of results. If most couldn't be indexed, then federation would be crucially important to get at those resources.
As of 2013, the situation has clarified, over the years as more libraries started to release data showing that usage tends to fall for anything not in discovery services and or conversely anything indexed in them will lead to increased in usage, content providers have become more and more eager to be indexed or risk being cut out of the game.
The earlier mentioned James Madison University paper is perhaps instructive. Back in 2010 where he was describing the situation, of the sources, he mentioned that was accessible via federated search, by now many like JSTOR, Sage, Sciencedirect etc all are now indexed in Summon and probably other discovery services.
More interesting even A&I services like Scopus, Web of Science, MLA, ERIC are often included in many discovery services now though with appropriate safeguards to ensure their records are shown only for authenticated users.
That said, there are still hold-outs, the well known Psycinfo, EconLit etc and other A&I databases that work with Ebsco Discovery Service only is perhaps the most gaping hole currently existing.
And of course the above refers only to publishers but in general aggregator databases have been less willing (Gale seems to be a an exception here being included in Summon since 2009 and recently added to ebscohost discovery service as well as others) particularly those owned by Proquest and ebscohost are typically out of bounds to discovery services of competing services barring some special agreement.
8. Problems of broken links are still an issue though the problem is less serious and likely to be so in future
One of the greatest issues with discovery services is that they typically rely heavily on openurl to get to the full text. As is well known openurl linking is not 100% reliable, so discovery services have put in place alternate routes to full text.
For example Summon implemented "Index-Enhanced Direct Linking" and EDS has their smart links (if content is in the ebscohost databases) or custom links (I believe equivalent to Summon's index-enhanced direct linking in most cases)
That said, linking to newspaper articles, non-journal items and free content can still be iffy.
Still new efforts like KBART and Improving OpenURLs Through Analytics (IOTA) are underway, so perhaps in time to come this issue will be hopefully reduced.
Some other less important findings,
- most libraries are deploying Discovery services has the main search option (either has only box or default tab)
- almost nobody is using Boolean searches in discovery services - this paper found 3% and this blog post reported 2%
- users are using facets to a moderate degree about 30% according to this blog post
I confess it took quite a bit of effort and courage to get this piece written and posted. Sometimes I wondered if I was getting the general consensus totally wrong, and yet other times I thought what I wrote is totally trite and obvious that people knew even right at the start of 2009.
I suspect the later is more likely to be correct, because I decided to err on side of caution and list the statements I thought were definitely agreed upon and bump the ones I was unsure to a follow- up blog post "X things we still are unsure about web scale discovery systems in 2013".
But what do you think? What else is it we know about discovery services that were in doubt in 2009?
BTW If you want to keep up with articles, blog posts, videos etc on web scale discovery, do consider subscribing to my custom magazine curated by me on Flipboard or looking at the bibliography on web scale discovery services)