Tuesday, August 19, 2008

Law Aleph Users Group Meeting Notes

Law Aleph Users Group
July 15, 2008 Meeting
AALL Conference in Portland, Oregon

Mila Rush (Minnesota), chairing.

Agenda is transcribed from Mila’s handout, with notes taken by Ellen McGrath during the meeting following each item.

Housekeeping:
*All expressed gratitude to Ex Libris for supplying us with delicious box lunches, which were also provided for the Law Voyager users meeting next door.
*Mila explained that we would switch rooms with the Voyager users after 45 minutes so we could hear/view the presentation from the Ex Libris staff member in attendance.

Participants’ self-introductions.

Scope of Group’s attention: some shift to other products (both from Ex Libris and other ILS vendors) that directly/intimately affect Aleph.
*There was general agreement that discussions should include other products too.

Caitlin Robinson (Iowa): Aleph and the reorganization at Iowa. To what extent and how did it influence the decision. Planning, process, experience, lessons learned, satisfaction.
*Unfortunately Caitlin was not able to travel to Portland at the last minute. All were interested in it though and Mila said she would followup with Caitlin to see if there was some way it could be shared with the group.

On demand discussions. Possible action plans? Spontaneous contributions from the floor.
*Buffalo is upgrading to Aleph version 19 about 3 weeks after the AALL conference. We would like to hear if anyone else is on version 19 or plans to be and what changes have been experienced on that version — Minnesota will be upgrading to version 19 fairly soon, but Mila was not sure of the exact date. Nobody else is on version 19.
*Any libraries using another discovery tool on top of their Aleph OPAC, either Ex Libris’ own product (Primo) or an open source product, such as Endeca? -- Minnesota is bringing up Primo as the primary OPAC interface very soon. Florida State has not had a good experience with Endeca and even tried to create its own system (Mango), but without much luck. It was suggested that questions about Endeca be directed to Jon Lutz, who was not present at this meeting.
*Fantasy serials module – We did not get to this agenda item.
*Changes in electronic counting as asked by ABA. How can we generate these figures from our Aleph data? Or Verde? – We didn’t really discuss this, though Baltimore had to cancel their Verde contract since the product was delayed so long.
*Other items from the floor -- There was a discussion about handling of budgets and interaction with separate business systems. Specific topics included how many budgets are optimal and how they are setup. The suggestion of transferring acquisitions data to a spreadsheet so that it can be re-sorted in many different ways was offered. (I confess that much of this was over my head, as I am a cataloger.) Minnesota is looking into use of the booking module, but nobody else uses it. Buffalo also looked into it, but their AV department did not like the system and decided to stick with their own scheduling system.

Jenny Forbes (Ex Libris): Development report on, and plans for, Aleph and related products.
*The group swapped rooms with the Voyager users so Jenny could keep her projector setup in one place.
*Aleph version 19 was released in January 2008 and included improvements to course reserves, batch job management, and staff privileges. If there are specific questions about these changes, perhaps Jenny would be willing to share her slides, which included many screen shots. There are some acquisitions options to update Aleph with data from the university accounting system, as well as enhancements to the generic vendor records loader.
*Ex Libris is always working to implement evolving standards. Specifically mentioned were the SRU/SRW protocol that enhances Z39.50 and MARC-XML as a record output option.
*Aleph version 20 is currently being developed. Ex Libris is also working on a next generation product, of which there will be only one, not the separate Aleph and Voyager products. It was mentioned that at present, even with a discovery tool added to the catalog, it is necessary to have a federated searching tool in place too. Jenny mentioned that there is a message from the President of Ex Libris online about the next gen system, but I was unable to find it quickly on their website.

Miscellaneous (we did not get to discuss any of these items before adjourning)
*New leadership.
*Subscribe to ELUNA-LAW-IG-L@LISTSERV.ND.EDU
*Updates to list of Aleph law libraries: http://www.bc.edu/schools/law/library/aleph.html
*New project: Current state of installations, including near-term plans. Cf. June/July 2004 list.
*Any other business.

Notes taken by Ellen McGrath

Thursday, August 07, 2008

Demystifying Batch-Load Analysis: What You Need to Know About Vendor-Supplied Bibliographic Records

When: Sunday, July 13, 2008, 4:15-5:15 PM

*Coordinator: Ellen McGrath, University at Buffalo
*Moderator: Kevin Butterfield, College of William and Mary
*Speaker: Yael Mandelstam, Fordham University

This program was standing room only–-well, actually a number of people were sitting on the floor, but you get the idea, it was popular!

There are a number of vendor-supplied record sets of interest to law libraries, including: Making of Modern Law (MOML), LLMC-Digital, BNA, CALI, HeinOnline Legal Classics, HeinOnline World Trials, and LexisNexis/Westlaw Cassidy collections.

Yael Mandelstam got right down to the nitty-gritty and showed us how she analyzes batches of vendor-supplied bibliographic records before she loads them into Fordham’s catalog. The importance of the “before” part became evident when Yael described the situation with the original batch of MOML records. Many law libraries loaded them, only to discover that the bibliographic records for the electronic versions overlaid the records for the microfiche versions by mistake. Oops … there were a number of nodding heads in the room, which I took to mean some of those present had been burned in that manner. But never again, as Yael gave us valuable advice about how to keep that from happening.

Before getting down to specifics, Yael cautioned that “this technique is not meant to replace proper authority control, use of URL checkers, etc.” She makes use of two readily-available tools in her analysis: MarcEdit (a free editing utility available for download at http://oregonstate.edu/~reeset/marcedit/html/) and Microsoft Excel (spreadsheet software). She emphasized repeatedly how essential it is that you save a copy of your original file of records before you start rearranging it and that you save each iteration of a file.

The PowerPoint handout Yael prepared is excellent, so I am not going to spend time here on details you can more easily see there. It is available at: http://tsvbr.pbwiki.com/Batchload+Analysis

The approach to record set analysis was presented in three steps:
* step 1: Examine several individual records
* step 2: Count fields in file
* step 3: View isolated fields

The first step is important and should almost go without saying. Step 2 is a quick way to verify the number of occurrences of certain fields. For example, if you have 100 records in your batch, there must be 100 each of required fields, such as the 245 (title) and 856 (URL). If there are less, that is a big red flag! The “What’s wrong with this picture?” examples on the slides are very revealing.

I especially like the subtitle on the slides for step 3: The power of eyeballing. The value of isolating fields for analysis became clear immediately when each individual field was removed from its record and grouped together with its counterparts. When all the same fields are sorted together, the errors and inconsistencies truly do just jump out at you—amazing!

Yael shared helpful tips on how to cleanup those errors and inconsistencies using the global update capabilities of MarcEdit. Unfortunately it is not possible to view the changes in MarcEdit before you apply them, so she recommended doing that in your ILS instead. She concluded by giving a general overview of the work of the TS-SIS Task Group on Vendor-Supplied Bibliographic Records (http://www.aallnet.org/sis/tssis/committees/cataloging/vendorbibrecords/) which has setup a wiki (http://tsvbr.pbwiki.com/) in order to share the results of such batch-load analysis.

There wasn’t much time for questions: Should a batch be analyzed every time you are ready to load it? Yes. But there were a few comments, one of which was that MarcEdit cannot be used with some ILSs unless the whole database is extracted. The session closed with a comment about the fact that these batches are creating many duplicates for the same content in our catalogs. The aggregator-neutral record approach for e-resources (both serials and monographs) was mentioned, but naturally that raises other complexities for which there is no easy solution at present. Many thanks to OBS and TS for sponsoring this excellent program!

Wednesday, July 23, 2008

Session A5: Encore, Enterprise, Primo and WorldCat Local: Explore the Evolving Discovery Tools for Your Catalog

Sunday, July 13, 2008 1:30 p.m. - 2:45 p.m.
  • Richard M. Jost, University of Washington, Gallagher Law Library
  • Mary Jane Kelsey, Yale Law School, Lillian Goldman Library
  • Julie Loder, Vanderbilt University
  • Debra Moore, Cerritos College

This session was a great way to get a taste of four of the major "discovery" products in a vendor-neutral setting. An excellent comparison of the features of all four platforms was compiled by the moderator, Keiko Okuhara of the University of Hawaii, and is available from the AALL handouts page at http://programmaterials.aallnet.org.

Mary Jane Kelsey spoke first about her experiences with Encore, Innovative Interfaces' (III) entry into the faceted browser market. They were development partners for Encore, and so were involved with its design from the beginning of the project.
  • Their goals for implementing Encore included improving the user experience, increasing their return on their investment in the ILS and cataloging, maintaining a credible Web presence, and bridging the gap between the Law and University OPACs (the main library is currently on Voyager).
  • As a current III customer, setup was pretty much plug-and-play, though increasing customization is available. Encore will also work with non-III systems, but the data is not displayed in real time.
  • Encore displays a Google-like single search box. (A link to perform an index search is present on subsequent pages after the main search page.) Results are relevancy-ranked, with journal titles and electronic resources "pushed" to the top of the list; the latter are Yale's customizations to the ranking algorithm.
  • Book jackets are shown at both the results and individual record level. Version 2.0 of Encore also features images from Yahoo "Safe Search" – though Mary Jane found some of those images not so "safe!"
  • A built-in spell-check validates against the database to prevent dead-ends (though this sometimes gives odd results; for example, a search on "voter fraud" responds "do you mean violin fraud?").
  • She finds the subject facts very useful; the ability to refine by tag includes a tag cloud displaying relative font sizes.

Debra Moore of Cerritos College presented their experiences with SirsiDynix's Enterprise Portal Solution (EPS) interface. They had not intended originally to implement a discovery platform, but they needed to migrate systems anyway and found funds available to make the switch, so they went ahead with EPS as their new interface for the new system.
  • EPS is intended to be a total portal to the library, not just to its resources. It is built on a proprietary software system that allows the library to create content modules (called "rooms"). The library can customize the design of each "room" within limited parameters and can select which resources appear in each "room" (for example, you could have a "psychology room" with links to your local or other OPACs, journal databases, Web searches, etc.).
  • Her favorite things about the system are the links to reviews, book jackets, and RSS feeds, and how the built-in federated search results display. (However, it is necessary to purchase a third-party product, such as Serial Solutions' 360 Search, to implement federated search.)
  • Her not-so-favorite things are the lack of authority control (e.g., a search on "movies" instead of "motion pictures" will retrieve nothing), the clunkiness of the "room builder" software, the lack of faceted search results (though this is under development by SirsiDynix as a separate product), and the speed of the system.

Julie Loder of Vanderbilt University presented an overview of ExLibris' Primo. As Yale was with III, Vanderbilt was a development partner with ExLibris for Primo, and did a soft rollout of their "DiscoverLibrary" in Spring 2008.
  • Their underlying OPAC is actually Unicorn, with Primo sitting "on top." At present the use of the native catalog is still required for "My Account" functions and requesting.
  • Primo works by building "pipes" to various collections (local OPAC, digital repositories, other databases) from which it extracts metadata into a standard form and "normalizes" it into an XML or PNX record.
  • Primo features include faceted search to refine results, built-in federated searching, "did you mean" spell-check based on the library's catalog metadata, user tagging, and user-customizable views. A new feature, called "third node searching," converts remote compatible databases to a Primo-type index for faster searching.
  • Initial feedback from their students has been positive, and the Primo interface is increasing the visibility of their TVNews Archives.

Richard Jost spoke last about the University of Washington's experience with WorldCat Local. (Note: the law library is not yet implementing WorldCat Local; these experiences are based on the main library.) Their implementation searches three catalogs and four article databases through one interface, including links to full text via an OpenURL resolver. WorldCat Local is still under development and not officially in release; UW is a beta partner on the project. They are still doing testing on design, the results algorithm, and request methods.
  • Results can be sorted for the user: in UW's case they display UW libraries first, consortial libraries second, and WorldCat results last. If all UW/consortial copies are checked out, the system automatically brings up an ILL form. The ease in which items can now be requested from non-UW libraries has increased their borrowing significantly.
  • Record results are displayed in a tabbed format so users can easily navigate to information about subjects, editions, libraries holding the item, item details, and reviews. WorldCat Local also offers faceted search, advanced search options and a built-in SFX-like link for articles.
  • One drawback is that WorldCat Local does not contain any 3rd-party resources for which OCLC does not have a license to put in WorldCat, such as Early English Books Online and some microform sets. For those materials users must return to the current catalog. Richard mentioned he expected these materials would eventually be able to be included, but he was dependent on OCLC obtaining the relevant licenses.

The Q&A session afterwards revealed some interesting similarities and differences among the systems. Most systems highlight the user's search terms in results (though this wasn't always apparent from the PowerPoint screenshots). Most also allow libraries some control over how results display and are sorted. Interestingly, none of the libraries had done a multi-vendor evaluation before choosing their now-implemented platform: three of the four were development partners or beta test sites, and Cerritos College chose EPS because it came with their new SirsiDynix system. In response to a comment from the audience that faceting is breaking apart pre-coordinated data, all four librarians said they did not think anything was being lost by this. Instead, users are now finding things they never would have before, and faceting is an improvement over plain keyword searching.

This was a truly interesting look at the state of things to come in library interfaces, and the 75 minutes really flew by.

Tuesday, July 22, 2008

Session D-2: The Good, the Bad, the Ugly: Rethinking Bibliographic Services in the 21st Century

Monday, July 14 8:45 a.m. – 9:30 a.m.

Richard Amelung, Saint Louis University, Omer Poos Law Library
Diane I. Hillmann, Director of Metadata Initiatives, the Information Institute of Syracuse

This session began with Richard Amelung, the AALL representative to the Working Group on the Future of Bibliographic Control presenting an overview of the Group's members and of its final report, which was disseminated in November 2007. Along the way he clarified several aspects of the report, the role of the Working Group, and the involvement of non-library parties that have evidently been causing misconceptions among libraries and librarians:

  • Working Group members have been careful to make sure they are all sending out the same message to various library communities when individual members of the Group are invited to speak. However, they are not speaking on behalf of the Library of Congress, only the Working Group. Only LC speaks for LC.
  • The Working Group is not responding to responses to the report. (As Richard noted, "the report's done, folks!") However, Working Group members are staying on in an advisory capacity to LC.
  • The report was structured to identify what would happen if nothing was done, then provide several recommendations for action and the anticipated outcomes if the recommendations were implemented.
  • The report was intentionally "long on ideas, but short on 'how'" – that is, it was focused on outcomes, not how they might be achieved. In this way, all involved parties would be given the maximum flexibility to develop their own ideas as to how best to achieve a given outcome. Planning for how to achieve these outcomes is what needs to happen next – the detailed steps behind how we will get from Point A to Point B.
  • LC is involved with a vast amount of projects, but communication between various areas within LC– and to the outside world in general – about what is going on has been an ongoing problem. Many of these projects are in various stages of development, from fully implemented to pilot stages only, but they all seem to get mentioned as if they are all at the same stage – which they're not.
  • The Working Group made recommendations to many entities outside of LC and the library/cataloging community, but many of these groups (publishers, vendors, etc.) need to be convinced of the importance of bibliographic control to their community, and to get involved.
  • Standards creation currently takes so long that by the time a standard is completed the community has moved beyond it. Multiple groups are usually involved in standards development, and unfortunately they aren't necessarily working in a coordinated fashion.

After Richard's presentation, Diane Hillmann took the podium to present "After the Report," where she described what she agreed and didn't agree with in the final version of the report. Her detailed PowerPoint slides are available at http://hdl.handle.net/1813/11115 (with even more detailed comments at http://docs.google.com/View?docid=dn8z3gs_51dsqc77), but here are some of the highlights:

  • In the age of mashups, is there any need to continue seeking the "holy grail" of a "unified philosophy of bibliographic control"? Libraries are no longer the only players in the information universe – yes, we may "do it better" than others, but maybe "good enough" is in fact good enough. Can we survive and thrive if we continue to insist on "unified" anything? Do we all have to agree on everything before any progress can be made?
  • While the report recommended increasing efficiencies, it's possible we may have wrung all of the efficiencies possible out of our current systems. It's time to explore alternate distribution systems beyond our current networks. "Sharing materials" does not necessarily have to take place only within bibliographic utilities. OCLC doesn't handle images, single items, and media well on an item-level basis. Sharing data about these types of materials may be better handled through other distribution methods.
  • Two big "YES"es from Diane: internationalizing authority files & transforming LCSH.
  • Traditional cataloging has focused too long on the secondary products of research. The new mindset needs to be more like that of archivists and Metadata Librarians, focusing on primary materials.
  • Without improved abilities for machines to manipulate our data, we are going to be locked out of participating in a world where the Web is everyone's platform. While we need a more flexible and extensible metadata carrier than MARC, LC is not necessarily well prepared to create it.
  • The idea of "return on investment" is a non-starter, because right now we really don't know what that means, or how to achieve it.
  • Suspending work on RDA made vendors breathe a sigh of relief (one less thing to do!) but it isn't realistic.
  • For a good explanation of why user data will be important in relevance ranking and selection, Diane referred to LibraryThing founder Tim Spaulding's ALA presentation on his Thingology blog, http://www.librarything.com/thingology/2008/07/future-of-cataloging.php. (BTW, in the interests of full disclosure, I recently joined LibraryThing and think it's a blast.).

This was a very stimulating discussion, and for folks (like me, sorry) who just couldn't quite motivate themselves to read the final report in its entirety when it came out, a good overview of the highlights. It looks like the report will be generating discussion for some time to come, and stimulating the library profession to do some serious thinking about what we will want and need to do with "our" data (if we can even call it "ours" any more!) in the years ahead.

Monday, July 21, 2008

Task Group on Vendor Supplied Bibliographic Records

Good morning!



The Task Group on Vendor-Supplied Bibliographic Records has been working on a wiki, where we can publish evaluations of MARC record sets provided by various vendors, as well as any related information. We are happy to announce that the wiki is ready to go “live”, and can be found here.

Currently we have reviews and other information posted for BNA, CALI, HeinOnline, World Trials, MOML, and the Westlaw/Lexis records from Cassidy. Please note that some of these reviews are works in progress and may not yet be complete. The reviews will also be edited or updated as new information is received.

In addition, the VBR wiki also contains a link to Yael Mandelstam’s excellent AALL presentation, “ Demystifying Batchload Analysis.” It contains very useful information for analyzing large MARC record sets.

Angela
________________________________________________________________________________________

Angela Jones
Senior Technical Services Librarian
Underwood Law Library
Southern Methodist University

The Future of Subject Access in the 21st Century

Monday, July 14, 2008
4:00 - 5:15pm

Dr. Barbara Tillett
Chief, Cataloging Policy and Support Office, Library of Congress

Dr. Lois Mai Chan
Professor, School of Library and Information Science, University of Kentucky

Coordinated and moderated by Rhonda K. Lawrence

It was a great pleasure to hear these two accomplished women address the mystifying issue of how cataloging may be brought in line with current trends while at the same time retaining the positive aspects.

In her presentation, Dr. Tillett reviewed the LC assessment of the pros and cons of pre- versus post-coordination of Library of Congress Subject Headings (LCSH) and resulting changes that are being implemented at LC. The entire report and appendices can be found on the web at http://www.loc.gov/catdir/cpso/pre_vs_post.html

Definitions:
  • Pre-coordination combines elements into one heading in anticipation of a search of that compound heading
  • Post-coordination combines headings or keywords at the time of search by user

Pros of pre-coordination include context, precision, and suggestibility.

Cons of pre-coordination include training time, syntax not understood by end users, headings too specific and many used only once, syndetic or connecting structure not rigorous enough, and slow incorporation of new terms.

Cons of post-coordination include limitations for recall and precision, poor relevance ranking capabilities, and less understanding (without the context of the string).

The recommendations of the LC report, some of which have been implemented, included:

  • Continue pre-coordinated strings
  • Assist end-users through improved "front-end" to OPAC including social tagging experiments
  • More automation of LCSH with software to suggest and verify subject headings
  • Simplification by removing the language subdivision which is already in fixed and variable fields
  • Utilize some form subdivisions in separate 655 (i.e. genre/form test with motions picture/tv currently being done)
  • Offer LCSH in a Simple Knowledge Organization System (SKOS)

For the future, Dr. Tillett predicted that the global use of LCSH will expand and that the process will be improved to reduce costs and at the same time maintain the system.

In her presentation, Dr. Chan offered her suggestions for changes to LCSH which would be theoretically sound but at the same time be more useful to OPAC and web users. These ideas can also be found in her paper in the appendices to the LC report at http://www.loc.gov/catdir/cpso/pre_vs_post.html

Dr. Chan contrasted keyword - the predominant way to retrieve information on the web - with controlled vocabulary - the primary way for subject retrieval in library catalogs and many online databases. She outlined the many advantages of LCSH including richness, comprehensiveness, standardization, and translation into many languages. Her question is -- how can LCSH be made more flexible?

Challenges for the continued viability of LCSH included improving compatibility with other controlled vocabularies, simplifying rules for heading construction, improving tools for automatic indexing, and striving for interoperability with other retrieval languages.

In order to do this Dr. Chan recommended that the terminology be separated from the application by reconfiguring LCSH into two separate files:

  • The source vocabulary consisting of building blocks for subject cataloging and removing subject heading strings
  • A validation file which has a list of validated subject heading strings, a keyword-searchable file created from 6xx fields, a browsing tool, and is maintained by updating subject headings in the source terms

Advantages of a source vocabulary would include easier maintenance, compatibility with other controlled vocabularies, amenability to different applications in organizations, easier translation to other languages, facilitating automatic assignment of subjects, and more compatibility with user-assigned terms and social tagging.

Advantages of a validation file would include increased productivity by easily accessible ready-made subject headings, higher rate of correctly constructed subject heading strings, and facilitating end-user browsing and searching by showing search terms in their proper context in the string.

Dr. Chan looked at the advantages and disadvantages of social tagging. The positives of user participation and empowerment are tempered by the inability to represent complexities which will become more of a problem as content scale increases (example of Flickr with 6000 pictures tagged as "summer vacation"). She suggested improving social tagging by suggesting LCSH subject headings to users during tagging and by mapping user-assigned tags to LCSH.

The lively discussion at the Roundtable following the presentation suggested that many of us are realizing changes are necessary, and are becoming engaged in the conversation. It was encouraging to see that leaders in the field are doing some creative thinking about options and stimulating discussion of some potentially reasonable, but not simple to implement, options for the future.

Friday, July 18, 2008

The Good, the Bad, the Ugly

Diane Hillmann's presentation from Monday, The Good, The Bad, The Ugly, can be found at: |http://hdl.handle.net/1813/11115|
Please check it out if you are interested!

Title: After the Report: Reactions to ?On The Record?
Authors: Hillmann, Diane I.
Keywords: cataloging / metadata
Issue Date: 16-Jul-2008
Abstract: Presentation at the American Association of Law Libraries annual meeting in Portland, OR, July 14, 2008.

Thanks,
Andrea


-posted by Andrea Rabbia