Web-scale library systems #IL2011

Andrew Pace, OCLC

We sorta fixed the OPAC, sorta built an ERM, didn’t know what to do next.

Web-based cooperative library management tool.

It’s built on Worldcat: not downloading records to our local system.

Staff systems are just as pleasurable as those for the public.

Everything colorful and illustrated with book covers.

Managing electronic resources (databases) in the same system as books.

Make libraries a gravitational hub (like Google, Amazon, Wikipedia)

Worldwide libraries are doing about 5,000 transactions / second — which is small compared to some big web sites.


“App gallery” coming in November. Examples:
Pulling the NY Times bestseller list and comparing it to your holdings.
Checking your acquisition budget as you shop in Amazon.
Adding library data to book shopping app.
Making map to show patrons how to go straight to a book on the shelves.

User support center: interact with other users, watch short videos.

Robin Hartman, Hope International Univ.

Small library

Using Serials Solutions, LibGuides, etc.

About six months between signing agreement and going live.

They didn’t have a lot of staff expertise, trusted OCLC’s reputation, believed in resource sharing.

Would rather go with Web 2.0 than try to keep an old Unix system running.

Search is a little weird, because it searches all of Worldcat, unless you limit to local holdings.

Larry Haight, Simpson Univ., Redding

Visionary, independent (from local network), affordable (server died, live with WMS an hour later, don’t need to buy a new one), holistic (pieces like EZ proxy, archives, repository all work together), efficient (why don’t we share vendor records and license records, etc.), full-featured, collaborative, supported, reliable, durable.


Future of the integrated library system #IL2011

Walter Nelson, RAND

Predicts that if we continue as we are now, it has no future.

If we free it from its constraints, it maybe has a future.

OPAC: a destination, customers must go there. It’s like the card catalog room, library as a place. Good at books. Not so good at journals, since people want articles, not journals. Not so good with other digital content.

ILS integrates with itself. Real integration is integrating with other stuff: hr, accounting.

A system that tries to do everything well will do nothing well. Example: Sharepoint.

Do your customers prefer your OPAC or Google? Is it the first place your customers you look? Is it the first place you look?

Does your OPAC look like your web site? Does content in your ILS display on your web site or do you have to enter data multiple times?

Is the ILS increasing or decreasing in relevance? Is it the best use of the money?

What catalogs are good at: clean data, consistent standards.

RAND needed data about its own publications. Publications Dept. data was messy and inconsistent. The library had been cleaning up their data for many years. Consistent data about authors, taxonomy, standards. Extracted it from the OPAC with great difficulty.

Your OPAC is a database-driven web site, like all major web sites these days. Data is separated from display.

List of demands:

Set my data free. Exquisitely crafted data trapped in this obscure corner of the web. Present data in multiple places in multiple formats.

Set my interface free. Have to wait for vendors to upgrade. Open up the interface to let people tinker with it.

Set my search free. Allow search in various ways. Real time and dynamic, not done with periodic data dumps.

Discovery, content management systems, mashups will replace ILS systems.

Drupal can handle MARC records, XML, full text searching; not circulation, journal management, though.

ILS becomes a CMS. It could run the whole library web site. If not, hitch our cart to a better mule.

Why is the ILS the way it is? The vendors give us what we ask for.

Vendors need to provide flexible, easy-to-use CMS systems out of the box, so even non-programmers can hack into it.

Librarians need to get more techie.

Andrew Pace, OCLC

[Bear in mind that OCLC is selling a “web-scale” system — i.e., “in the cloud.” I don’t doubt that Pace honestly believes that’s the way of the future. However, it’s sometimes the case that when you have a hammer, every problem looks like a nail.]

– power: trying to power multiple functions at multiple libraries with one “generator.”
– metadata: stuff is out on the cloud; why do we have our metadata on the home generator?
– “hosted” and”cloud”: separate systems not talking to each other
– unintended consequences: vendors gave us what we asked for

Irrelevance: indexing journal articles done on free services like Google Scholar and Pubmed. Need to aggregate and syndicate library data so it gets in search engines. Core business of libraries is delivery.

Innovation: Black box. We’re reduced to hacking into OPACs. On a good day, we look like masters. On a bad day, we blame the environment, beg for standards, etc. The big guys (Google, amazon, facebook, ebay) put together data, infrastructure, and community. Data is valuable, but the price people pay for it is declining. Applications are disposable.

Identity: The big guys know who their customers are. We know who are customers are, but we pretend that we don’t. Libraries value privacy. We need to think about managing our customers like a CRM. Privacy, cloud, and sevices that patrons expect are not mutually exclusive.

  • Data will live in the cloud.
  • Re-integration will occur
  • The future requires new technology
  • The traditional ILS will last as long as the traditional library
  • the cult of marc must be subverted
  • collection development will be at collection level, not record by record
  • Metadata might be saved by linked data
  • Big switches will drive traffic to libraries
  • Scaled innovation will occur only though open development platforms
  • Done well, it will scale library business intelligence
  • Re-assess the role of libraries in assisting customers
  • Guard privacy and anonymity while providing services customers expect
  • Shake off ethic of not caring how info is used
  • Think about patrons’ entire timeline of interaction with the library

Marshall Breeding, Vanderbilt

Library catalog in 2015

telnet, graphic, web, next-gen, discovery systems. 2015: The Library. We want the whole thing; we don’t want it chopped up, same as when you go to Amazon.

Undergraduates are satisfied. Librarians want the “right” results. Know the collection and the catalog inside and out. Serious researchers (medical, ph.d.) missing something could have serious results.

Optimistic that we’ll get there.

“Web-scale” search: all potential objects that the library cares about. Discovery systems are large, but incomplete, and their relevancy is not so good. Key articles and books don’t necessarily rise to the top.

2015: Indexes will be comprehensive; publishers will provide data. Less comprehensive sources get marginalized.

Transparency: Librarians will be able to understand what’s included, the depth of indexing

Relevancy: Good, but not great. In big indexes, you can’t just match keywords.

Re-integration between Discovery and back end.

Today: discovery interface is a separate silo from the rest of the library web site. There are some new systems that are integrating the whole library web site.

2015: Discovery will not be a separate activity. All customer activity as part of an organic and comprehensive web presence. Presentation layer connects to ILS, library web site, subject guide, article databases and e-books.

Future Ready 365 #IL2011 #FR365

Steven Abram: “How you can bring people along with your awesomeness.” [The two organizers of SLA’s year-long Future Ready 365 project talk about how they did it.]

Cindy Romaine: Technology: how does it work and how do I get it implemented? Social and financial and other implications.

Adopt an attitude of being more flexible, adaptable, and confident.

Post a day at web site.

Logistics: people, process, tools, posts

Malcolm Gladwell (from his book): mavens, connectors, salespeople

Meryl Cole: thought they could just send an e-mail asking for posts.

Use cloud stuff to do everything: Google Docs to track everything.

Personal communication works better to encourage people to write a post.

Cindy: Very good reading because it’s honest and sincere.

Guy Kawasaki says finding knowlege locked away in databases is where research librarians still hold the key.

Juliane Schneider: “Data fusion” is what tech services needs to add to its repertoire.

Promotion: Info outlook, targeting specific units within SLA, presentations to groups (you never know who will show up)

Lessons learned:
define higher purpose (focus on purpose, not methods)
build team
set goals and ojectives (coordinated action)
get buy-in
get budget and resources
build a tribe
lather, rinse, repeat

Digital Content Tools: Thesaurus and Folksonomies #IL2011

Aubrey Madler, Rural Asssistance Center

40% of search failures have to do with vocabulary failures

Types of controlled vocabularies:

  • List
  • Synonym ring
  • Hierarchy
  • Thesaurus

Simple list

Synonym ring: dog, canine, mutt, pooch

Hierarchy: UF, BT, NT (used for, broader term, narrower term)

Thesaurus: Medical Subject Headings (MeSH) from NLM, for example

Creating a thesaurus

1. Generate wordstock – subject experts, users, publications, your org.

Top-down: start with existing thesauri
Bottom-up: start with users

2. Decide on format, choose preferred terms and identify synonyms. Consistent about capitalization. Usually use plurals. One term per concept.

3. Choose hierarchies and facets. Facets are ways to refine search. In shopping site, for example, color, brand, price range.

4. Add associative terms: related term, see also. Both terms should be in the thesaurus, but you want to help the user find both.

5. Select thesaurus design and display. Electronic or print? Staff only or visible to public? Facet, hierarchy, or both? Topical order or alphabetical? Visual or strictly text?

It’s never finished. New terms come up all the time. You’ll notice gaps, which could show in search logs. The collection changes. Society, literature, fashion change. Are you going to reindex the collection?

Planning and maintenance. Document what and why you did.

Melissa Rosales, TBWA/Chiat Day, and Andrew Carlos, The Harker School

Folksonomies: quick and dirty

Allow anyone to add tag. An impromptu bridge. It’s a cost-effective way of dealing with a large amount of information.

Trust power user who is more knowledgable.

“Ugly tags”: people don’t know the “rules of the house.”

Tags identify who it’s about, what it is, who owns it, refining categories, qualities and characteristics, self-reference, task organization (e.g., toread, towatch).

Serendipity: “I found it!”

Tools like Google Reader, Google Alerts: have to preload them.

Browsing tags is somewhat similar to shelf-browsing. Tag browsing, though, is more proactive.

The Wikipedia Game: find connections from one thing to another.

Social bookmarking: Delicious and others. (Zootool, Pinterest, The Fancy)

Del. more visual than it used to be. Zootool allows sorting by media.

Social media more than just text and sharing, heavy interest in visual media.

Issues and problems:

  • Polysemy (multiple meanings)
  • Synonymy
  • Basic level variation

Users like to tag in the singular, because they are thinking of the one item they are tagging.

Some users are very granular, specific (“young adult paranormal romance”)

Moderator: what about combination of thesaurus and folksonomy?

Melissa: One case found 92% original content from users.

Aubrey: LC did that with their photo collection.

Audience: RLG study on user-generated content. www.oclc.org/research/activities/aggregating/

What systems?

Aubrey: Home-grown content management system.

Andrew: Has tags enabled in his system (catalog?)

Audience: Uses medical subject headings, but borrows from LC, nursing terminology. Has trouble tagging personal items. Some days in a more general mood, some days more specific.

Moderator: LC subject terms don’t always match subject matter experts (curators at his museums)

School librarian: Students take notes. Tagging makes more sense after you’ve thought about the subject, looked at subheadings, etc.

Keynote: James Werle, Steven Abram, Liz Lawley #IL2011

James Werle, Internet 2 director:

1996: 28% of U.S. people on Internet and they spent half-hour per month online

2015: Internet use expected to quadruple. 2 devices per person worldwide. IP traffic dominated by video. 5 years’ worth of video every second.

Video conferencing, cloud computing, real-time gaming, immersive/3D environments

All require high bandwidth.

50% of public libraries report slow access at least part of the time. 75% were not able to increase their bandwidth in the past year.

Internet 2: national fabric of state and regional not-for-profit educational and research networks: four-year universities, museums, etc.

bit.ly/pFcfRp: Gates Foundation report on “Connections, Capacity, Community”

What keeps you awake at night?

Steven Abram: polarization of opinions. Everyone’s against something, nobody’s for anything. Apple fanboys will defend censorship of books on iPhone app. They’ll ban the SI swimsuit issue, political apps. Advertising is coming to books; that’s the purpose of the big digitization projects. You are the product. Content spam (e.g., medical works that don’t tell you about side effects, Demand Media).

Liz Lawley: Bandwidth is an issue. We’ve been complaining about that for years. This is basic, like plumbing. We’re not spending enough time talking about network neutrality. Are you going to have a meaningful piece of this bandwidth? Lots of video, but most of it’s crap. Fears about cloud computing. I have a lot of stuff on Google Docs, and it keeps going down. Content doesn’t have to be immersive in a technical sense to be immersive in an emotional sense.

James Werle: Agree bandwidth is coming, but is it coming to a library near you and can you afford it?

Liz: We need to pay attention to “freemium services.” What do we pay for and why? Why do I pay for some of my apps? I pay for Evernote because I like the flexibility of getting to it from any device. People are willing to pay for a good experience. Context often trumps content. The way you present something really matters. How they feel about that experience, do they feel successful? They should come out of every experience saying “I’m awesome!” Interface, experience, interaction on a much higher level.

Steven: When it’s freemium, you are the product. It’s about segregating the power users from other users. Google Plus’ demand for real identity has to do with tying names to credit cards to market to them. Amazon’s public library deal: we served up our customers to them. I’m appalled at the lack of discussion. I’m not against it; I’m just appalled at the lack of discussion. We probably need regulation for an information economy. What is ALA doing about this?

James: Is it possible to have a unified voice?

Steven: I’m not suggesting a unified voice. I just want the discussion to be informed by our voices.

Single disruptive technology trend?

Liz: I’m watching “gamification.” Can be a disruptive trend. It’s difficult to do these things well. People are still interested in tangible things. Paper is not going away. 3D printing. (I didn’t know about this. See Youtube video, for example)

Steven: Printing human body parts. Subscription models for content at the consumer level. Frictionless commerce.

Liz: QR codes not nearly as interesting as RFID. Phones will all have RFID. Scan your wallet to pay for things. Feels magic. Risks? Sure. But the potential …

Steven: It will change user behavior. We need to ask who’s going to pay to move that content to the top? Are the drug companies going to write medical info? The car companies consumer info?

Liz: Video “Rendering synthetic objects into legacy photographs” Inserted dragon into picture of dining room, even getting the lighting right. Harder for us to know what’s real.

James: Videoconferencing. Tele-immersion: like being in the same room with someone who’s far away.

Steven: We’re close to “Save me, Obi-wan.” Being somewhere, scanning body parts, etc.

Liz: Google Hangouts is group video chat that just works. Skype is still clunky. Tech needs to work. You have to make it accessible and make it work.

Steven: It changes things when you can see the person and their emotions.

Liz: It makes a difference to be in the same room with people. I don’t do distance teaching. When I can see the whites of your eyes, I can tell if I have you and I can tell if I’ve lost you.

Anything you want the audience to leave here knowing?

James: Internet 2, video conferencing.

Steven: Be more radical, find our voice. Spend more time understanding the other point of view, rather than demonizing. Remember our principles and be radical about them.

Liz: I want you to remember what it’s like to be a kid, what’s playful and magical. Think about ways technology blends into the background. Some people turn on Google Hangouts and just let it run.

Social Media Storytelling #IL2011

Melissa Rosales, TBWA/Chiat Day, and Andrew Carlos, The Harker School:

  • Ambience: curating to stand out
  • Purpose
  • Participation in conversation

Story is a familiar format, creates engagement, loyalty, interest.

New media technologies are useless without a compelling brand story.

Gatorade campaign with 30-somethings training for and replaying high school football games.

Companies with a compelling story: Chipotle (“food with integrity”), Tom’s of Maine, Apple

Traditional hard sell doesn’t work.

Answer the question: why do I care about your organization?

Jeremy Snell and Matthew Montgomery, Mechanics Institute, San Francisco:

Services for writers and literary orgs in SF

Decided to start a web portal.

Two guys, no money, 3 months.

Decided to use Drupal.

Did hand-sketched wireframes to show people.

Drupal 6, no custom modules

Tested site with two people he knew with differing degrees of Internet experience.

Digital natives don’t necessarily read instructions (such as confirmation e-mails).

http://kuler.adobe.com – color pallettes

Integrating content for creative products & services #IL2011

Elena Maslyukova, World Bank, D.C.

6 libraries, 50 professional staff in D.C. (3 more in Kenya and India, so they can have 24-hour service)

Skype librarian

News products

55K+ e-journals

120+ databases

Popular: FT, Economist, WSJ, Factiva, EIU Country reports

Knowledge workers: 4.4 hours/day on smartphones, 2.9 hours/day other handhelds, 1.9 hours/day e-book readers

Apps: World Bank’s own published books

Mobile access: library catalog, library web site, licensed databases

Blackberry: custom feed of news from library databases

Licensing: some publishers wanted more money for mobile access

Special web pages for supported devices to tell people which databases have mobile versions

Posted messages on internal social web sites to inform staff


Ideal solution: Database A-Z list would show different types of access. Currently testing top 20 databases.

Demand from users for mobile access

Visible and invisible intermediation: example of visible: Youtube videos about how to use mobile databases.

Ask vendors for mobile access when buying new products.

Big learning curve for library staff.

Christopher Connell, Institute for Defense Analyses (think tank for Pentagon)

Citation analysis

Use subscription databases and free A&I services

Goals: library as hub of information about publishing at your institution

  • Google Scholar Citation
  • Microsoft Academic Search
  • Scopus
  • Web of Science

Institutional publication list

Alerts from databases:
Tell staff their papers have been indexed
Tell staff their papers have been cited

Integrate with staff expertise directory

Integrate with local repository

Institutional metrics (see how you compare with similar institutions)

New publications list: RSS pulls from Scopus and puts it on web site. Link resolution allows going to full text.

U. of Hong Kong uses Scopus’ API to integrate with its Dspace repository.

Knowledge Management (KM) and Libraries #IL2011

Jaye Lapachet and Camille Reynolds, law firm librarians

Problem with ordering clothes. Make sure synonyms are included, even slang

What is km? capture, search, collaboration, analysis, process (incl. process improvement)

What it is not? magic, tech alone, easy

KM got a bad reputation, but people knew and trusted librarians

KM is about the content, not the container: managing the flow of the content

The focus needs to be on organizing and disseminating the flow of information.

KM is change mgt.

It’s not about technology, not a new toy, not a new task, but changing the way people think. “We want to tell the story of your work.”

Define KM for your org.
Identify a champion
Solidify support from management; bring it up at meetngs
Get buy-in from affected staff: it looks like more work
Start small, don’t try to org. everything at the beginning

Decide what problem you want to solve. Remember that problem we had? Decisionmakers love that.

Don’t kid yourself that you can do this with existing resources. make a plan, consider staff you have, but also temps, consultants. Don’t overpromise. Are there low-value tasks you can stop doing?

For a pilot, we don’t advocate buying a lot of new tools. No extra money and little extra training. Maybe there are consumer products you can adapt. Maybe IT has some extra licenses for software that they aren’t using.

Team with records and risk mgt.

One firm never had an intranet until 2007. Started “eLibrary” with electronic law books.

Put filings into document mgt. system.

Lawyer wanted one place to search. Started pilot with pbworks. Page on a legal topic: links go to Lexis and
Westlaw treatises. They were happy to help, because it exposed their content.

Working on linking into catalog, too.


  • Change mgt
  • Money
  • Staff resources
  • Time

“Do or do not, there is no try.”


Eric Bryan, Angela Gillis, Robert McAllister, librarians from Boeing:

modified III catalog and turned it into an institutional repository.

Focus on internal Boeing documents.


1. Training (if it wasn’t easy, they weren’t going to use it)
2. As little effect on library staff as possible
3. different levels of security for different kinds of documents.

Lib. catalog had a robust search engine, stored documents, and had different levels of security.

Added pages to allow users to upload docs.

Added gateway pages as interfaces.

Catalog modifications:

A 690 field for name of collection. 856 for links and security restrictions.

Search widget: HTML and JS that can go on any page. Search form that pulls from catalog.

Upload page:

Their login causes the system to add custom (hidden) fields to the form behind the scenes. In effect, letting end users do some of the cataloging.

Gateway pages: Different pages based on a common template. Pages customized for different workgroups. “Bump it” connected to internal social network, so people can recommend things. Catalog search only searches the collection relevant to them. Feed link, so they can have updated content (news, etc.).

Integration with enterprise search: library’s bibliographic data gets pushed into the search engine. So people find it there as well as by coming directly to the library catalog. About 35-40% of their searches originate there!

Over past 3 years, 25% increase in catalog.

Lessons learned:

  • find champion
  • marketing with website and social media
  • integration with search
  • go where your users are
  • gather specific requirements
  • match your services to specific needs
  • use your expertise in information science
  • users like the high-tech, high-touch level of service
  • Be clear, set boundaries as to what this service can and cannot do

Drupal to the next level #IL2011

Ruth Kneale, National Solar Observatory

Cary Gordon, Cherry Hill Company

Ruth: Having to do every single page gets monotonous. Compared CMSes and went with Drupal. Showed it off to whole org. and everyone was excited.

Goals: Others can update content without having to go through her, expandable, little training required.

Cary: Built web sites for California State Library and other libraries. HTML -> ColdFusion -> Drupal (cheaper for libraries).

Build web sites as an iterative process. Make an example, show it to people so they can see if that’s what they really want.

Cherry Hill acted as consultant for NSO.

Used project management system: Unfuddle.

Base theme that can be customized.

Review every single page, ask people for new content. Nag them if necessary.

Review -> Nag -> Review -> Nag -> Migrate

Limit number of people who have accounts to make changes to site.

Module called Article shows articles on main news page and in a box on other pages.

Consistent navigation on every page.

NSF-required footer on every page shows up automatically as part of the template. No one needs to remember to put it there.

Round of reviews: differing ideas about where the search box should be, for example.

Next steps: finish tweaking, flip live web sites, train content managers, start work on second phase.

Easy to train people on editing pages: it looks like Word to them.

I asked a question about whether Ruth had a catalog or other database? Ruth: Have a database of their data with its own GUI interface; looking forward to putting that on Drupal.

Cary: Drupal is very good at sucking in data. You can import everything or do it dynamically. ILSes tend to be proprietary and the vendors aren’t very helpful. You can do scraping and a kind of “back-door API.”

Ruth: They don’t actually have a catalog of their collection; will do that in third phase.

Question about Drupal 7: Theme will have to be modified when moving from Drupal 6 to Drupal 7.

E-books Issues #IL2011

Bobbi Newman: Digital divide. You need an e-reader and high-speed access at home.

2,000 books is not a big enough collection for a public library.

Hallelujah, you can now read Overdrive books from the library on an Amazon Kindle. Amazon got a lot of good marketing stats, and libraries got zero.

Sarah Houghton: We don’t read the fine print. We take whatever books we can get. The average librarian thinks you own the books; you lease them. Terms of service override copyright law. Companies can put whatever they want in the terms of service. When patrons borrow through Amazon, it’s being kept on a corporate web site, and they can’t take it off as far as we know. Violates our principles. Also they try to sell you stuff when your book is due.

Amy Affelt, corporate librarian: Would like to buy portions of e-books and usually can’t. Usually have to click a box that says “read on this PC only.” Amazon doesn’t let you download to somebody else’s Kindle. Want to be able to pay to read it across all platforms. Ends up paying to read on any device they want to read on.

Faith Ward, elem. school libn.: 1st graders made more mistakes with their reading on the e-book. OTOH, when kids are interested in the e-book, she can get them to do more. Standards are requiring the ability to read non-print texts.

Question re. Adobe digital editions that you can’t pass along. Infotrieve has a problem explaining what they do to publishers. Bobbi: DRM does not work with print material.

Question re. cost-benefit of e-book circulation: Sarah: It depends on what you’re buying. We don’t get the discount we traditionally got on print books.

AIIP negotiated an agreement with some publishers the right to pass along e-documents. Response: No unified voice for all libraries.

Container and content go together. If you want to use Kindle, you’re buying from Amazon and taking its terms.

Amazon gives inconsistent answers about what libraries can do with their products.

Question re. what law libraries can do: you want no DRM, you want reasonable pricing comparable to what we’ve had for print, you want to be able to take it to differnt platforms.

We don’t sign contract with our paper book vendors; why are we signing contracts with e-book vendors?

Colorado libraries are doing some innovative things in getting contracts from publishers.

Libraryrenewal.org – org. trying to negotiate these issues.