Tuesday, March 25, 2008

Define Yourself, Sir

When I took early modern philosophy in college, one of the main problems that we ran into between any two philosophers was their difference of definition. Kant might have meant one thing by his use of “mind” or “knowledge”, and Leibnitz would have a very different meaning. And then if you threw Berkeley into the mix, well, you had a rumble on your hands.

Just kidding. Philosophers do not have fistfights.

Anyway, this is a major problem in most fields, because if you don’t define your terms, no one is ever going to understand you. The beauty of FRBR, for example, is that they defined the shit out of everything. You might have to read it five times to get it, but they DO define their terms.
The problem that I’m seeing more and more and more in blogs and in listservs, is that librarians are not defining their terms and therefore make themselves completely unintelligible to anyone who wants to understand them. They also simultaneously make themselves completely dismissible by anyone who doesn’t care to listen to them.

A good example is this listserv I'm on. I subscribe to it, mostly just to read all the smart people’s contributions. I am, unfortunately, a listserv-lurker. Anyway, there has been this discussion about non-literal vs. literal strings. Don’t ask me to explain what those are, because I can’t. But someone decided to try to explain the basic difference, finally, after about 5-6 emails had already bounced around that used the terms without definition, and what happens? An email immediately comes through from someone saying “Thank you!” for explaining the terms. It took 5-6 emails for that one person just to understand the terms...not the argument about the terms, just the terms themselves.

What is it with people? Is it that hard to understand that you might not always make sense? Especially when you’re talking about very difficult concepts that only use words as placeholders and not as describers? I mean, if the person were using “literal” in the literal sense…well, I have no idea what that could mean in relation to “non-literal.” This is one of those times when you absolutely have to define yourself, or everyone’s eyes will just glaze over and you’ll never get anywhere.

To extrapolate this further, one of the biggest issues I see in blogs is that people won’t define their terms, and let’s face it, the library world is really not that well-defined. Yes, we have standards and we have codes of ethics and we have conferences, but we still insist on using whatever term our local database has contrived for a “work” or a “bib” or a “title” (all the same thing, by the way—just different terms, all dependent on your ILS). FRBR tried to help us out by changing some of the ways that we think about conceptual objects in the library world, but I don’t think that most librarians are well-versed in that FRBR world, and don't use those terms on a regular basis. And since RDA apparently isn’t even using all of FRBR's concepts to write their manual…well, I don’t exactly see a light at the end of the tunnel.

Monday, March 24, 2008

Thomas Mann's Response to the Working Group

This is big right now...Thomas Mann wrote a paper about the Working Group's paper on RDA that they presented back in November. There's another response to his work from the autocat listserv here.

I’m just going to be taking the things I highlighted from his work and talking about them. Not very systematic, but hopefully it acts as a good guide for me personally when I go back through the work. This is a long post, although I think its worth it for anyone who hasn't read the thing yet.

Please keep in mind that I just love anecdotal evidence, unlike many people who do not believe a thing unless it has a graph. I think that a reference librarian with 30+ years of experience has the right to make observations about users, without doing a study on it first. But I'm very "unscientific" that way. So here goes!

Pg. 11: “the goal of cataloging is not merely to provide researchers with ‘something quickly’…its purpose, first and foremost, is to show ‘what the library has’—i.e. in its own local collections, onsite.”
Amen, Thomas. I don’t understand when or how the idea came about, that libraries are not just responsible for their own holdings, but for the entire scope of human knowledge everywhere. If that was the case, we wouldn’t keep physical books at all; we’d be….um, OCLC? Google? A wish list?

Pg.16: “a major weakness of word clouds is that they cannot show cross-references, scope notes, or further subdivisions of their own terms…we must remain clear about the differences between catalog search environments and Web search environments.”
I think that this is important…we get so excited, as librarians, to see word clouds, that we forget that we are the not the user. The user will see a word cloud and think “oh, more words like what I just typed.” A librarian sees a word cloud and thinks “oh, they took the subdivisions from LCSH that relate to my broad term and made those into a word cloud.” Users don’t see relationships in the way that we do.

Pg. 17: paraphrasing here: the Working Group is not being academically rigorous in its research. They are not using the scholarship that already exists, and are reinventing the wheel when it comes to thinking about subject access. We should probably all read the reports that Mann has put out there in these pages.
Also, he makes the point that OCLC has been funding a lot of the research that results in the support of facetization, “whose own WorldCat cannot display either cross-references or browse-menus of precoordinated terms. Why…should the rest of us naively accept OCLC’s oversimplified software to begin with?” Why indeed.

Pg.18: “the first responsibility of LC is to catalog its own—and the nation’s—unique copyright-deposit collection.” This is like page 11, but it bears repeating.

Pg: 20: ”contrary to the widely touted mantra, facetization does not “make the data work harder”; it makes the user work harder…it is a stunning violation of the Principle of Least Effort in information-seeking behavior. ‘Least effort’ is supposed to refer to the level of work done by the user, not the catalogers.”
Wow. Just….wow. He’s right, he really is. Yes, he ignores the basic funding issues that all libraries have (although he does talk more about the cost of cataloging in other places, and makes good arguments against downsizing at LC), but he’s still right. Our job is not to make ourselves as lazy as possible about cataloging and foist all the effort onto the user. That’s actually supposed to be the opposite of what we do.

Pg 21: “Anyone who has ever done a Google search knows that Google’s search mechanism exacerbate rather than solve…problems of information overload that are now created and aggravated by computer and web-environment retrievals.”
His argument is that LCSH avoids those problems. I agree, at least a little. LCSH is certainly better than Google, with the caveat that you have to learn about LCSH to use it, and with Google...you can get away with never learning about it at all.

Pg. 24: Accuse me of soundbites, I don’t care. This is gold: “it is undeniably true that the LCSH system is complex—but so is the literature of the entire world, on all subjects and in all languages and from all time periods, that is has to categorize, standardize, and inter-relate….the complexity of the world’s book literature is a rock-bottom reality that will not vanish simply because neither the Working Group nor LC management wishes to pay for professional catalogers.”

Pg. 34: He moves on to talk about reference work and the user, to great effect, I think: “Most researchers, when left to their own devices, are quite unsophisticated in doing computer searches…what [the user] prefers [keyword searching]…is based on a serious misunderstanding of what their “preferred” search technique is actually capable of delivering.” I think this is another case of librarians not being users, but some librarians not understanding that. The average user does not understand that a keyword search does not bring up everything. They don’t even understand the difference between a browse search and keyword search. You may think I’m kidding, but I’ve talked to enough college students to know that.

And the last sentence: “If the Library of Congress succeeds in dumbing down its own subject cataloging operations through this reorganization, there will be serious negative consequences for all American scholars who wish to pursue their topics comprehensively and at in-depth research levels, and for libraries in every Congressional District whose financial constraints make them more dependent than ever on the continued supply of quality subject cataloging from the Library of Congress.”

Friday, March 21, 2008

Creating meaning

I've been reading up on RDA, FRBR, and metadata more generally over the past week (I have such a cool job). Anyway, as I was reading, and reading, and reading, I saw some things that grabbed my attention.

A lot of people who talk about RDA (and FRBR) talk about how these new concepts and new shifts in understanding are going to help us create meaning for our users. Instead of cataloging in a vaccuum, treating each piece as separate islands, we're going to be creating the connections between ideas and users and creators.

Now, shift over to TWO weeks ago, when I was trying to learn about our new big metadata project. I was talking to the project manager, and we were discussing how our group would assign subject headings and geographical placenames. The more we talked, the more I realized that the focus of this project does not lend itself to "traditional" ideas about assigning metadata.

In my other job as a cataloger, I might catalog a book about Nabokov, and then a book about English Victorians. These two things will have no relation to one another, and my job is not to try to find a connection (although in this particular example, what a great challenge!).

The thing is, in this metadata project, that is EXACTLY what they need. They need this map to be applicable to this book, or this book to remind a user about that manuscript. We're actively trying to create meaning for the user. Now, this is easy for us in this case, because everything pulled for the project is swirling around a central research topic. So it's not as if we're going to be using the entire LCSH in order to do this project. Instead, we're using just a small, interrelated fraction of that. So when I tell the other catalogers that we need to keep connections in mind, they totally get it, and its easy.

RDA and FRBR have a great ideal in place, and I love it, but I think that RDA is missing something really central in their thought processes. Even if you use machines to pull a lot of this data, and we use publisher information, and we stop caring about grammar and punctuation, it is still a ridiculously high expectation to put on catalogers to "create meaning" for the entire scope of human knowledge. I think its daunting for us to be doing this for researchers in a relatively narrow application, because we're never going to understand what those researchers really want. Maybe its time for librarians to adopt the archival perspective: We can't know what the user wants, so we give them the best we can give and they just have to figure out the rest. In that light, the ideals don't look quite so daunting.

Thursday, March 20, 2008

A Quote for the Day

Dort, wo man B├╝cher verbrennt, verbrennt man am Ende auch Menschen.--Heinrich Heine

("Where they burn books, they will ultimately also burn people.")

Historical note!
Heine was a Jewish cum Protestant Romantic poet living in Germany in the early half of the 19th century. The quote was actually in reference to the Spanish Inquisition, and the burning of the Qur'an.

Wednesday, March 19, 2008

RDA, FRBR, and other acronyms

I am SO GLAD that Karen Coyle gave her talk at Code4Lib on RDA. I myself have been asked to do a powerpoint presentation on RDA/FRBR for the cataloging department, and the points she’s raising are insanely useful (although also scary). She makes a good stab at talking about weaknesses and strengths of RDA without coming down on one side or the other.

Of course, in this blog, I don’t really feel like being unbiased. I will be for the powerpoint presentation, but not here! One of the useful things about being a nobody.
The thing that really gets me (and I commented on the FRBR blog about this), is that the RDA creators seem to be getting farther and farther away from what they said they would be doing, and that may force librarians to dislike RDA.

Some of the professed goals of RDA:

1. Create a more streamlined standard.
800 pages later, I’m questioning that one.

2. Hold true to the FRBR ideal.
They don’t use the attributes in FRBR to describe things in RDA. Why, I don’t know.

3. Make things easier for the user to find what they need, in the context of all knowledge.
RDA doesn’t address subject headings. And no one has ever heard of FRAD except the people on the RDA/FRBR/FRAD groups. And FRAD doesn’t do anything, anyway. It’s conceptual, just like FRBR.

4. Make the focus the content of the record, not the display of the record.
This is all well and good, but telling a cataloger not to standardize their records is like asking a fish not to swim. We’re trained this way! Taking the display rules out won’t automatically make us stop thinking about it.

5. Create a standard that archives, libraries, museums, and creators of digital materials can use.
No one except librarians is talking about RDA. I don’t see a lot of discussion (well, ANY discussion) from archivists or curators about RDA. Is this one of those “it’ll be for their own good” kind of initiatives? I think we all know how well MARC for archives turned out.

Monday, March 17, 2008

ALA elections

I'm pretty young to the ALA listservs, so I've never been around for the Presidential elections. But I have to say, the email blasts with the (clearly?) professional graphic design work....is this normal? Or is this new? Alire's is the funniest, to me: the one that's trying too hard. The one that came in today, Williams, is much more down to earth, although I do like how we still have the faded out "vote" behind her name, and the "campaign" color palette.
Is this really necessary (no)? Do I care that much (no)? Do the pretty colors make me want to vote for one person or another (okay, maybe...Alire's orange is pretty!)?

The thing is, when I look at their credentials, both of the candidates look very capable to me, although Alire does have an advantage to me, since she's college and research library focused, and Williams is from the school library front. But that's just personal preference, not anything else.

When I look at their advertisements, though....it makes me like neither of them. Am I a Luddite, or the anti-advertisement equivalent thereof?

Friday, March 14, 2008


I went on a field trip the other day. To the library's offsite storage facility. Now, I had seen pictures of these places before--robotic arms that gather materials that are so closely spaced no human being could ever get in.

Our storage facility is not that crazy, although it does have 40 ft. ceilings and can store about 1.3 million volumes. but the gathering is still done by actual, live people in a cherry picker. The staff there called it an order picker, but I've seen cherry pickers (my dad liked to switch out the engines between his pickup trucks) and that is what they use at the storage facility. except that it's SUPER tall and instead of an engine attached, it's a person. And a booktruck that's 6 ft tall (it has a forklift attachment on the front to hold the booktruck and a platform to stand; it's not like you're just dangling off the thing like that Batman ride at Universal Studios).

Anyway, let me describe this place. The facility is nice, with landscaping outside, birch trees (which are not native to this area, but there they are) and bamboo plants. You can't get in the outer gate without a code, and you can't get into the building without a code. It's like they're storing gold, not old, underused books.

They duplicate the barcodes that are on the book, slap the new barcode on the outside, then sort the books by size, put them in little cardboard racks, and shelve them. Oh, and they vaccuum the books first with this big industrial vaccuum. It's pretty neat.

The room where they're stored is, like I said, 40 ft high and really long and big. Two big air conditioners and a desiccator run constantly to keep it at 50 degrees F, and 30% Rh. Dry and cold, just like the books like it. The shelves go up 35 ft, and the order picker will also go that high, obviously.

They're at about 40% capacity, and keep both archival and library materials there. They do runs twice a day back to the library, to pick up books headed to the facility (250 per day or so) and drop off requests (30 per day or so). It's nice, because if you order something at the right time of day, you can literally just hang out for an hour and it will come to you. Automatic email notifications are sent out when the book is placed on hold at the circulation desk.

All in all, an extremely efficient system.

Thursday, March 13, 2008


I'm just going to lay this out there--sometimes I get bored cataloging. It's not always puppies and sunshine at my desk, especially right now, when I'm learning how to do the cataloging the way they do here. I mean, it's good practice, but dear me, I get tired of checking for correct spacing.

I'm much more interested in talking to people about metadata then actually creating it. My husband has suggested I actually have the soul of a reference librarian, in that I like being around people, talking to them and brainstorming. But I like systems so much! I protest. I could never be a reference librarian. Making subject guides just sounds like torture.
There is something comforting about cataloging, sometimes. Knowing that there is a set way to do things, that the semi-colon always comes before the 300$c field. It's like a warm blanket.

Metadata, on the other hand....is like a crazy game where the rules change all the time. I love it, though. I have a big, big meeting next week just to hash out a bunch of questions that I have come up with about this newest digital project. I have a feeling that the consistency required by catalogers will always be at loggerheads with the need for quick, fluid change that is often the rule in digitization projects. I think I'm supposed to be a translator. I'm like a daywalker, maybe.

And in the spirit of that, I'm apparently going to "teach" a "class" on how catalogers catalog, to scholars and techies and even archivists. Because none of the other people involved in this digitization project have a clue about what we do down here in the basement. So I'm writing down all these questions that I think need answered, like "what do people not know about cataloging?" Answer: EVERYTHING. Or, "Explain the difference between regular cataloging and this metadata process?" Answer: explain we're not really anal retentive--we're trying to standardize data inputting as much as possible. LCSH is just a big, big set of block letters on my notepad right now (as if I would forget about it...). It might also have a fairy castle growing out of the H. Maybe.

At any rate, being a translator has brought me a lot more joy in my job than semicolons ever did. Although the mighty semicolon certainly has its place.

Tuesday, March 11, 2008

Episode IV: A new hope

I've been trying to find some information on what kind of workflows are out there for metadata creation. Let me tell you, it is harder than you think. But I did stumble across a new piece of software that will hopefully be coming out soon: Rutger Libraries' Workflow Management System (inventive title, no?). Code4Lib did an article about it awhile ago, so it's not like super-new news, but I've noticed that not everyone and their mom reads Code4Lib. No offense to the C4L guys--you all seem quite awesome.

Anyway Grace Agnew and Yang Yu wrote an article about it, and it's pretty interesting stuff. WMS is really just like Archon or Archivists Toolkit, except that it's for anyone that's creating metadata, not just archives. I like that aspect very much, since here at our instiution, most of the digitization projects are actually hybrid projects that use staff from archives, the library, and the digital people. I imagine it's probably at least a little more user-friendly than the archives software, mostly because of who created it. Archivists can be....not so user-centered sometimes. Again, no offense (I'm offending lots of people today!).

Thursday, March 06, 2008


The Interwebs is already starting to seeth with mentions of GLIMIR.

But first, you may ask, what is GLIMIR? Well, as I read it, it's a terribly horrible acronym for Global Library Manifestation Identifier (yeah, I don't know where that other I and the R come from, either....maybe we could throw some stuff in?....Global Library Irate Manifestation Identifier Roadshow?)

Basically, it takes the problem of manifestations (FRBR alert!), and addresses the issue that ISBNs are not manifestation identifiers. A good example of what that means was given by Mr. Stuart Weibel--there are lots of records in OCLC that have the same ISBN. But many of those records are not duplicate, redundant records. They're foreign language records for a work in English. So....in this case we're talking about the same work, but a different manifestation of that work (I may be using "work" in an improper form. Sorry in advance). I guess that in the beginning, a lot of people thought that ISBN would be a manifestation-identifier. Which would be very nice and comforting, since it helps to ground FRBR-thinking into current-cataloger-thinking, but it's not a 1-to-1.

So OCLC (in all their infinite wisdom), has graciously decided to solve this problem for us. Whether or not these GLIMIRs will be "business-neutral" is still up for debate. Honestly, I don't see why they wouldn't be....OCLC numbers (and ISBNs) are "free"--once one catalog outside OCLC has one in their record, you're perfectly welcome to use that number for whatever you like.

So, with that (really, really bad) introduction to GLIMIR, I give you a link list:

Stuart Weibel's GLIMIR Of the Future (good stuff, read the comments, too!)

The FRBR blog's Open Library developers’ meeting (just a mention)

FRBR definitions (why not? Manifestation!)

That's all I have for now....OCLC is not yet admitting publicly that it's launching a pilot project. But I do think it's fascinating that FRBR is basically infiltrating our organizational lives already--RDA is not more than a mere glimmer in our eyes (pun!), yet we're already ramping up for a FRBR-based approach to cataloging. In fact, it's kind of like the current recycling theory: it's easier to recycle when you don't have to think about it. It's easier to FRBR when you don't have to catalog it.

Wednesday, March 05, 2008

LCSH v. techies

There's a big digital project in the works here at The New Job. They're digitizing something like 400 works or pieces, and then some of us in the cataloging department are charged with creating the metadata. Not from scratch or anything, of course--the works are originally out of the archives here, so there's some basic metadata available. I've been meeting with people about this project a lot in the past few days, since I am the Metadata Librarian.

And I finally think I have a grasp on what it means to the be the Metadata Librarian. My job is to make sure that the catalogers don't feel like they're selling their souls, and that the digital people don't feel like they're being nickel and dimed by the catalogers. Case in point: LCSH.

This new digital project is going to be pretty cool--two institutions working together to create a federated search portal that other libraries/archives will be able to use, as well, in the future, all under one umbrella. It's not the most groundbreaking piece of technology I've seen, but still. It's neat that they're doing it.

They did a pilot metadata creation thing a few weeks back, as I understand it. One cataloger told me that they were given 6 days (really four, since two of the days were a weekend) to create metadata on 35 records. No big deal, right? Wrong. Apparently the metadata includes LCSH. And let's not forget, all the catalogers here have their "real" jobs, where they do all the other cataloging that needs to be done.
So the catalogers are all in a tizzy because they think (perhaps rightly) that the digitization people just don't get how long it takes to do subject analysis, not to mention filling in the other blanks in the metadata record. Oh, and did I mention that the catalogers didn't have anything to look at while they cataloged? The digitization people didn't think that the catalogers needed to see any of the pieces in order to catalog. How does subject analysis get done when all you have is a title?

Now, on the other side of this, the digitization people (this includes the project manager), think that the catalogers are exaggerating how long it takes to do things, and that their time table is going to get screwed up if the catalogers keep insisting on needing more things and more time. I think that the digitization folks believed that the metadata and digitization would be done concurrently, or even that the metadata could be done BEFORE the items were digitized. This is of course possible...but only if, as one cataloger said to me "we go upstairs with a notepad and catalog it by hand in front of the original."

I've already come up with several solutions in my head for this, and I think this is why they hired me. I like creating compromise. But that's not the "biggest" problem.
The biggest problem is that the digitization folks have now started messing with the LCSH field. They have started asking for non-LCSH terms to be used in that field. The catalogers are horrified, of course. I'm kind of horrified, too, but not because LCSH is so inviolate. More because the non-LCSH term they want is not a "subject" at all. It's a type of material. But I think I have an answer for that, too, if I can phrase it correctly. Then everybody wins. We have a meeting today; we'll see how it goes. Considering that I'm totally new, they might not even want me to speak at all. :)

Monday, March 03, 2008

There Ain't No School Like the Old School

So I've started using old-school Unicorn (ie, Workflows). I mentioned before that it feels outdated. It's really easy to use, once someone has explained how to use it and what the words mean. It's kind of Windows-based...it reminds me of databases that we used in junior high, which makes sense, since Unicorn is a pretty old system. I'm currently learning how they copy catalog here, so the work is not terribly challenging (although it does give me a good chance to relearn my leader/directory/008 fields). Their system here is so streamlined...the vendor has a relationship with OCLC, so the copy catalogers' job is really to just check the cataloging that's already in the Sirsi system--there's no uploading by our library, unless OCLC has a poor record or no record to use at all.
So I've been spending time today using a light wand (I know!), and using the "public access" part of Unicorn, which is laughably old. The face that actual users today see is fine--it looks like any regular ILS front end. But the public access module in Workflows that the librarians use has those picture buttons, like an Athena system or something. The "reserve desk" has a picture of an apple, "search catalog" has a picture of a girl in 1993-era clothing studiously looking at books in a library. "Browsing" has a pair of binoculars floating free above the Earth (I assume that these are some kind of super spy satellite binoculars). For some reason, the "subject" search has a picture of the Space Shuttle launching. Don't ask, because I don't know. I could go on, but you get the idea.
Workflows is fairly customizable, even though I would never have imagined that to be the case when I first saw it. You can put in whatever menus you want. I also learned today that if you want to do an import, though, Workflows go really, really 1993 on you. The first step is to tell it you want to import a certain file, and then you have to schedule the upload. I imagine that back in the day this was necessary so you could upload everything at 3am when no one was using the system. Of course now it's just silly, and makes the catalogers sigh.
I'm looking forward to using Java Workflows a little more...just to see what kind of changes they made to the system. It has to be better than old Workflows, if even just in the feel of it.
"Wicked people never have time for reading. It's one of the reasons for their wickedness." —Lemony Snicket, The Penultimate Peril.