Wednesday, April 30, 2008

Devil's Advocate

Ranganathan once wrote, of Charles Ammi Cutter's Rules for a Dictionary Catalog: "Rdc is indeed a classic. It is immortal. Its influence has been overpowering. It inhibits free-thinking even today." (Headings and Canons)

Ranganathan saw something in Cutter's rules that I think most people don't even see today: that we're still going in the same directions we've been going since Cutter wrote his rules, and we're not somehow coming to a new dawn of cataloging when we talk about FRBR or RDA. Now, lots of cataloging scholars also see this (William Denton's chapter in Understanding FRBR is a great example of this--his whole chapter is about how FRBR developed out of the past).

But I feel as though a lot of OTHER people just aren't seeing this relationship to the past. Many librarians are very worried about RDA, just as FRBR worried them 10 years ago. Why are they worried about RDA? Because it's being touted as this new, groundbreaking initiative that will change everything about cataloging as we know it.

Except that it won't.

Ranganathan's words still ring true. Cutter's rules, lo those many years ago, set us up for using card catalogs, author and subject indexing, and helping the user to find what they need. LC based most of their standards on Cutter's work, and the ALA based most of its work on LC standards, and let's face it, Cutter wouldn't find very much to be shocked about in the AACRII.

RDA likely won't be THAT shocking, either. I mean, to look at it from an outsider's perspective (say, the serials cataloging community), it's just more of the same, wrapped up in new terminology. Will the new terminology help us to catalog various formats better? I have no idea, although from looking at the vocabularies list that came out recently, I suspect it will not. They have a separate term for music publisher numbers. No term for other kinds of publisher numbers, though. Where is the overarching idealism in something like that?

I frequently worry that the library community has been stuck in a classification rut for over a hundred years, but that the reason no one is truly willing to step out on a limb and create something totally new is because it's just easier to do everything like we've been doing it**. As the amount of information increases, and we become more and more invested in one system, it gets harder and harder to scrap the concepts. When I even think about conceiving a new classification system, I usually end up (mentally) drawing away from the idea. Dr. Miksa once bemoaned the lack of true theoretical learning in the cataloging world, telling us that one big reason we would never change our systems is because we're no longer training anyone to do it. We're just training everyone in the same traditions, and letting everyone go along thinking that this is the only way to organize information. He mused that when someday a new format comes along that everyone wants and it doesn't fit into any of our organizational schemas, there will be no one to make the leap and conceive of a new system. We've been very resourceful so far, fitting the new formats into our old ways of doing things, by renaming or expanding or just tweaking the same concepts over and over. But it may not always be enough. And I feel as if I (and others, of course) have been led to believe that RDA is supposed to lead to some new, more open era of cataloging. But the more I think about it, the less I believe it.




**(this is not a new idea, by the way...somebody very recently was saying that the reason people don't want RDA is because it will lead to a change in our system requirements, and that change will cost money that people don't want to spend)

Tuesday, April 29, 2008

In Which I Ramble A Lot

I've been really busy these past few days....you know, cataloging everything. I've been cataloging a lot of German-language material, which you might find laughable once you hear that I don't actually speak German. Not very well, anyway. It's gotten to the point where I sigh in relief when something crosses my desk that is in French. FRENCH. So you know how desperate my situation has become.

Really, though, this is a great exercise in cataloging. It's very hard, but it forces me to understand the book in my hand thoroughly before attempting to do analysis of it. Luckily for these books, I've been cataloging in German for awhile, ever since my first professional library job, in fact. Again, let's recap: I don't speak German except in cases of ordering Turkish kebabs at the Christmas market in Aachen. Since that went well, though, I figure I'm good.

So besides the cataloging, I've been simultaneously working on creating html versions of our cataloging manual, and continuing to sit in on the class that's teaching TEI, and working on that metadata project that at one point ate my entire professional life but has since calmed down a little.

I used to think library work was boring, can you imagine? Funny story: when I started college, "they" wanted me to work in the archives of the college, because I had worked in a state historical archives in high school. What "they" didn't know was that I spent my entire volunteer time alphabetizing request forms by patron name, and shelving microfilm. I actually said to my mother, "I would rather do dishes than work in a library. Libraries are like math--they make my head hurt."

But since I am a conflict-avoider, I went ahead and did what "they" told me to do, which was to work in the archives. And sooner or later I learned that only student workers shelve microfilm, and by my sophomore year I was actually the "senior" worker in the archives, because the archivist moved away. And then I realized that I really liked organizing things. The End.

Thursday, April 24, 2008

Conversation with a Serials Librarian

It went like this: I sent our serials librarian the link to the "FRBR for serials" paper (pdf!), which is apparently quite the rage at the CONSER operational meeting. She said, "oh, yeah, it's nice to see CONSER taking this on, since ever since FRBR came out, it hasn't addressed serials."
Um....?
And THEN she says "It took them about 40 years just to address serials and continuing resource questions in AACR! That's the whole reason CONSER exists."

The more we talked, the more we both came to the conclusion that FRBR is, as the serials librarian said, "a fancying up of old ideas." When I think of FRBR and I think of all the examples that FRBR has been used to describe, it's still based off the basic idea of one work that can be expressed in multiple expressions/manifestations/items. But it's still ONE WORK that was created by ONE person/group/corporation. Serials are just not like that, so it falls to CONSER to (yet again) make their own rules, and tweak concepts, just like they had to do when the library community ignored their needs back in the mid-century with AACR. I don't have a problem with FRBR being tied to traditional ideas of bibliographic control, but they should probably acknowledge that straight off.

This goes back to something I wrote about awhile ago--that archives and museums will have no interest in RDA. And it's dawning on me as to why (besides the obvious one that RDA is kind of imperialistic). If RDA is based off the conceptual framework of FRBR, and FRBR starts everything off with the term "work," we're automatically shutting out everyone who is not part of the one-book-one-author universe. Archivists, curators,and serials librarians just don't respond to that kind of terminology. It doesn't really matter if you have "addressed" their needs--you're starting from the wrong place.

I think that our serials librarian is right, in a way. FRBR, even though it seems to be trying to philosophically embrace all kinds of information, just doesn't do that. Where do archival collections fit into FRBR? Serials? Pith helmets? They don't fit, and will never fit. And that's ok. Because I'm not saying that FRBR isn't an excellent way of conceptualizing the creation of creative works. It is! But if the concept doesn't even fit one of the largest "anomalies" in the library world (serials), then why are we using it as a base to build a new set of rules about cataloging?

Tuesday, April 22, 2008

Searchability

Interoperability is a buzz word. A really, really important buzzword (unlike, say, "paradigm"). And I feel like I'm banging my head against it. I know the old saying: "There's the right way, and then there's our way." I think that this applies to this institution's approach to digitization projects.
Now, don't get me wrong: this place is the most awesomely together place I've ever worked with regards to digitization projects. They have very clear projects and expectations, if perhaps not quite enough staff to go around. But that's a common problem everywhere, and we all know it.

But the more we talk about this new project, the less happy I am with the way we're communicating. The TEI initiative isn't meshing with the metadata initiative, and I feel like, while they're not exactly working at cross-purposes, they're certainly duplicating work and ultimately making things harder for a user. The TEI people have no concept of controlled vocabularies, and the metadata folks are certainly not going to give into the natural language camp, and the more I think about it, the less I like the idea of one side doing their thing and the other side doing their thing, isolated.

So how do we get ourselves out of this predicament?

I'm teaching a class soon on the basics of cataloging for non-librarians. I'm hoping that this helps to clarify, for these natural-language people, just where we metadata folks are coming from in our need for controlling everything, and also how beneficial it can be to control terms and names and places. I think that many users never really understand how much controlled headings help them. Someone the other day asked me why we "bother" with controlling names or subjects, "when Google is right there and you can just let the software do that stuff for you." I think this person didn't really know what he was suggesting. THe beauty of the controlled heading is that I can put in something like "Dostoevsky" and get all the OTHER versions of Dostoevsky's name as well. Or that I can type in "New Amsterdam" and get the references to New York City. These are things that people think "software" can do, but in reality, it can't. Someone still has to map these things out in order for the references to exist.

So when the TEI people say "well, can't we just put Emperor Maximilian" and everyone will know what they're looking at?" I can honestly say "No--because what about the people who just write Maximillian, or the people who are looking for the emperor of the Holy Roman Empire, or the people who are looking for the "emperor" of Mexico? Or the prince of Baden? Or Maximilian the saint?"

 If there's an easy way to solve the problem of searchability...I can't wait to learn about it. But for now, we're going to have to settle for interoperability, and making our metadata and TEI mesh in very concrete ways. And we're not at that point yet, unfortunately.

Wednesday, April 16, 2008

Future trends, past traditions

The new paper put out by Richard Gartner this month in the JISC is....well, it's not saying anything that we don't already know. I've noticed that much of academic writing is just common sense stuff put down into words. If I could ever figure out how to do that, I would be a great and accomplished academic writer.
Anyway.
The paper is interesting; you can find it here (warning:pdf).
Basically, to use a metaphor (simile?), metadata schemas are like the parts of a car (simile!). METS is the frame, MODS is the engine, DC is the transmission, MIX is the mirrors....this simile is breaking down, a little, but you get the point. Gartner's idea is that once we figure out a way to bolt all the pieces together in a systemized way, we'll have a car and then everyone will be driving. So, once we finally, as a library community, decide to systematize all these different schemas and link them together and decide upon common access points, we'll have something akin to MARC and AACRII, where all the records can transfer to any library system, and everyone uses the same rules, and we can trade records and federate searches and everyone will eat ice cream every day and there will be puppies at every computer terminal.

The thing is, he's not that far from reality. I figure it really is just a matter of time before we come up with a standard for digital object description that uses pieces of MODS, or DC, or PREMIS, all within a METS wrapper. I mean, we're there NOW, we just haven't codified it yet. Who doesn't use those metadata schemas? It's the most organic kind of creation, without rules, yet we all follow a kind of Pirates' Code where we try to take into account all the traditions of library cataloging, use LCSH when possible, etc. (Also, if we could call this new code that will someday be created the Pirate Librarians' Code, I would be cool with that). There is a ton of tradition behind new metadata creation.

RDA is certainly our first step towards a system of creating metadata content that is standardized, and will work with both paper and digital, and is not based on the idea of catalog cards. At least I hope that they ditch all those crazy rules that only come from the idea of having a finite amount of space to write. I'm sure they will, the RDA group seems pretty smart. Much like AACRII and MARC, though, someone is probably going to have to come along and write one of those books like "Cataloging with AACRII and MARC21", because the RDA people will not want to tie their content standard to anything concrete, and all the librarians will just be sitting there wondering how in the hell they translate their rules from physical to metadata to digital. And I imagine that in five years, my reference shelf will have a copy of RDA, and a copy of "Cataloging with RDA and MARC" and a copy of "Cataloging with RDA and MODS" or something.

Ooh, maybe I can write one. Then I will be a Great and Accomplished Academic Writer.

Tuesday, April 08, 2008

Where the MARC meets the XML

TEI (Text Encoding Initiative) is not new. At all. It was born in 1987, although wasn't put into XML format until this century, I believe. Mostly, English nerds love it. It allows them to find patterns in literature that's been encoded, and differences across editions and versions. It warms their nerdy little hearts, and at the same time allows for the writing of more literary criticism than ever before possible. This also fuels the academic cataloging departments, of course, so I'm not complaining. Much.

Anyway, for this big project we're working on, we're taking scholars and having them do TEI markup on printed works and handwritten manuscripts of all kinds, and then everything will be searchable by keyword and etc etc. I think that tomorrow I'm going to write about TEI, and put in links and things, because I think that not enough librarians know a lot about TEI. But today, I'm going to talk about the relationship of TEI to MARC.

Yes. They have a relationship.

I noticed it right away while we were talking about the capabilities of TEI. The thing about it is--if encoded correctly, there's no need for a cataloging record. The TEI will have captured title, author, format, genre, extant, publisher information, year published, translators, as well as chapter and section titles. The search mechanisms then pull that information out directly from the digital document.
The caveat of course is that the document has to be digital. But think about it--you could easily have a catalog that pulls not only MARC records, but also TEI document information, and have both types of things in one catalog. There's not even really a need for a search portal--you could write a fairly simple program to pull information out of a TEI document and automatically generate a MARC record with it, and then import that record into your catalog. You could even do an 856 and link the whole thing together, and you could do it all with minimal effort on the part of the cataloger.

If there were other librarians at my desk with me right now, they'd all be screaming about subject headings, and yeah, this model doesn't do a thing for subject headings, but we don't create subject headings for manuscripts, anyway, really. It's too hard. And for printed books--subject analysis is a heck of a lot less of a time commitment than doing an entire record from scratch.

And when I think about it, MODS and EAD are the same way. Terry Reese at Oregon State has written a conversion program for EAD to MARC21, and I know that MarcEdit is the perfect platform for such things...but it's also not terribly intuitive all the time, and it only deals with EAD. We're fast approaching a time when having programs for conversion of other XML formats will be really, really useful...Why hasn't a little program been written yet? It's times like these that I wish I were a programmer. Unfortunately for me (but fortunately for humanity), I am not.

I just feel like we've given up on MARC, as a profession, when in reality, it's still pretty useful for parsing information and making it searchable. And these other metadata schemas all still take the same stuff out of the original and put it into machine-readable format...why not use the structures we have in place (like our ILSes) and put them to work?

Textbooks and conne(x)ions

Ever since I had Kinkos print out a copy of the PREMIS data dictionary and bind it for me (for just $16!), I have been longing for a comparable piece of literature to come out for MODS and METS. I mean, didn't PREMIS win an award or something for putting their information in such a wonderful and useable format?

YES. THEY DID.

I know that all computer geeks think print is just sooo 1993, but seriously, it's very comforting to have a paper version of the PREMIS tags within arms' reach. And I don't even use PREMIS very much (for the record, I also have a bound copy of FRBR, but it kind of doesn't seem as cool as PREMIS). Imagine if we had neat, easy to print-off-and-bind copies of the METS terms, instead of the interminable webpages, and mouse clicking (by the way, if there is such a thing as a printable METS dictionary and I'm just too dumb to find it, please tell me so that I can go out and get it. Don't let me stay a fool).

Oh my God, am I getting old? Did I just say that a webpage was clunky?

Yes, of course I did! Traditional webpages ARE clunky. There's a reason that everyone's so excited about web 2.0 (and now 3.0), and it isn't because webpages are staying static and obtuse. It's because we have this new ability to make something GREAT with our technology. Not just a list of items, but something that's more dynamic and can grow and is intuitive to use.

Speaking of which, have [you] seen Connexions? It's a Rice University project that has turned into a huge success. Open source, online textbooks. Apparently some of the textbooks that have been created are being used as the national curriculum of Mongolia. The quick and dirty layout of Connexions is that it takes too long to write a traditional textbook, and the sciences especially can't keep up with the new information by creating textbooks in the old way. So this guy from Rice (electrical engineer, maybe?) decided that why not create something like Wikipedia, but you can create full textbooks instead? You start by creating modules, smaller snippets of information (like, say, for a physics book, a module on Acceleration and then a module on Torque, and then a module on angular momentum, etc etc), and then you pick and choose which snippets you need for your textbook, select them, have the website generate a textbook for you, and then you ORDER IT PRINTED by an overnight printing house, who creates a real, honest-to-God hardbound set of textbooks for you for about $20 each.
Not $350 like you might expect for a physics textbook or a bio textbook. No. $20.

It's pretty cool.

Hmm....maybe we should start writing a metadata/cataloging textbook...with all the tags that we use, and all the rules we follow, and maybe entries on FRBR and AACRII....a library "textbook"? Intriguing.

Thursday, April 03, 2008

The Computer as a Communication Device

My predecessor left me a bunch of articles about all kinds of technology/library issues. That is cool, but since I don't know what he left me, I decided to make a spreadsheet of the articles. A catalog, if you will. Hee.

So, I'm going through the folders and what do I find, but J.C.R. Licklider's "The Computer as a Communication Device." (warning: it's a pdf) A veritable classic in our field, akin to Vannevar Bush's Memex machine.
So of course I read it (again). And was struck by the ending paragraphs (again). At the end of the article, Licklider paints this utopian computer world for us, where "life will be happier...communication will be more effective and productive...communication and interaction will be with programs and programmed models...and...there will be plenty of opportunity for everyone (who can afford a console) to find his calling, for the whole world of information...will be open to him."

Aside: I love that he puts the caveat of being wealthy in there, in order to benefit from this greatness.

Licklider was a little....eccentric. And, from the looks of it, a utopian. I find it very amusing that he assumes all information and all computing will always be for the higher ideal of creating and supporting learning. I find this especially amusing considering how much of the internet is useful only for wasting time.

But consider his ideal--it's beautiful, in its own way. Everyone learning, everyone making connections. Of course he's a little off in a lot of places...such as assuming that employment will disappear because there will be so much work in adapting network software to the new generations of computers (he never imagines that businesspeople will find a more efficient way of handling this problem). But still, the ideal of making information freely available, AND FINDABLE, is a really nice thought. Unfortunately for us, it still hasn't happened yet. Maybe it never will?

Then I look at the actual title of Licklider's article--The Computer as a Communication Device. He was right about that part.
"Wicked people never have time for reading. It's one of the reasons for their wickedness." —Lemony Snicket, The Penultimate Peril.