rolanni | When databases walk the earth

Alert readers will recall that I live in Maine. In fact, that I live in that geographically squishy area known as Central Maine, about 3 hours from the Massachusetts state line and 2.5 hours from the Canadian border, if you’re feeling sanguine with regard to Coburn Gore.

Now, not only do I live in Maine, but I am a veritable giantess among Maine women — six foot tall in my striped sockfeet. This means that I need to either (1) make my own clothes, which I used to do when I was young and ambitious, but have not done for more than twenty-five years, (2) wear mens clothes, which I do pretty often, or (3) buy girl clothes from tall shops on the internet, which is what I do somewhat less often than (2).

One of my favorite vendors of tall women’s clothes is Long Tall Sally, a British chain with stores/distribution points in Massachusetts and in Canada. Mind you, I order from the internet, and make no secret of my US address.

Which is of course why two out of three orders that I make with Sally are fulfilled by the Canadian distributor. I wouldn’t mind this so much, except that the Canadian facility runs the credit card and I get whacked with a currency conversion fee.

And, also, on the rare occasions when I need to return something, the Sally folk in charge of issuing RMA numbers pretty much always insist that my original order of course did not come from Canada; that would be. . .silly.

Ahem.

I suppose there’s a database somewhere in Sally’s kingdom that figures out which shipping facility is closest to what customer and shuffles the orders that way, with a fine — let us even allow, a joyous — disregard for such human vanities as the borders between countries.

* * *

Speaking of databases — this post is about databases — I wonder if someone who is more savvy than I am can explain why it is that bookstores can’t seem to handle co-authors in their databases. It seems universal, from the Big River on down to Ma & Pa’s Bookstore and Pizza Emporium.

Steve, as the second author listed on our collaborative work, is constantly dropped from the database record. This is not only unfair and untrue, but it means that readers who may only recall that “Steve Miller” is one of the authors of those space opera books they like so much can’t find what they want to read.

Which seems a disservice to readers, authors, bookstores, and publishers.

In other words — it’s lose-lose.

And yet the error persists.

Is it Just Too Hard to build a database that will accommodate the reality of co-authors? Or do the builders of databases, being, perhaps techies more than readers, not understand (and therefore don’t care about) the issue?

* * *

I have met folks who haven’t believed that co-authors do equal work on the projects that bear both of their names.

My co-workers on the newspaper many years ago for instance widely believed that my husband “let me put my name” on his books. To keep peace in the family, one assumes.

More recently, I submitted a novel by Sharon Lee and Steve Miller to the Maine Arts Commission, in application for a grant. After they had the application in hand, and despite having asked beforehand and told to go ahead, I was told that co-authored works were not acceptable. The official’s suggested solution was that I simply remove my co-author’s name from the application.

I don’t believe that, to this day, she understands why this was wrong, though, be assured, I did my Very Best to tell her.

So, it’s not at all outside the realm of possibility that those who are charged with building database may be. . .misinformed regarding the necessity of listing all the authors of a particular work.

What I’d like to know, I guess, is how to get to these folks in order to educate them.

Ideas?

Originally published at Sharon Lee, Writer. You can comment here or there.

Current Mood: curious

Flat | Top-Level Comments Only

From: (Anonymous)

You made me paranoid.
But I *do* know what you mean about the database thing. Having
been a test engineer for databases, I find it annoying that
the uploads often ditch the multiple authors. (Having come from
a science research background, I found multiple authors quite
common.)

Short of pitching a hissy fit, I don't know what you can do.

Love, love, LOVE the Snippets of George.
Lauretta@ConstellationBooks

PS Your blog-anniversary request: I've found the entries under
the writing life, writing life, and writing to be very interesting.

PPS Hope you're cooler than we are.

From:

martianmooncrab.livejournal.com

the joys of buying on the Internet... since Ireland doesnt have postal codes, it doesnt exist and the company wont ship to it, and certain companies dont ship to Scotland .. I end up every now and then either getting a package I need to remail or I have to go on a shopping expedition and then mail the results.

Then there is the VAT that Customs likes to slap on any package coming from a business vs from an individual.

From:

sctechsorceress.livejournal.com

As a computer programmer for more than 30 years, I can answer that question. The reason that things like this happen is that no one ever thought of it. As simple as that. Now, in a deeper sense, the problem is about how software in general, and databases in particular are designed. There is a well-known, amply documented, technique for making sure that problems like this do not occur. However, it requires a lot of time, the services of highly trained technical analysts, and 'users' who are willing to explain the nature of their business to those analysts. In short, it is expensive.

Probably less expensive in the long term than writing up some sketchy specifications and giving them to a database designer who may or may not share a language with you (by which I don't necessarily mean overseas, some of us technical types are not what you'd call fluent in English!)
But the money has to be spent all upfront. So it doesn't happen.

From:

redpimpernel.livejournal.com

The database thing is all about fields, linking, and laziness or some other excuse to allow the design flaw.
In it's most simplistic form a single book is linked to a single author. You have a table with book attributes (name, isbn, etc.), then you have an author table (name, address), finally you have a table that connects authors to books (Book ID, Author ID).

When a book has multiple authors you have 2 choices, either have 1 record per book, with multiple author fields (which means,you have to decide up front how many authors you want to accommodate, and 90% of the time the extra author fields are empty, thus wasting very valuable space & memory, when you are talking about millions of books) or you allow multiple records for each book, 1 record per author: book 1, author 1; book 1 author 2; etc. allows for a flexible number of authors. This is obviously the better way to accommodate multiple authors. But it causes all kinds of headaches in other areas of the design, because now when you do a search query, instead of getting back one result per book, you might get many, so there is a lot more complex bookkeeping that needs to be done.

Mind you, this is not a hard task to overcome, and is in fact one of the common examples of how to handle multiple attributes for a single item in database classes.

So why establishments of the size of B&N or Amazon would let this be a problem is truly unacceptable. Loretta is correct, in the scientific community, databases that handle scientific papers must be able to handle multiple authors, because that is the norm, rather then the exception. For display purposes (which is actually the biggest nightmare with mult-authors) they usually cap at 2-3 authors before using et. al. And display purposes may be the real reason that the online bookstores only show 1 author, space is at a premium. (I'm guessing.)

From:

adina-atl.livejournal.com

A many-to-many relationship between author and book (one author can have multiple authors, one book can have multiple authors) is a common, doable, but non-trivial programming task. Instead of having a book table with an author field linking it to the author table, you now have to have a book table, an author table, and a book-author linking table. Then every display and every search query needs to take the linking table into account. And the truth is, many--even most--computer applications are built without the input of a database designer. Most databases are designed by programmers who have picked up the absolute basics of database usage on the fly, with no real theory or depth to it, and they use rote solutions even if they don't quite fit the problem. (I'm primarily a database programmer myself, and most database designs I see are enough to make me cry.) Mind you, Amazon in particular should do better, but I suspect that their original software was cobbled together by a bunch of enthusiastic amateurs--most dot coms were back then--and it's too much trouble to redesign from scratch now.

The real mystery to me is the MOBI ebook format, which has apparently has a single field for "Author name," resulting in the author field being completely useless for sorting purposes on my Kindle. When "Jane Austen" sorts after "James Tiptee" and nowhere near "Austen, Jane", you have a problem.

From:

schulman.livejournal.com

Amazon's system does handle multiple authors (although I note that on some, but not all books, their search results page only hyperlinks the first author through to an author listing -- I don't know why).

All that data comes in from somewhere, though, and if the data is bad at the source (the distributor, for instance, although I don't know where Amazon gets their data), it will be bad in all the automated systems that consume it.

In my experience, it's much, much easier to fix databases than it is to fix data.

From:

furballtiger.livejournal.com

IMHO this ought to have been hammered out in the emergence of an xml schema used by the publishing industry to exchange data (similar to the catalog design c.f. the direction setting by the LoC here in the US; they handle this issue fine). This is still emerging in some industries. That would then (eventually) drive the db designs that follow. Publishing is not my field, but a quick search revealed a flawed "best practice" example that could only handle single author by a consultant (http://www.xfront.com/ExtensibleContentModels.html) and an example led by NIH for describing works that has both a single author path and a group attribution option (http://dtd.nlm.nih.gov/book/tag-library/). This may be a better starting point: http://www.webxsystems.com/Journals_and_Books.htm . But this is the kind of thing industry groups are supposed to hammer out (and thus why everything takes longer than it seems it should at times). You'd be amazed how long it takes electrical engineers to agree on a simple connector! It only seems simple from the outside; in reality there are all sorts of vested interests, strategies to play, etc. You might ask Amazon or a big publisher about data interchange stds for books.

From: (Anonymous)

This is the difference between databases designed for businesses and databases designed by/for librarians (full disclosure: I am a systems librarian). Libraries are accustomed to multiple authors and the MARC format in use in most libraries handles that, although one author is considered the "main" author and others the "added" authors. Not perhaps the best, since both authors may be "main", but all the linking is there. And displays in online catalogs can give the main author and any added authors equal time. [The main author is determined by which one appears first on the title page]. Libraries are accustomed to many-to-many relationships. Plus there is no wastage of space, since if a field has no data, it is not there (with the exception of a fixed field). Business databases tend to have a fixed number of fixed-length fields. Everyone has seen long last names cut off because the field was not long enough, or complex addresses cut off on mailing labels.

Library databases deal with variable data all the time. One of my gripes about some of the newer metadata schema is that they do not deal as well with variable amounts of variable data and tend to oversimplify things. - From S. Card

Flat | Top-Level Comments Only

February 2026

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

Expand Cut Tags

No cut tags

Eagles Over the Kennebec

When databases walk the earth

When databases walk the earth

OK, I just double checked IndieBound.org and both of you are on the books

no subject

no subject

no subject

no subject

no subject

importance of stds/schema/etc

Database design

February 2026

Most Popular Tags

Expand Cut Tags

Profile

Page Summary

Active Entries

Style Credit