rolanni: (booksflying1.1)
[personal profile] rolanni

Edited to Add:  I'm seeing people volunteering (thank you!) and asking questions.  I also see that we may have a list already to work from.  I'm going to be in and out for the rest of the day; I'll try to keep up with the question-answering part and get back to the larger discussion tomorrow morning. So -- not ignoring you if I don't answer right away.  Thanks again!

OK.  It seems to me, from perusing the previous discussions, that we have lots of volunteers for Harvesting, but not so many folks want to Wrangle.  No blame to any of us — you see me saying I got no time to Wrangle Weird Words, right? — it’s a massive job.

Which means that we will need to go, with Great Trepidation on the part of the Luddite Writer, to a software sort.  We had several volunteers for this — if whomever is still interested in playing would shout out again, in response to this message, we can all put our heads together to see how best to split up the books.

I will still need at least one, and preferably three Wranglers, because, let’s face it, that’s a BIG pile o’novels over there, and some — I’m looking at you, Crystal books — are harder going than others.

The Wish List now looks like this:

1.  Automagicians

2.  Wranglers

3.  Cabana boy

The goal, one more time:  A list of Weird Words (including all “foreign” words, be they Liaden, Terran, Delgadan, Vandese, or etc.), and  Names (including ship names, planet names, city names, personal names) for each book.  One book = One list. In the order the words appear.

Someone had asked if I also wanted odd combos, such as “brother-cousin” or “close-kin” or “Silain-luthia”.  Of the examples given, the only pairing I would want would be “Silain-luthia” because Silain has the possibility of becoming I-dare-not-guess and luthia — is an invented word.

Why do I want this?  You guys have been so good about putting up with my fidgeting and fussing over this, you deserve the straight dope.

I want these lists, and in this particular format, for two reasons.

Reason One:  The Liaden Pronunciation Guide Steve and I have been talking about Forever.

Reason Two:   A bunch of Liaden books are going to be produced as audiobooks, RSN.  We have promised — actually, given what happened with poor Mr. Shanks — we have insisted that we will provide pronunciation assistance.  It is Reason Two that produces the deadline.

Worries:  I am particularly concerned that names stay together — Val Con yos’Phelium, Shan yos’Galan and etc.  This is the major need for the Word Wranglers.  I’m not so worried about bizarre English words sneaking into the list, because, if they’re that bizarre, they belong on the list.

. . .I think that’s it.

OK — who’s in?




Originally published at Sharon Lee, Writer. You can comment here or there.

Date: 2012-04-12 06:04 pm (UTC)
From: [identity profile] jessie-c.livejournal.com
May I volunteer to wrangle? I presume volunteers would send their lists of words to the wranglers, who would then weed out the duplicates and forward them on to you?

Date: 2012-04-12 06:12 pm (UTC)
From: [identity profile] sb-moof.livejournal.com
I am in. I would perhaps be willing to take on Wrangler duties, depending on timing. Was there an answer to the "list just the page of first occurrence vs. list every page of occurrence" question that was raised in the previous discussion? IMHO, the latter would make the process significantly more difficult, unless PDFs of the books as printed were available for electronic searching.

Date: 2012-04-12 07:38 pm (UTC)
From: [identity profile] rolanni.livejournal.com
The words ought to be in the order of first occurrence. The every page provision was to make sure we covered all the pages -- not an issue, I think (and someone will correct me if I'm wrong) -- in the case of a software sort.

Automagician

Date: 2012-04-12 06:16 pm (UTC)
From: [identity profile] ebartley.livejournal.com
I have a list which is currently one big file, divided by book. (It's trivial to split it into separate files for separate books.)

I also have a list of all words, sorted alphabetically, which I've used to derived a list of words to be suppressed -- things that are utterly typical in-genre (e.g. hyperdrive, empath), clear typos (e.g. imaginatiion), apparent dialect (e.g. checkin', cutesey, damnfool), apparent omissions from my wordlist (e.g. fatcats, flowerbed), sound-representations (e.g. ahhh, fwuummps), or proper nouns not within the story proper (e.g. baen, bujold.) Given what you say here, the list probably isn't inclusive enough; I should probably have included all compound English words (e.g. betold, bloodprice, blueglow.)

Do you want me to make a second go-round adding to the words to be considered English? Or email you the list of additional English words? Or email you the list of all words and you can make your own selections from there? Or, heck, email you the whole shebang so that if a bus hits me tomorrow you have what I've done to date?

Date: 2012-04-12 06:18 pm (UTC)
From: [identity profile] silverdragonma.livejournal.com
I also can be a wrangler. I have tons of experience with databases and could set you up with a nice one.

In!

Date: 2012-04-12 08:45 pm (UTC)
From: [identity profile] kat ayers mannix (from livejournal.com)
I am in to WordHarvest- have Local Custom, Balance of Trade and (arrived as of yesterday) the Partners in Necessity omnibus (encompassing Agent of Change/Conflict of Honors/Carpe Diem) in hardcopy. Just let me know what I can do. I would also be available to Wrangle, I am an excellent proofreader. Oh and a sigh of dismay- found a spelling typo TWICE (desparation not desperation) in the description on the back cover of the Partners in Necessity tpb/qpb. Thought you might like to know. Also would like to know- will these words be set up in a database with cells to denote the word's occurrence in the printed book/page/paragraph location?

Date: 2012-04-12 09:03 pm (UTC)
From: [identity profile] eoma-p.livejournal.com
Oops! Didn't mean to NOT volunteer to wrangle. Please put me to work wherever I'm most useful. It sounds like ebartley has already made a huge start.

Automagician

Date: 2012-04-13 12:31 am (UTC)
From: [identity profile] johnhawkinson.livejournal.com
"Which means that we will need to go, with Great Trepidation on the part of the Luddite Writer, to a software sort. We had several volunteers for this — if whomever is still interested in playing would shout out again."

Again, I'm happy to do this. I can trivially produce a list of non-dictionary words for all books that I have electronic versions of, and the number of books does not appreciably increase my time to do so, which is minutes. (I sent Sharon a list of books that I have handy in text form via email).

It looks like we've added a new requirement, for special words that appear in close proximity to each other, e.g. proper names like Val Con yos'Phelium. That's doable, but a bit trickier. I think I'd just generate the first list of such words and then find all instances where they appear adjacent to each other, and spit them out for manual review.

Date: 2012-04-13 02:27 am (UTC)
From: [identity profile] redpimpernel.livejournal.com
I'll throw my hat in the ring, as a standby wrangler or automagician. I say standby, because enough people may have already volunteered. Also, I'm not retired, so I'm not available 24/7.

I've got database design experience (Access). Assuming ebartley hasn't already finished the whole kaboodle. =)

Date: 2012-04-13 03:02 am (UTC)
From: [identity profile] ebartley.livejournal.com
I haven't been working on anything more complicated than tab-delineated text files ... but I was more thinking wiki than database, if we do anything.

Date: 2012-04-13 04:05 am (UTC)
From: [identity profile] zola.livejournal.com
I can still get you set up as described in my original post :) The words would NOT get broken up, they would go in as the wrangler added them.

Pronunciation Guide

Date: 2012-04-13 07:03 am (UTC)
From: [identity profile] catherine ives (from livejournal.com)
I assume from the mention of poor Mr. Shanks that he messed up the pronunciation of the Liaden words on the audio book. But...how would anyone know that but you and Steve. Wouldn't you have to provide somewere a recording of the correct pronunciation of all the Liaden names and words? Unless you write the words down on a list in the phonetic system that linguists use. A recording would imo be better.
Anyway, the whole discussion is way over my head.

As always I am in awe of your fabulous commenters and their incredible expertise. I obviously had a very miss spent youth and the rest of my life too where technology is concerned.

Re: Pronunciation Guide

Date: 2012-04-13 11:47 pm (UTC)
From: [identity profile] rolanni.livejournal.com
But...how would anyone know that but you and Steve.

That Steve and I know...is enough.

Volunteering

Date: 2012-04-13 05:23 pm (UTC)
From: [identity profile] capricchio.livejournal.com
Did you decide you needed the words pulled from the hard copy version or just the order they first appear? If it can be done in the electronic copy, that's the fastest way. I say this becasue I don't mind gathering the words from the Crystal Variations (Golly, an excuse to read them again!) but I can see losing track of the words and starting to read the story and forgetting the words. :)

June 2025

S M T W T F S
1 2 3 4 56 7
8 9 1011 12 13 14
1516 1718 19 20 21
22 23 24 25 26 27 28
2930     

Most Popular Tags

Expand Cut Tags

No cut tags