First Pass for the Food Words!

Word Fans, I have done it!  All the way from “cellar” to “tobacco-jar“, I have scanned for all the food words, common and uncommon, and entered them into the concordance.  I’m certain to have missed some, and I am humbly ready to call this my First Pass.  Alert Readers who put me wise to food words I have missed will have a verse written in their honor in the style of the Tra-la-la-lalley Elves.

Let it be noted that I have already had a good argument with myself over “supplies”, and have decided that it’s not a food word.  It is used in “food-supplies”, which is counted separately, and in all other instances can indicate “bandages” as well as it stands for “food”.

Next I will make some lovely graphs of food words.  I’m interested in their frequency and location in the text; I also have an idea in the back of my mind to do a deeper analysis including a negative valence for those times that food words indicate a lack of food.

As I made this first pass, I also took the chance to improve my file of the text.  I’ve eliminated many of the phrase-breaks which left only one-word phrases, fussed with punctuation breaks, and started keeping an eye out for use or non-use of a marked subjunctive.

Concordance-reversed!

Great news!  On the About page, you can always find listed the tools which have been created for this project.  As of yesterday, Tech Support added Concordance-Reversed.py to the Digital Humanities Toolkit.

Concordance.py will take your text and strip out a set of Stop Words.  It’s what I used to strip out the Ten Thousand most common words from The Hobbit.

Concordance-reversed.py will strip out everything but a set of Go Words!  It’s what I’m using now to find specific lemmas, such as Fish, Fishes, Fish’s, Fishes’, Fished, Fishing.  Put all the different forms of your lemma into a text file of Go Words, and you’re on your way!

I love you, Tech Support, from your grateful Mama.

Rescuing lovely uncommon forms of common words

In the beginning of this project, I needed to simplify our list and I bid farewell to such beauties as “shod” and “unbeknown”.  My current occupations, now that I am free to expand our concordance in any manner that is useful to us, is to find those delicious words and record them properly in our project.  I have begun with those which I noted in this blog as I had to wave them goodbye.

10K tag complete

It took a while, but all the Concordance entries so far have been tagged “10K” so that we can make some new entries of common words this weekend.  A good handful of the entries to date will also get the new “common” tag, of course, as they are common words spelled in a gollumesque way.

Lately

What’s happening lately in the project is a boatload of behind-the-scenes work.  I am not surprised (but still chagrined) to learn that repetitive tasks I could have completed in a day when I was 100% focused on this project take many, many weeks when I am returned to the workaday world of family and profession.

I have been proofreading, correcting entries in my spreadsheet, double-checking for words which got lost between the cracks, searching for more onomatopoeia, and making judgement calls on a few more food words (like dining-room).  Today I am still marking the uncommon words with the 10K tag so that I’ll be ready to add more common words very soon.  Already the post of Feminine Pronouns is crying out to become the official entry on “She”.

Ever on!

Well, it’s been a week!  After the exhilarating presentation Monday night I had a couple of days of curling up in my hobbit-hole to do domestic chores and cuddle with my little one who is no longer small.  There were some difficulties with the audio of the recording, so we may re-record it in the near future.  If it’s possible for guests to listen in on that second edition, I’ll let you know.  Then the better recording will be made freely available, because that’s how Signum University rolls!

One of my goals is to find funding to continue this project at a professional pace – any advice is appreciated!

Meanwhile, I will proceed with small tasks, such as tagging the uncommon words with “10K” before I add any common words; I really would like to add “Luck” and “Chance” and “Fortune” and similar, which are within The Ten Thousand.  I’ll keep you apprised here on the Home page of what I’m up to!