Feed Icon RSS 1.0 XML Feed available

Trigrams 'n' Tags

Date: 14-Oct-2008/9:49+3:00

Tags: , , , , ,

Characters: teacher, me

I was sitting at a desk in the corner of a classroom, listening to music on headphones. For some reason I didn't think I had the same obligations as other people there.
Note This assumption could be, of course, because I don't have academic responsibilities right now and haven't for many years...so it's just off my radar.
Someone who I assumed was teaching the class came over to me, and handed me what looked like a dollar bill.
teacher: "I'm giving you a raise."
me: "Uhh... a dollar? Gee. Thanks."
I couldn't read it and the paper looked discolored.
me: "Are you joking? I mean, did you run this through the washer and find it in your pants pocket or something?"
teacher: "No, it is colored with blood and filth."
Not really believing her I looked closer and realized it wasn't a dollar bill but rather a check whose amount was for approximately a dollar. The printing on the background was wavy and colored, perhaps causing my earlier impression that if it was a dollar run through a washer. Looking at the date, it said it was good from some month in the year 1000 to some month in the year 1010. The issuer was something company with an old-timey name that started with a P.
me: (excited) "Hey, this expired in the year 1010...maybe it's worth something historically after all!"
Many things about it started not adding up. The first doubt I had was why there'd be a 10 year range on allowances to cash a check. I became suspicious that modern English spellings, paper, and other such things would not have been able to survive through 2008. The endpoint of this questioning process had me into lucidity where I wanted to read more of what was around me.
I reached into my pocket and produced a report, which analyzing some essay. It seemed to be printed on something like a pay stub you'd receive from an employer. Blue with a lot of courier fonts and sections.
It had statistical information about the number of words, a lot of metric data. There was also a section of analysis of what plotlines it was similar to, and listed shows like Buffy the Vampire Slayer. Most of it was numbers and lists, but there was a paragraph analyzing the writing which started something like:
Funny. Very funny...but at the same time, not mind-bendingly funny.
Scanning around to figure out any other name, the company that produced the report seemed to be called TNTrealTNT.text.TNTrealTNT--and I thought about how screwed the domain name system is we end up using things like that.
I'm going to call this one a hit. Because there is an NLP text tagging tool called "TNT" which stands for "Trigrams-n-Tags":
TnT, the short form of Trigrams'n'Tags, is a very efficient statistical part-of-speech tagger that is trainable on different languages and virtually any tagset. The component for parameter generation trains on tagged corpora. The system incorporates several methods of smoothing and of handling unknown words.
It's an unusually topical match, considering I'd never heard of it!
Currently I am experimenting with using Disqus for comments, however it is configured that you don't have to log in or tie it to an account. Simply check the "I'd rather post as a guest" button after clicking in the spot to type in a name.
comments powered by Disqus
copy write %C:/0304-1020 {Met^(00C6)ducation}

The accounts written here are as true as I can manage. While the words are my own, they are not independent creative works of fiction —in any intentional way. Thus I do not consider the material to be protected by anything, other than that you'd have to be crazy to want to try and use it for genuine purposes (much less disingenuous ones!) But who's to say?