Custom word lists with text analysis

Chat about just about anything else
Post Reply
Level 3
Level 3
Posts: 120
Joined: Mon Nov 05, 2012 1:53 am

Custom word lists with text analysis

Post by Doranwen » Sat May 26, 2018 3:03 am

I've been pondering the best way to do something for a while and it occurred to me that eliciting the input of those of you on here might be a good idea. (And if this is in the wrong place, mods, I apologize in advance - I struggled to figure out where I should post this and finally decided this was the best option, since I'm open to a solution that's not limited to Mint.)

I use Mint as my main system at both home and work (though I do have some Windows VMs and keep around a Windows box or two for various programs and games that simply won't work via Wine), so I can handle a solution to this that involves software running on either system, though of course my preference is to use Mint. :)

I''m a teacher who's been developing a reading curriculum (the main part of it will be given away as it's designed to be used in areas with extremely limited resources, such as developing countries). I won't go into all of the principles of this one (as opposed to the majority that are out there - just note that there are elements of this one that make it unique from anything else, hence the "reinvention of the wheel", so to speak), but relevant details are that it does involve many lessons which each have wordlists associated with them. The texts for each lesson include only the words associated with that lesson, as well as the words from previous lessons. The most difficult part of all of it has been creating sentences and longer texts for each lesson. I have cumulative wordlists to work from (ones that include all the words up through the current lesson), though those only go through the lesson I'm working at. I do have wordlists for future lessons (they're subject to change but they're mostly good to work with as-is).

What's really frustrating me is that I can create texts for decoding practice (sounding out the words), but I have absolutely no way of easily analyzing an existing text to place it in the sequence. For instance, if I wanted to find out at what point a specific Emily Dickinson poem could be used, or a verse from the Bible, or a passage from The Adventures of Tom Sawyer, I currently would have to look at each word one by one and search the document of wordlists to figure out where it fits in - then note the latest lesson number of all the words used, and mark that passage for that lesson. I can do that with a short sentence, tedious as it is - but a poem or a passage would take ages! I am certain that there is a better method out there (it seems like something computers were born to do), but I don't know how to implement it.

A couple possible solutions have occurred to me:

a) While I've tried searching, it may be that I didn't find the right combination of words, or know how to read the description of software features, and there may be a piece of software designed to do something like this. I think it rather unlikely, but there's always that chance. (Though if it costs hundreds of $$, it won't be helpful anyway - I don't have that kind of money, lol.) The closest I could find was PrimerPro by SIL - but it's designed for someone creating primers to read completely phonetic scripts in local languages around the world, and meant to focus on new characters being introduced. English, being much more complex with its orthography, doesn't really work there - and it's not designed to analyze texts and place them in a sequenced curriculum anyway.

b) Perhaps there is an online website that would do something like this. Again, I've done a fair bit of searching and turned up nothing, but this doesn't mean that it couldn't exist...

c) I would have to create my own. Sadly, however, I have no programming skills or experience, and my experience with scripting has been only reading and trying to figure out why someone else's failed (I did manage that, but it took ages, and my solution was such a rough hack that I wouldn't give myself any experience points with scripting!).

If anyone has any ideas, either fleshing out one of my possibilities above or something else, I'd love to know.

Note that I don't actually have the very last word lists complete - at this point I'm still finding words and going "oh, that one doesn't have a place, it doesn't fit in the earlier lessons, so it has to go in this level" and I'm dumping all of them into a level I haven't created specific lessons for yet - but that's OK, they can be just treated as one gigantic lesson. If I were to take any given passage, though, I'm sure there'd be words here and there I haven't placed, so it'd be good if whatever solution I used would actually tell me "this word's not in the lists!" so I could figure out where it belonged.

Edit: So, apparently the best way to find a solution is to post asking for help, and then it'll turn up. I tried one more search (and I'd tried several before!) and suddenly this time I've run across a piece of software that even works in Linux (as well as Windows and Mac)--and it seems to have a "vocabulary level" thing going on, where one can specify the levels. I might actually be able to make this one work! The software in question is AntWordProfiler, found here: ... dprofiler/

While I haven't had a chance to fully explore it yet, it does look like it'll do what I was wanting. What a relief!

Post Reply

Return to “Open chat”