The post is not about Legend of the Galactic Heroes so much as my questionable approach to tackling Japanese reading material. This post assumes awareness of the flashcard software Anki.
A couple years back when I was still fairly new to this, I would hear very often that you should read, read, and read some more to really push your Japanese ability further. This is still commonly given advice in Japanese learning circles. I think anyone who’s seriously tried to do this has at some point probably faced the frustration of having to look up too many words. Hence what I eventually started to do was pre-learn vocab words contained in things I was interested in, to reduce dictionary look-up burden.
I’ll be the first to admit that at this point of my Japanese study, my time would be much better spent actually reading things instead of preparing to read things. But it’s hard for me to resist spending a few days grinding through my Anki card creation process so I have a steady flow of new vocabulary to learn and reinforce through Anki. I still think there is some value to this, especially if that vocabulary is directly applicable to helping me read books I am interested in. Currently I just go through 7 new cards a day as opposed to my previous 20 I used to do (I run out of new cards too fast otherwise, and my Anki load grows larger than I’d prefer).
I find it somewhat interesting to observe the new vocabulary counts between subsequent volumes in a novel series. Before I get to those, I have to briefly explain my card creation process (which perhaps deserves its own post some other time). It basically goes like this:
- Get digitized version of some novel
- Run it through my scripts which do the following:
- Parse through my existing Anki database to determine the set of already known words
- Parse through a separate text file containing a list of words to ignore, and add those to the set of already known words
- Parse through the digitized version of the novel, and spit out sentences with unknown words highlighted. Sentences containing only known words are ignored.
- I take the output of the script and then import the automatically generated flashcards into Anki.
- Then I go through all the generated cards and adjust the highlighting on the sentence to include more context if necessary. If I deem a word unnecessary to review through Anki, I add the word to my list of words to ignore and delete the card. Typically these are loan words or words whose meaning I feel can be easily determined by the comprising kanji.
An example card that tests for 諧調 from Vol. 6 of Legend of the Galactic Heroes
Much of this process used to be manual. Over the years more and more of it became automated by the scripts I developed for personal use. However, there is no reasonable way for me to automate step 4. After this process I typically end up with about 40%-45% of the cards that were generated before step 4. Note that as a consequence of this process, what will happen is that words flagged as unknown in earlier volumes will not be flagged in subsequent volumes, if processed in order.
Before I started this process I had 14957 cards in my personal deck already, along with a Core 2k/6k vocabulary deck (i.e. 6000 pre-made vocabulary cards). My list of words to ignore was also approximately 9500 items. This process generated a total of 2166 cards. Here are the numbers after going through this process with the 10 volumes of the main series, and the 4 longer volumes of the side series. These numbers are presented in the order the volumes were processed.
|Name||Number of cards|
|銀河英雄伝説 外伝1 星を砕く者||92|
|銀河英雄伝説 外伝2 ユリアンのイゼルローン日記||42|
|銀河英雄伝説 外伝3 千億の星、千億の光||61|
|銀河英雄伝説 外伝4 螺旋迷宮（スパイラル・ラビリンス）||67|
Even between volumes 1 and 2 of the main series you can see a precipitous drop in the number of cards that come out of the process. By volume 5 the descending trend has mostly stabilized. While this isn’t a true quantitative measure of overlapping vocab, due to the subjective nature of step 4 of my process, it still provides some reassurance that an initial vocabulary barrier goes away after a couple of volumes. A ton of vocabulary comes from military ranks or titles referring to the various levels in a hierarchical empire. That should not be a surprise for anyone familiar with Legend of the Galactic Heroes. Besides that, there were also pockets of science terminology and annoying alternate kanji forms.
By the way, my list of words to ignore grew by about 1500 words or so after this (I didn’t bother tracking the growth between volumes). As expected there were a bunch of loan words or foreign names that got caught by the parser that wasn’t worth my time studying in Anki. Overall though there’s more Japanese words in there than not.
So far I’ve finished studying (i.e. I have at least one review) all the cards from volume 1 of the main series, and I’m in the middle of getting through the cards from volume 2. I don’t plan on actually starting this series for a while due to wanting to get some more reading practice with easier material and also because I have a pretty big backlog of other books/series for which I’ve already pre-learned vocabulary (Kino’s Journey, Spice and Wolf, Biblia Used Bookstore Casebook, and other books that are more difficult than these). Despite what I said at the beginning, Legend of the Galactic Heroes will probably be the last longish series for which I intend to pre-learn vocabulary. I don’t think I’m going to bother pre-learning vocabulary for single volume titles either.
Reading Legend of the Galactic Heroes at this point is still somewhat of a distant goal, but I’d really love to absorb myself into the source material of one of the greatest anime series ever made. Hats off to the late director Noboru Ishiguro and the rest of the staff for doing a fantastic job (Takeshi Shudo did some screenwriting for this franchise as well!)
Update (04/05/2018): As of today I’ve finished “learning” the last card I created through this process… I have yet to actually take a stab at reading any of the novels. Too much backlog of things to read and not enough actually reading.