Sunday, July 5, 2009

Measuring Progress

An interactive test that could be used to periodically measure and track progress would be extremely valuable. I haven't found a good, comprehensive one yet.

Clavis Sinica offers a pretty good character recognition test. It generates an estimate of the number of characters the user knows. The test is located here.

The test seems to be well-designed. Characters are categorized by frequency of use. The "Beginner" level focuses on the 300 most frequently used characters. The higher levels progressively throw additional characters in to the mix. The estimated number of characters known is calculated from the percentage of characters correctly identified in each grouping. Since the Beginner level only samples from the 300 most frequently used characters, the highest possible score at this level is 300. The Low Intermediate level adds in the next 500 most frequently used characters, so the maximum score achievable at this level is 800. And, so on.

I took all four levels of the test and achieved the following scores:

I am not sure exactly how many characters I know, so I can't quantifiably gauge the test's effectiveness in that regard. But, the but the High Intermediate result of 656 seems plausible. Since I currently am studying in a "top down" manner (from Chinesepod lessons) instead of "bottom up", the order in which I am learning characters is only loosely correlated with their usage frequency. That is why I already know a fair number of characters in the 2nd 800 even though I still need to learn a large number in the first 800. I was 0-for-18 on characters beyond the first 1600, so there currently is no point in taking the Advanced level test. Doing so just reduces the sample size available for the more frequent characters.

I'll take the test again in a couple months and see where things stand then ...

Saturday, June 20, 2009

Return to Chinesepod Podcasts and Transcripts

I've been using Chinesepod transcripts and podcasts again over the last 6 months. My new emphasis has been on reading, writing, and building my vocabulary in an integrated manner. When I previously used Chinesepod, I had focused only on listening, speaking, and pinyin. I had completely avoided trying to learn the written script.

I am using Chinesepod material to move forward from where the Integrated Chinese Level I textbook left me. I had learned about 350 characters and 550 words through that effort.

I terminated my original subscription to Chinesepod a couple years ago, due to frustrations I experienced with their approach. I currently am re-visiting old Chinesepod lessons that I had downloaded and worked through previously. I have not yet re-subscribed to the service, although I plan to try that again soon.

Below is a timeline of the primary tools I've used so far in this Long March through the Chinese language. I've had to switch between systems over time, because I haven't yet found what a really good complete system for self-study of the Chinese language. There is lots of good content and many good tools available, but no really good system available yet, I don't believe.

I use the Chinesepod Newbie and Elementary level transcripts primarily as practice reading material. I don't generally listen to the podcasts associated with these lessons, since a majority of the content is in English. I am using the Intermediate lessons in a more integrated manner, though (i.e. reading, writing, listening, and speaking), because they are mostly in Chinese and full of rich content. Each Intermediate lesson typically introduces somewhere on the order of about 15 new characters and 20 new words into my vocabulary.

Sunday, May 24, 2009

Deconstructing the Chinese Language

I’ve found there are a number of different levels at which I need to study the Chinese language. A sentence, for example, can be broken down into the following hierarchy:

  1. Sentence
  2. Lexical chunks
  3. Words
  4. Characters
  5. Character Components
The diagram below breaks down the simple sentence, “I like to eat lunch,” in this manner (click to view full size image):

Lexical chunks are word patterns that commonly show up in the language. Ken Carroll of Chinesepod is a strong advocate of the lexical approach to learning languages. In the simple example above, “I like to eat…,” is a lexical chunk. It shows up frequently in the language, paired with a variety of different objects. E.G.

  • I like to eat lunch
  • I like to eat vegetables
  • I like to eat fruit
  • ...

Words can be formed from a single character or multiple characters. 午饭 - lunch, for example is formed from the two characters, –noon and - meal.

Characters are more basic elements than words. To successfully learn and remember characters, it generally is necessary to break them down into their components. In some cases I assign components their true etymological meaning, while in other cases I assign them a meaning that I find to be more conducive to building mnemonics. In the example above, “hand” is the actual meaning of . But, “chair” is a made-up name I have assigned to thecomponent, simply because it looks like a director’s chair to me.

Sunday, May 3, 2009

Using Mnemonics to Learn and Remember Multi-Character Words

My recent posts have focused on using mnemonics to help learn and remember characters and the components used to construct them. Mnemonics can also play a role up one level, in learning multi-character words. A significant fraction of Chinese words are composed of two or more characters. In some cases, the meaning of a word is closely related to the characters used to construct it, and therefore easy to remember. In other cases, the relationship is not so clear.

I have found there are four general classes of words in this respect. From easiest to most difficult they are:

  1. Literals
  2. Doubles
  3. Literal + Other
  4. Non-Literals


These are my own, layman’s classifications of how these words appear to me as a language learner; they are not based on the actual etymology of the words.

Literals

Most relatively new words to the language seem to have a fairly clean, literal, and intuitive association with the characters used to construct them. The names for electrically powered devices, for example, typically are easy to break-down and remember according to their functional nature. Most begin with the character (diàn) = electric, and the literal translation often is somewhat entertaining. Some of my favorites include:

  • 电脑 (diànnǎo) = Electric + Brain = Computer
  • 电梯 (diàntī) = Electric + Ladder = Elevator

The literal translation makes these words seem as if they are from a decades old science fiction book. These words were easy to learn, and I’m sure I will never forget them.

Some other words are less functionally “correct” in terms of literal translation, but still provide very strong visual images, and therefore also are easy to learn and remember (once their root characters have been salted into memory). A great example is popcorn:

  • 爆米花 (bàomǐhuā) = Explode + Rice + Flower = Popcorn

“Exploding Rice Flower!” How perfect of an image is that?

Some more conceptual words also fall into this class. An example is:

  • 开心 (kāixīn) = Open + Heart = Happy

Little effort is required to develop a mnemonic story for remembering this word.

Doubles

Some words are made up of two characters that have a similar meaning to the word they form. The most straightforward are those in which a character is doubled. For example:

  • 看看 (kànkàn) = See + See = To examine

More frequently, a word may consist of two different characters that both have a similar meaning to the word they form:

  • 告诉 (gàosù) = Tell + Inform = To tell, to inform
  • 休息 (xiūxi) = Rest + Rest = Rest

These types of words are fairly easy to learn and remember once the underlying characters have been absorbed into memory.

Literal + Other

Some words have a similar meaning to one of their characters, but not the other. I would guess these combinations had a more literal, intuitive relationship at some point in the distant past, but have since evolved over time. An example is:

  • 如果 (rúguǒ) = If + Fruit = If

I’m not sure how the fruit got in there.

Non-Literals

Most words have a less visibly literal relationship with the characters they are made from, probably due to evolution over time. They vary in how easy they are to develop mnemonics for. An example of an easy one is:

  • 拿手 (náshǒu) = Hold + Hand = Expert

Of course an expert can help by offering to hold hands through the process.

If you are an entrepreneur and one of your life focuses is a business idea, then here is another easy one:

  • 生意 (shēngyi) = Life + Idea = Business

And, if you are a journalist, then when a big news event occurs, your rest probably vanishes:

  • 消息 (xiāoxi) = Vanish + Rest = News

But, most words are more difficult than this. They require more effort to build a story for, and therefore generally are more difficult to remember. A couple arbitrary examples:

  • 从来 (cónglái) = From + Come = Always
  • 照顾 (zhàogu) = Shine + Look Back = To take care of

Sunday, April 26, 2009

Another Character Mnemonics Book

"Learning Chinese Characters", by Alison and Laurence Matthews applies mnemonics to learning and remembering Chinese characters. Similar to "Remembering Simplified Hanzi," it uses a bottom-up approach, starting at the component level and working up to build characters and words. The authors extend the use of mnemonics to address pronunciation, as well. I haven't tried doing this. A good preview of the book is available here.

I won't use the book because I have chosen not to adopt a bottom-up approach to studying characters. But, regardless of your plan of attack, the book provides a good reference for learning how to develop and apply mnemonics.


Thursday, January 1, 2009

Learning and Remembering Chinese Characters

The challenge of learning and remembering the large number of characters required to gain proficiency in the Chinese language (~3000) is enormous. Also enormous is the pedagogic topic of how best to go about tackling this challenge.

There are multiple aspects of characters that must be learned. Not just what the character looks like and means, but also how to pronounce it, the correct stroke order for drawing it, and what compounds (words) that it is used in.

Children in China generally build their written vocabulary through copious amounts of repetition – writing characters over and over and over. But, most laowai who pursue learning Chinese as a second language do not have the time luxury or patience required to follow this path.

Some characters are easy to remember, because they are simple, distinct, and pictorially associated with their meaning. Here is an example of a character that is very easy to remember:
  • - fēi – to fly


It only requires a few strokes to draw, it looks like a hummingbird, and it means to fly. Perfect.

Unfortunately, only a small percentage of characters are pictographs. And, many characters which technically are classified as pictographs have evolved over the centuries to the degree that they no longer look like the meaning they represent. For example:

  • – yuē - moon


This character doesn’t look like a moon to me.

Mnemonics

A variety of different strategies are used for learning and remembering characters. A common theme that most methods share is the use of mnemonics. This typically involves breaking each character down into components, labeling each component, and building a story or a picture that binds those components together to represent the meaning of the character. How well this works varies from character to character. Here is a classic example of the mnemonic approach:

  • Character: = to rest
  • Component 1: = man
  • Component 2: = tree
  • Mnemonic: to rest, a man leans against a tree


Both of the components that make up this specific character are very common, and show up in many other characters. In this case, they also happen to be radicals (more on this later), and the meanings I supplied for them are the standard definitions assigned by the Chinese language powers that be. It is not necessary to assign components a name which is derived from their etymological roots – you can name them whatever you would like. But, I’ve found that for most radicals, it's easiest to use their official assigned name. Examples of a couple that I have given alternative names to include:

  • – yòu – again. Alternative mnemonic name: chair
  • – zhǐ – go. Alternative mnemonic name: armchair


I created these alternative names because they provide a more tangible visual representation of what the components look like (to me at least), and because they were easier to fit into mnemonic stories.

OK, so what exactly are radicals anyways? I don’t really know, and haven’t found a good explanation of them. Theoretically, they provide the meaning basis for each character. But, often times the radical is only peripherally related to the character’s overall meaning, if at all. They are useful to know primarily because many of them show up as components in a large number of different characters. There are 214 official radicals. A good list is available at Yellowbridge. This list is helpful because it includes the Unicode representation of each radical and its common variants. This makes it ideal for incorporation into electronic documents, for the purpose of isolating and manipulating individual components.

Overall Method: Bottom-up versus Top-down

Most people use mnemonics in some manner, but there are many different ways they can be incorporated into a broader strategy for learning and remembering characters.

A somewhat famous language linguist named James Heisig promotes a purely bottom-up approach that focuses on building upward from components. It also focuses primarily on character design and meaning and leaves pronunciation and contextual usage for the student to pursue on their own. Heisig first developed this strategy when studying Japanese many years ago and recently collaborated with Timothy Richardson to apply it to Chinese. An extended excerpt from their book, Remembering Simplified Hanzi, can be found near the bottom of the page here in pdf format. It’s worth taking a look at to see if this method might work for you.

A top-down approach, on the other hand, focuses on studying characters simply as another element of learning new words and phrases.

Both approaches have pluses and minuses. I find Heisig’s approach to be too abstract and detached for my liking. I’ve determined that I need to learn characters within a broader context to stay interested and focused, so I embarked upon the top-down approach. The downside of the top-down approach is that it takes awhile to figure out the best way to break characters into components and how to build mnemonics from those components. Heisig has already tackled much of this heavy lifting for you with his approach.

Through trial and error, I’ve ended up using a hybrid of both the top-down and the bottom-up strategies. I find that it is best to build-up clumps of super-components so that any character can be represented using no more than just a few components or super-components. In most cases, super-components are characters in and of themselves, but characters which have not yet shown up in any of my studies to-date. Which super-components are worth creating and remembering depends on how frequently they show up in different characters you come across. An example of a common super-component is:

  • – hé – combine


Originally, I mnemonically referred to this combination by the names for each of the three radicals that form it:

  • - man
  • - one
  • - mouth


Man, one, mouth. After I started running into this combination of components in a variety of different characters, I made an effort to memorize the meaning of the aggregate super-component (combine), and now I use that to build mnemonics for the characters that incorporate it.

I find YellowBridge’s etymological dictionary to be an indispensable tool for breaking characters down into their components and super-components. A Unicode representation of most components is available within the breakdown they provide. This makes it easy to identify and then electronically copy and paste components into a document or spreadsheet for the purpose of building a vocabulary list that includes components and mnemonics in addition to the character itself. MDBG also recently added a somewhat similar capability to their site.

Other Items

Stroke Order. Is it important to learn the correct stroke order? I’ve found that it is helpful, because it is easier to remember how to draw a character or component if you always do it in the same way. And conforming with the standard method gives you a reference source to go back to for this. Using proper stroke order also improves the ability of handwriting recognition software, such as Plecodict, to recognize what you write.

Reading versus Writing. Since the typing of characters is accomplished using pinyin, is it necessary to be able to write characters, or is visual recognition enough? In most cases recognition is enough, and most people are able to recall many more characters than they are able to write correctly, especially when seeing them in the context of written text. A key question is: does gaining the ability to write characters help with long-term visual recall? I don’t know the answer to this, but it seems like it should. When using flashcards, I drill myself as much or more on writing the characters as on recognizing them.

Sunday, December 21, 2008

Who's Mastering Chinese in Australia?

Australian Prime Minister Kevin Rudd is fluent in Chinese, and has an ambition for Australia "to be the most Asian-literate nation in the Western world".

This article in a publication named “The Age” summarizes the results of an interesting study into the effectiveness of Chinese language instruction in Australia’s schools.

The report states that the study of Chinese in Australian high schools "is overwhelmingly a matter of Chinese teaching Chinese to Chinese":

  • Students studying Chinese as a second language are "overwhelmed" in assessments by "strong numbers" of students who have Chinese as a first language.
  • 94% of students who learn Chinese at some stage during their education drop out before year 12.
  • Of those still studying the language at year 12, 94% are "first language" speakers — Chinese-born or of Chinese descent.
  • Students learning Chinese as a second language at year 12 are required to master about 500 Chinese script characters — the same number reached by five-year-olds in grade one in China.

Jason & Sherry - Chinese Speaking Aussies

Friday, December 19, 2008

Learning to Read and Write Chinese - Part I

After completing Pimsleur, I decided that I needed to focus on expanding my vocabulary as the next step towards growing my Chinese language skills. And, I concluded that in order to do so, I needed to start becoming literate – to learn to read and write. How did I come to this conclusion? I found that the plethoric homophones in the Chinese language make it very difficult to imprint new words into one’s memory just from the sounds and the pinyin representation of those sounds by themselves. All the shi’s and xi’s and xiao’s and shu’s start to blur together over time.

The reasons I waited 2.5 years before starting to study the script are twofold: 1) learning the script seemed of secondary importance and value to learning the spoken language, and 2) I find the Chinese script to be unappealing, both in terms of aesthetics and function. To me, characters look like lots of random scribbles, and there doesn’t seem to be much rhyme or reason between the design of characters and their meaning and pronunciation.

Most laowai who eventually develop proficiency in the language seem to have originally been attracted to the script. I wish that were my case, but it isn’t. The Chinese script seems like a pointless puzzle, like Rubic’s cube or Sudoku, and I have no interest in those kinds of games.

I had only dabbled a bit with the script prior to embarking upon this new initiative. Since the script is not phonetic in nature, it is not intuitively clear what the best way to learn it is.

So, how best to proceed?

I decided that first and foremost, I needed to tackle reading and writing in a contextual manner, integrated with my limited speaking capabilities. While revisiting the characters I had poked at previously, it instantly became clear that I only remembered ones associated with words that I was already very comfortable with using verbally (e.g. hăo - good). The rest seemed completely foreign to me. They had failed to stick in my memory.

So, maybe a textbook would be a good tool? It seemed likely that an introductory textbook would focus on words I was already familiar with, and on applying them in context, rather than just introducing discrete characters (as “What’s in a Chinese Character” does).

I remembered seeing mention of a popular college textbook named “Integrated Chinese” on the Chinesepod blog, so I decided to give the first semester of this series (Level I, Part I) a shot. As the title implies, the series is designed for learning all four aspects of the language (listening, speaking, reading, and writing) simultaneously. I can’t imagine taking on such a challenge, but I’m sure there are benefits to this kind of approach.




The first book introduces about 350 characters and 550 words. I was familiar with most of the words, so was able to focus most of my attention on the written aspects of the language. The book also addresses many basic grammar points that had previously puzzled me, such as when to insert a “de” between adjectives and nouns, and when not to.

It took me a few months to work through all 11 chapters of the textbook. I used Plecodict as a vocabulary list and flashcard tool to aid the process. I learned early on that it was important to work with full words in addition to the individual characters. For example, I had difficulty remembering one of the most basic of characters, (sheng1 – to be born) until I repeatedly saw it as a component in the compound, 先生 (xian1sheng1 – Mr.). The reason being the power of context, of course. I rarely come across the verb “to be born” in every day language, but “Mr.” is very common.

The outcome (so far)

Studying the written script has turned out to be very valuable – I wish I had started much earlier. I recommend starting to study the script within 6 months of initiating study of the spoken language, rather than waiting 2.5 years as I did.

Having a visual representation associated with each word makes it easier to identify the relationship between words, and to imprint them in memory. As an example, the written script makes it clear that 已经 (yi3jing1 – already) and 经常 (jing1chang2 – frequently) both share a common character, (jing1 – pass through). This is helpful because both words are temporal in nature, so the shared character helps reinforce the relationship between the words. This relationship would not be readily identifiable from the pinyin alone, since “jing” is such a common sound in the language.

Although I still don’t care much for the written script, I feel that I am becoming more comfortable with the language as a whole from studying it. The language is starting to feel less opaque than pinyin leaves it. A tangible example of this surfaced while looking at a map of the country. The names of the provinces never meant much to me before. Sichuan, Shandong, Shanxi, etc. used to seem like arbitrary names. But, when written in the script, these names suddenly come to life: Sichuan is written as, 四川. This is (si4 – four) plus (chuan – river). Of course! Four Rivers! Likewise Shandong is written as 山东. (shan1 - mountain) plus (dong1 - east). Of course! East Mountain!

Upcoming Posts
  • Learning and Remembering Characters
  • Australians who speak Chinese
  • Lexical Chunks
  • American Celebrities who Speak Chinese

Friday, November 28, 2008

Will Chinese Become the Dominant Global Language?

When I started considering this question a few years ago, it seemed like a complex question. The eventual outcome would be determined by a tangled stew of global sociological dynamics, trade and investment flows, migration patterns, economic growth rates, popular culture development and marketing, thought and opinion leadership, technology development, etc.

I since have come to believe the answer is fairly simple: No, because Chinese is too difficult to learn and master as a second language.

Certainly Chinese is an important language today, if for no other reason than because more than 1 billion people use it. And, it will remain important for many decades and probably centuries into the future.

But, will large numbers of non-native speakers someday use it as a neutral, common language in order to converse with each other? Will corporate executives from France and Germany shift to Chinese when they meet to discuss business? How about students from Ghana and Thailand studying together in Canada? Or, government ministers from India and Australia discussing trade issues?

I don’t believe so. And, those are the attributes that define a “global language”.

Sunday, September 14, 2008

Typing Chinese Characters

Now that I’ve embarked upon learning to read and write the Chinese script, the question of, “How the heck do people type Chinese characters on a computer?” has become more than just a curious afterthought.

Do they have a 6000 key keyboard, one key per character?

Apparently not.

Most Chinese speakers (all?) use a software tool that allows them to type a pinyin syllable, and then select the actual character they wish to enter from a list of characters that match the pinyin.

Windows supports this through a tool called the Input Method Editor (IME). The easiest way I’ve come across to set it up is described here by a Shanghai blogger.