Vague vagaries: April 2010

Wednesday, April 28, 2010

Opera and the CPU

Opera 10.51 is running on my work PC under Windows Vista, with 2gb RAM and 2.66ghz Core2Duo. There are about 50 open tabs, and one of them is playing a Youtube video. CPU usage is listed in the Task Manager as around 2-3%, with 300mb of physical RAM used.

The ~~most~~ recent Opera (10.10 I think... actually 10.52 was apparently released today, but 10.10 still reports that there are no new updates) on my 2.33ghz C2D Macbook with 2gb RAM in the same configuration except without any Youtube videos playing, uses 10% CPU according to top, and 20-30% with a video playing. And that's with the Flashblock userjs - before it was much higher.
Why? Is it because my PC has a half-decent graphics card and the Macbook doesn't?

Tuesday, April 20, 2010

US government finally admits most piracy estimates are bogus

(link)

A nice, measured treatment of the flaws (especially the common citations of bogus or non-existent work) in pro-IP surveys and studies, usually commissioned by "content industries" (organisations like the MPAA, RIAA, BSA etc) which often contain "specific and alarmist rhetoric".

It's US-centric but has some very sane and generally applicable points, like: "For instance, these studies ignore the obvious points that pirating goods leaves consumers with more disposable income, which is likely spent elsewhere in the economy. Effects on the economy as a whole, then, are terribly speculative and seem more likely to be simply redistributive".

Thursday, April 15, 2010

Distinguishing between "choking" and "panicking"

If you've ever "choked" in any kind of performance (e.g. when you're far ahead in a snooker game and just need this one, simple shot to win, you can do this, just keep your shoulder down and your elbow straight, follow through with the cue and WHAT THE-), then this utterly superb article will have you nodding your head in acknowledgement, understanding and compassion.

Not only does it explain and separate the notions of panicking (reversion to instinct) and choking (loss of instinct) under pressure with dramatic examples, it introduces the interesting form of choke that is "stereotype threat" (which seems to correspond with something I wrote a while ago):

"Steele and others have found stereotype threat at work in any situation where groups are depicted in negative ways. Give a group of qualified women a math test and tell them it will measure their quantitative ability and they'll do much worse than equally skilled men will; present the same test simply as a research tool and they'll do just as well as the men."

Ultimately, we're faced with a Schrodinger-type paradox, whereby external, theoretically irrelevant variables (audience, expectations, prize, etc) influence performance on a task:

"We have to learn that sometimes a poor performance reflects not the innate ability of the performer but the complexion of the audience; and that sometimes a poor test score is the sign not of a poor student but of a good one."

Tuesday, April 13, 2010

Redundant T&C's

(http://www.meteor.ie/terms_and_conditions/billpaymax/)

These fair use conditions are such that a Customer's usage of this tariff plan shall not exceed 45,000 minutes of calls and/or 5,000 texts per month.

1 month = (max) 31 days = 31*24 hours = 31*24*60 = 44,640 minutes.

Since it's physically impossible to exceed 45,000 minutes of calls in any 44,640 minute period... why have that condition in the contract at all?
It's like having a "friends and family" discount with a condition that you can only apply the discount to a maximum of 7 billion people.

Saturday, April 10, 2010

Üter

Was walking to the train station with the childe on the way home from town, going by Leinster House. To prompt the dawdling girl into hurrying up I told her the Garda stationed at the gate would catch her if she was bold, which naturally caused her to sprint away at full pelt. As we passed, the young cop called out: "Ah don't make me run, I'm full of chocolate!"

Good to see a Garda with a sense of humour (and a decent knowledge of Simpsons episodes) :D

Friday, April 09, 2010

Subclipse: "An existing connection was forcibly closed by the remote host"

Subclipse "suddenly" stopped working, so I couldn't commit or synchronise to a svn repository anymore:

RA layer request failed
svn: Commit failed (details follow):
svn: OPTIONS of 'http://big-long-svn-path': Could not read status line: An existing connection was forcibly closed by the remote host.

Maybe a recent update of TortoiseSVN bolloxed it up, who knows... anyway, I worked around it by going to Team->SVN in Eclipse's preferences dialogue and changing the client in the "SVN Interface" section from JavaHL (JNI) to SVNKit (Pure Java). Works so far, although I had to re-enter the username/password which had been stored before.

Wednesday, April 07, 2010

Interregnum

Saw a nicely vandalised sign in the DCU car park today which reminded me (pleasantly) that this is Ireland. Someone had crossed out three letters so it read:

NO ~~SMO~~KING BEYOND THIS POINT

Chinese character frequencies

After a long time of somewhat naïvely trying to learn Chinese by adding production flashcards for new words (where the front side is a English term with hints to avoid guessing an answer that was correct but not the one on the back side, and the back side is Chinese characters and phonetic pinyin), I realised the task was far too difficult and time-consuming. For each of those cards, I'd write the characters on a graphics tablet and speak them, then flip the card and fail it if I made any mistakes in either the writing or speech. This was needlessly laborious since there was so much redundancy and opportunity to make small mistakes even if most of the answer was correct (writing out 印制电路板 (printed circuit board) many times was extremely tedious and unproductive).

So some reading on Glowing Face Man's blog led me to switch things around a bit, changing my deck so that the only characters I would write (production) were single characters, of which there are still very very many (over 20,000!) but the most common 3,000 account for over 99% of what you'll see in actual modern Chinese. All the other cards changed to recognition, where the front side is the Chinese characters and the back side (what I speak out loud before flipping the card) is phonetic pinyin and a (sometimes rough) English translation. Rather than mess about with Anki's deck format or exporting/modifying/importing, I wrote a dodgy AppleScript program to automate moving through the deck interface and sending keystrokes to cut, paste and rearrange the text... even crappy automation can be better than changing 2,500 cards manually. In fact, it would still be better even if it took the same amount of time, because of the sense of reward that it spurs.

This has helped immensely, reducing the pain and greatly increasing throughput and efficiency. However, learning the characters still takes time - my current plan is to go through the 3,000 most common ones and learn them as production cards before carrying on with sentence recognition cards.

But why 3,000 characters? Why not half or twice that? And which ones?

That's answered here - a computer program can quickly go through a huge corpus of text and produce a sorted listing of characters by frequency. Predictably, the first couple of hundred characters account for a huge fraction of written Chinese: 200 characters will get you 55% understanding (that's "most" Chinese already, heh), 400 will get you 70%, and so on. (Of course, when I say "understanding", I'm ignoring the fact that you need to learn the grammar, idioms and so on, and which of many possible meanings a character will take on in different contexts.)

A quick plot of the numbers provided produces a roughly logarithmic shape, showing diminishing returns (given the roughly constant time required to learn characters):

So it looks like the payoff is small by the time you're hitting around 2,500 characters (98.5%), but it would be nice to say that you only don't know <1% of written Chinese when you hit 3,000 characters (99.2%), and only add more unfamiliar characters to the deck as you encounter them during reading, less and less often.

Saturday, April 03, 2010

Tip of the Tongue learning is bad!

This article came as a surprise - my default assumption was that "working through" this tip of the tongue state until I came upon the answer. The research demonstrates that the time you spend agonising and searching for the answer causes the same thing to happen next time - you're "practicing" the stuck condition.

So the best thing to do is to have a short timeout (10 seconds was better for future remembering than 30 seconds, in the study) whereupon you give up and look up the answer, or make a note to check later or something. Anything but keep struggling until you remember the answer the hard way, since it only facilitates the same wrong mental paths in future.

Two more suggestions...
1. When you struggle with a tip of the tongue thought, whichever way you manage to resolve it, make an entry for it in an SRS program like Mnemosyne or Anki.
For example, the researcher who carried out the study said that she often struggled to remember the word "obsidian". So when you notice that you tend to struggle with this word, you add a flashcard to your SRS with "glassy lava rock" on the front, and "obsidian" on the back. Then when reviewing the cards, if you can't remember the answer after 5-10 seconds, you give it a fail mark. If you remembered it quickly, give it a passing mark. The SRS program will take care of the rest, managing the transition of the properly-learned knowledge into your long-term memory.

2. When you see a friend (or a child!) struggle for a word and you can guess what word it is, put them out of their misery ASAP, and if they say "ahh, I would have got it, why did you tell me?" then explain why!

Vague vagaries