Science & Fun - Jonathan Graehl

Inglorious Basterds, Avatar, Up in the Air, Sherlock Holmes

Avatar was very fun in 3d, and although it seemed incredibly dumb at times, even Lucas-like in dialog, it showed a type of spare simplicity and good taste that I didn’t see in the Star Wars prequels. Tarantino is getting even better. I love his musical direction, his tense setups, his delightful gore, even his witty banter. It’s nice that he didn’t act in this film. Basterds is my favorite recently made movie.

Up in the Air bored me intensely. We left after 40 minutes of one-note wanna-cutesiness. I may finish it one day on DVD. Instead we saw Sherlock Holmes, which was also utterly boring as far as interaction between or development of characters, but offered some momentary visual and technical charms. Fun slo-mo fight w/ incredible planning (as in absolutely unbelievable; Holmes must be seen as a comic superhero, which I gladly grant). I’d say this is the same type of movie as da Vinci Code, only with a more pleasing style and lead. I love Tom Hanks but his da Vinci character (Dr. Robert Landon?) is boring as fuck.

Jan 10th, 2010

Unintended Consequence

MILWAUKEE – Cities around the country that have installed energy-efficient traffic lights are discovering a hazardous downside: The bulbs don’t burn hot enough to melt snow and can become crusted over in a storm — a problem blamed for dozens of accidents and at least one death.

I’d try to attack this problem with eaves and surface coatings rather than heat.

Dec 16th, 2009

Criticism Is Easy

From In Praise of Appreciative Thinking:

I presented three computational models for how attention could lead to consciousness. Based on the papers we assigned, one of them had strong support from the data (Taylor’s CODAM), one of them had medium data-based support (Wang’s LEABRA, small amount of data given via Figure 3 here), and one had little to no support from data (Cavanna’s figure 1 here, the “ascending reticular activating system”). We asked the thirteen other students to rank the models at the beginning of the class, with 1 being the best. Taylor’s model was ranked last, with an average rank of 2.21, Wang’s was second best, with an average rank of 2.07, and Cavanna’s was considered the best, with an average rank of 1.71. So, there was an inverse relationship between the amount of data presented between a model and how much people preferred it.

I’m sure the students preferred the storytelling of the least-supported theory. But I agree they should prefer the story that is a proven model for actual data, even if it’s less pleasing aesthetically.

It’s probably what everyone has told them to do their whole lives: be critical of everything you read, don’t trust statistics, etc. The better advice would be to simply try to gauge the veracity and utility of any individual claim based on the data you are given and your prior beliefs about the possible bias of the data source.

Interpreting data is hard. Most students probably didn’t try to do so.

If people are critical of every new idea they encounter, all that will do is bias them towards favoring the status quo or the null hypothesis. I blame the market failure on our social norms: being critical is too often viewed as automatically synonymous with correct.

Lazy skepticism is a necessary filter; too much “research” is being produced and publicized. However, if a trusted expert recommends a paper, I’d be happy to learn about the evidence in it. I’d prefer that they summarize for me the data and the inferences they think should be drawn from it, because I’ll rarely bother to sift through a paper aimed at the clique of academics who specialize in the topic.

Nov 23rd, 2009

Explosive Exercise

EXPLOSIVE EXERCISES IN SPORTS TRAINING: A CRITICAL REVIEW
This paper reviews evidence relating to the effectiveness and safety of explosive exercises, such as Olympic style weight lifting, other weight training exercises performed at a fast cadence, and plyometric exercises, that are commonly used in the strength and conditioning training of athletes.Contrary to popular belief and the practices of many athletes, the peer-reviewed evidence does not support the view that such exercises are more effective than traditional, slow and heavy weight training in enhancing muscle power and athletic performance.
In fact, such exercises do not appear to be any more effective in this regard than weight training at a relatively slow cadence, and some evidence suggests they are less so.
Also, such explosive exercises do not transfer well (if at all) to athletic performance on the sports field, and present a significant injury risk.

Nonetheless, I expect that the stereotypical uncoordinated, awkward, muscle-bound bodybuilder really exists to the extent that he has practiced no additional physical skills. Presumably the sport is practiced explosively to the extent that there’s any performance benefit in doing so. People love to win.

Nov 23rd, 2009

Endurance Exercise

I’ve quoted selectively from http://www.bodyrecomposition.com/training/methods-of-endurance-training-part-5-interval-training-part-2.html
which is quite long.

Adaptation to both varieties of anaerobic intervals below seems to reach diminishing returns after just 2-3 weeks (6 sessions).

Training anaerobic power is pretty much what you’d imagine. Generally work bouts lasting 30-60 seconds are used with fairly long rest intervals. For a one minute maximum effort, a rest interval of 4-5 minutes of easy recovery might be taken. The idea, essentially is to generate a lot of power (and a lot of waste products, some have even called this lactate production training) during the work bout and then give the body a chance to clear it during recovery. That’s then repeated a number of times (perhaps 8-10 in total depending).

Anaerobic capacity training, as noted above, is more along the lines of repeated longer repeated efforts (60-90 seconds) with rest intervals of perhaps 1:1 (so 60 seconds rest for a 60 second interval, 90 seconds for a 90 second interval). Intensity has to be lower here and the goal is to maintain power output against increasing fatigue (some used to call this lactate tolerance training).

this energy system is not massively trainable. Certainly buffering capacity can be improved (and science continues to understand the adaptations involved) but overall improving the power or capacity of this pathway is relatively limited; you put in a lot of very hard, very draining work for not a lot of payback.

the adaptations to this type of training occur quickly but then stop just as rapidly. Studies find that the improvements to this type of work typically stop after about 3 weeks. In one study of cyclists, 12 HIIT workouts over 6 weeks generated no better improvements in performance than was seen after the first 6 workouts over 3 weeks.

in Effects of Moderate-Intensity Endurance and High-Intensity Intermittent Training on Anaerobic Capacity and VO2 Max a similar pattern was seen: the major effects occurred in the first 3 weeks with far smaller effects occurring in the last three.

the 2000 German track cycling team, preparing for the 4km (an event lasting about 4 minutes); they performed miniscule amounts of anaerobic training and did so for about 8 days prior to the event. They had found that that was all that was required to max out the anaerobic pathways.

the high acid production caused by these types of anaerobic intervals may damage mitochondria and overall endurance especially if it’s not balanced by sufficient low intensity work (Lydiard, for example, was adamant that anaerobic work never be performed while building the aerobic engine). This was an older idea that has seen recent scientific validation: in a recent study, individuals who consumed bicarbonate prior to training obtained better aerobic adaptations, presumably by buffering acidosis

Aerobic training for VO2 max also reaches diminishing returns in just 3-8 weeks (8 weeks starting from a very weak base).

Interval training, specifically the kind that drives athletes to achieve VO2 max primarily affects heart function. Certainly there are going to be peripheral effects in muscle but the main effect is in heart.

VO2 max can be sustained for 5-8 minutes at absolute maximum, generally VO2 max intervals are performed for anywhere from 2-5 minutes (with 3 minutes being a fairly standard duration). It will generally take about one minute for the body to achieve Vo2 max such that anywhere from 1-4 minutes is performed at VO2 max. With multiple sets, a rather large amount of training at VO2 max levels can be achieved without having to go to complete exhaustion.

Rest between aerobic intervals should be 50-100% of the duration of an interval.

Steady state aerobic endurance requires and develops an endurance adaptation in muscles

interval training improved Vo2 max far more than long-duration training, it had little to no impact on citrate synthase levels [evidence of aerobic endurance adaptation in muscle]. In contrast, steady state training primarily improved citrate synthase levels with little impact on VO2 max.

I’m sure that starting from very low fitness, you could see improvements to both with some intermediate duration. But in already fit people, you really mostly only build one thing at a time (and sometimes that work can actually actively degrade other areas).

Steady state means, of course, that you could keep going for a long time (an hour). If you want to adapt faster, you’d work near the threshold of the most intensity you could maintain for that time. It’s probably good to do more sets (throughout the week or back to back) of shorter duration (10-20 min) rather than actually working that hard for an hour. I don’t even want to determine what my limit is for an hour; I’d just work really hard for 10-20 min and then reduce the intensity by a little bit from that benchmark as an estimate of what I could do for an hour.

Goal	Intensity	Work Interval	Rest Interval
Neuromuscular Power	Maximum	6-15 seconds	Complete (2-5 minutes)
Anaerobic Power	Just Below Maximum	30-45 seconds	2-5 minutes
Anaerobic Capacity	High	60-90 seconds	60-90 seconds
Aerobic Power	Above maximum steady state	2-5 minutes	2-5 minutes
Aerobic Capacity	At/near maximum steady state	8-20 minutes	4-10 minutes

Nov 20th, 2009

Transition

Iain Banks is a great writer and an interesting thinker. When his characters muse, I’m actually amused by their perceptiveness. I’ve liked everything of his I’ve read so far (including the non-science-fiction e.g. The Wasp Factory). Transition is fun: a minority of people can travel between the quantum many-worlds as bodysnatchers. They tweak the worlds to their liking (out of altruism, of course). The ending is a relative mess (a super powerful being provides the resolution), but it was a fun ride. Of course it’s impossible to really believe in the pretend science, but it’s still much tidier than your typical hollywood bullshit.

Nov 16th, 2009

The Quiet War

Paul McAuley’s Quiet War was a little preachy, and the conclusion was almost as much a hamhanded letdown from the momentum established in the beginning of Stephenson’s Diamond Age, but I still enjoyed it. Several characters are trapped in annoying situations by powerful people who use them, which almost made me feel resentful toward the author, who was really at fault. I like the author’s vision of explicitly engineered ecosystems in space and on earth. I’d read this author again.

Nov 16th, 2009

Air

Geoff Ryman’s Air was delightful, even a bit of mystical hoo-hah thrown in. Great writing and story.

Nov 16th, 2009

Pride and Prejudice (2005)

fantastic. top 50 for me. keira knightley surprised me (after he vapid role in pirates) with great acting (everyone’s acting choices appealed to me, except for the fey rich sister). i was extremely pleased with the many clever, artistic shots (sometimes well-synchronized to musical phrases). it’s rare that i appreciate a courtly dance scene, but the fading away of everyone but elizabeth + mr. darcy was fun. mr. darcy is essentially a superhero. i wonder if many women today find him as attractive as i do. his forceful manner would be obnoxious if it weren’t supported by such competence and reliability beneath, but i far prefer him to the modern hyper-cute “player” archetype, or the shallow self-conscious james bond.

elizabeth, while far too intelligent to comfortably fill the role offered her in society, ultimately isn’t capable of taking responsibility for her own feelings, relying on the man to take the initiative. she definitely wouldn’t have earned his admiration were it not for her many flashy outbursts, which did show an uncompromising strength and self-assurance that attracted me, even if they seemed at times either rash or affected (covering up some hurt to her pride).

Nov 10th, 2009

The Beat That My Heart Skipped

top-30 movie. underrated.

Nov 9th, 2009

Oldboy

The (South) Korean film Oldboy is pure joy. My only complaint is with one silly, but fun, 1 vs. 20 hallway fight.

Spielberg wants to do a remake without the incest (apparently based on the manga and not the movie). I expect it will be worse in any case.

The english dub was the default, and in the very first minute it was evident that it was absolutely awful. Subtitles should be mandatory, because without it you lose most of the power of the great acting by the two male leads (the woman who plays Mi-do is also very good).

Nov 6th, 2009

Nabokov Revised Too

Shakespeare’s revisions (and Nabokov’s) matter for two reasons. Revision indicated that even these writers shouldn’t be considered godlike figures from whom the muse poured forth perfection on the first try, but writers who are—in some ways—like other writers, in at least this respect: They were subject to second thoughts.

This popped out at me while skimming a fawning piece on Nabokov, who must have died recently.

I thought it was well-known that any artistic, musical, literary, mathematical, or engineering genius both used writing to enhance their mental working space, and improved substantially on their work by revision. I’d say whatever exceptions there are (who did it all in their head and submitted only first drafts), could have been better had they edited. If they didn’t edit or redraft, all I’d say is that they didn’t find the activity rewarding, not that it wouldn’t have helped.

I’m not saying that the most-revised work is always the best (though Madame Bovary may be excellent; I haven’t read it).

It was seeing this edit of Kafka’s work by Nabokov in my RSS reader that started me wondering if Nabokov was dead:

On the whole, the edits improve it. I’ve never read Nabokov - but I think I will, soon.

Sep 25th, 2009

Risk Taking While (No Longer) Drunk

Scientists at the University of Washington (U.W.) in Seattle fed alcohol to a group of rats and found that their ability to make good decisions was impaired even long after they stopped consuming booze.

Researchers found that the tendency to make imprudent decisions did not fade for up to three months after the rats stopped consuming alcohol.

(from Scientific American’s report)

In many ways, rats are nothing like humans, but I expect that whatever mechanism explains this in rats will also exist in us. The rats weren’t actually visibly drunk or addicted (displaying withdrawal) to the alcohol; they were just always slightly intoxicated for 20 days straight.

After ending alcohol consumption, the rats got to play with two reward-dispensing levers; one with a high expected reward but no extremely large “jackpot”, and the other with worse expected payoff, but larger payoff size (in both, I assume there’s some probability of receiving no reward).

I wouldn’t expect rats to correctly compute expected reward per trial, but nonetheless that rats who never got drunk preferred the higher-average lower-peak reward, while the drunk rats liked the bigger-jackpot less-rewarding lever, for up to 3 months of soberness. We’ll assume the scientists randomized the positions of the jackpot and consistent levers between trials; otherwise the persistence of preference would be explained by habit.

Sep 24th, 2009

Dignified With a Response

Denials of misinformation from trusted experts are partially counterproductive; three days after reading a pamphlet correcting health myths, 40% of people had within 3 days dropped the negation from the original “X is not true” in their memory, believing instead “X” and attributing the knowledge to the experts who’d said the opposite. This is probably a worse state of affairs than before (unless the particular belief was both extremely harmful and widespread before exposure to the propaganda). In any case, it seems like we should do better than to accidentally and absolutely convince 40% of people of a harmful untruth.

If I wanted people to remember that X was wrong, I’d have an unattractive person directly advocate X, and viciously destroy him and his argument, preferably in public. People seem to remember who’s embarrassed and who hates who. I’ve seen this technique used in propaganda cartoons, and on Fox News with patsy liberal guests.

From The Denier’s Dilemma:

Ruth Mayo … found that for a substantial chunk of people, the “negation tag” of a denial falls off with time. … explaining why people who are accused of something but are later proved innocent find their reputations remain tarnished. … So is silence the best way to deal with myths? Unfortunately, the answer to that question also seems to be no. Another recent study found that when accusations or assertions are met with silence, they are more likely to feel true

In that vein, Robin Hanson asks what conclusions we should draw when a heavily promoted fringe belief (not accepted by official authorities) isn’t met with any substantial rebuttal. Not only should we not believe fringe advocates just because they’re loud, fervent, and apparently unopposed, we should take the official silence as an implied rebuke.

I often wonder: when is it worth my effort to undertake my own investigation of the merits of the fringe argument, in depth sufficient that I could find the confidence to overcome the strong prior against them by nature of their fringe status? In most cases, either the fringe claim is so incredible, or the advocates so flawed, that there’s no temptation. In some cases, the payoff for correctly adopting a fringe belief seems low (if it’s true that excessive milk drinking is harming me, it probably isn’t by very much), even if the cost of gaining the information needed to decide it is modest.

Prof. Hanson also suggests that when there is serious opposition to an idea, the fact that the opponents are diverse in their reasons and premises, is not evidence against them. For example, people sometimes advocate racial (among other types of) profiling for the reason that it’s rational to avoid considering certain groups when you want to find a qualified person of any group as easily as possible, but Robin and others want to argue that it’s unfair and breeds crime, for not enough benefit. The argument for profiling is simpler and clearer, but since most of us don’t want to allow profiling, we should be consistent and not generally take “confusion in the opposition” as evidence.

Sep 22nd, 2009

A Foolish Consistency

Growing up, we learn how to behave in a way that makes us a cog fit for society’s machine. Parts of official and folk religion may become counterproductive if we keep them, while valuing rationality to the extent that we really begin to overcome hypocrisy and inconsistency. Those parts would never have allowed our society to prosper if not for their protective companion conventional irrationalities.

If you fix one bug in an incorrect computation while others remain, your answer may be more inaccurate than before. So, if you’re both courageous and rational, please be careful before acting in a way that harms society; maybe the irrationality you just removed was part of your society’s protection against the logical consequences of another error you also need to discard. For example, please consider abandoning your religion before committing acts of righteous terrorism.

In this vein:
LessWrong discussion
Nerds are nuts

Sep 21st, 2009

Crash (a Kind of Movie Review)

crash: a pretty, fun, full-of-crap movie, with some powerfully acted moments (spoilers follow)

everywhere: tarantino-esque supposed to be funny-because-it’s-shocking racist repartee. the absolute low: “how did those two cultures (el salvadorian and puerto rican?) get together and decide to park their cars on the grass?”, said to his girlfriend in a fit of blue-balled pique.

coincidence #1: opening crash - cut to dazed detective getting out and finding his dead brother’s shoe on the side of the road

coincidence #2: molester cop randomly encounters molested woman next day and pulls her from her car just before it explodes (i was really pulling for him! it was a tasteful molestation, so he turned out ok)

coincidence #3: molester cop’s ex-partner also randomly encounters molested woman’s partner the same day and saves his life after he flips out and fights back against carjackers, then, continuing to run amok, attempts suicide-by-cop in a delayed reaction to his doormat-like acquiescence to her molestation-by-cop.

coincidence #4: one of the carjacker who failed in #3, later that night while hitchhiking ends up shot by the #3 lifesaving cop - while not emotionally significant, it’s still improbable that they would both have an independent relationship to the same car

obvious: evil HMO refuses to check molester’s dad for prostate cancer

obvious: imaginary bullet-proof cloaks and a blank ammunition

obvious: ludacris ironically railing against mumbling rappers

annoying: DA assistant’s sleazy/glib blackmail presentation

annoying: if you don’t want to rock the LAPD boat in order to leave your racist partner, you have to publicly claim uncontrollable flatulence

annoying: knowing the leaking gasoline would eventually reach the flame. (a person or a blanket would have easily been able to stop that flow)

retarded: molested woman’s partner brawling with two armed carjackers (we’re to understand that said carjackers, who earlier told us they would never steal from black people, won’t shoot)

retarded: detective’s brother / ludacris’ carjacking partner getting himself shot and tragically having an idol of St. Somebody in his hand rather than a gun. i would have shot him too.

retarded: giving ludacris back his gun and telling him he embarrasses you

true and sad: producer telling above partner (also a TV director) to direct a black actor to “talk more black”

horrible person: hmo bureaucrat who admits she’s making prostate dad suffer because his molester son is racist

horrible person: persian father who was enough of a dick that i was happy his store was vandalized in possible retaliation

horrible person: detective’s mother. gross. her son is a saint for not raging against her.

horrible person: DA’s wife. disgusting in every way. her quasi-redemption did nothing for me - when she realizes she is angry all the time for no reason, when she realizes that her maid (who should hate her but is too saintly) is the only friend she has, i think “yes, you are horrible, and your husband treats you better than you deserve”

possibly horrible person: DA, because presumably he knows what his blackmailing subordinate did. “i need to pin a medal on a black person so i still get votes from people after i was carjacked by black people” was just stupid, not evil. it’s part of the morality of the film that as a politician, he must be tainted by evil, but i do enjoy his calm command. he seems reasonable to me.

horrible person: ludacris’ carjacking partner.

horrible person: ludacris, until he’s conveniently redeemed by setting free the smuggled vietnamese locked in the back of the van he stole because he remembered it from the night before he hit-and-nearly-ran the night prior in the car he stole from the DA

horrible person: molester cop until he’s conveniently redeemed by heroically pulling the woman he molested from a car just prior to its explosion

nice person: locksmith (and his family), whether or not he vandalized the persian father who abused him and cheated him. horrible if in handing down invisible bulletproof cloak he scarred his daughter for life after he was shot (thanks, blanks) - which i was waiting for the entire movie.

nice person: persian daughter who bought the blanks

nice person: detective, provided he reneges on his deal to withhold exculpatory evidence now that his brother is dead

nice person: director who let his wife be molested.

nice person: not-racist cop, who rightfully shot his crazy hitchhiker (but he could have done MORE! we ALL could do MORE!)

overall: about the 300th best movie i’ve watched. pretty and well-acted. if you liked crash, magnolia does it better (equally believable and with less moralizing)

Sep 7th, 2009

movies, reviews

Primitive Scala

A lower level (Scala) solution to this puzzle:

There’s a path from the car to the exit on the right following the lines at corners, and emerging only at arrowheads. This is just a directed graph. The two directions on a street segment are independent nodes. I stole from Daniel Sobral the text representation of the graph, and implemented breadth first (shortest path) search just like he did. The difference is that I explicitly represent the directed graph, and I do so in typical programming contest style. He has a nice ASCII art printout; I just used simple debug prints to verify that I built the graph right.

It bothers me that in JVM languages, wrapping a primitive in a type requires boxing (no matter how awesome just-in-time compilation is, there’s still memory overhead). So I end up using a lot of bare primitives and arrays of primitives, probably to my detriment. Clearly it’s not relevant for this problem (which even a human can solve), but it might eventually compel me to use C++ again (where things are miserable, but at least there’s no abstraction penalty). I believe .NET allows C-like structures of unboxed primitives, but I’m not satisfied that F# is as pleasant or efficient as Scala (even using my primitive-happy style).

Obviously, most of the code is about translating the text graph description to the directed adjacency list. It’s about half the lines of his version, but I won’t claim it’s easier to understand.

final case class Roadmap(rows: Int,cols: Int) { type Onto=Array[Int] // adjacency list type Roads=Array[Onto] // vertices (unidirectional road segments) val RC=rows*cols val ndir=4 val goal=RC*ndir val DRC=RC*ndir val roads=new Roads(DRC) // one dimensional array allows for simpler adjacencies at the cost of uglier addressing /* separate road segment for each direction right left down up (0 1 2 3) so that (row,col) is the location of the upper/left corner of that segment (be consistent!) */ def index(rldu: Int,row: Int, col: Int) = RC*rldu+row*cols+col def coord(i: Int) = { val rldu=i/RC;val r=i%RC;(rldu,r/cols,r%cols) } val R=0 val L=1 val D=2 val U=3 val NONE=4 def c2dir(a: Char) = a match { case ‘R’ => R case ‘L’ => L case ‘D’ => D case ‘U’ => U case _ => NONE } def dir2c(d: Int) = “RLDUN”(d) def coordname(t: (Int,Int,Int))=t match {case (rldu,row,col) => “%c%d%d”.format(dir2c(rldu),row,col)} def i2s(i: Int)=coordname(coord(i)) /* set the transitions that meet in the intersection on row r, col c. exits(dir) if there’s an arrowhead; undir is a list of pairs of linked dirs */ def setCorner(r: Int, c: Int, exits: Array[Boolean], undirs: Array[(Int,Int)], goaldir: Int) { // remember, r,c gives the upper/left part of road val froms=Array(index(L,r,c),index(R,r,c-1),index(U,r,c),index(D,r-1,c)) // to approach from the right, you head left. froms and tos are identical except opposite directions val tos= Array(index(R,r,c),index(L,r,c-1),index(D,r,c),index(U,r-1,c)) val ways= Array.fill(ndir)(new collection.mutable.ArrayBuffer[Int](ndir)) def checkdir(from: Int,to: Int) { if (goaldir==to) ways(from)+=goal else if (exits(to)) ways(from)+=tos(to) } for ((from,to) checkdir(from,to) checkdir(to,from) } for ((f,w) if (w.size>0) roads(f)=w.toArray // since we didn’t pad the border, we’ve considered indices that are off the array. // user input must not ask us to move from somewhere off the array // a padded border would prevent this, and allow a single start and goal state } } def s2dirs(s: String) = (c2dir(s(0)),c2dir(s(1))) def readCorner(r: Int, c: Int, desc: String) { val a=desc.trim split “:” val exits=new Array[Boolean](ndir) var lastdir=NONE var goaldir=lastdir for ( c if (c==’!’) goaldir=lastdir else { val d=c2dir(c) if (d!=NONE) { exits(d)=true lastdir=d } } } setCorner(r,c,exits,a(1).trim split ” ” map s2dirs toArray,goaldir) } def read(s: String) = { var r=0 for ((line,row)=0) zipWithIndex) { for ((s,col) readCorner(row,col,s) } } } def path2goal(starts: Int*) = { type Path=List[Int] val q=new collection.mutable.Queue[Path] val seen=new Array[Boolean](DRC) starts foreach (seen(_)=true) val s=starts map (x=>List(x)) q.enqueue(s: _*) var p=q.head while (p.head != goal) { val h=p.head seen(h)=true val n=roads(h) map (_::p) q enqueue (n: _*) p=q dequeue } p } def prettypath(p: List[Int]) = p reverseMap i2s }

object car { /* from http://dcsobral.blogspot.com/2009/09/puzzle.html */ val graphString = “”“| |DR: DR, RLD: RD LD, L: DL LR, LR: DR LR, DL: DL |UR: LU RU DU LR, LRD: LR UD LU RU LD RD, UDLR: UD LR UR LD, UDLR: UD LR RD, UL: UD UL |UDR: UD RU RD, UDLR: LR LD DR RU, UD: UD UL DL, UDR: UD UR, UD: UD LD |UDR: UD RU RD, UDLR: UR UL LR RD, UR: UD LR LU RU DR, ULR: LR UR UL DR DL, UDLR!: UD UL DR |UR: UR, ULR: UL UR LR, ULR: UL UR RL, ULR: UL UR RL, UL: UL |”“”.stripMargin def main(args: Array[String]) { val r=Roadmap(5,5) r.read(graphString) println(r.prettypath(r.path2goal(r.index(r.R,1,0),r.index(r.U,0,0)))) } }
The format is explained on Daniel’s blog, but I found it odd enough that I’ll mention that it’s a matrix of intersections with EXITS: UNDIRECTED_ADJACENCIES. EXITS is the list of directions that have a departing arrowhead. The adjacencies are undirected to save typing (UR means both UR and RU). I added a ! following the arrowhead if it leads to the goal, since I didn’t want to hard code the exit location.

Because I didn’t create any node for the car’s starting location, I just started the search from both the initial turns available.

(SPOILER: R10, R11, D12, D22, R32, U23, U13, U03, R03, D04, L13, D13, R23, D24, D34, L43, L42, L41, L40, U30, U20, R20, D21, L30, D30, R40, R41, R42, R43, U34, where 32 means the road segment with its upper left at row 3 and column 2)

https://gist.github.com/181632

Sep 5th, 2009

Tree Pattern Matching

Fast Translation Rule Matching for Syntax-based Statistical Machine Translation
in EMNLP09 (Zhang et. al) gives a practical method for top-down enumerating tree pattern matches against the root of a forest (compact regular set of trees). This problem arises in MT when you first context-free-parse your source material, and then transform its syntax tree into another language - since parsers aren’t perfect, you translate all the likely parses rather than just the most likely, and pick the most likely and fluent output. Apparently the baseline pattern matching methods used are really terrible:

EITHER: Up to the largest rule size/shape iterate over the (exponential in that size) many forest-topping label sequences, checking each against a rule label sequence hash.
OR: Iterate over all the rules, checking each one independently against the forest.

They give an improvement to the latter approach: if you’ve already matched the first 3 levels of a 4-height rule before, don’t repeat the work. (The diagrams are pretty good; refer to them if you want to make sense of what follows). This is 20 times faster, and makes their overall MT system twice as fast. If you’re actually using one of the above horrible baseline approaches, you would do well to emulate this paper. I suspect that people are in practice using a hybrid of the two: exhaustively unpack the forest for a set of (small) top-of-tree-shapes e.g. x(x(x,x),x), retrieving potential rule sets by labels-hash, and then iterating over the resulting subset. Unfortunately, they only compare their performance against the two (stupid) extremes. I’ll also note that it always makes sense to (reversibly) binarize both the forest and the tree patterns in the same way, if you haven’t already, e.g. A(B,C,D) binarized would be: A(B,C_D(C,D)) where C_D is a virtual nonterminal that only appears in the context C_D(C(…),D(…)).

The avoidance of (some) repeated work is accomplished by 1) organizing tree patterns (rule left hand sides) to reflect these shared upper levels and 2) performing exhaustive top-down breadth-first expansion of pairs of (pattern state, forest states - one for each leaf in pattern state), where the forest states tuple may grow exponentially with depth. Since we’re interested in the subforests aligned to the leaves of our tree pattern, we would take the forest states for pattern states corresponding to complete patterns, and perform some computation (MT in this case) on the subforests + pattern - we don’t just want the set of matching tree patterns.

This paper could have been clearer and shorter. The pseudocode has quirks. It also seems to cite without reading: Decoding Complexity in Word-Replacement Translation Models (Knight 1999) does not show MT in general is NP-hard, only that IBM Model 1 (one of the simplest word to word translation models, with very free word reordering) is. MT based on parsing, without global features, is actually polynomial (albeit high degree if you have e.g. target 5gram probabilities).

Even though the forests arising from a parser will have the same root label for all trees in a state (e.g. NP covering words 1-2 has only NPs), I prefer the regular tree grammar formalism for a set of trees, where a grammar state can have production with different tree roots. This paper handles the case where each state has a single unique tree root, since that’s what their parser gives them. A general RTG production of rank k would need to be unpacked into N^k productions to achieve this, where N is the size of the tree label alphabet. It’s better to modify your algorithm to use (root-label,forest-state) tuples instead of just (forest-state) whenever you want to look at only same-root trees. I’m using “state” instead of the usual “nonterminal” to avoid confusion with the labels of the parse tree (in parsing, NP is a called a “nonterminal” of the tree NP(NN(dog))) since it’s not a “terminal”, or leaf).

Everything in the paper is called “hyper-X”. There’s even a section called “hyper-node, hyper-path and hyper-tree”. I appreciate the enthusiasm, but these are not great names, especially since they’re (misleadingly) suggestive of vertex/node, hyperpath, and hypertree in hypergraphs. The forest is also given as a set of “hyper-edges” (without definition, but obviously they mean a directed B-hyperedges with ordered tails).

Obviously if the parse forest has probabilities attached to productions, you will have accumulated those in the process of finding a particular match.

The authors could have cited, for the related problem of matching a set of tree patterns against a single tree, Pattern Matching in Trees (Hoffman and O’Donnell 1982). That work deals with ranked tree labels (doesn’t provide a way to say “subtree rooted in a NP of any arity”; you’d have to have NP(x) NP(x,x) NP(x,x,x) …), but that’s always a trivial change. As far as I can tell, it’s impossible to use the top-down matching scheme in that paper efficiently on forests.

Hoffman and O’Donnell provide a helpful framework: the (immediate) subsumption graph of tree patterns. This is a directed acyclic graph with edges from a to b if a>b (a>b (“subsumes”) iff \forall f a(f) => b(f)), where I’ve used a and b as predicates over trees or tree-sets f. Immediate just means that only the necessary edges are kept (subsumption is transitive). In the immediate subsumption graph pictured below, v means a variable, i.e. label can be anything:

Top-down matching procedures that report each assignment of subforests to tree pattern variables exactly once, will need to turn the subsumption graph into a tree. Most likely you’ll want to do this in advance. I’ve done this in a depth-first leftmost-expansion fashion by turning each tree pattern into a string, and combining the results into a trie with branching choice of “where to expand next”, implicitly giving a tree subgraph of the immediate subsumption graph (unpublished work). Since I allow tree labels to have variable arity, I separate the choice of a label and its arity (the latter may never be specified). My algorithm for matching this subsumption subgraph against a forest is the same as Zhang et. al, except that they’re breadth first, and they (unnecessarily) expand the whole level at once. That is, their subsumption subgraph is not immediate. They lose some sharing as a result, although it would be easy to fix this. You can devise pattern sets that make a particular subsumption graph->tree mapping perform badly, and I don’t have any intuition as to whether theirs or mine will perform better on particular sets (e.g. MT rules containing most naturally occurring tree fragments up to a fixed size or depth).

The lost sharing from top-down matching any particular subsumption tree can be avoided entirely with bottom-up matching. Hoffman+O’Donnell give a method for building a bottom-up deterministic tree recognizer, which is certainly optimal with respect to sharing and probably time if you can afford to precompute and store the (possibly exponential) monster. Bottom up tree recognizers can match against forests beautifully, since regular tree grammars are closed under intersection. Unlike top-down matching, the result of bottom-up matching would just be a set of rule (fragments) for each forest node. To get the subforests, you’d have to traverse top-down (this should be pretty fast, since you’re constrained by a specific tree pattern). In a future post, I’ll describe a bottom-up patterns against forest (or tree) matching that doesn’t explicitly determinize beforehand. I’ll also note that the type of sharing you get in bottom up matching is avoiding work in repeatedly matching subtrees that are contained in different rules. On the other hand, subsumption avoids repeated work in matching shared tops of trees. That is, a pattern that subsumes another is always larger (turning some leaves into full subtrees).

Also, although I haven’t yet read it, A Simple Tree Pattern-Matching Algorithm claims to improve Hoffman+O’Donnell. It’s unlikely that this would be useful for forest matching.

Aug 6th, 2009

EMNLP MT System Combination Papers

I’ve skimmed a the system combination papers mentioned at ACS: Machine Translation Papers at EMNLP.

System combination can give large improvements in MT quality, but in practice you have to ask people to translate material for you on their system.
Duan et al. (Feature Subspace Method for SMT System Combination) take a single system and create a dozen variants of it (each leaving out a single feature, or using a lower-order language model), and gain about 1 BLEU by doing sentence-level rescoring using ngram-agreement and internal model features (weights trained on a dev set). This is more than the modest 0.25 BLEU I’ve obtained by doing the same thing for a single system’s nbests; they didn’t report their single-system reranking gain but it’s probably similar.

When combining two actual systems (Hiero-style and phrase-based), they have to leave out the internal features (which are difficult or impossible to evaluate on an unrelated system output), but still get an additive ~1 BLEU when using 2*12 than 2*1 systems. They also found that 20-50 nbests were best (this agrees with my experience in sentence reranking, but obviously that’s dependent on the search errors, duplicate derivations, and other properties of the actual system and search procedure; sometimes I’ve found a small benefit from up to 1000-bests).

This is an excellent result and easy (if tedious) to apply to your own system(s). I don’t find their theoretical motivation particularly compelling. It seems to me that all the benefits are explained in terms of Bayesian averaging over multiple feature-weight vectors.

Feng et al. (Lattice-based System Combination for SMT) take the nice word-level (“sausage graph”) confusion network IHMM system combination method of Xiadong He, and take the obvious step of generating lattices. They get another 0.5-1 BLEU over his result. They build the lattice from word to word alignments against a backbone translation, as in the CN case, so it’s impressive that they get a meaningful lattice. My own thought would be to have the MT systems output alignment information directly, but their way makes integration easier.

It’s also possible to get much of the benefit from a lattice’s phrases by using system preferences for ngrams as a feature on top of the unigram CN; Zhao & He get almost identical improvement by doing so (they also add the usual ngram-agreement reranking feature, where you treat your weighted nbest as an English LM, into their CN decoder, which is smart). If you have an existing CN system, it’s probably easier to do this than switch to lattices (there’s no reason not to try to combine the methods, but they do very similar things already).

He & Toutanova (Joint Optimization for MT Sytem Combination) also amend He’s original approach, and get a larger improvement (1.5 BLEU). The original CN IHMM approach was a pipeline of several greedy/hard decisions; this work fixes that by jointly searching possible reorderings/alignments/selections of phrases from the nbest translations. There’s probably not any way to “add lattices” on top of this; there’s no explicit backbone, and they already use bigram system voting. I’d try new features before considering lattices (e.g. nbest ngram LM, 3gram voting). They show that the integrated alignment search is important; they lose 0.3 BLEU if they restrict the search to Viterbi union alignments.

Aug 3rd, 2009

A Wrong “Alternative” 0/1 Knapsack Solver

Be suspicious of algorithms given without proof of correctness, even if they do appear on the first page in Google.

An Alternative Dynamic Programming Solution for the 0/1 Knapsack is “alternative” as in homeopathy. It’s appealingly somewhat simpler than the standard solution.

From their Java implementation:

// Initial guess:  the knapsack for wt-1.
   bestVal[wt] = bestVal[wt-1];

That already causes some sub-optimal solutions. I fixed that, but that only reduced the frequency of wrong solutions. For the input:

204
9
2 2
81 81
82 82
182 182
142 142
11 11
31 31
61 61
101 101

The “alternative solution” given is:

(142, 142)
(61, 61)
Total weight: 203
Total value: 203

But this is better:

(31+61+101 = 204)

The problem: the “alternative” solution has already used 31 in summing to (204-31), 61 in summing to (204-61), and 101 in summing to (204-101). If the input is ordered differently, this may not happen - all 3 of those coincidences need to occur for the algorithm to fail.

The algorithm wrongly commits to just one way of reaching a given sum, unlike the standard (correct) solution.

Jul 30th, 2009