The nonlinear underactuated systems represent an important and general class of problems in robotics which have proven mostly intractable for analytical and numerical control design paradigms. Machine learning approaches to underactuated control, which employ approximations to make numerical optimal control techniques tractable, will have broad applications from walking robots to the control of aerial vehicles and fluid systems. Here we pursue a careful analysis of the algorithms applied to linear time-invariant (LTI) systems. This will contribute fundamental results on the convergence rate of different learning algorithms, and the design of robot mechanisms and input-output "features" which maximize the rate of convergence. Theoretical results are coupled with experiments on a strongly nonlinear control problem - a two-link bipedal robot walking over rough terrain. Acquiring a near optimal feedback policy for this robot would produce a result that is surprising and compelling (because a simple robot will be traversing more complicated terrain than has been demonstrated by any humanoid), but also clear and revealing (because the simplicity of the robot exposes the fundamental problems in walking and nothing more). Both theory and experiment allow us to perform a careful comparison between machine learning control and more mature control approaches from modern control.