Monday, February 28, 2011

Dr. Martinez shops bazaars

In 2003, Dr. Tony Martinez published The General Inefficiency of Batch Training for Gradient Descent Learning; the ideas of online learning are analogous to bazaar development. Batch training is comparative to cathedral development, where changes roll out after an epoch of improvement. On the other hand, online learning is like the bazaar: every fix quickly integrates with the result, and the many small changes converge on the goal with greater efficiency. Dr. Martinez also points out that large training sets take orders of magnitude longer to learn than smaller ones. So it is with development styles. Moreover, here lies the analogy to Brooks Law: with large teams, complexity rises by the square, while work done only rises linearly. In machine learning, complexity lives in the dataset size. These analogies are not perfect. The reasoning behind cathedral style development is that the user should see the fewest bugs as possible, while for batch training, correctness motivates withholding updates until the end of an epoch. In addition, vanilla online learning is not parallel, though it can be parallelized. However, whether or not this analogy is complete, results are slower for the cathedral and the Batch approaches. The scaffolds in code development and machine learning slow progress and increase overhead.

No comments:

Post a Comment