Explanation for "saddle" pattern
Submitted by Kathy Maffei on Thu, 2006-03-23 14:30
If you'll recall, we had a discussion yesterday in class about the "saddle" pattern seen in testing a range of input values (x & y each from 0 to 1) for the xor problem. There was some question as to why the saddle always ran like "/" rather than "\" Basically, the center range of x & y insisted on returning high values, even when Doug added training data for (0.5,0.5) to return 0. My intuition was that it had something to do with the calculations involved in adjusting the network's weights during back-propagation. Math isn't my forte (unfortunately, for a comp sci major!), but I'm pretty sure I've confirmed that the backprop algorithm is biased for answers of 1 over 0. Let me know if there's a hole in my logic, here. I've written out a few examples and posted them online in case anyone would like to see them. Basically, for each of 4 examples I compared two cases of error that were the same distance from the goal (desiredOutput) but in opposite directions. Basic logic (at least mine!) would suggest that regardless of which direction (positive or negative) you are from the goal, you would want to adjust the same amount (negative or positive) for comparable distances away. But, in all but one case, the weightAdjustment was very different for a negative error than for a positive error of the same absolute value (distance from the desiredOutput). The only case where the weightAdjustments were the same absolute value was Example 2, where the actualOutput was 0.5 for each, and the goals were 1 and 0 - same distance (pos & neg) and same actualOutput. Why? Because for some reason the actualOutput is factored into the final weightAdjustment. This is what causes the bias. I'm sure Doug will be able to explain why the algorithm is configured this way - there must be a good reason. And like I said, maybe there's a hole in my logic. Any thoughts?