Corresponding author:
Department of Computer Science, Columbia University, 450 Computer
Science Building, Mail Code 0401, 1214 Amsterdam Avenue, New York, NY
10027
The scaled dot product
kernel gives better performance using a threshold that is optimized on
the training set, so we report results for this threshold, rather than
a threshold of 0.
An
alternate formulation of support vector machines does not use an
explicit bias b but makes the bias implicit by adding 1 to the
kernel function. In this case, the hyperplane goes through the
origin, and the optimization does not require the constraint
. We use this implicit bias method in our
experiments.