Small sample is an acute problem in many application domains, which may be partially addressed by feature selection or dimensionality reduction. For the purpose of distance learning, we describe a method for feature selection using equivalence constraints between pairs of datapoints. The method is based on L1 regularization and optimization. Feature selection is then incorporated into an existing non-parametric method for distance learning, which is based on the boosting of constrained generative models. Thus the final algorithm employs dynamical feature selection, where features are selected anew in each boosting iteration based on the weighted training data. We tested our algorithm on the classification of facial images, using two public domain databases. We show the results of extensive experiments where our method performed much better than a number of competing methods, including the original boosting-based distance learning method and two commonly used Mahalanobis metrics.