**RCB-SOCP**
is a clustering based algorithm for classification of large datasets
[1,2]. The key features are:

The training time complexity is O(m), where m is the number of training examples.

Need not store training data in memory .

It uses second order moments, i.e. mean and variance, of clusters to build optimum classifier. Hence performs better than methods which use only mean information.

It is robust to moment estimation errors. Hence can be used with any online clustering algorithm. The scalability of

**RCB-SOCP**can be improved by choosing faster clustering algorithms.It can be extended to non-linear classifiers [3].

The steps involved are:

Clustering the positive and negative training data points efficiently using any online clustering algorithm. Estimate the second order moments of clusters --- mean and variance.

We used the following BIRCH program for clustering --- birch.tgz in our experiments [1]. This is slightly modified version of the original program by [4].

Solve the CB-SOCP/RCB1-SOCP/RCB2-SOCP formulations [1] which use both mean and variance of the training data clusters in order to build the discriminating hyperplane.

We implemented the above formulations in SeDuMi [5] for Matlab. The code is available here. See the README file for details.

Once the discriminating hyperplane w'x-b=0 is built, the label of a new test example x_t is sign(w'(x_t)-b).

The synthetic dataset, D, used in our scalability experiments [1] is available here. The scripts used to generate synthetic data, D1, D2, can be downloaded from here.

