Design and implementation of Kmeans clustering algorithm on widely available graphics processing units (GPUs). Also presented is an analysis of the scalability of our proposed methods with increase in number and dimensionality of data points as well as the number of clusters and comparison of our results with current best available implementations on GPUs and a 24-way threaded parallel CPU implementation.