?Hola! This is to announce that [kmcuda](https://github.com/src-d/kmcuda) has obtained native R bindings and ask for the help with CRAN packaging. kmcuda is my child: an efficient GPGPU (CUDA) library to do K-means and K-nn on as much data as fits into memory. It supports running on multiple GPUs simultaneously, angular distance metric, Yinyang refinement, float16 (well, not in R for sure), K-means++ and AFK-MC2 initialization. I am thinking about Minibatch in the near future. Usage example: dyn.load("libKMCUDA.so") samples <- replicate(4, runif(16000)) result = .External("kmeans_cuda", samples, 50, tolerance=0.01, seed=777, verbosity=1) print(result$centroids) print(result$assignments[1:10,]) This library only supports Linux and macOS at the moment. Windows port is welcome. I knew pretty much nothing about R a week ago so would be glad to your suggestions. Besides, I've never published anything to CRAN and it will take some time for me to design a full package following the guidelines and rules. It will be awesome If somebody is willing to help! It seems to be the special fun to package the CUDA+OpenMP code for R and this fun doubles on macOS where you need a specific combination of two different clang compilers to make it work. Besides, I have a question which prevents me from sleeping at night: how is R able to support matrices with dimensions larger than INT32_MAX if the only integer type in C API is int (32-bit signed on Linux)? Even getting the dimensions with INTEGER() automatically leads to the overflow. -- Best regards, Vadim Markovtsev Lead Machine Learning Engineer || source{d} / sourced.tech / Madrid StackOverflow: 69708/markhor | GitHub: vmarkovtsev | data.world: vmarkovtsev
Multi-GPU "Yinyang" K-means and K-nn for R
2 messages · Vadim Markovtsev, Charles Determan
Hi Vadim, I would be happy to explore helping you out with this. I am quite active in development for GPU use in R. You can see my work on my github ( https://github.com/cdeterman) and the group I created for additional packages in development (https://github.com/gpuRcore). I believe it would be best though to take this conversation off list though. If you would like to discuss this further please email me separately. Kind regards, Charles On Thu, Feb 23, 2017 at 4:37 AM, Vadim Markovtsev <vadim at sourced.tech> wrote:
?Hola! This is to announce that [kmcuda](https://github.com/src-d/kmcuda) has obtained native R bindings and ask for the help with CRAN packaging. kmcuda is my child: an efficient GPGPU (CUDA) library to do K-means and K-nn on as much data as fits into memory. It supports running on multiple GPUs simultaneously, angular distance metric, Yinyang refinement, float16 (well, not in R for sure), K-means++ and AFK-MC2 initialization. I am thinking about Minibatch in the near future. Usage example: dyn.load("libKMCUDA.so") samples <- replicate(4, runif(16000)) result = .External("kmeans_cuda", samples, 50, tolerance=0.01, seed=777, verbosity=1) print(result$centroids) print(result$assignments[1:10,]) This library only supports Linux and macOS at the moment. Windows port is welcome. I knew pretty much nothing about R a week ago so would be glad to your suggestions. Besides, I've never published anything to CRAN and it will take some time for me to design a full package following the guidelines and rules. It will be awesome If somebody is willing to help! It seems to be the special fun to package the CUDA+OpenMP code for R and this fun doubles on macOS where you need a specific combination of two different clang compilers to make it work. Besides, I have a question which prevents me from sleeping at night: how is R able to support matrices with dimensions larger than INT32_MAX if the only integer type in C API is int (32-bit signed on Linux)? Even getting the dimensions with INTEGER() automatically leads to the overflow. -- Best regards, Vadim Markovtsev Lead Machine Learning Engineer || source{d} / sourced.tech / Madrid StackOverflow: 69708/markhor | GitHub: vmarkovtsev | data.world: vmarkovtsev
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.