Hello, I want to do regression or missing value imputation by knn. I searched r-help mailing list. This question was asked in 2005. ksmooth and loess were recommended. But my case is different. I have many predictors (p>20) and I really want try knn with a given k. ksmooth and loess use band width to define neighborhood size. This contrasts to knn's variable band width via fixing a k. Are there any such functions I can use in R packages? Your help is highly appreciated. Shengqiao Li
How to do knn regression?
3 messages · Shengqiao Li, Yihui Xie, Hans W Borchers
Hi Shengqiao, I don't know any direct solutions to your question, but I don't think it's difficult to write a few lines of code to find the k-nearest neighbours for an observation with a missing value. Typically you need the function dist() to compute distances, rank() or order() to find the k-nearest neighbours, and finally using mean() or median() or any statistic to make predictions. To assure you the light work of programming, I can tell you all the code of this example (http://animation.yihui.name/dmml:k-nearest_neighbour_algorithm) is no more than 100 lines :-D But seriously speaking, I don't think my method is efficient. Maybe C code will be much faster, as the knn() function in package 'class' has called. Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China
On Fri, Sep 19, 2008 at 10:17 AM, Shengqiao Li <shli at stat.wvu.edu> wrote:
Hello, I want to do regression or missing value imputation by knn. I searched r-help mailing list. This question was asked in 2005. ksmooth and loess were recommended. But my case is different. I have many predictors (p>20) and I really want try knn with a given k. ksmooth and loess use band width to define neighborhood size. This contrasts to knn's variable band width via fixing a k. Are there any such functions I can use in R packages? Your help is highly appreciated. Shengqiao Li
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Shengqiao Li <shli <at> stat.wvu.edu> writes:
Hello, I want to do regression or missing value imputation by knn. I searched r-help mailing list. This question was asked in 2005. ksmooth and loess were recommended. But my case is different. I have many predictors (p>20) and I really want try knn with a given k. ksmooth and loess use band width to define neighborhood size. This contrasts to knn's variable band width via fixing a k. Are there any such functions I can use in R packages?
The R package 'knnFinder' provides a nearest neighbor search based on the approach through kd-tree data structures. Therefore, it is extremely fast even for very large data sets. It returns as many neighbors as you need and can also be used, e.g., for determining distance-based outliers. Hans Werner Borchers ABB Corporate Research
Your help is highly appreciated. Shengqiao Li
______________________________________________ R-help <at> r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.