Search Archives
Search tips
from:Name
Search by author name, e.g. from:Duncan Murdoch
"exact phrase"
Match an exact phrase
word1 word2
Match messages containing both words
Date range
Use the date pickers to filter results to a time period
Use the list dropdown to narrow results to a specific mailing list. Combine from: with other terms to filter by author and content.
Hi everyone. I'd like to perform RIDIT scoring of a column that consists of ordinal values, but I don't have a comparison dataset to use against it as required by the Ridit::ridit function. As a question of...
Greg Williams has a book titled "Data Mining with Rattle and R", which has a chapter on association rules and the arules package. Williams' Rattle GUI package for R also lets you define an association rules model using a graphical...
Hi everyone. I have a dataframe that is a collection of Vendor IDs plus a bank account number for each vendor. I'm trying to find a way to count the number of duplicate bank accounts that occur in more...
Hi Brian. I assume you're interested in some kind of classification of the theme or the contents within each document? In which case I would direct you to natural language processing for multinomial classification of unstructured data. Basically an...
Hi Leslie and all. You may want to investigate using SparklyR on a cloud environment like AWS, where you have more packages that are designed to work on cluster computing environments and you have more control over those types of...
Hi everyone. I'm using the kernlab ksvm function with the rbfdot kernel for a binary classification problem and getting a strange result back. The predictions seem to be very accurate judging by the training results provided by the algorithm...
Hi. I am not a lawyer and I didn't stay at a Holiday Inn last night, but I don't think this is a GPL violation. The closest analogy for non-GPL release packages that comes to mind is...
I am using R with the nnet package to perform a multinomial logistic regression on a training dataset with ~5800 training dataset records and 45 predictor variables in that training data. Predictor variables were chosen as a subset of all...
Hi Paul. Have you considered just going onto Kaggle and GitHub and searching for some of the many freely available real datasets that are posted there? I'm seeing a lot of productivity there days with research focused on data...
Have you looked at the merge function in base R? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge On 2022-03-19 21:15, Jeff Reichman wrote: > R-Help Community > > I'm trying to combine two...
Couldn't you convert the date columns to character type data in a data frame, and then convert those strings to factors in a 2nd step? The only downside I think to treating dates as factor levels is that you...
Some ideas: You could create a cluster model with k=3 for each of the 3 variables, to determine what constitutes high/medium/low centroid values for each of the 3 types of plant types. Centroid values could then be...
Hi Dr. Pedersen. I haven't used cook's on an aov object but I do it all the time from an lm (general linear model) object, ie.: mod <- lm(data=dataframe) cooksdistance <- cooks.distance(mod) I *think* you might...
forgot to mention, the training and testing dataframes are composed of 4 IVs (one double numeric IV and three factor IVs) and one DV (dichotomous factor, i.e. true or false). The training dataframe consists of 48819 rows and test...
I had a paper published about 2 weeks ago that cited R in the references section: Woolman, T. A., & Pickard, J. L. (2022). Gradient Descent Machine Learning with Equivalency Testing for Non-Subject Dependent Applications in Human Activity Recognition.?EAI...
In Windows versions of R/RStudio when refering to filename paths, you need to either use two "\\" characters instead of one, OR use the reverse slash "/" as used in Linux/Unix. It's an unfortunate conflict between R and Windows...
You can also do "SQL-like" joins in the tidyverse with dplyr. On 2022-03-19 21:23, Jeff Reichman wrote: > Evening Tom > > Yest I've been playing with the merge function. But haven't been able > to > achieve what...
Hi everyone. I'm using a random forest in R to successfully perform a classification on a dichotomous DV in a dataset that has 29 IVs of type double and approximately 285,000 records. I ran my model on a...
Thanks, everyone! Quoting Jim Lemon <drjimlemon at gmail.com>: > Oops, I sent this to Tom earlier today and forgot to copy to the list: > > VendorID=rep(paste0("V",1:10),each=5) > AcctID=paste0("A",sample(1:5,50,TRUE...
I'm trying hard to take tonight off and avoid booting up the laptop and launching R... :) but you need to merge by the primary key(s), e.g. the common columns (common IVs) shared between the two dataframes. On...
Can't find what you're looking for? Try searching with Google .