An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121231/92c931b4/attachment.pl>
cut ()
3 messages · Muhuri, Pradip (SAMHSA/CBHSQ), Neal H. Walfield, David L Carlson
At Mon, 31 Dec 2012 22:25:25 +0000,
Muhuri, Pradip (SAMHSA/CBHSQ) wrote:
The issue is that, for Utah, I am getting an <NA> instead of (42,48.7] in the ob_mrj_cat column.
The problem is likely due to comparisons of floating point numbers. Try moving your lower and upper bounds out a tiny bit. When I add c(-1e-8, 0, 0, 0, 0, 1e8) to the result of quantile, I don't get any NAs. Neal
A misplaced right parenthesis caused the problem: p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile (p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE)) Should be p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile (p1_st_data$obt_mrj_p, (0:5/5)), include.lowest=TRUE) --------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Muhuri, Pradip (SAMHSA/CBHSQ) Sent: Monday, December 31, 2012 4:25 PM To: R help Subject: [R] cut () Hello List, My goal is to create a 5 category variable (p1_st_data$ob_mrj_cat), based on the p1_st_data$obt_mrj_p variable, using the following code for 50 States and District of Columbia (N=51). p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile (p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE)) The issue is that, for Utah, I am getting an <NA> instead of (42,48.7] in the ob_mrj_cat column. Is there a way to tweak the code (i.e., programmatically) to resolve the issue? I would appreciate receiving your help. Happy New Year and Best Wishes to R Expert-members, who have been so kind and helpful to beginner R users like me. Thanks and regards, Pradip Muhuri ########################## console followed the reproducible example #######
table(p1_st_data$ob_mrj_cat)
(42,48.7] (48.7,50.9] (50.9,52.8] (52.8,54.2] (54.2,58.7]
10 10 10 10 10
p1_st_data [p1_st_data$state =="Utah",] [, 1:4]
state obt_mrj_p obt_mrj_se ob_mrj_cat
45 Utah 42 1.49 <NA> # I expected this to be
(42,48.7] instead of <NA>.
### The Reproducible Example (data and code) is shown below:
#read estimates of risk factors for substances use (ages 12-17) by
State obtained from SUDAAN output
p1_st_data <-read.table (text="
Alabama, 49.60, 1.37
Alaska, 55.00, 1.41
Arizona, 52.50, 1.56
Arkansas, 50.50, 1.22
California, 51.10, 0.65
Colorado, 55.10, 1.26
Connecticut, 56.30, 1.28
Delaware, 53.60, 1.30
District of Columbia, 53.50, 1.22
Florida, 52.70, 0.67
Georgia, 52.50, 1.15
Hawaii, 49.40, 1.33
Idaho, 48.30, 1.23
Illinois, 52.70, 0.63
Indiana, 49.60, 1.16
Iowa, 46.30, 1.37
Kansas, 44.30, 1.43
Kentucky, 52.90, 1.37
Louisiana, 49.70, 1.23
Maine, 55.60, 1.44
Maryland, 53.90, 1.46
Massachusetts, 55.40, 1.41
Michigan, 52.40, 0.62
Minnesota, 51.50, 1.20
Mississippi, 43.20, 1.14
Missouri, 48.70, 1.20
Montana, 56.40, 1.16
Nebraska, 45.70, 1.51
Nevada, 54.20, 1.17
New Hampshire, 56.10, 1.30
New Jersey, 53.20, 1.45
New Mexico, 57.60, 1.34
New York, 53.70, 0.67
North Carolina, 52.20, 1.26
North Dakota, 48.60, 1.34
Ohio, 50.90, 0.61
Oklahoma, 47.20, 1.42
Oregon, 54.00, 1.35
Pennsylvania, 53.00, 0.63
Rhode Island, 57.20, 1.20
South Carolina, 50.50, 1.21
South Dakota, 43.40, 1.30
Tennessee, 48.90, 1.35
Texas, 48.70, 0.62
Utah, 42.00, 1.49
Vermont, 58.70, 1.24
Virginia, 51.80, 1.18
Washington, 53.50, 1.39
West Virginia, 52.80, 1.07
Wisconsin, 49.90, 1.50
Wyoming, 49.20, 1.29",
sep= "," , col.names = c("state" , "Obt_mrj_p" , "Obt_mrj_se" ),
colClasses = c( "character" , "numeric" , "numeric" )
)
#change the names to lower cases
names(p1_st_data) <- tolower (names(p1_st_data))
# cerate five equal-sized groups for the perceived ease of obtaining
marijuana variable
p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile
(p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE))
p1_st_data
dim (p1_st_data)
table(p1_st_data$ob_mrj_cat)
p1_st_data [p1_st_data$state =="Utah",] [, 1:4]
Pradip K. Muhuri, PhD
Statistician
Substance Abuse & Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260
e-mail:
Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>
The Center for Behavioral Health Statistics and Quality your feedback.
Please click on the following link to complete a brief customer survey:
http://cbhsqsurvey.samhsa.gov<http://cbhsqsurvey.samhsa.gov/>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.