temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
startRow=2, endRow= 11, startCol=2, endCol=5)
temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
temp
Col1 Col2 Col3 Col4
[1,] "647853" "1413" "57662" "27897"
[2,] "491400" "1365" "40919" "20411"
[3,] "38604" "-" "5505" "985"
[4,] "576" "-" "20" "54"
[5,] "80845" "21" "10211" "4494"
[6,] "36428" "27" "1007" "1953"
[7,] "269915" "587" "32988" "12779"
[8,] "224494" "-" "30554" "9184"
[9,] "11858" "587" "-" "686"
[10,] "3742" "-" "81" "415"
temp <- sapply( temp , as.numeric )
Warning messages:
1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
647853 491400 38604 576 80845 36428 269915
647853 491400 38604 576 80845 36428 269915
224494 11858 3742 1413 1365 - -
224494 11858 3742 1413 1365 NA NA
21 27 587 - 587 - 57662
21 27 587 NA 587 NA 57662
40919 5505 20 10211 1007 32988 30554
40919 5505 20 10211 1007 32988 30554
- 81 27897 20411 985 54 4494
NA 81 27897 20411 985 54 4494
1953 12779 9184 686 415
1953 12779 9184 686 415
temp[ is.na( temp ) ] <- 0
temp
647853 491400 38604 576 80845 36428 269915
647853 491400 38604 576 80845 36428 269915
224494 11858 3742 1413 1365 - -
224494 11858 3742 1413 1365 0 0
21 27 587 - 587 - 57662
21 27 587 0 587 0 57662
40919 5505 20 10211 1007 32988 30554
40919 5505 20 10211 1007 32988 30554
- 81 27897 20411 985 54 4494
0 81 27897 20411 985 54 4494
1953 12779 9184 686 415
1953 12779 9184 686 415
2013/5/2 Anthony Damico <ajdamico at gmail.com>
try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
if that doesn't work, try a few other steps
# view what data types your file is being read in as
sapply( temp , class )
# convert all fields to character if they're factor variables.. but i
don't think you need this, readWorksheet defaults to `character`
temp <- sapply( temp , as.character )
# you can also convert a subset like this
temp[ , c( 1 , 3:4 ) ] <- sapply( temp[ , c( 1 , 3:4 ) ] , as.character )
# remove commas from character strings
temp <- sapply( temp , function( x ) gsub( ',' , '' , x ) )
# convert all fields to numeric
temp <- sapply( temp , as.numeric )
# convert all NA fields to zeroes if you prefer
temp[ is.na( temp ) ] <- 0
On Wed, May 1, 2013 at 11:55 PM, jpm miao <miaojpm at gmail.com> wrote:
Hi,
Attached are two datasheet to be read.
My raw data "130502temp.xlsx" contains numbers with ' symbols, and they
can't be read as numbers. Even if I copy and paste as numbers to form a
new
file "130502temp_number1.xlsx", they could not be read smoothly.
1. How can I read the datasheet as numbers?
2. How can I treat the notation "-" as (1) "NA" or (2) zero?
Thanks,
Miao
temp<-readWorksheetFromFile("130502temp.xlsx", sheet=1, header=FALSE,
startRow=2, endRow= 11, startCol=2, endCol=5)
Col1 Col2 Col3 Col4
1 647,853 1,413 57,662 27,897
2 491,400 1,365 40,919 20,411
3 38,604 - 5,505 985
4 576 - 20 54
5 80,845 21 10,211 4,494
6 36,428 27 1,007 1,953
7 269,915 587 32,988 12,779
8 224,494 - 30,554 9,184
9 11,858 587 - 686
10 3,742 - 81 415
Error in temp[2, 2] + 3 : non-numeric argument to binary operator
temp_num<-readWorksheetFromFile("130502temp_number1.xlsx", sheet=1,
header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)
Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator
as.numeric(temp_num[2,2])+3
[1] NA
Warning message:
NAs introduced by coercion