An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080918/70822abe/attachment.pl>
detecting null values in a CSV file
7 messages · Henrik Bengtsson, Hutchinson,David [PYR], Jason Thibodeau +1 more
What have you tried this far? Can't you parse them as missing values, i.e. NAs? See ?read.csv and arguments '...', i.e. the arguments '...' are passed to read.table() which takes argument 'na.strings' - a character *vector* of strings that you want to be interpreted as NAs. See ?read.table for more details. My $.02 Henrik
On Thu, Sep 18, 2008 at 10:11 AM, Jason Thibodeau <jbloudg20 at gmail.com> wrote:
Hello all,
I have a CSV file, that is 2411 columns wide. There are certain instances in
teh file, where null values are located. That is: two commas together,
without anything in the middle. In a certain section, the only possible
values are NULL, 0,1,and 2. I need to be able to detect these NULL's and be
able to have them counted. For example, in a frequency table. How can I
accomplish this?
Thanks in advance for the help.
--
Jason Thibodeau
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080918/577310fe/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080918/8d586526/attachment.pl>
Try length(na.omit(<the particular data column>))
Here's an example:
data <- runif(100,0,10)
data[runif(20,0,100)] <- NA
file.contents <- matrix(data, ncol = 5, byrow = TRUE)
for (i in 1:5) {
print (length(na.omit(file.contents[,i])))
}
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Jason Thibodeau
Sent: Thursday, September 18, 2008 10:12 AM
To: r-help at r-project.org
Subject: [R] detecting null values in a CSV file
Hello all,
I have a CSV file, that is 2411 columns wide. There are certain
instances in
teh file, where null values are located. That is: two commas together,
without anything in the middle. In a certain section, the only possible
values are NULL, 0,1,and 2. I need to be able to detect these NULL's and
be
able to have them counted. For example, in a frequency table. How can I
accomplish this?
Thanks in advance for the help.
Jason Thibodeau [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080919/aded8788/attachment.pl>
You can always do this if they are single valued vectors: if ((!is.na(data_filter)) & (!is.na(trigger)) & (data_filter == trigger)) .... This will catch the condition where either is an NA and therefore not do the final compare which was giving your error.
On Fri, Sep 19, 2008 at 9:48 AM, Jason Thibodeau <jbloudg20 at gmail.com> wrote:
On a related note, I am trying to do some matching using conditional
statements. These NULL values are being brought in to my data frame as NA,
as expected, but in a conditional if() statement, I cannot compare then to a
integer value, it fails the program. Here is a small snippet of where the
error occurs.
while(col_loop<1570)
{
data_filter <- data[c(col_loop)]
print(data_filter)
if(data_filter == trigger)
{
trigger_count <- trigger_count +1
}
col_loop <- col_loop +1
}
Here: trigger_count, and trigger are both integers. The print statement was
debug to see why it was failing and this is what it returned:
<snip>
V1415
1 0
V1416
1 0
V1417
1 1
V1418
1 1
V1419
1 1
V1420
1 NA
Error in if (data_filter == trigger) { :
missing value where TRUE/FALSE needed
Thanks for any help you can provide.
On Thu, Sep 18, 2008 at 2:12 PM, Hutchinson,David [PYR] <
David.Hutchinson at ec.gc.ca> wrote:
Try length(na.omit(<the particular data column>))
Here's an example:
data <- runif(100,0,10)
data[runif(20,0,100)] <- NA
file.contents <- matrix(data, ncol = 5, byrow = TRUE)
for (i in 1:5) {
print (length(na.omit(file.contents[,i])))
}
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Jason Thibodeau
Sent: Thursday, September 18, 2008 10:12 AM
To: r-help at r-project.org
Subject: [R] detecting null values in a CSV file
Hello all,
I have a CSV file, that is 2411 columns wide. There are certain
instances in
teh file, where null values are located. That is: two commas together,
without anything in the middle. In a certain section, the only possible
values are NULL, 0,1,and 2. I need to be able to detect these NULL's and
be
able to have them counted. For example, in a frequency table. How can I
accomplish this?
Thanks in advance for the help.
--
Jason Thibodeau
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
--
Jason Thibodeau
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?