Skip to content

mutually exclusive events

6 messages · Adrian Johnson, David Winsemius, Don McKenzie +2 more

#
On Aug 2, 2014, at 11:11 AM, Adrian Johnson wrote:

            
#-------------
 dat <- read.table(text="Cluster      Gene      Mutated    not_mutated
  1             G1             1              0
  1             G2             1              0
  1             G3             0              1
  1             G4             0              1
  1             G5             1              0
  2             G1             0              1
  2             G2             1              0
  2             G3             1              0
  2             G4             0              0
  2             G5             1              0", header=TRUE, stringsAsFactors=FALSE)

 with(dat, table(Cluster, Gene, Mutated)  )
#----------------
, , Mutated = 0

       Gene
Cluster G1 G2 G3 G4 G5
      1  0  0  1  1  0
      2  1  0  0  1  0

, , Mutated = 1

       Gene
Cluster G1 G2 G3 G4 G5
      1  1  1  0  0  1
      2  0  1  1  0  1
#--------------
Or:
xtabs(Mutated ~ Cluster+Gene, data=dat)
#----------------
       Gene
Cluster G1 G2 G3 G4 G5
      1  1  1  0  0  1
      2  0  1  1  0  1


I'm a bit unclear about your goals. Are you trying to identify the "Gene"s that have only one "Cluster" mutated as the "G1-G3" events and the Gene's that have either-Cluster but not both as the "G2-G5" events?

If so you can choose the columns that have a sum of 2 for the first and columns with sum of 1 for the second.
It's even less clear what sort of "test" you propose. `fisher.test` is a test of association. It doesn't identify combinations.
This is a plain text mailing list.
David Winsemius
Alameda, CA, USA
#
Homework?

There is a no homework policy here.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Sat, Aug 2, 2014 at 11:47 AM, Don McKenzie <dmck at u.washington.edu> wrote:
#
On Sat, Aug 2, 2014 at 1:11 PM, Adrian Johnson
<oriolebaltimore at gmail.com> wrote:
I am having trouble visualizing your data. How about a sample? The
easy is to do something like:

temp <- head(realData,10);
dput(temp);

Then cut'n'paste the output from the dput() into another email here.

But, asuming I have a bit of a grasp, you have four columns (example
only shows 3). If you have a set of columns which are 0 & 1 or FALSE
and TRUE, then you can create a "temp" column which encodes tehm
simply by considering them to be binary digits in a number. I.e.
tempColumn = 1 * column1 + 2 * column2 + 4*column3 + 8*column4. You
can the "group" the data by this value. All rows with the same value
are in the same "group". But I don't know what you want your output to
look like. As an aside any value other than 0, 1, 2,4, or 8 could be
considered invalid because it means that more than one column is TRUE,
which violates your constraint.