Message-ID: <20081224121730.E0D3D595608@borg.st.net.au>
Date: 2008-12-24T12:16:55Z
From: Bob Green
Subject: selecting a subset of a matrix based on a value occurring in 5 records
In-Reply-To: <mailman.21.1230030005.18537.r-help@r-project.org>
Hello,
>I am hoping for some advice as to how I might create a subset of a
>matrix. The matrix is 176 x 3530. The rows are individual records
>and the columns words. I want to create a new matrix that only
>consists of words which occur in at least 5 records. For example,
>if column 7 is "charges" and this only appears in 4 records/rows
>this variable would not be included, whereas if column 109 was the
>word "monitor" and occurred in 95 records it would be saved into the
>new matrix. Values in the matrix are numbers, such that if a word
>does not occur in a record the cell contains a zero, whereas if it
>occurs 7 times there is a value of 7 for that record. It is the
>number of records rather than the than the column total that is the
>criteria for determing inclusion into the matrix.
Any suggestions on how I might reduce the size of this matrix so as
to include only those columns in which a word occurs at least in 5
records is much appreciated,
regards
Bob