Hello all,
I have the following function call to create a matrix of POP_SIZE rows
and fill it with bit strings of size LEN:
pop=create_pop_2(POP_SIZE, LEN)
I have 3 questions:
(1) If I did
keep_pop[1:POP_SIZE] == pop[1:POP_SIZE]
to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in 'pop' would it be reflected in 'keep_pop'
too? (I don't think so, but just wanted to check). I would like
two independent copies.
(2) If I wanted to change the order of *rows* in my matrix 'pop', is there
an easy way to shuffle these? I don't want to change anything in the
columns, just the complete rowsn (E.g., in Python I could just say
something like suffle(pop) assuming pop is a list of list) - is there
an equivalent for R?
(3) I would like to compare the contents of 'keep_pop' with 'pop'. Though
the order of rows may be different it should not matter as long as
the same rows are present. Again, in Python this would be simply
if sorted(keep_pop) == sorted(pop):
print 'they are equal'
else
print 'they are not equal'
Is there an equivalent R code segment?
Thanks,
Esmail
--------------- the code called above -------------
####################################################
# create a binary vector of size "len"
#
create_bin_Chromosome <- function(len)
{
sample(0:1, len, replace=T)
}
############## create_population ###################
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 <- function(popsize, len)
{
datasize=len*popsize
print(datasize)
npop <- matrix(0, popsize*2, len, byrow=T)
for(i in 1:popsize)
npop[i,] = create_bin_Chromosome(len)
npop
}
3 questions regarding matrix copy/shuffle/compares
12 messages · Esmail, David Winsemius, Hadley Wickham
On Apr 26, 2009, at 12:28 AM, Esmail wrote:
Hello all, I have the following function call to create a matrix of POP_SIZE rows and fill it with bit strings of size LEN: pop=create_pop_2(POP_SIZE, LEN)
Are you construction a vector or a matrix? What are the dimensions of your matrix?
I have 3 questions: (1) If I did keep_pop[1:POP_SIZE] == pop[1:POP_SIZE] to keep a copy of the original data structure before manipulating 'pop' potentially, would this make a deep copy or just shallow? Ie if I change something in 'pop' would it be reflected in 'keep_pop' too? (I don't think so, but just wanted to check). I would like two independent copies.
"==" is not an assignment operator in R, so the answer is that it would do neither. "<-" and "=" can do assignment. In neither case would it be a "deep copy".
(2) If I wanted to change the order of *rows* in my matrix 'pop', is there an easy way to shuffle these? I don't want to change anything in the columns, just the complete rowsn (E.g., in Python I could just say something like suffle(pop) assuming pop is a list of list) - is there an equivalent for R?
You can get a value from a matrix by using the indexing construction. But your terminology is confusing. Is pop a matrix or a list? ?"[" ?order ... and perhaps ?sample if you wanted a random permutation of the rows. I am going to refrain from posting speculation until you provide valid R code that will create an object that can be the subject of operations.
(3) I would like to compare the contents of 'keep_pop' with 'pop'.
Though
the order of rows may be different it should not matter as long as
the same rows are present. Again, in Python this would be simply
if sorted(keep_pop) == sorted(pop):
print 'they are equal'
else
print 'they are not equal'
Is there an equivalent R code segment?
Depends on what you want to do and what you are doing it on. You could look at: ?%in% ?merge
Thanks, Esmail --------------- the code called above -------------
The code below creates a "bit vector" but then only makes exact multiples of it in the first row and zeros in the second row. Was that what was desired?
####################################################
# create a binary vector of size "len"
#
create_bin_Chromosome <- function(len)
{
sample(0:1, len, replace=T)
}
############## create_population ###################
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 <- function(popsize, len)
{
datasize=len*popsize
print(datasize)
npop <- matrix(0, popsize*2, len, byrow=T)
for(i in 1:popsize)
npop[i,] = create_bin_Chromosome(len)
npop
}
David Winsemius, MD Heritage Laboratories West Hartford, CT
Hello David,
Let me try again, I don't think this was the best post ever I've made :-)
Hopefully this is clearer, or otherwise I may break this up into
three separate simple queries as this may be too long.
> "==" is not an assignment operator in R, so the answer is that it
> would do neither. "<-" and "=" can do assignment. In neither case
> would it be a "deep copy".
It was late when I posted the code, I made a mistake with regard to
the assignment operator and used the boolean compare instead -- thanks
for catching that.
It should have been:
keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]
-------- Here's an edited and clearer version I hope:
The basic idea is that I am trying to keep track of a number of bitrings.
Therefore I am creating a matrix (named 'pop') whose rows are made up
of bit vectors (ie my bitstrings). I only initialize half of the rows
with my bitstrings of random 1s and 0s, the rest of the rows are set
to all zeros).
So I use following function call to create a matrix and fill it with
bit strings:
pop=create_pop_2(POP_SIZE, LEN)
where
POP_SIZE refers to the number of rows
LEN to the columns (length of my bitstrings)
This is the code I call:
####################################################
# create a random binary vector of size "len"
#
create_bin_Chromosome <- function(len)
{
sample(0:1, len, replace=T)
}
############## create_population ###################
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 <- function(popsize, len)
{
datasize=len*popsize
print(datasize)
npop <- matrix(0, popsize*2, len, byrow=T)
for(i in 1:popsize)
npop[i,] = create_bin_Chromosome(len)
npop
}
My 3 questions:
(1) If I did
keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]
to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in pop would keep_pop change too? I would
like two independent copies so that 'keep_pop' stays intact while
'pop' may change.
> "<-" and "=" can do assignment. In neither case would it be a
> "deep copy".
Is there a deepcopy operator, or would I have to have two nested
loops and iterate through them? Or is there a nice R-idiomatic way
to do this?
(2) If I wanted to change the order of rows in my matrix 'pop', is
there an easy way to shuffle these? I.e., I don't want to change
any of the bitstrings vectors/rows, just the order of the rows in the
matrix 'pop'. (E.g., in Python I could just say something like
suffle(pop)) - is there an equivalent for R?
So if pop [ [0, 0, 0]
[1, 1, 1]
[1, 1, 0] ]
after the shuffle it may look like
[ [1, 1, 0] (originally at index 2)
[1, 1, 1] (originally at index 1)
[0, 0, 0] ] (originally at index 0)
the rows themselves remained intact, just their order changes.
This is a tiny example, in my case I may have 100 rows (POPS_SIZE)
and rows of LEN 200.
(3) I would like to compare the contents of 'keep_pop' (a copy of the
original 'pop') with the current 'pop'. Though the order of rows
may be different between the two, it should not matter as long as
the same rows are present. So for the example given above, the
comparison should return True.
For instance, in Python this would be simply
if sorted(keep_pop) == sorted(pop):
print 'they are equal'
else
print 'they are not equal'
Is there an equivalent R code segment?
I hope this post is clearer than my original one. Thank you David for
pointing out some of the shortcomings of my earlier post.
Thanks,
Esmail
On Apr 26, 2009, at 7:48 AM, Esmail wrote:
Hello David, Let me try again, I don't think this was the best post ever I've made :-) Hopefully this is clearer, or otherwise I may break this up into three separate simple queries as this may be too long.
"==" is not an assignment operator in R, so the answer is that it would do neither. "<-" and "=" can do assignment. In neither case would it be a "deep copy".
It was late when I posted the code, I made a mistake with regard to
the assignment operator and used the boolean compare instead -- thanks
for catching that.
It should have been:
keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]
-------- Here's an edited and clearer version I hope:
The basic idea is that I am trying to keep track of a number of
bitrings.
Therefore I am creating a matrix (named 'pop') whose rows are made up
of bit vectors (ie my bitstrings). I only initialize half of the rows
with my bitstrings of random 1s and 0s, the rest of the rows are set
to all zeros).
So I use following function call to create a matrix and fill it with
bit strings:
pop=create_pop_2(POP_SIZE, LEN)
where
POP_SIZE refers to the number of rows
LEN to the columns (length of my bitstrings)
This is the code I call:
####################################################
# create a random binary vector of size "len"
#
create_bin_Chromosome <- function(len)
{
sample(0:1, len, replace=T)
}
############## create_population ###################
# create population of chromosomes of length len
# the matrix contains twice as much space as popsize
#
create_pop_2 <- function(popsize, len)
{
datasize=len*popsize
print(datasize)
npop <- matrix(0, popsize*2, len, byrow=T)
for(i in 1:popsize)
npop[i,] = create_bin_Chromosome(len)
npop
}
My 3 questions:
(1) If I did
keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]
to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in pop would keep_pop change too? I would
like two independent copies so that 'keep_pop' stays intact while
'pop' may change.
> "<-" and "=" can do assignment. In neither case would it be a > "deep copy".
Is there a deepcopy operator, or would I have to have two nested loops and iterate through them? Or is there a nice R-idiomatic way to do this?
Not that I know of, although my knowledge of R depth is not encyclopedic. You might get the desired sort of effect by creating a copy inside a function, working on it inside the function in the manner desired, and then comparing the output to the original. There might be other strategies to get certain effects by creating specific environments.
(2) If I wanted to change the order of rows in my matrix 'pop', is
there an easy way to shuffle these? I.e., I don't want to change
any of the bitstrings vectors/rows, just the order of the rows in
the
matrix 'pop'. (E.g., in Python I could just say something like
suffle(pop)) - is there an equivalent for R?
So if pop [ [0, 0, 0]
[1, 1, 1]
[1, 1, 0] ]
after the shuffle it may look like
[ [1, 1, 0] (originally at index 2)
[1, 1, 1] (originally at index 1)
[0, 0, 0] ] (originally at index 0)
the rows themselves remained intact, just their order changes.
This is a tiny example, in my case I may have 100 rows (POPS_SIZE)
and rows of LEN 200.
Yes. As I said before "I am going to refrain from posting speculation until you provide valid R code that will create an object that can be the subject of operations."
(3) I would like to compare the contents of 'keep_pop' (a copy of the
original 'pop') with the current 'pop'. Though the order of rows
may be different between the two, it should not matter as long as
the same rows are present. So for the example given above, the
comparison should return True.
For instance, in Python this would be simply
if sorted(keep_pop) == sorted(pop):
print 'they are equal'
else
print 'they are not equal'
Is there an equivalent R code segment?
If you created a random index vector that was used to sort the rows for display or computational purposes only, you could maintain the original ordering so that row wise comparisons could be done.
I hope this post is clearer than my original one. Thank you David for pointing out some of the shortcomings of my earlier post. Thanks, Esmail
David Winsemius, MD Heritage Laboratories West Hartford, CT
David Winsemius wrote:
Yes. As I said before "I am going to refrain from posting speculation until you provide valid R code that will create an object that can be the subject of operations."
The code I have provided works, here is a run that may prove helpful:
POP_SIZE = 6
LEN = 8
pop=create_pop_2(POP_SIZE, LEN)
print(pop)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 1 0 1 1 0 0 1
[2,] 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 1 0 0 0
[4,] 0 0 0 0 0 0 0 1
[5,] 0 0 1 1 0 0 1 0
[6,] 1 0 0 0 0 0 1 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0
[11,] 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0
I want to (1) create a deep copy of pop, (2) be able to shuffle
the rows only, and (3) be able to compare two copies of these objects
for equality and have it return True if only the rows have been shuffled.
On Apr 26, 2009, at 9:43 AM, Esmail wrote:
David Winsemius wrote:
Yes. As I said before "I am going to refrain from posting speculation until you provide valid R code that will create an object that can be the subject of operations."
The code I have provided works, here is a run that may prove helpful:
POP_SIZE = 6
LEN = 8
pop=create_pop_2(POP_SIZE, LEN)
print(pop)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 1 0 1 1 0 0 1
[2,] 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 1 0 0 0
[4,] 0 0 0 0 0 0 0 1
[5,] 0 0 1 1 0 0 1 0
[6,] 1 0 0 0 0 0 1 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0
[11,] 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0
I want to (1) create a deep copy of pop,
I have already said *I* do not know how to create a "deep copy" in R.
(2) be able to shuffle the rows only, and
I have suggested that shuffling by way of a random selection of an
external index:
> pop=create_pop_2(POP_SIZE, LEN)
[1] 48
> pop
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 0 0 1 0 1 1
[2,] 1 0 1 0 0 0 1 0
[3,] 1 1 0 1 0 1 0 0
[4,] 0 0 0 0 1 0 0 0
[5,] 1 0 0 1 1 1 1 1
[6,] 1 1 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0
[11,] 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0
> dx <- sample(1:nrow(pop), nrow(pop) )
> dx
[1] 12 10 8 9 3 1 6 11 5 7 4 2
> pop[dx,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0
[5,] 1 1 0 1 0 1 0 0
[6,] 1 1 0 0 1 0 1 1
[7,] 1 1 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
[9,] 1 0 0 1 1 1 1 1
[10,] 0 0 0 0 0 0 0 0
[11,] 0 0 0 0 1 0 0 0
[12,] 1 0 1 0 0 0 1 0
(3) be able to compare two copies of these objects for equality and have it return True if only the rows have been shuffled.
I see two possible questions, the first easier (for me) than the second. Do you want to work on a copy with a known permutation of rows... or on a copy with an unknown ordering? In the first case I am unclear why you would not create an original and a copy, work on the copy, and compare with the original that is also sorted by the external index.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD Heritage Laboratories West Hartford, CT
I want to (1) create a deep copy of pop,
I have already said *I* do not know how to create a "deep copy" in R.
Creating a deep copy is easy, because all copies are "deep" copies. You need to try very hard to create a reference in R. Hadley
My understanding of the OP's request was for some sort of copy which did change when entries in the original were changed; the sort of behavior that might be seen in a spreadsheet that had a copy "by reference".
On Apr 26, 2009, at 11:28 AM, hadley wickham wrote:
I want to (1) create a deep copy of pop,
I have already said *I* do not know how to create a "deep copy" in R.
Creating a deep copy is easy, because all copies are "deep" copies. You need to try very hard to create a reference in R. Hadley
David Winsemius, MD Heritage Laboratories West Hartford, CT
David,
Good news! It seems that R has deep copy by default. I ran this simplified
test and it seems I can change 'pop' without changing the saved version.
POP_SIZE = 4
LEN = 8
pop=create_pop_2(POP_SIZE, LEN)
cat('printing original pop\n')
print(pop)
keep_pop = pop
pop[1,1] = 99
cat('printing changed pop\n')
print(pop)
cat('printing keep_pop\n')
print(keep_pop)
-----------
> source('mat.R')
[1] 32
printing original pop
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 1 1 0 1 0 0 1
[2,] 1 0 1 0 0 0 1 1
[3,] 0 1 0 1 1 1 0 1
[4,] 0 0 0 1 0 1 0 0
[5,] 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
printing changed pop
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 99 1 1 0 1 0 0 1
[2,] 1 0 1 0 0 0 1 1
[3,] 0 1 0 1 1 1 0 1
[4,] 0 0 0 1 0 1 0 0
[5,] 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
printing keep_pop
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 1 1 0 1 0 0 1
[2,] 1 0 1 0 0 0 1 1
[3,] 0 1 0 1 1 1 0 1
[4,] 0 0 0 1 0 1 0 0
[5,] 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0
Re Shuffle
I tried using sample based on your earlier post, but your example
really helped, thanks! That solves the shuffling issue.
dx <- sample(1:POP_SIZE, POP_SIZE)
cat('shuffled index:')
print(dx)
print(pop[dx,])
cat('shuffled pop')
pop[1:POP_SIZE,] = pop[dx,]
print(pop)
re compare:
> I am unclear why you would not create an original and a copy,
Well .. that I wanted to do from the start (hence my question about
deep copy :-)
> work on the copy, and compare with the original that is also sorted
> by the external index.
That's a great idea, hadn't thought of keeping the index around for
this, I'll give this a try.
Final question, how do I compare these two structures so that I get
one result, true or false? Right now
keep == pop yields all these individual comparisons:
> pop==keep
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE TRUE FALSE TRUE FALSE
[2,] FALSE TRUE FALSE TRUE FALSE
[3,] TRUE TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE TRUE
[5,] TRUE TRUE TRUE TRUE TRUE
[6,] TRUE TRUE TRUE TRUE TRUE
Thanks for the help, much appreciated.
Esmail
In that case, you would want a shallow copy, and you'd need to jump through a lot of hoops to do that in R. Hadley On Sun, Apr 26, 2009 at 10:35 AM, David Winsemius
<dwinsemius at comcast.net> wrote:
My understanding of the OP's request was for some sort of copy which did change when entries in the original were changed; the sort of behavior that might be seen ?in a spreadsheet that had a copy "by reference". On Apr 26, 2009, at 11:28 AM, hadley wickham wrote:
I want to (1) create a deep copy of pop,
I have already said *I* do not know how to create a "deep copy" in R.
Creating a deep copy is easy, because all copies are "deep" copies. You need to try very hard to create a reference in R. Hadley
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
hadley wickham wrote:
I want to (1) create a deep copy of pop,
I have already said *I* do not know how to create a "deep copy" in R.
Creating a deep copy is easy, because all copies are "deep" copies. You need to try very hard to create a reference in R.
Hi Hadley Right you are .. I discovered this now too. It's really confusing to go back and forth between different languages. I have been programming in Python for the last 2 months and everything there is a reference .. so I have to worry about deep copy etc. Thanks! Esmail
David Winsemius
<dwinsemius at comcast.net> wrote:
My understanding of the OP's request was for some sort of copy which did change when entries in the original were changed; the sort of behavior that might be seen in a spreadsheet that had a copy "by reference".
You misunderstood (my phrasing wasn't probably the best), but I was
clear about wanting two independent copies.
From my earlier post:
(1) If I did
keep_pop[1:POP_SIZE] = pop[1:POP_SIZE]
to keep a copy of the original data structure before manipulating
'pop' potentially, would this make a deep copy or just shallow? Ie
if I change something in 'pop' would it be reflected in 'keep_pop'
too? (I don't think so, but just wanted to check). I would like
two independent copies.
Regardless, the net outcome was new knowledge, so this is a good outcome.
Esmail