If cycle takes to much time...
On 25.01.2013 12:08, Berend Hasselman wrote:
On 25-01-2013, at 10:25, marcoguerzoni <marco.guerzoni at unito.it> wrote:
dear all,
thank you for reading.
I have a dataset of artists and where and when they had an
exhibition.
I'd like to create an affiliation network in the form of matrix,
telling me
which aritist have been in the same at the same time.
I manage to do it, but given that I have 96000 observation the
program takes
30 months to complete.
her what i have done.
the data look like this
Artist <-c(1,2,3,2,4,4,5)
Begin <- as.Date(c('2006-08-23', '2006-03-21', '2006-03-06',
'2006-01-13',
'2006-05-20', '2006-07-13', '2006-07-20'))
End <- as.Date(c('2006-10-23', '2006-11-30', '2006-05-06',
'2006-12-13',
'2006-09-20', '2006-08-13', '2006-09-20'))
Istitution <- c(1, 2, 2, 1, 1, 2, 1)
artist is the name of the artist, Begin and End is the when and
Istitutionis
the where.
my IF is working,
#number of unique artist
c <- unique(Artist)
d <- length(c)
a <-length(Artist)
B <- mat.or.vec(d,d)
for(i in 1:d) {
for(j in 1:d) {
if (Istitution[i] == Istitution[j]) {
if (Begin[i] <= End[j])
{
if (End[i]-Begin[j] >= 0) {
B[i,j] <- B[i,j]+1
B[i,i] <- 0
}
}
else{
if (End[j]-Begin[i] >= 0) {
B[i,j] <- B[i,j]+1
B[i,i] <- 0
}
}
}
}
print(i)
}
do you have a way to make the programm simpler and faster?
It is not clear why you are only using the unique artists. You shouldn't be using "c" as variable name. It is a builtin function. Since the result is symmetric you can change the j-loop to for(j in (i+1):d). After the loop you can do B[lower.tri(B)] <- t(B)[lower.tri(B)] to fill the remainder of the matrix B. This would certainly be more efficient. But I don't quite understand what you are trying to do. With you example you could compute the result you desire. Gerrit's answer is concise. Berend
thank you Berend, what I like to do is to have a symmetric matrix, where raws and colums are artists and value I get 1 (or true) if they had an exhibition in the same and in the same place. My unelegant code is working, but for 96000 observation is requiring months and months. Gerrit is very elegant, but i run out of memory... the problem is the size. I am looking maybe for a way to divide gerrit solutonin smaller steps which can be handled thanx Marco