Dear all;
I have two data sets, data=map and data=ref). A small part of each data set
are given below. Data map has more than 27 million and data ref has about
560 rows. Basically I need run two different task. My R codes for these
task are given below but they do not work properly.
I sincerely do appreciate your helps.
Regards,
Greg
Task 1)
For example, the first and second columns for row 1 in data ref are 29220
63933. So I need write an R code normally first look the first row in ref
(which they are 29220 and 63933) than summing the column of "map$rate" and
give the number of rows that >0.85. Then do the same for the second,
third....in ref. At the end I would like a table gave below (the results I
need). Please notice the all value specified in ref data file are exist in
map$reg column.
Task2)
Again example, the first and second columns for row 1 in data ref are 29220
63933. So I need write an R code give the minimum map$p for the 29220
-63933 intervals in map file. Than
do the same for the second, third....in ref.
#my attempt for the first question
temp<-map[order(map$reg, map$p),]
count<-1
temp<-unique(temp$reg
for(i in 1:length(ref) {
for(j in 1:length(ref)
{
temp1<-if (temp[pos[i]==ref[ref$reg1,] & (temp[pos[j]==ref[ref$reg2,]
& temp[cumsum(temp$rate)