Skip to content

How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)

5 messages · Kristiina Hurme, Phil Spector, Rui Barradas +2 more

#
Hello, 
I'd have a time series, where I am plotting the means and sd of a distance
for a variety of positions along a bird's bill. I'd like to set each line
(represented by "point") to start at zero, so that I can look at the
absolute change along the series. At the moment I only know how to do that
in Excel, by subtracting the value of time 1, point 1 from all other times
for point 1. My actual data set has many points ( 20 per bird, only 3 shown
here), so I would love to make this faster in R. Ideally, I would have
another column titled "adj_mean" for the adjusted means. 

Here is an example.
point time      mean        sd
1       1    1 52.501000 1.5073927
3       1    2 54.501818 0.8510329
4       1    3 56.601739 1.5787222
5       1    4 57.200000 1.2292726
6       1    5 59.300000 2.2632327
7       1    6 57.800893 1.4745218
8       1    7 55.303508 2.2661855
9       1    8 51.100943 1.8540025
10      1    9 50.600000 1.7126977
2       1   10 52.904716 1.1010460
111     2    1 50.605963 1.2633969
113     2    2 52.203828 0.7890765
114     2    3 54.100909 1.1013344
115     2    4 55.000000 1.1547005
116     2    5 57.001725 1.6341500
117     2    6 55.003591 1.5652438
118     2    7 52.911089 1.7373914
119     2    8 49.204022 1.0350809
120     2    9 48.904103 0.8747568
112     2   10 50.915700 0.8765483
131     3    1 48.608228 0.8433913
133     3    2 49.307101 0.4827703
134     3    3 51.310824 0.9424023
135     3    4 52.413350 0.6997860
136     3    5 54.116723 1.1927297
137     3    6 52.618161 1.1686288
138     3    7 49.822764 1.6303473
139     3    8 47.107336 1.2013356
140     3    9 47.104214 1.1986148
132     3   10 48.719484 0.6765047

and I would like it to look like this... (which I did in Excel). The start
of each time 1-10 has an adj_mean of 0.
point	time	mean	sd	adj_mean
1	1	1	52.501	1.5073927	0
3	1	2	54.501818	0.8510329	2.000818
4	1	3	56.601739	1.5787222	4.100739
5	1	4	57.2	1.2292726	4.699
6	1	5	59.3	2.2632327	6.799
7	1	6	57.800893	1.4745218	5.299893
8	1	7	55.303508	2.2661855	2.802508
9	1	8	51.100943	1.8540025	-1.400057
10	1	9	50.6	1.7126977	-1.901
2	1	10	52.904716	1.101046	0.403716
111	2	1	50.605963	1.2633969	0
113	2	2	52.203828	0.7890765	1.597865
114	2	3	54.100909	1.1013344	3.494946
115	2	4	55	1.1547005	4.394037
116	2	5	57.001725	1.63415	6.395762
117	2	6	55.003591	1.5652438	4.397628
118	2	7	52.911089	1.7373914	2.305126
119	2	8	49.204022	1.0350809	-1.401941
120	2	9	48.904103	0.8747568	-1.70186
112	2	10	50.9157	0.8765483	0.309737
131	3	1	48.608228	0.8433913	0
133	3	2	49.307101	0.4827703	0.698873
134	3	3	51.310824	0.9424023	2.702596
135	3	4	52.41335	0.699786	3.805122
136	3	5	54.116723	1.1927297	5.508495
137	3	6	52.618161	1.1686288	4.009933
138	3	7	49.822764	1.6303473	1.214536
139	3	8	47.107336	1.2013356	-1.500892
140	3	9	47.104214	1.1986148	-1.504014
132	3	10	48.719484	0.6765047	0.111256

Thank you so much for your help. 
Kristiina

--
View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html
Sent from the R help mailing list archive at Nabble.com.
#
Kristiina -
    If the data will always be sorted so that the first time 
for a point appears first in the data frame, you can use:

sort2v4$adj_mean = sort2v4$mean - ave(sort2v4$mean,sort2v4$point,FUN=function(x)x[1])

Otherwise, something like this should work:

firstmeans = subset(sort2v4,time==1,select=c(point,mean))
names(firstmeans)[2] = 'adj'
sort2v4 = merge(sort2v4,firstmeans)
sort2v4$adj_mean = with(sort2v4,mean-adj)
sort2v4$adj = NULL

    In the future, you may want to learn about the dput function, which makes
it a little easier for others to reproduce your data.

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu
On Sat, 30 Jun 2012, Kristiina Hurme wrote:

            
#
Hello,

Try, where 'dat' is your dataset,

dd <- lapply(split(dat, dat$point), function(x) x$mean - x$mean[1])
dat$adj_mean <- NA
for(i in names(dd))
	dat$adj_mean[dat$point == i] <- dd[[i]]
rm(dd)  # clean-up

Now 'dat' has one extra column, with the adjusted mean values.

Hope this helps,

Rui Barradas

Em 30-06-2012 22:21, Kristiina Hurme escreveu:
#
HI,

Try this:
#dat1: data

dat2<-split(dat1,dat1$point)
adjmeanlist<-lapply(dat2,function(x)x[,3]-x[,3][1])
dat3<-data.frame(dat1,adjmean=unlist(adjmeanlist))
?head(dat3)
? point time???? mean??????? sd? adjmean
1???? 1??? 1 52.50100 1.5073927 0.000000
3???? 1??? 2 54.50182 0.8510329 2.000818
4???? 1??? 3 56.60174 1.5787222 4.100739
5???? 1??? 4 57.20000 1.2292726 4.699000
6???? 1??? 5 59.30000 2.2632327 6.799000
7???? 1??? 6 57.80089 1.4745218 5.299893



A.K.



----- Original Message -----
From: Kristiina Hurme <kristiina.hurme at uconn.edu>
To: r-help at r-project.org
Cc: 
Sent: Saturday, June 30, 2012 5:21 PM
Subject: [R] How to adjust the start of a series to zero? (i.e. subtract the first value from the sequence)

Hello, 
I'd have a time series, where I am plotting the means and sd of a distance
for a variety of positions along a bird's bill. I'd like to set each line
(represented by "point") to start at zero, so that I can look at the
absolute change along the series. At the moment I only know how to do that
in Excel, by subtracting the value of time 1, point 1 from all other times
for point 1. My actual data set has many points ( 20 per bird, only 3 shown
here), so I would love to make this faster in R. Ideally, I would have
another column titled "adj_mean" for the adjusted means. 

Here is an example.
? ? point time? ? ? mean? ? ? ? sd
1? ? ?  1? ? 1 52.501000 1.5073927
3? ? ?  1? ? 2 54.501818 0.8510329
4? ? ?  1? ? 3 56.601739 1.5787222
5? ? ?  1? ? 4 57.200000 1.2292726
6? ? ?  1? ? 5 59.300000 2.2632327
7? ? ?  1? ? 6 57.800893 1.4745218
8? ? ?  1? ? 7 55.303508 2.2661855
9? ? ?  1? ? 8 51.100943 1.8540025
10? ? ? 1? ? 9 50.600000 1.7126977
2? ? ?  1?  10 52.904716 1.1010460
111? ?  2? ? 1 50.605963 1.2633969
113? ?  2? ? 2 52.203828 0.7890765
114? ?  2? ? 3 54.100909 1.1013344
115? ?  2? ? 4 55.000000 1.1547005
116? ?  2? ? 5 57.001725 1.6341500
117? ?  2? ? 6 55.003591 1.5652438
118? ?  2? ? 7 52.911089 1.7373914
119? ?  2? ? 8 49.204022 1.0350809
120? ?  2? ? 9 48.904103 0.8747568
112? ?  2?  10 50.915700 0.8765483
131? ?  3? ? 1 48.608228 0.8433913
133? ?  3? ? 2 49.307101 0.4827703
134? ?  3? ? 3 51.310824 0.9424023
135? ?  3? ? 4 52.413350 0.6997860
136? ?  3? ? 5 54.116723 1.1927297
137? ?  3? ? 6 52.618161 1.1686288
138? ?  3? ? 7 49.822764 1.6303473
139? ?  3? ? 8 47.107336 1.2013356
140? ?  3? ? 9 47.104214 1.1986148
132? ?  3?  10 48.719484 0.6765047

and I would like it to look like this... (which I did in Excel). The start
of each time 1-10 has an adj_mean of 0.
??? point??? time??? mean??? sd??? adj_mean
1??? 1??? 1??? 52.501??? 1.5073927??? 0
3??? 1??? 2??? 54.501818??? 0.8510329??? 2.000818
4??? 1??? 3??? 56.601739??? 1.5787222??? 4.100739
5??? 1??? 4??? 57.2??? 1.2292726??? 4.699
6??? 1??? 5??? 59.3??? 2.2632327??? 6.799
7??? 1??? 6??? 57.800893??? 1.4745218??? 5.299893
8??? 1??? 7??? 55.303508??? 2.2661855??? 2.802508
9??? 1??? 8??? 51.100943??? 1.8540025??? -1.400057
10??? 1??? 9??? 50.6??? 1.7126977??? -1.901
2??? 1??? 10??? 52.904716??? 1.101046??? 0.403716
111??? 2??? 1??? 50.605963??? 1.2633969??? 0
113??? 2??? 2??? 52.203828??? 0.7890765??? 1.597865
114??? 2??? 3??? 54.100909??? 1.1013344??? 3.494946
115??? 2??? 4??? 55??? 1.1547005??? 4.394037
116??? 2??? 5??? 57.001725??? 1.63415??? 6.395762
117??? 2??? 6??? 55.003591??? 1.5652438??? 4.397628
118??? 2??? 7??? 52.911089??? 1.7373914??? 2.305126
119??? 2??? 8??? 49.204022??? 1.0350809??? -1.401941
120??? 2??? 9??? 48.904103??? 0.8747568??? -1.70186
112??? 2??? 10??? 50.9157??? 0.8765483??? 0.309737
131??? 3??? 1??? 48.608228??? 0.8433913??? 0
133??? 3??? 2??? 49.307101??? 0.4827703??? 0.698873
134??? 3??? 3??? 51.310824??? 0.9424023??? 2.702596
135??? 3??? 4??? 52.41335??? 0.699786??? 3.805122
136??? 3??? 5??? 54.116723??? 1.1927297??? 5.508495
137??? 3??? 6??? 52.618161??? 1.1686288??? 4.009933
138??? 3??? 7??? 49.822764??? 1.6303473??? 1.214536
139??? 3??? 8??? 47.107336??? 1.2013356??? -1.500892
140??? 3??? 9??? 47.104214??? 1.1986148??? -1.504014
132??? 3??? 10??? 48.719484??? 0.6765047??? 0.111256

Thank you so much for your help. 
Kristiina

--
View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-the-start-of-a-series-to-zero-i-e-subtract-the-first-value-from-the-sequence-tp4634999.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.