Skip to content
Prev 29220 / 29559 Next

Finding the highest and lowest rates of increase at specific x value across several time series in R

The only spot you'll need to change the names for is when putting all of your dataframes in a list as that is based on the names you gave them in your script when reading in the data. In the function, you don't need to change the input to "dataframe1", and naming it that way could be confusing since you are applying the function to more than just dataframe1 (you're applying it to all 10 of your dataframes). I named the argument df to indicate that you should supply your dataframe as the input to the function, but you could name it anything you want. For example, you could call it "mydata" and define the function this way if you wanted to.

ExtractFirstMin<- function(mydata){
  mydata$abs_diff<- abs(mydata$x-1)
  min_rate<- mydata$y[which.min(mydata$abs_diff)]
  return(min_rate)
}

#The function has its own environment of variables that is separate from the global environment of variables you've defined in your script.
#When we supply one of your dataframes to the function, we are assigning that information to a variable in the function's environment called "mydata". Functions allow you to generalize your code so that you're not required to name your variables a certain way. Note here, we do assume that "mydata" has a "$x" and "$y" slot though.

#Without generalizing the code using a function, we'd need to copy and paste the code over and over again and make sure to change the name of the dataframe each time. This is very time consuming and error prone. Here's an example for the first 3 dataframes.

min_rate<- rep(NA_real_, 10) #initialize empty vector
df1$abs_diff<- abs(df1$x-1)
min_rate[1]<- df1$y[which.min(df1$abs_diff)]

df2$abs_diff<- abs(df2$x-1)
min_rate[2]<- df2$y[which.min(df2$abs_diff)]

df3$abs_diff<- abs(df3$x-1)
min_rate[3]<- df3$y[which.min(df3$abs_diff)]

print(min_rate)
#>  [1] 29.40269 32.21546 30.75330       NA       NA       NA       NA       NA
#>  [9]       NA       NA

#With the function defined we can run that it for each individual dataframe, which is less error prone than copying and pasting but still fairly repetitive
ExtractFirstMin(mydata = df1) # You can explicitly say "mydata ="
#> [1] 29.40269
ExtractFirstMin(df2) # Or equivalently it will be based on the order arguments when you defined the function. Since there is just one argument, then what you supply is assigned to "mydata"
#> [1] 32.21546
ExtractFirstMin(df3)
#> [1] 30.7533

# Rather than manually typing out to tun the function on eeach dataframe and bringing it together, we can instead use sapply.
# Sapply takes a list of inputs and a function as arguments. It then applies the function to every element in the list and returns a vector (i.e. goes through each dataframe in your list, applies the function to each one individually, and then records the result for each one in a single variable).
sapply(df_list, ExtractFirstMin)
#>  [1] 29.40269 32.21546 30.75330 30.12109 30.38361 28.64928 30.45568 29.66190
#>  [9] 31.57229 31.33907