Skip to content

[R studio] Plotting of line chart for each columns at 1 page

10 messages · Jim Lemon, Subhamitra Patra

#
Hi Subhamitra,
As before, I don't have your data, so I cannot run your code. Similarly,
when you say the plot is looking unclear, I have almost no idea what that
means in terms of a plot that I could see and possibly correct. Let's start
at the top anyway. You set up an array of 20 plots, then plot 38 series.
This is going to cycle through your array almost twice, as each time you
plot, you step forward one plot in the array. At the end of the first Excel
data sheet, you will be at plot 18 out of the 20. You then display another
10 plots, leaving you at plot 8 in the array. Each time you plot into the
same section of the array, you will wipe out the previous plot. Maybe this
is why you are not getting what you want. Finally, you display a further
two plots, leaving you at plot 10. If I am correct, you will have plots 31
to 38 from sheet 1 in the bottom two rows, with plots 1 and 2 from sheet
two in positions 19 and 20, then 3 to 10 from sheet 2 in the top rows,
finishing off with the two plots from sheet 3 in the 9th and 10th positions
in the second row of the plot array. Of course I can't see what you
actually have plotted, so this is but a desperate guess. I apologize for
the complicated answer which is probably no use whatever, but without data
and hopefully the output of your code, I am unable to read your mind.

Jim


On Fri, Dec 14, 2018 at 4:26 PM Subhamitra Patra <subhamitra.patra at gmail.com>
wrote:

  
  
#
Hello Sir,

I am extremely Sorry for the late reply.

Ok now, I am sending my data and output, and would like to discuss my
queries one by one.

This is my final code.

pdf("EMs.pdf",width=20,height=20)
par(mfrow=c(5,4))
# import your first sheet here (16 columns)
ncolumns<-ncol(EMs1.1)
for(i in 1:ncolumns)
  plot(EMs1.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs1.1)[i])
#import your second sheet here, (1 column)
ncolumns<-ncol(EMs2.1)
for(i in 1:ncolumns)
  plot(EMs2.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs2.1)[i])
# import your Third sheet here, (1 column)
ncolumns<-ncol(EMs3.1)
for(i in 1:ncolumns)
  plot(EMs3.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs3.1)[i])
# import your fourth sheet here, (1 column)
ncolumns<-ncol(EMs4.1)
for(i in 1:ncolumns)
  plot(EMs4.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs4.1)[i])
# finish plotting
dev.off()

With this code, I found the following results. I attached The data files
(in rar file containing 4 excel files) and the output of my result.

*My First query is :*
I am having daily data, and without defining the date column, I obtained
the results. Therefore, I found the no. of observation in my X-axis of the
plots (in the attached result Pdf file). Now, I need the date column in my
X-axis with the corresponding data. I considered 03-01-1994 to 03-08-2017
(Date-Month-Year) by excluding 2 non-trading days per week.

I know that the frequency for defining yearly data is 1. So, I tried with
the following code being attached to the plot code, but not sure that
whether It is giving the appropriate plot or not?

*library(zoo)*
*y=zoo(EMs1, seq(from = as.Date("1994-01-01"), to = as.Date("2017-08-03"),
by = 1))*

*Therefore, kindly suggest me instead of getting the no. of observations in
the X-axis, how to get the date column in X-axis? *




[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
12/15/18,
10:39:06 AM
On Fri, Dec 14, 2018 at 2:48 PM Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
    
#
Hi Subhamitra,
Thanks. Now I can provide some assistance instead of just complaining. Your
first problem is the temporal extent of the data. There are 8613 days and
6512 weekdays between the two dates you list, but only 5655 observations in
your data. Therefore it is unlikely that you have a complete data series,
or perhaps you have the wrong dates. For the moment I'll assume that there
are missing observations. What I am going to do is to match the 24 years
(1994-2017) to their approximate positions in the time series. This will
give you the x-axis labels that you want, close enough for this
illustration. I doubt that you will need anything more accurate. You have a
span of 24.58 years, which means that if your missing observations are
uniformly distributed, you will have almost exactly 226 observations per
year. When i tried this, I got too many intervals, so I increased the
increment to 229 and that worked. To get the positions for the middle of
each year in the indices of the data:

year_mids<-seq(182,5655,by=229)

Now I suppress the x-axis by adding xaxt="n" to each call to plot. Then I
add a command to display the years at the positions I have calculated:

axis(1,at=year_mids,labels=1994:2017)

Also note that I have added braces to the "for" loop. Putting it all
together:

year_mids<-seq(182,5655,by=229)
pdf("EMs.pdf",width=20,height=20)
par(mfrow=c(5,4))
# import your first sheet here (16 columns)
EMs1.1<-read.csv("EMs1.1.csv")
ncolumns<-ncol(EMs1.1)
for(i in 1:ncolumns) {
  plot(EMs1.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs1.1)[i],xaxt="n")
 axis(1,at=year_mids,labels=1994:2017)
}
#import your second sheet here, (1 column)
EMs2.1<-read.csv("EMs2.1.csv")
ncolumns<-ncol(EMs2.1)
for(i in 1:ncolumns) {
  plot(EMs2.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs2.1)[i],xaxt="n")
 axis(1,at=year_mids,labels=1994:2017)
}
# import your Third sheet here, (1 column)
EMs3.1<-read.csv("EMs3.1.csv")
ncolumns<-ncol(EMs3.1)
for(i in 1:ncolumns) {
  plot(EMs3.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs3.1)[i],xaxt="n")
 axis(1,at=year_mids,labels=1994:2017)
}
# import your fourth sheet here, (1 column)
EMs4.1<-read.csv("EMs4.1.csv")
ncolumns<-ncol(EMs4.1)
for(i in 1:ncolumns) {
  plot(EMs4.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs4.1)[i],xaxt="n")
 axis(1,at=year_mids,labels=1994:2017)
}
# finish plotting
dev.off()

With any luck, you are now okay. Remember, this is a hack to deal with data
that are not what you think they are.

Jim
#
Thank you very much sir. Actually, I excluded all the non-trading days.
Therefore, Each year will have 226 observations and total 6154 observations
for each column. The data which I plotted is not rough data. I obtained the
rolling observations of window 500 from my original data. So, the no. of
observations for each resulted column is (6154-500)+1=5655. So, It is not
accurate as per the days of calculations of each year.

Ok, Sir, I will go through your suggestion, obtain the results for each
column of my data and would like to discuss the results with you. After
solving of this problem, I would like to discuss another 2 queries.

Thank you very much Sir for educating a new R learner.

[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
12/16/18,
12:20:17 PM
On Sun, Dec 16, 2018 at 8:10 AM Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
    
#
Hello Sir,

I have three queries regarding your suggested code.

*1. *In my last email, I mentioned why there are missing observations in my
data series. In the line, *year_mids<-seq(182,5655,by=229), *

*A. what 182 indicates and what is the logic behind the consideration of
229 increments, although there are 226 observations per year?*
*B.  Each excel file is having different observations depending on the
variation of starting dates. So, is it required to add  **year_mids in the
loop? I think I need to justify **year_mids object each time after
importing the individual excel files. If I am wrong, kindly correct me.*

2. Further, in the command* axis(1,at=year_mids,labels=1994:2017), 1
indicates the no. of increments of year name, right?*

Kindly clarify my queries Sir for which I shall be always grateful to you.

Thank you very much.

[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
12/16/18,
1:29:05 PM

On Sun, Dec 16, 2018 at 12:24 PM Subhamitra Patra <
subhamitra.patra at gmail.com> wrote:

            

  
    
#
Hi Subhamitra,
As I said, the code I sent is an approximation to get your year labels in
about the correct places. You are welcome to improve the calculations.

182 days is about half a year, so that the first "tick" will fall around
the end of June (i.e. the middle of the year). If you specify the increment
as 226, you get one too many labels. 229 is what is known as a kludge (a
clumsy solution that works)

Yes, I mistakenly thought that the observations were the same throughout
the four files. As you know this (and I didn't) you can do a better job of
placing the year labels by changing the sequence for each of the CSV (not
Excel) files. The best method of all would be to have a date for each
observation. You could then discard all these approximations I have made to
get the plots to work.

No, the arguments of the axis function are:

axis(<side of plot>, <position of ticks>, <labels for the ticks>)

The first argument is; 1=bottom, 2=left, 3=top, 4=right. The next two
arguments must be the same length. If not, you will get an error. As you
can see, only every other tick has a label to avoid crowding. There are
ways to get more tick labels on an axis.

Jim


On Sun, Dec 16, 2018 at 7:03 PM Subhamitra Patra <subhamitra.patra at gmail.com>
wrote:

  
  
#
Hello Sir,

Thank you very much for your excellent guidance to a new R learner.

I tried with your suggested code and got the expected results, but for the
2 CSV files (i.e. EMs2.1. and EMs.3.1), the date column is not coming in
the X-axis (shown in the last row of the attached result Pdf file).  I
think I need to increase more or less than 229 in the year-mids because for
both the CSV files, starting date is 03-01-2002 and 04-07-2001
(date-month-year) for EMs 2.1. and EMs 3.1. respectively. *Sir, hence I am
quite confused for the logic behind the fixing of year_mids*. For your
convenience, I am attaching both the code and result file.

pdf("EMs1.pdf",width=20,height=20)
par(mfrow=c(5,4))
# import your first sheet here (16 columns)
EMs1.1<-read.csv("EMs1.1.csv")
ncolumns<-ncol(EMs1.1)
for(i in 1:ncolumns) {
  plot(EMs1.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs1.1)[i],xaxt="n")
  year_mids<-seq(182,5655,by=229)
  axis(1,at=year_mids,labels=1994:2017)
}
#import your second sheet here, (1 column)
EMs2.1<-read.csv("EMs2.1.csv")
ncolumns<-ncol(EMs2.1)
for(i in 1:ncolumns) {
  plot(EMs2.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs2.1)[i],xaxt="n")
  year_mids<-seq(182,3567,by=229)
  axis(1,at=year_mids,labels=2002:2017)
}
# import your Third sheet here, (1 column)
EMs3.1<-read.csv("EMs3.1.csv")
ncolumns<-ncol(EMs3.1)
for(i in 1:ncolumns) {
  plot(EMs3.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs3.1)[i],xaxt="n")
  year_mids<-seq(182,3698,by=229)
  axis(1,at=year_mids,labels=2001:2017)
}
# import your fourth sheet here, (1 column)
EMs4.1<-read.csv("EMs4.1.csv")
ncolumns<-ncol(EMs4.1)
for(i in 1:ncolumns) {
  plot(EMs4.1[,i],type="l",col = "Red", xlab="Time",
       ylab="APEn", main=names(EMs4.1)[i],xaxt="n")
  year_mids<-seq(182,5265,by=229)
  axis(1,at=year_mids,labels=1995:2017)
}
# finish plotting
dev.off()


Sir, According to your suggestion, *"** you can do a better job of placing
the year labels by changing the sequence for each of the CSV (not Excel)
files. The best method of all would be to have a date for each observation.
You could then discard all these approximations I have made to get the
plots to work.**" , *when I am adding the date (i.e. date-month-year) in
the sequence *(**year_mids<-seq(182,5655,by=229)*
*
  axis(1,at=year_mids,labels=03-01-1994:03-08-2017)*
*
  })*
*I am getting the error.*

Kindly suggest.

Thank you very much.





[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
12/17/18,
12:25:26 PM
On Sun, Dec 16, 2018 at 2:58 PM Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
    
#
Hi Subhamitra,
As for the error that you mention, it was probably:

Error in axis(1, at = year_mids, labels = 3 - 1 - 1994:3 - 8 - 2017) :
  'at' and 'labels' lengths differ, 24 != 1992

Anything more than a passing glance reveals that you didn't read the
explanation I sent about the arguments passed to the "axis" function.
Perhaps it will be rewarding to read the help page for the "axis" function
in the "graphics" package.

Your confusion about the logic (really simple arithmetic) of assigning
positions for the year labels may be allayed by the following. Think back
to those grade school problems that read:

 "If I have m apples to give to n people, how many must I give each person
so that all will receive the same number and I will have the fewest apples
left?"

I'm sure that you remember that this can be solved in a number of ways. You
can divide m/n and drop the remainder. So, from 03-01-2002 to 03-08-2017 in
EMs2.1:

diff(as.Date(c("03-01-2002","03-08-2017"),"%d-%m-%Y"))
Time difference of 5691 days
# plus 1 for all of the days included
# calculate the number of years
5692/365.25
[1] 15.58385

So if there had been an observation each day, you would have the trivial
task of dividing the number of days by the number of years to get the tick
increments:

5692/15.58385
365.2499

Of course you don't have that many observations and you are trying to get
the number of observations, not days, in each year. By making the
assumption that the missing observations are spread evenly over the years,
you can simply replace the number of days with the number of observations.
At the moment I don't have that as I unrared your data at home. But you do
have it and I will call it nobs:

# this calculates the number of observations per year
nobs/15.58385
<obs_per_year>

will yield the number of observations in each year. So you have your tick
increments. Now for the offset. If you want the year ticks to appear at the
middle of each year, you will want to start at 182 minus the two days
missing in January or 180. So your new year_mids will be:

year_mids<-seq(180,nobs,obs_per_year)

Your years are 2002:2017 for EMs2.1, so:

axis(1,year_mids,2002:2017)

may well be what you want for axis ticks. As you can see, the "m apples to
n people" approach gives you the answer. The only missing part was the
offset, or where to start handing out apples. You might want to have
another look at the help pages for "axis" and "seq" (or ":") which will
show you why your axis command failed badly. Good luck.

Jim

On Mon, Dec 17, 2018 at 6:12 PM Subhamitra Patra <subhamitra.patra at gmail.com>
wrote:

  
  
#
Hi Subhamitra,
My apologies, I caught a mistake. To have the first tick in the middle of
the first year, you want half of the _observations_ in a year, not half of
the days. As I now have your data at my fingertips:

3567/15.58385
[1] 228.8908

Almost exactly what was calculated for the first series. Your increment
remains 229 and your offset is 114, so

year_mids<-seq(114,3567,229)

Jim
#
Hello Sir,

It is really great learning for me while discussing with you.

As per your suggestion, I also read the axis function in the graphics
package, and now completely understand your logic. I will apply the same
logic for my rest variable and would like to discuss after the successful
generation of all plots.

Thank you very much, Sir, for pointing me to the right path.



[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
12/18/18,
5:54:53 PM
On Tue, Dec 18, 2018 at 3:31 PM Jim Lemon <drjimlemon at gmail.com> wrote: