Skip to content

Getting an unexpected extra row when merging two dataframes

2 messages · Paul Bernal, jim holtman

#
Hello everyone,

Hope you are all doing great. So I have two datasets:

-dataset1Frame: which contains the historical number of transits from
october 1st, 1985 up to march 1, 2017. It has two columns, one called
TransitDate and the other called Transits. dataset1Frame is a table comming
from an SQL Server Database.

-TransitDateFrame: a made up dataframe that goes from october 1st, 1985 up
to the last date available in dataset1Frame.

Note: The reason why I made up TransitDataFrame is because, since sometimes
dataset1Frame has missing observations (some dates do not exist), and I
just want to make sure I have all the dates available from october 1, 1985
up to the last available observation.
The idea is to leave the transits that do exist as they come, and add the
dates missing as aditional rows (observations) with a value of NA for the
transits.

That being said, here is the code:
verbose=TRUE)
as.Date(dataset1Frame[nrow(dataset1Frame),1]), "months")
#Output from resulting dataframes
Why is there an additional row(observation) with a value of NA if I
specified that the dataframe should only go to the last observation? There
should be 378 observations at the end and I get 379 observations instead.

The reason I am doing it this way is because this is how I got to fill in
the gaps in dates (whenever there are nonexistent observations/missing
data).

Any guidance will be greatly appreciated.

I am attaching a .txt file as a reference,

Best regards,

Paul
-------------- next part --------------
TransitDate	Transits
1-Oct-85	55
1-Nov-85	66
1-Dec-85	14
1-Jan-86	48
1-Feb-86	57
1-Mar-86	49
1-Apr-86	70
1-May-86	19
1-Jun-86	27
1-Jul-86	28
1-Aug-86	27
1-Sep-86	66
1-Oct-86	26
1-Nov-86	52
1-Dec-86	29
1-Jan-87	59
1-Feb-87	47
1-Mar-87	59
1-Apr-87	46
1-May-87	39
1-Jun-87	11
1-Jul-87	42
1-Aug-87	14
1-Sep-87	38
1-Oct-87	34
1-Nov-87	21
1-Dec-87	39
1-Jan-88	18
1-Feb-88	67
1-Mar-88	35
1-Apr-88	49
1-May-88	36
1-Jun-88	22
1-Jul-88	69
1-Aug-88	69
1-Sep-88	33
1-Oct-88	26
1-Nov-88	43
1-Dec-88	11
1-Jan-89	46
1-Feb-89	22
1-Mar-89	53
1-Apr-89	46
1-May-89	49
1-Jun-89	64
1-Jul-89	16
1-Aug-89	31
1-Sep-89	22
1-Oct-89	37
1-Nov-89	32
1-Dec-89	60
1-Jan-90	65
1-Feb-90	72
1-Mar-90	14
1-Apr-90	35
1-May-90	25
1-Jun-90	9
1-Jul-90	56
1-Aug-90	47
1-Sep-90	62
1-Oct-90	39
1-Nov-90	36
1-Dec-90	60
1-Jan-91	48
1-Feb-91	56
1-Mar-91	42
1-Apr-91	32
1-May-91	28
1-Jun-91	64
1-Jul-91	30
1-Aug-91	12
1-Sep-91	29
1-Oct-91	63
1-Nov-91	38
1-Dec-91	64
1-Jan-92	16
1-Feb-92	70
1-Mar-92	49
1-Apr-92	34
1-May-92	34
1-Jun-92	69
1-Jul-92	11
1-Aug-92	20
1-Sep-92	17
1-Oct-92	51
1-Nov-92	41
1-Dec-92	18
1-Jan-93	16
1-Feb-93	15
1-Mar-93	43
1-Apr-93	58
1-May-93	43
1-Jun-93	49
1-Jul-93	69
1-Aug-93	45
1-Sep-93	68
1-Oct-93	19
1-Nov-93	18
1-Dec-93	30
1-Jan-94	43
1-Feb-94	28
1-Mar-94	13
1-Apr-94	47
1-May-94	71
1-Jun-94	67
1-Jul-94	28
1-Aug-94	44
1-Sep-94	61
1-Oct-94	73
1-Nov-94	68
1-Dec-94	61
1-Jan-95	34
1-Feb-95	70
1-Mar-95	16
1-Apr-95	68
1-May-95	40
1-Jun-95	60
1-Jul-95	69
1-Aug-95	53
1-Sep-95	20
1-Oct-95	28
1-Nov-95	61
1-Dec-95	41
1-Jan-96	25
1-Feb-96	40
1-Mar-96	31
1-Apr-96	24
1-May-96	57
1-Jun-96	22
1-Jul-96	44
1-Aug-96	19
1-Sep-96	24
1-Oct-96	50
1-Nov-96	46
1-Dec-96	50
1-Jan-97	22
1-Feb-97	31
1-Mar-97	32
1-Apr-97	45
1-May-97	19
1-Jun-97	40
1-Jul-97	22
1-Aug-97	60
1-Sep-97	48
1-Oct-97	68
1-Nov-97	45
1-Dec-97	42
1-Jan-98	33
1-Feb-98	70
1-Mar-98	41
1-Apr-98	67
1-May-98	24
1-Jun-98	29
1-Jul-98	73
1-Aug-98	50
1-Sep-98	13
1-Oct-98	11
1-Nov-98	28
1-Dec-98	31
1-Jan-99	35
1-Feb-99	22
1-Mar-99	22
1-Apr-99	52
1-May-99	38
1-Jun-99	41
1-Jul-99	64
1-Aug-99	57
1-Sep-99	66
1-Oct-99	13
1-Nov-99	10
1-Dec-99	16
1-Jan-00	31
1-Feb-00	16
1-Mar-00	38
1-Apr-00	50
1-May-00	44
1-Jun-00	19
1-Jul-00	61
1-Aug-00	63
1-Sep-00	12
1-Oct-00	68
1-Nov-00	65
1-Dec-00	20
1-Jan-01	58
1-Feb-01	73
1-Mar-01	57
1-Apr-01	23
1-May-01	50
1-Jun-01	71
1-Jul-01	48
1-Aug-01	35
1-Sep-01	42
1-Oct-01	9
1-Nov-01	42
1-Dec-01	29
1-Jan-02	57
1-Feb-02	63
1-Mar-02	49
1-Apr-02	44
1-May-02	39
1-Jun-02	63
1-Jul-02	35
1-Aug-02	11
1-Sep-02	26
1-Oct-02	43
1-Nov-02	62
1-Dec-02	27
1-Jan-03	69
1-Feb-03	22
1-Mar-03	56
1-Apr-03	21
1-May-03	43
1-Jun-03	58
1-Jul-03	52
1-Aug-03	18
1-Sep-03	12
1-Oct-03	32
1-Nov-03	12
1-Dec-03	58
1-Jan-04	27
1-Feb-04	27
1-Mar-04	51
1-Apr-04	33
1-May-04	66
1-Jun-04	61
1-Jul-04	53
1-Aug-04	68
1-Sep-04	29
1-Oct-04	22
1-Nov-04	10
1-Dec-04	60
1-Jan-05	59
1-Feb-05	68
1-Mar-05	24
1-Apr-05	69
1-May-05	43
1-Jun-05	32
1-Jul-05	37
1-Aug-05	13
1-Sep-05	34
1-Oct-05	13
1-Nov-05	42
1-Dec-05	69
1-Jan-06	67
1-Feb-06	34
1-Mar-06	56
1-Apr-06	40
1-May-06	58
1-Jun-06	57
1-Jul-06	36
1-Aug-06	44
1-Sep-06	47
1-Oct-06	45
1-Nov-06	41
1-Dec-06	12
1-Jan-07	56
1-Feb-07	24
1-Mar-07	29
1-Apr-07	22
1-May-07	19
1-Jun-07	69
1-Jul-07	47
1-Aug-07	51
1-Sep-07	30
1-Oct-07	25
1-Nov-07	55
1-Dec-07	73
1-Jan-08	10
1-Feb-08	20
1-Mar-08	16
1-Apr-08	30
1-May-08	48
1-Jun-08	48
1-Jul-08	58
1-Aug-08	59
1-Sep-08	26
1-Oct-08	70
1-Nov-08	51
1-Dec-08	33
1-Jan-09	52
1-Feb-09	15
1-Mar-09	54
1-Apr-09	63
1-May-09	50
1-Jun-09	33
1-Jul-09	43
1-Aug-09	25
1-Sep-09	39
1-Oct-09	9
1-Nov-09	71
1-Dec-09	64
1-Jan-10	62
1-Feb-10	61
1-Mar-10	54
1-Apr-10	38
1-May-10	45
1-Jun-10	55
1-Jul-10	52
1-Aug-10	49
1-Sep-10	43
1-Oct-10	19
1-Nov-10	22
1-Dec-10	53
1-Jan-11	70
1-Feb-11	36
1-Mar-11	49
1-Apr-11	42
1-May-11	39
1-Jun-11	29
1-Jul-11	73
1-Aug-11	19
1-Sep-11	24
1-Oct-11	28
1-Nov-11	66
1-Dec-11	67
1-Jan-12	67
1-Feb-12	21
1-Mar-12	59
1-Apr-12	45
1-May-12	24
1-Jun-12	9
1-Jul-12	61
1-Aug-12	38
1-Sep-12	42
1-Oct-12	66
1-Nov-12	22
1-Dec-12	73
1-Jan-13	48
1-Feb-13	63
1-Mar-13	47
1-Apr-13	38
1-May-13	54
1-Jun-13	30
1-Jul-13	30
1-Aug-13	52
1-Sep-13	21
1-Oct-13	52
1-Nov-13	73
1-Dec-13	24
1-Jan-14	67
1-Feb-14	13
1-Mar-14	51
1-Apr-14	49
1-May-14	22
1-Jun-14	33
1-Jul-14	15
1-Aug-14	24
1-Sep-14	23
1-Oct-14	40
1-Nov-14	51
1-Dec-14	46
1-Jan-15	13
1-Feb-15	73
1-Mar-15	63
1-Apr-15	67
1-May-15	59
1-Jun-15	29
1-Jul-15	56
1-Aug-15	73
1-Sep-15	51
1-Oct-15	40
1-Nov-15	17
1-Dec-15	51
1-Jan-16	41
1-Feb-16	64
1-Mar-16	31
1-Apr-16	15
1-May-16	13
1-Jun-16	53
1-Jul-16	20
1-Aug-16	11
1-Sep-16	43
1-Oct-16	68
1-Nov-16	70
1-Dec-16	14
1-Jan-17	45
1-Feb-17	27
1-Mar-17	40
#
first of all when you read the data in you get 379 rows of data since
you did not say 'header = TRUE' in the read.table.  Here is what the
first 6 lines of you data are:
'data.frame':   379 obs. of  2 variables:
 $ V1: Factor w/ 379 levels "1-Apr-00","1-Apr-01",..: 379 333 301 80
145 113 239 18 270 207 ...
 $ V2: Factor w/ 66 levels "10","11","12",..: 66 46 57 5 39 48 40 61 10 18 ...
V1       V2
1 TransitDate Transits
2    1-Oct-85       55
3    1-Nov-85       66
4    1-Dec-85       14
5    1-Jan-86       48
6    1-Feb-86       57
You need to learn to use 'str' to look at the structure.  So when you
are converting the dates, you will get an NA because the first row has
"TransitDate".  Now if you had used 'header = TRUE', you data would
look like this:
+                     header = TRUE,
+                     as.is = TRUE  # prevent conversion to factors
+                     )
'data.frame':   378 obs. of  2 variables:
 $ TransitDate: chr  "1-Oct-85" "1-Nov-85" "1-Dec-85" "1-Jan-86" ...
 $ Transits   : int  55 66 14 48 57 49 70 19 27 28 ...
TransitDate Transits
1    1-Oct-85       55
2    1-Nov-85       66
3    1-Dec-85       14
4    1-Jan-86       48
5    1-Feb-86       57
6    1-Mar-86       49
So try again.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Wed, Mar 29, 2017 at 11:02 AM, Paul Bernal <paulbernal07 at gmail.com> wrote: