On Jan 4, 2017, at 7:40 PM, David Wolfskill <r at catwhisker.org> wrote:
I have (time series) data extracted from a repository that is stored
such that each record is for an hour, but each record contains an
ordered set of values throughout the hour. In the following exmaple,
I'll show sets of 4, depicting 0, 15, 30, and 45 minutes after each
"start" point (respectively):
test_data <- structure(list(start = c(1482793200L, 1482793200L, 1482793200L,
1482793200L, 1482793200L, 1482793200L, 1482793200L, 1482793200L,
1482793200L, 1482793200L, 1482793200L, 1482793200L, 1482793200L,
1482793200L, 1482793200L), hostname = c("c001.example.net", "c001.example.net",
"c001.example.net", "c001.example.net", "c001.example.net", "c001.example.net",
"c001.example.net", "c001.example.net", "c001.example.net", "c001.example.net",
"c001.example.net", "c161.example.net", "c161.example.net", "c161.example.net",
"c161.example.net"), mtype = c("health", "health", "net", "net",
"net", "net", "net", "net", "net", "sys", "sys", "net", "sys",
"sys", "sys"), limit_type = c("fill", "serve", "", "", "", "",
"", "", "", "", "", "", "", "", ""), hw = c(1.16, 1.16, 1.16,
1.16, 1.16, 1.16, 1.16, 1.16, 1.16, 1.16, 1.16, 1.21, 1.21, 1.21,
1.21), fw = c("2017Q1.1.1", "2017Q1.1.1", "2017Q1.1.1", "2017Q1.1.1",
"2017Q1.1.1", "2017Q1.1.1", "2017Q1.1.1", "2017Q1.1.1", "2017Q1.1.1",
"2017Q1.1.1", "2017Q1.1.1", "2016Q4.2.13", "2016Q4.2.13", "2016Q4.2.13",
"2016Q4.2.13"), tcp_state = c("", "", "", "", "closed", "closing",
"fin_wait_2", "last_ack", "syn_rcvd", "", "", "", "", "", ""),
value_type = c("limit", "limit", "", "", "", "", "", "",
"", "", "", "", "", "", ""), nic = c("all", "all", "", "",
"", "", "", "", "", "", "", "mce0", "", "", ""), name = c("in_download_window",
"in_download_window", "tcpOutSegs", "tcpRetransSegs", "tcp_connection_count",
"tcp_connection_count", "tcp_connection_count", "tcp_connection_count",
"tcp_connection_count", "CpuSystem", "CpuUser", "HCOutOctets",
"CpuIdle", "CpuSystem", "CpuUser"), values = c("[0.0, 0.0, 0.0, 0.0]",
"[0.0, 0.0, 0.0, 0.0]", "[260410.94547698632, 258469.54433635762, 260579.2186617577, 258763.2815145043]",
"[18436.311524062934, 18248.952271420356, 18201.62259198662, 17818.39529178736]",
"[5.0, 3.0, 3.0, 3.0]", "[3.0, 3.0, 2.0, 2.0]", "[670.0, 677.0, 685.0, 729.0]",
"[1162.0, 1192.0, 1148.0, 1110.0]", "[25.0, 60.0, 71.0, 33.0]",
"[11.0, 10.0, 11.0, 10.0]", "[2.0, 2.0, 2.0, 2.0]", "[7.873191635959294E9, 7.7377184658927E9, 7.876630519328283E9, 7.714521544912713E9]",
"[70.0, 70.0, 70.0, 70.0]", "[27.0, 26.0, 27.0, 26.0]", "[4.0, 4.0, 4.0, 4.0]"
)), .Names = c("start", "hostname", "mtype", "limit_type",
"hw", "fw", "tcp_state", "value_type", "nic", "name", "values"
), class = "data.frame", row.names = c(NA, -15L))
So of the 15 rows in the above example, row 8 (which depicts the TCP
connection counts in the "last ACK" state) has the values:
* 1162.0
* 1192.0
* 1148.0
* 1110.0
It seems to me that what is wanted is for each of the existing rows to be
replaced by a set of 4 (in this case) rows, where the other columns are
the same (save for "start", which is a timestamp, and should be adjusted
for the respective times).