Hi R-Helpers,
I am taking it upon myself to delve into the world of lists for R. In no small part because I appear to have discovered a source of data for an exceptionally good price but that delivers much of that data in json format.
So over the last day or so I managed to fight the list processing tools to a draw and get a list that has only selected elements (actually it ends up in matrix form). But when I try to convert that to a data frame I can't get it to a form that is workable.
I have visited some pages about converting a matrix to a data frame but they result in highly redundant and inelegant data.
I am thinking that someone who works with lists and matrices knows how to do this quite easily and would be willing to provide a solution.
The reproducible example is shown below. Just to be explicit, what I am trying to get to is something along the lines of a data frame like this.
Date netIncome Gross Profit
2020-09-30 57411000000 104956000000
2019-09-30 55256000000 98392000000
.....
The closest I get is a matrix that looks like this
wanted
2020-09-30 2019-09-30 2018-09-30 2017-09-30 2016-09-30 2015-09-30 2014-09-30
date "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30" "2016-09-30" "2015-09-30" "2014-09-30"
netIncome "57411000000.00" "55256000000.00" "59531000000.00" "48351000000.00" "45687000000.00" "53394000000.00" "39510000000.00"
grossProfit "104956000000.00" "98392000000.00" "101839000000.00" "88186000000.00" "84263000000.00" "93626000000.00" "70537000000.00"
Code for example
library(jsonlite)
test <- jsonlite::fromJSON("https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX")
hist<-test[[13]]
ISY<-hist$Income_Statement$yearly
wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
Your guidance would be much appreciated.
--John J. Sparks, Ph.D.
'data.frame': 36 obs. of 3 variables:
$ Date : Date, format: "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" ...
$ netIncome : num 5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
$ Gross Profit: num 1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
with(df, plot(Date, netIncome))
On Thu, Jul 1, 2021 at 5:35 PM Sparks, John <jspark4 at uic.edu> wrote:
Hi R-Helpers,
I am taking it upon myself to delve into the world of lists for R. In no
small part because I appear to have discovered a source of data for an
exceptionally good price but that delivers much of that data in json format.
So over the last day or so I managed to fight the list processing tools to
a draw and get a list that has only selected elements (actually it ends up
in matrix form). But when I try to convert that to a data frame I can't
get it to a form that is workable.
I have visited some pages about converting a matrix to a data frame but
they result in highly redundant and inelegant data.
I am thinking that someone who works with lists and matrices knows how to
do this quite easily and would be willing to provide a solution.
The reproducible example is shown below. Just to be explicit, what I am
trying to get to is something along the lines of a data frame like this.
Date netIncome Gross Profit
2020-09-30 57411000000 104956000000
2019-09-30 55256000000 98392000000
.....
The closest I get is a matrix that looks like this
wanted
2020-09-30 2019-09-30 2018-09-30
2017-09-30 2016-09-30 2015-09-30 2014-09-30
date "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" "2016-09-30" "2015-09-30" "2014-09-30"
netIncome "57411000000.00" "55256000000.00" "59531000000.00"
"48351000000.00" "45687000000.00" "53394000000.00" "39510000000.00"
grossProfit "104956000000.00" "98392000000.00" "101839000000.00"
"88186000000.00" "84263000000.00" "93626000000.00" "70537000000.00"
Code for example
library(jsonlite)
test <- jsonlite::fromJSON("
https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX
")
hist<-test[[13]]
ISY<-hist$Income_Statement$yearly
wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
Your guidance would be much appreciated.
--John J. Sparks, Ph.D.
[[alternative HTML version deleted]]
Bill,
A Matrix can only contain one kind of data. I ran your code after modifying
it to be proper and took a transpose to get it risght-side up:
t(wanted)
date netIncome grossProfit
2020-09-30 "2020-09-30" "57411000000.00" "104956000000.00"
2019-09-30 "2019-09-30" "55256000000.00" "98392000000.00"
2018-09-30 "2018-09-30" "59531000000.00" "101839000000.00"
2017-09-30 "2017-09-30" "48351000000.00" "88186000000.00"
That looks better, I think.
So I did this:
wanted <- t(wanted)
It has rownames and colnames and you can make it a data frame easily enough:
mydf <- data.frame(wanted)
But the columns are all character strings, so CONVERT them as you wish:
mydf
date netIncome grossProfit
2020-09-30 2020-09-30 57411000000.00 104956000000.00
2019-09-30 2019-09-30 55256000000.00 98392000000.00
2018-09-30 2018-09-30 59531000000.00 101839000000.00
Your numbers are quite large and may or may not be meant to be integers:
date netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10
The first entries may have something wrong as they become NA when I make
them integers.
The date column is the same as the rownames and is not in a normal vector
format. It shows as a list and you may want to convert it to one of several
formats R supports for dates or a more normal character string.
So here is how I made it a character string:
date netIncome grossProfit
2020-09-30 2020-09-30 5.7411e+10 1.04956e+11
2019-09-30 2019-09-30 5.5256e+10 9.83920e+10
2018-09-30 2018-09-30 5.9531e+10 1.01839e+11
2017-09-30 2017-09-30 4.8351e+10 8.81860e+10
2016-09-30 2016-09-30 4.5687e+10 8.42630e+10
2015-09-30 2015-09-30 5.3394e+10 9.36260e+10
If you want a DATE, it can now be converted again using one of many methods.
Just FYI, numbers that big and rounded might work just as well measured in
millions as in what I did to grossProfit:
date netIncome grossProfit
2020-09-30 2020-09-30 57411 104956
2019-09-30 2019-09-30 55256 98392
2018-09-30 2018-09-30 59531 101839
2017-09-30 2017-09-30 48351 88186
2016-09-30 2016-09-30 45687 84263
2015-09-30 2015-09-30 53394 93626
Some of the numbers are negative though.
If rownames are not needed:
rownames(mydf) <- NULL
head(mydf)
date netIncome grossProfit
1 2020-09-30 57411 104956
2 2019-09-30 55256 98392
3 2018-09-30 59531 101839
4 2017-09-30 48351 88186
5 2016-09-30 45687 84263
6 2015-09-30 53394 93626
It may be easier to work with this, but again, if you need the dates to be
real dates, as in graphing.
Hope that helps.
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Bill Dunlap
Sent: Thursday, July 1, 2021 9:01 PM
To: Sparks, John <jspark4 at uic.edu>
Cc: r-help at r-project.org
Subject: Re: [R] List / Matrix to Data Frame
Does this do what you want?
'data.frame': 36 obs. of 3 variables:
$ Date : Date, format: "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" ...
$ netIncome : num 5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
$ Gross Profit: num 1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
with(df, plot(Date, netIncome))
On Thu, Jul 1, 2021 at 5:35 PM Sparks, John <jspark4 at uic.edu> wrote:
Hi R-Helpers,
I am taking it upon myself to delve into the world of lists for R. In
no small part because I appear to have discovered a source of data for
an exceptionally good price but that delivers much of that data in json
format.
So over the last day or so I managed to fight the list processing
tools to a draw and get a list that has only selected elements
(actually it ends up in matrix form). But when I try to convert that
to a data frame I can't get it to a form that is workable.
I have visited some pages about converting a matrix to a data frame
but they result in highly redundant and inelegant data.
I am thinking that someone who works with lists and matrices knows how
to do this quite easily and would be willing to provide a solution.
The reproducible example is shown below. Just to be explicit, what I
am trying to get to is something along the lines of a data frame like
this.
Date netIncome Gross Profit
2020-09-30 57411000000 104956000000
2019-09-30 55256000000 98392000000
.....
The closest I get is a matrix that looks like this
wanted
2020-09-30 2019-09-30 2018-09-30
2017-09-30 2016-09-30 2015-09-30 2014-09-30
date "2020-09-30" "2019-09-30" "2018-09-30"
"2017-09-30" "2016-09-30" "2015-09-30" "2014-09-30"
netIncome "57411000000.00" "55256000000.00" "59531000000.00"
"48351000000.00" "45687000000.00" "53394000000.00" "39510000000.00"
grossProfit "104956000000.00" "98392000000.00" "101839000000.00"
"88186000000.00" "84263000000.00" "93626000000.00" "70537000000.00"
Code for example
library(jsonlite)
test <- jsonlite::fromJSON("
https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFF
mMliFG5orCUuwAKQ8l4WWFQ67YX
")
hist<-test[[13]]
ISY<-hist$Income_Statement$yearly
wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
Your guidance would be much appreciated.
--John J. Sparks, Ph.D.
[[alternative HTML version deleted]]
From: Bill Dunlap <williamwdunlap at gmail.com>
Sent: Thursday, July 1, 2021 8:01 PM
To: Sparks, John <jspark4 at uic.edu>
Cc: r-help at r-project.org <r-help at r-project.org>
Subject: Re: [R] List / Matrix to Data Frame
Does this do what you want?
> df <- data.frame(check.names=FALSE, lapply(c(Date="date",netIncome="netIncome",`Gross Profit`="grossProfit"), function(nm)vapply(ISY, "[[", nm, FUN.VALUE=NA_character_)))
> str(df)
'data.frame': 36 obs. of 3 variables:
$ Date : chr "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30" ...
$ netIncome : chr "57411000000.00" "55256000000.00" "59531000000.00" "48351000000.00" ...
$ Gross Profit: chr "104956000000.00" "98392000000.00" "101839000000.00" "88186000000.00" ...
> df$Date <- as.Date(df$Date)
> df$netIncome <- as.numeric(df$netIncome)
> df$`Gross Profit` <- as.numeric(df$`Gross Profit`)
> str(df)
'data.frame': 36 obs. of 3 variables:
$ Date : Date, format: "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30" ...
$ netIncome : num 5.74e+10 5.53e+10 5.95e+10 4.84e+10 4.57e+10 ...
$ Gross Profit: num 1.05e+11 9.84e+10 1.02e+11 8.82e+10 8.43e+10 ...
> with(df, plot(Date, netIncome))
On Thu, Jul 1, 2021 at 5:35 PM Sparks, John <jspark4 at uic.edu<mailto:jspark4 at uic.edu>> wrote:
Hi R-Helpers,
I am taking it upon myself to delve into the world of lists for R. In no small part because I appear to have discovered a source of data for an exceptionally good price but that delivers much of that data in json format.
So over the last day or so I managed to fight the list processing tools to a draw and get a list that has only selected elements (actually it ends up in matrix form). But when I try to convert that to a data frame I can't get it to a form that is workable.
I have visited some pages about converting a matrix to a data frame but they result in highly redundant and inelegant data.
I am thinking that someone who works with lists and matrices knows how to do this quite easily and would be willing to provide a solution.
The reproducible example is shown below. Just to be explicit, what I am trying to get to is something along the lines of a data frame like this.
Date netIncome Gross Profit
2020-09-30 57411000000 104956000000
2019-09-30 55256000000 98392000000
.....
The closest I get is a matrix that looks like this
> wanted
2020-09-30 2019-09-30 2018-09-30 2017-09-30 2016-09-30 2015-09-30 2014-09-30
date "2020-09-30" "2019-09-30" "2018-09-30" "2017-09-30" "2016-09-30" "2015-09-30" "2014-09-30"
netIncome "57411000000.00" "55256000000.00" "59531000000.00" "48351000000.00" "45687000000.00" "53394000000.00" "39510000000.00"
grossProfit "104956000000.00" "98392000000.00" "101839000000.00" "88186000000.00" "84263000000.00" "93626000000.00" "70537000000.00"
Code for example
library(jsonlite)
test <- jsonlite::fromJSON("https://eodhistoricaldata.com/api/fundamentals/AAPL.US?api_token=OeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Feodhistoricaldata.com%2Fapi%2Ffundamentals%2FAAPL.US%3Fapi_token%3DOeAFFmMliFG5orCUuwAKQ8l4WWFQ67YX&data=04%7C01%7Cjspark4%40uic.edu%7Cec30a41cf9b14db3eb5208d93cf4f33d%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637607847614944186%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=p8vaGeaCWVDbL9yCiOcHvJVRa%2Bb1OsxN3A0xRtD3uc8%3D&reserved=0>")
hist<-test[[13]]
ISY<-hist$Income_Statement$yearly
wanted<-sapply(ISY, "[", j = c("date","netIncome","grossProfit"))
Your guidance would be much appreciated.
--John J. Sparks, Ph.D.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Cjspark4%40uic.edu%7Cec30a41cf9b14db3eb5208d93cf4f33d%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637607847614954146%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=fMhmQ4dY8zovN6bevG0NJ%2Fjw4rEFQzafDwBIehAtOpI%3D&reserved=0>
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Cjspark4%40uic.edu%7Cec30a41cf9b14db3eb5208d93cf4f33d%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637607847614954146%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=anMcffYjEXffR%2F0yw86LDcEQ%2BvRTHsIuGDc%2BCU6ISm8%3D&reserved=0>
and provide commented, minimal, self-contained, reproducible code.