Skip to content
Prev 58 / 117 Next

General Insurance Data

Hi All,

Thank you all for your responses. What I have in mind is a package to 
help with the initial stage of getting claims data into R and ready to 
use (formatting etc) before any modeling is done. Effectively to provide 
a workflow for working with actuarial claims data.
My past experiences, using GI reserving projection programs, have 
usually involved a stage to summarise the data in some format then later 
feed this data into a program, be it a proprietary program or one of the 
R packages. The key stage I wish to contribute to here is where the data 
has to be fed in to R. MRMR and ChainLadder do allow for reshaping of 
the data (and loading data) but I would like to build on the 
functionality in these packages while not reinventing the wheel. I will 
take some time to digest the MRMR package as this does seem to have a 
lot of functionality that I could use for manipulating the data.

A package providing a workflow to get data into a basic tabular 
structure  would act as a stepping stone for getting claims data 
available and cleaned for use in packages such as ChainLadder or MRMR. 
It would even be easier to perform other more basic analysis. As I see 
it at the moment when someone gets to the stage that the data is loaded 
and formatted in R they are basically ready to go and use MRMR or 
ChainLadder _*but*_ this is a very big hurdle for novice R users who 
typically have data stored elsewhere (SAS, SQL Excel or other flat 
files). Even moving from Excel to R, where data can be stored in objects 
as opposed to tables is a giant leap in its own right and I fear this is 
the stage where a lot of novice R Actuarial users are lost or lose 
confidence in R. I am not a professional programmer and I am not very 
knowledgeable about different types of connections or porting files from 
one format to another so I was hoping to leave that up to packages such 
as RODBC or xlsx. My idea is to provide a workflow structure to load 
claims data into a data.table using a simple GUI interface. Its worth 
mentioning, I have come across RExcel which allows you to call R from 
Excel but what I am talking about is somewhat different in that your 
using R as the backbone, rather than Excel, for storing and manipulating 
the data. This would at the same time leave the data in familiar tabular 
format but helper functions could pivot the data and allow novice R 
users to create triangles intuitively or extract useful representations 
such as latest diagonals cumulative data, in year data etc. It would 
still be easy to manipulate the basic underlying data structure if 
desired as it is just a data.table but helper functions would give it 
intuitive Actuarial functionality. The goal would be to reduce the time 
to close to zero for a novice R Actuarial user to go from using existing 
data (e.g. Excel or SAS) to a data object that could then be passed to 
ChainLadder, MRMR or ggplot2.
The package Rattle,, has a lovely GUI interface for loading data. I have 
something like this in mind to get people started. A GUI interface could 
be used to specify key fields and formats such as origin period etc. The 
data stored in R as a data.table would still be in a very familiar 
tabular format and look similar to what novice R users are used to when 
they have worked with data stored elsewhere in SQL, SAS or Excel files.  
This would provide a bit of reassurance  to get working in R. It would 
also facilitate novice R users to plot the data using ggplot or 
googlevis, reshape it, summarise it in tables for printing and you also 
get the benefit that the data.table structure is quite efficient for 
working with big in memory data.

My recent background is claims reserving in a GI company and claims data 
is what I have initially in mind to tackle. Other data such as pricing 
data, time series etc would also certainly lend themselves to more 
specialised Actuarial data structures and I think I could work on these 
down the road.

 From my description above the package might just sound like its to load 
and format the data. This is a non trivial exercise for novice R users 
and such a package would reduce the barrier for people adopting R 
especially at work. At the optimistic end of the scale I also think it 
could make development data analysis accessible to non Actuaries and 
could even be useful in other fields. I would love the help from anyone 
that is interested in contributing as I am working on this solo at the 
moment. I wish to work a bit on a basic proof of concept first and when 
I get it working I will load it to git hub and post some links to this 
mailing list. I wanted first to check that such a package does not exist 
but it will take a bit of time to get something useful and working. I 
also appreciate any comments, contributions or feedback.

Thanks again for the help

Regards
Edward Roche
On 29/09/14 02:03, Brian Fannin wrote: