Skip to content

Attempting to confirm a program i wrote in C (normalize 2 datasets, transform into histogram, transform into CDF, perform KS test)

3 messages · Jeff Newmiller, Tarskin

#
I have written a program in C that two xy datasets, aligns these 2 datasets
based on shared features, transforms them into equal sized histograms,
transforms the histograms into cumulative distribution functions (via GSL)
and finally performs a KS_test.

I am wanting to validate my program's results and figure'd i would use R but
i am kinda stuck at ithe histograms (I have 2 histogram objects with the
same #bins, weighted by the Y value) right now and not really sure how to
transform these into CDF to perform a KS.test, I looked at the edf function
but got kind of lost.

I realize this is most likely a very basic question but I am just not that
familiar with R :(

Thanks in advanc



--
View this message in context: http://r.789695.n4.nabble.com/Attempting-to-confirm-a-program-i-wrote-in-C-normalize-2-datasets-transform-into-histogram-transform-tp4656704.html
Sent from the R help mailing list archive at Nabble.com.
#
It is generally expected that the questioner pose a specific example so the respondent can have some assurance they are answering the actual question. In this case please provide a test data set, intermediate results, and final result generated by your C program. It would also show that you made a reasonable effort if you showed what steps you tried in R and where you think they went wrong.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
Tarskin <b.c.jansen at lumc.nl> wrote:

            
1 day later
#
The C program takes 2 mzML files from which the binary strings (according to
the X data and Y data is uncompressed/decoded), it then examines spectral
(xy data) similiary and combines both datasets into a new one and finally
after all similar spectra have been merged it writes it all back into 1 new
mzML file. I am assuming that people do not want to get several complete
mzML files however, I will however include 2 spectra from different sources
that were extracted from the total mzML files that I have been using to test
this with.

IgG2_G1F.data <http://r.789695.n4.nabble.com/file/n4656833/IgG2_G1F.data>  
IgG2_G1F.data <http://r.789695.n4.nabble.com/file/n4656833/IgG2_G1F.data>  

The steps i performed in R:

experimental<-read.table(<first data set>, header=T)
predicted<-read.table(<second data set>,header=T)
hist_e<-hist(rep(experimental[,1],experimental[,2]), breaks=seq(0,2500,1))
hist_p<-hist(rep(predicted[,1],predicted[,2]),breaks=seq(0,2500,1))
-- up to here it does what I expect, my own C program would now transform
the GSL hist object to a GSL cdf object --
ecdf(hist_p) seemed the logical choice to me however it complains about
error in rank and my attempts to figure out what this means haven't been
very clear so far.

I would appreciate any pointer saying why this doesn't do what I'm
expecting.







--
View this message in context: http://r.789695.n4.nabble.com/Attempting-to-confirm-a-program-i-wrote-in-C-normalize-2-datasets-transform-into-histogram-transform-tp4656704p4656833.html
Sent from the R help mailing list archive at Nabble.com.