Skip to content
Prev 51348 / 63424 Next

How to efficiently share data (a dataframe) between R and Java

You can pass the entire df, example:
df, 6 variables
[0]: double[150]
[1]: double[150]
[2]: double[150]
[3]: double[150]
[4]: int[150]
[5]: String[150]


Java code:

public class C {
       static void df(Object df[]) {
       	      int n;
       	      System.out.println("df, " + (n = df.length) + " variables");
	      int i = 0;
	      while (i < n) {
	      	    if (df[i] instanceof double[]) {
		    	double d[] = (double[]) df[i];
		        System.out.println("["+i+"]: double["+d.length+"]");
		    } else if (df[i] instanceof int[]) {
		    	int d[] = (int[]) df[i];
		        System.out.println("["+i+"]: int["+d.length+"]");
		    } else if (df[i] instanceof String[]) {
		        String s[] = (String[]) df[i];
			System.out.println("["+i+"]: String["+s.length+"]");
		    } else {
		        System.out.println("["+i+"]: some other type...");
		    }
		    i++;
	      }
        }
}

Normally, you wouldn't pass the entire df but instead have methods for the types you care about as the modeling function - that's more Java-like approach, but either is valid and there is no difference in efficiency.

Cheers,
Simon