Making package "frontier" available for Mac: debugging Fortran code
Dear Berend and Simon Thanks a lot for making me aware of this bug and for sending me so many details. I probably would not have found the bug without your help. I have fixed it on R-Forge and I will upload the fixed version to CRAN. Best regards, Arne
On 3 December 2012 21:06, Berend Hasselman <bhh at xs4all.nl> wrote:
Arne, I believe I have found the bug. Thanks to the traceback of Simon. On 03-12-2012, at 20:05, Simon Urbanek wrote:
Arne, unfortunately I don't think I can help you with Fortran code, but this is the trace from the crash:
example(summary.frontier)
[...] smmry.> # Efficiency Effects Frontier smmry.> rice2 <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) | smmry.+ EDYRS + BANRAT, data = riceProdPhil ) Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x01a55000 0x00ff8433 in info (nstartval=@0x260ce80, startval=0x260d540, nrowdata=@0x260cec0, ncoldata=@0x260cf20, datatable=0x1a4f018, nparamtotal=@0x260d0e0, ob=0x32d3080, gb=0x32d3118, fxs=@0xbfffac68, y=0x32d31b0, h=0x27b8618) at front41.f:818 818 xxd(i)=dataTable(k,2+i) (gdb) p i $1 = 7 Current language: auto; currently fortran (gdb) p k $2 = 318 (gdb) bt #0 0x00ff8433 in info (nstartval=@0x260ce80, startval=0x260d540, nrowdata=@0x260cec0, ncoldata=@0x260cf20, datatable=0x1a4f018, nparamtotal=@0x260d0e0, ob=0x32d3080, gb=0x32d3118, fxs=@0xbfffac68, y=0x32d31b0, h=0x27b8618) at front41.f:818 #1 0x00ffcbe7 in front41 (imarg=@0x260fd20, ipcarg=@0x260fda0, nnarg=@0x260d340, ntarg=@0x260d360, nobarg=@0x260d380, nbarg=@0x260d3a0, nmuarg=@0x260fdc0, netaarg=@0x260d3c0, iprintarg=@0x260d3e0, indicarg=@0x260d400, tolarg=@0x260d420, tol2arg=@0x260d440, bignumarg=@0x260d460, step1arg=@0x260d480, igrid2arg=@0x260ce20, gridnoarg=@0x260d4a0, maxitarg=@0x260d4c0, bmuarg=@0x260d4e0, mrestartarg=@0x260d500, frestartarg=@0x260d520, nrestartarg=@0x260ce40, nstartval=@0x260ce80, startval=0x260d540, nrowdata=@0x260cec0, ncoldata=@0x260cf20, datatable=0x1a4f018, nparamtotal=@0x260d0e0, ob=0x32d3080, gb=0x32d3118, startlogl=@0x260d560, y=0x32d31b0, h=0x27b8618, fmlelogl=@0x260d580, niter=@0x260d260, icodearg=@0x260d280) at front41.f:74 #2 0x0046c5ab in do_dotCode (call=0x12c1c60, op=0x101b2d0, args=0x262b024, env=0x31e74b4) at ../../../../R-2.15-branch/src/main/dotcode.c:1901 #3 0x0049a681 in Rf_eval (e=0x12c1c60, rho=0x31e74b4) at ../../../../R-2.15-branch/src/main/eval.c:494 #4 0x0049d9c6 in do_set (call=0x12c1bd4, op=0x100e2a8, args=0x12c1bf0, rho=0x31e74b4) at ../../../../R-2.15-branch/src/main/eval.c:1717 #5 0x0049a482 in Rf_eval (e=0x12c1bd4, rho=0x31e74b4) at ../../../../R-2.15-branch/src/main/eval.c:468 #6 0x0049c362 in do_begin (call=0x1591c00, op=0x100e1ac, args=0x12c1640, rho=0x31e74b4) at ../../../../R-2.15-branch/src/main/eval.c:1415 #7 0x0049a482 in Rf_eval (e=0x1591c00, rho=0x31e74b4) at ../../../../R-2.15-branch/src/main/eval.c:468 #8 0x0049f8bd in Rf_applyClosure (call=0x21f3580, op=0x11b9d9c, arglist=0x31e673c, rho=0x102519c, suppliedenv=0x10251b8) at ../../../../R-2.15-branch/src/main/eval.c:861 #9 0x0049a382 in Rf_eval (e=0x21f3580, rho=0x102519c) at ../../../../R-2.15-branch/src/main/eval.c:512 #10 0x0049d9c6 in do_set (call=0x21f35d4, op=0x100e2a8, args=0x21f35b8, rho=0x102519c) at ../../../../R-2.15-branch/src/main/eval.c:1717 #11 0x0049a482 in Rf_eval (e=0x21f35d4, rho=0x102519c) at ../../../../R-2.15-branch/src/main/eval.c:468 #12 0x0049b1aa in do_eval (call=0x10dcad0, op=0x101d394, args=0x31e6800, rho=0x31e6838) at ../../../../R-2.15-branch/src/main/eval.c:2107 The code crashes at exactly the same line in 64-bit mode as well, so it's likely a real bug. valgrind seems to confirm that it's a OOB read bug in front41.f at 818: smmry.> # Efficiency Effects Frontier smmry.> rice2 <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) | smmry.+ EDYRS + BANRAT, data = riceProdPhil ) ==76290== Invalid read of size 8 ==76290== at 0x867E433: info_ (front41.f:818) ==76290== by 0x8682BE6: front41_ (front41.f:74) ==76290== by 0x46C5AA: do_dotCode (dotcode.c:1901) ==76290== by 0x49A680: Rf_eval (eval.c:494) ==76290== by 0x49D9C5: do_set (eval.c:1717) ==76290== by 0x49A481: Rf_eval (eval.c:468) ==76290== by 0x49C361: do_begin (eval.c:1415) ==76290== by 0x49A481: Rf_eval (eval.c:468) ==76290== by 0x49F8BC: Rf_applyClosure (eval.c:861) ==76290== by 0x49A381: Rf_eval (eval.c:512) ==76290== by 0x49D9C5: do_set (eval.c:1717) ==76290== by 0x49A481: Rf_eval (eval.c:468) ==76290== Address 0x8b81288 is 0 bytes after a block of size 22,040 alloc'd ==76290== at 0x16483: malloc (vg_replace_malloc.c:236) ==76290== by 0x4DE762: Rf_allocVector (memory.c:2380) ==76290== by 0x406A48: Rf_allocMatrix (array.c:194) ==76290== by 0x407AC9: do_matrix (array.c:128) ==76290== by 0x4908D7: bcEval (eval.c:4449) ==76290== by 0x49A145: Rf_eval (eval.c:399) ==76290== by 0x49F8BC: Rf_applyClosure (eval.c:861) ==76290== by 0x49A381: Rf_eval (eval.c:512) ==76290== by 0x49B8CC: Rf_evalList (eval.c:1831) ==76290== by 0x49A54A: Rf_eval (eval.c:487) ==76290== by 0x49D9C5: do_set (eval.c:1717) ==76290== by 0x49A481: Rf_eval (eval.c:468) ==76290==
I'm sure it's a bug.
I have inserted these lines just before line 813 of front41.f (the do 134 k=1,nob loop in subroutine info)
call intpr('(info) value of nob at loop 134 ', 31, nob,1)
call intpr('(info) value of nr at loop 134 ', 31, nr ,1)
call intpr('(info) value of nRowData at loop 134 ',36,nRowData,1)
call intpr('(info) value of nColData at loop 134 ',36,nColData,1)
The result in ....../frontier/frontier.Rcheck/frontier-Ex_x86_64.Rout is
# Efficiency Effects Frontier rice2 <- sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) |
+ EDYRS + BANRAT, data = riceProdPhil ) (info) value of nob at loop 134 [1] 344 (info) value of nr at loop 134 [1] 7 (info) value of nRowData at loop 134 [1] 344 (info) value of nColData at loop 134 [1] 8
summary( rice2 )
which implies that in the loop
do 143 i=2,nr
xxd(i)=dataTable(k,2+i)
143 continue
column 9 of dataTable is accessed when i is 7. dataTable has maximum 8 (nColData) columns.
I haven't checked why nr is set to 7 and not a smaller value.
regards,
Berend
Arne Henningsen http://www.arne-henningsen.name