Skip to content

[R-pkg-devel] Package Load fails to find 3rd Party DLL

9 messages · Jeff Newmiller, Russell Almond, Ivan Krylov

#
I have an R package (RNetica available at 
https://ralmond.r-universe.dev/RNetica and 
https://github.com/ralmond/RNetica) which links to a 3rd party library 
Netica.dll, so RNetica.dll (built from my C code) calls the 3rd party code.

The config.win script downloads Netica.dll and moves it into the 
libs/x64 directory, where it should get loaded when RNetica.dll is 
loaded.? However this is not happening:

Here is the relevant portion of the build log (build is on R-universe, 
but I think it is the same script as CRAN):

```

cp 
"/d/a/ralmond/ralmond/RNETIC~1.RCH/00_PKG~1/RNetica/src/Netica/Netica_API_5
10/lib64/Netica.dll" 
"D:/a/ralmond/ralmond/RNetica.Rcheck/00LOCK-RNetica/00new/R
Netica/libs/x64"
 ? cp 
"/d/a/ralmond/ralmond/RNETIC~1.RCH/00_PKG~1/RNetica/src/Netica/Netica_API_5
10/lib64/Netica.lib" 
"D:/a/ralmond/ralmond/RNetica.Rcheck/00LOCK-RNetica/00new/R
Netica/libs/x64"
 ? C:\rtools43\x86_64-w64-mingw32.static.posix\bin\nm.exe: 'NeticaDLL': 
No such f
ile
 ? gcc -shared -s -static-libgcc -o RNetica.dll tmp.def Cases.o 
Continuous.o Edge
s.o Experience.o Inference.o Networks.o Node.o Random.o Registration.o 
Session.o
 ?-L. 
-LD:/a/ralmond/ralmond/RNetica.Rcheck/00LOCK-RNetica/00new/RNetica/libs/x64
 ?-lNetica -LC:/rtools43/x86_64-w64-mingw32.static.posix/lib/x64 
-LC:/rtools43/x8
6_64-w64-mingw32.static.posix/lib -LC:/R/bin/x64 -lR
 ? C:\rtools43\x86_64-w64-mingw32.static.posix\bin/ld.exe: internal 
error: aborti
ng at ../../binutils-2.40/ld/ldlang.c:527 in compare_section
 ? C:\rtools43\x86_64-w64-mingw32.static.posix\bin/ld.exe: please report 
this bug
 ? collect2.exe: error: ld returned 1 exit status
```

A little bit of searching on the internet, indicates that Windows 
sometimes reports Dll A not found when Dll A needs Dll B and it can't 
find B.

This used to work under older versions of R and the tool chain and I 
don't think I've changed anything related to the C side of the code.

1) Have the paths changed, so I no longer should be moving the (64 bit 
version of the) 3rd party DLL to `libs/x64`?

2) Is there something that has changed with the mingw tools (nm.exe and 
ld.exe) which are changing things?

3) Is there a change on how win32 and win64 variants are handled (I have 
both 32 and 64 bit copies of the 3rd party DLL, I just need to move them 
to the right places).

Thanks for any enlightenment you can offer,

 ??? --Russell Almond
#
Use of precompiled code is not allowed in CRAN. This looks like your package needs to be distributed elsewhere... e.g. via GitHub.
On July 12, 2023 6:41:11 AM PDT, Russell Almond <russell.g.almond at gmail.com> wrote:

  
    
2 days later
#
? Wed, 12 Jul 2023 09:41:11 -0400
Russell Almond <russell.g.almond at gmail.com> ?????:
This is where the problem starts. You can retrace the steps that R
takes when building and installing the package by running sh
configure.win manually and then running something like
R_PACKAGE_DIR="$(pwd)" R CMD SHLIB -n *.c in the src subdirectory of
the package. That will in turn tell you the exact command lines to be
run while building your package, including the following:

 nm Cases.o Continuous.o Edges.o Experience.o Inference.o Networks.o \
  Node.o Random.o Registration.o Session.o NeticaDLL \
 | sed -n 's/^.* [BCDRT] / /p' \
 | sed -e '/[.]refptr[.]/d' -e '/[.]weak[.]/d' \
 | sed 's/[^ ][^ ]*/"&"/g' \
 >> tmp.def;

That "NeticaDLL" at the end of the list of object files doesn't belong
there. I think it gets added because it's among the dependencies of the
$(SHLIB) Make target. It would be best to make that target a real
object file that nm.exe can process. Otherwise, you could also write
your own .def file and skip its automatic generation.

After nm fails, you get a crash in the linker (while parsing the
resulting incomplete .def file?), which leaves your package without a
shared library to use:
There must be a way to streamline this process. Maybe put all the
library-downloading and extraction code into a portable
tools/configure.R (to be launched manually from the configure shell
script), leaving src/Makevars only to compile your own code, link with
Netica using PKG_LIBS, then copy the additional Netica DLL from a
custom install.libs.R file?
#
Thanks.? I know know the problem is in the Makevars.win; however, I'm 
still confused.

My `Makevars.win` had

| .PHONY:?? all NeticaDLL clean

| all: $(SHLIB)
| $(SHLIB): NeticaDLL
|
| NeticaDLL:
|? ? mkdir -p "$(INSTALL_LIB)"
|? ? cp "${NETICA_LIB}/Netica.dll" "${INSTALL_LIB}"
|??? cp "${NETICA_LIB}/Netica.lib" "${INSTALL_LIB}"

[BTW, I tried change this to

| all: NeticaDLL $(SHLIB)

and got the same problem.]

This looks very much like the `mylibs` example in 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-Makevars

So I'm confused.? Why is the Makevars -> Makefile conversion assuming 
that all targets of $(SHLIB) (or all) are executable files, ignoring the 
PHONY declaration?

Also, does the example in the Writing R Extensions manual still work?

 ??? --Russell
On 7/14/23 11:14 AM, Ivan Krylov wrote:
#
On Fri, 14 Jul 2023 13:29:32 -0400
Russell Almond <russell.g.almond at gmail.com> wrote:

            
The $(SHLIB) target is defined in ${R_HOME}/share/make/winshlib.mk. In
particular, it says:

 echo EXPORTS > tmp.def; \
 $(NM) $^ | $(SED) -n $(SYMPAT) $(NM_FILTER) | \
  $(SED) $(ADDQU) >> tmp.def; \

Unfortunately for our case, the $^ automatic variable contains the
names of all prerequisites of the $(SHLIB) target [*], including .PHONY
ones. So as long as $(SHLIB) is declared to depend on NeticaDLL, this
string will get mentioned here.

Maybe it's a red herring. Maybe the message from nm about missing file
has always been harmless, and what we're seeing here is a bug in the
toolchain; perhaps ld.exe doesn't like something about Netica.lib so
much that it crashes. I think that's less likely. If you run the
commands manually but without mentioning NeticaDLL, do you get a DLL in
the end?
It should. At the very least, rgl uses the same approach without any
apparent problems:
https://github.com/dmurdoch/rgl/blob/master/src/Makevars.ucrt
(But then it also provides rgl-win.def, which sidesteps the automatic
.def generation, and then has $(OBJECTS), not $^, linked into $(SHLIB).
You may have found a bug in winshlib.mk.)

There's also a different idiom, where all: but not $(SHLIB): depends on
the non-file target:
https://github.com/eddelbuettel/rprotobuf/blob/master/src/Makevars.win
(Not sure how the dependency resolution order is supposed to work in
this case. If winshlib.mk declares that all: depends on $(SHLIB) and
later Makevars declares that all: also depends on winlibs, why doesn't
Make attempt to build $(SHLIB) first, with no regard for winlibs?)
2 days later
#
? Fri, 14 Jul 2023 22:25:51 +0300
Ivan Krylov <krylov.r00t at gmail.com> ?????:
Judging by your build logs, this could be a toolchain bug. If you set
the *.lib file aside and only give the *.dll to the linker (using
-l:Netica.dll if necessary), does it still fail? I know that GCC can
link directly to *.dll files, without relying on import libraries.
#
Thanks for the suggestion.  In previous versions of the build tools, I 
know I needed the .lib files.

I'm also not sure I'm copying the DLL files into the right directory, so 
that maybe the linker isn't seeing it.  This always confuses me as the 
location used to build and compile is (potentially) different from the 
test location and the final build location, and I'm not sure of the best 
way to refer to these directories in my script.  I'm thinking maybe I 
need to copy the DLL into the src directory.

Unfortunately, I don't have a windows box on which to easily test.  I'm 
trying to get a virtualBox setup working so I can test more quickly.

   --Russell
On 7/17/23 5:30 AM, Ivan Krylov wrote:
#
Okay,  I've changed my Makevars.win so that it has:

PKG_LIBS = -L. -L${NETICA_LIB} -lNetica

and

all: NeticaDLL

where ${NETICA_LIB} is set to the appropriate lib subdirectory of the 
unpacked sources.  I'm no longer getting the nm.exe error, so it may be 
a bug in ld.exe.


| gcc -shared -s -static-libgcc -o RNetica.dll tmp.def Cases.o 
Continuous.o Edges.o Experience.o Inference.o Networks.o Node.o Random.o 
Registration.o Session.o -L. 
-L/c/Users/ralmond/Projects/RNetica/src/Netica/Netica_API_510/lib64 
-lNetica -LC:/rtools43/x86_64-w64-mingw32.static.posix/lib/x64 
-LC:/rtools43/x86_64-w64-mingw32.static.posix/lib 
-LC:/PROGRA~1/R/R-43~1.1/bin/x64 -lR
| C:\rtools43\x86_64-w64-mingw32.static.posix\bin/ld.exe: internal 
error: aborting at ../../binutils-2.40/ld/ldlang.c:527 in compare_section
| C:\rtools43\x86_64-w64-mingw32.static.posix\bin/ld.exe: please report 
this bug
| collect2.exe: error: ld returned 1 exit status

However, when I manually moved Netica.dll to the source directory, and 
removed NETICA_LIB from the path, it compiled correctly.

So, I'm guessing this is a bug in ld.exe, and in fact it is here:
https://github.com/msys2/MINGW-packages/issues/15469

So I'll see if I can try to fix my script to only copy the .dll and not 
the .lib.
	--Russell
On 7/17/23 5:30 AM, Ivan Krylov wrote:
2 days later
#
Not sure if I reported the success here or not.

Copying the 3rd party DLL, but not the .lib file, to the src directory 
does work around the bug in the linker.

The complete working solution can be seen at 
https://github.com/ralmond/RNetica.

Thanks for the help.
	--Russell
On 7/17/23 5:30 AM, Ivan Krylov wrote: