Code Monkey home page Code Monkey logo

Comments (13)

Jojo-1000 avatar Jojo-1000 commented on June 1, 2024 1

@FanDjango

The only change: One additional LoadLibrary call, throwing away the returned pointer

I can kind of see why: Load library and Free library are also reference counted, so by loading twice you never actually unload the library.

For now I think it is best to just keep it loaded, especially if it is slow to free and someone uses many sequential connections. Maybe someone will figure out why this is broken in the future, maybe not. I think we can live with the small RAM overhead.
From what I can tell, DllImport would also not unload it after use.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

@fredericDelaporte 533, once a minute is about 8 hours?

from fluentftp.gnutls.

fredericDelaporte avatar fredericDelaporte commented on June 1, 2024

Yes, it is about 8 hours. Still in my case it was not a stable time between failures, so not a stable count of init/deinit. Maybe there is some differences in the crash when the init/deinit is done in a loop doing nothing else without any wait time, and when it is done in an app actually using the FTP connection (calling DownloadDirectory once before closing) and waiting one minute before doing it again.

(The machines are Windows 2022 Servers, running on AWS EC2 t3.medium (staging environments) or m6i.large, no significant differences noted between these two config about the crash. The program is a .Net 6 background service running as a Windows service. The polled FTP server is a third party old obsolete and beta FileZila Server, version 0.9.59 likely running under Windows. That FileZila Server version dates back to 2016, and the third party does not want to update it. It seems to have some TLS specificity causing .Net streams to fail with it.)

Anyway, that is great to have a good basis like your test, for investigating the trouble.

Exit code 3 : could be a C abort call according to this blog. So, that could be an explicit GnuTLS kill of the program, but in such case I would expect it to log something prior to that.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

Well I'll be darned:

To find out that is causing the abort, I decided to test the resiliency of LoadLibrary(...) and FreeLibrary(...) for loading libgnutls-30.dll. I haven't tested it yet for some other .dll to see if this is specific to the GNU library. EDIT Yes, I have tested this with other .dlls and there is no such problem with them, can load/unload millions of times.

The following code (.NET 6):

private const string dllNameWinUtil = @"Kernel32.dll";
[DllImport(dllNameWinUtil, CallingConvention = CallingConvention.StdCall, CharSet = CharSet.Ansi, SetLastError = true)]
private static extern IntPtr LoadLibrary([MarshalAs(UnmanagedType.LPStr)] string lpFileName);
[DllImport(dllNameWinUtil, CallingConvention = CallingConvention.StdCall, CharSet = CharSet.Ansi, SetLastError = true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool FreeLibrary(IntPtr hModule);

static void my_Load_Unload_Test() {
	for (int i = 0; i < 6000; i++) {
		Console.WriteLine(i.ToString());

		// Load the .dll lib
		IntPtr dllP = LoadLibrary(@"libgnutls-30.dll");
		if (dllP == IntPtr.Zero) throw new Exception();

		// Unload the .dll lib
		bool result = FreeLibrary(dllP);
		if (!result) throw new Exception();
	}
}

run my_Load_Unload_Test() as a console app results in this:

...(1057 lines omitted)
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069

C:\Users\M....\Test_FluentZOS_Net_6.exe (process 7960) exited with code 3.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

This is moving in the direction of analyzing the GnuTLS .dll library as to what it's DLLMAIN (if at all called or present) is doing, as well as any internal constructor they have somehow contrived - the compile is done with Mingw32 (windows port of gcc), there are many concerns out there about DLLMAIN related problems.

Other available GnuTSL .dll libraries, like the one supplied with FileZilla, also exhibit this behaviour.

Need to test this under linux.

Perhaps it is best to just not do any unloading of the libraries.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

@Jojo-1000 We are back to square one re. your wanting the libraries freed if no-one is using them thanks to the problem desccribed above.

With the current code, there will be a GlobalDeInit(...) when use-count reaches zero, but freeing the libraries is currently commented out.

from fluentftp.gnutls.

Jojo-1000 avatar Jojo-1000 commented on June 1, 2024

The interesting thing is that it is actually the FreeLibrary call that crashes.
When I run 1069 iterations of load/free dll, and then try to use FTPClient, the DLL loading fails with a proper error code:

ERROR_DLL_INIT_FAILED

1114 (0x45A)
ERROR_DLL_INIT_FAILED

1114 (0x45A)

A dynamic link library (DLL) initialization routine failed.

Although setting the same error modes only using load/free does not do that, neither does just using the FTPClient repeatedly.
So my guess is that loading fails, but that is not detected for some reason, so the call to free crashes (and probably other uses of the library). This is strange, but I guess leaving the library loaded is not too bad for now. At least now it is not initialized all the time.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

A dynamic link library (DLL) initialization routine failed.

The DLLMAIN that is produced by vanilla Mingw32 gcc windows versions or whatever. It failed, they put too much stuff in there. If you google GnuTLS DLLMain you will see (old, 2014) discusssions about "doing away with Global Init" and instead "self initializing" the library. Who knows how far they went and whether it is Win compatible, as they seem to prefer the ELF side of these things at that time. Regardless, it is a question of whether the Win implementation of the dll DLLMAIN via Mingw32 is correct for this task.

Big thanks for also trying this out. Isn't it nice / or strange that you get the same iterations before failure even on a different system?

from fluentftp.gnutls.

robinrodricks avatar robinrodricks commented on June 1, 2024

@FanDjango can we just keep the DLL libs loaded and not unload them ever. Will that solve your problem? Is there a reason to unload the libs in the first place? I don't know much here so apologies but usually simpler solutions work better. Especially with native<->.NET interop.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

Released to NuGet.

@robinrodricks

can we just keep the DLL libs loaded and not unload them ever

My response to that:

With the current code, there will be a GlobalDeInit(...) when use-count reaches zero, but freeing the libraries is currently commented out.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

Please let me add a few insights I gained over the past weeks on these issues:

#40
To make the code for linux viable and not use a lot of duplicated code, we stopped using DllImport and started loading the function delegates ourself with the help of LoadLibrary and FreeLibrary. It seemed "polite" to free the library after use. Nobody noticed the inherent flaw.

#88
Making the library work in a multi-threaded environment meant moving GLobalDeInit and FreeLibrary into a usecount based environment. This didn't change the inherent flaw introduced long ago for #40.

(this one) #100
Finally, the problem reported already in #88, the failure after 8 hours, or after 533 FreeLibrary calls, surfaced, although it should have been reported many many weeks earlier as a separate topic. The user concerned just noticed in the course of the #88 resolution, that it was helping him also.

As usual, a number of things, mixed up together, have contributed to this evolution.

What remains though, is a numb feeling of despair:

libgnutls-30.dll, at least under windows, incorparates a number of internal schemes to facilitate its internal initialization, some of which was introduced quite long ago. At least under Windows .dll the design of this might be faulty and causes the DllMain feature to do things that are not totally compatible - you cannot Load/Unload this .dll as often as you like.

Is it worth it to search for the exact cause of this? To report it to the GnuTls people? I am hesitant as I feel they probably have bigger concerns. I myself do not currently have the resources to investigate both the source of GnuTls and the way the Mingw toolchain for compiling to windows might be making this happen.

Does repeated loading/unloading of libgnutls also fail under linux - then perhaps it would be worth reporting to them.

Another observation that makes disabling the FreeLibrary function a happy thing to do: The performance gain is actually quite noticable. Dynamic library unloading (= Free) is quite resource intensive.

And with that, I will soon close this issue here before it gets too long. Any problems encountered with 1.0.22 should please be reported in a new issue.

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

@Jojo-1000

Another interesting thing: The following code does not crash, it runs more or less for ever. The change is so abstruse that I don't consider this to be a viable solution, just a curiosity:

static void my_Load_Unload_Test() {
	for (int i = 0; i < 66000; i++) {
		Console.WriteLine(i.ToString());

		try {
			// Load the .dll lib
			IntPtr dllP = LoadLibrary(@"libgnutls-30.dll");
			if (dllP == IntPtr.Zero) throw new Exception();

			_ = LoadLibrary(@"libgnutls-30.dll");

			// Unload the .dll lib
			bool result = FreeLibrary(dllP);
					if (!result) throw new Exception();
			}

			catch (Exception e) {
				Console.WriteLine($"{e.Message}");
			}

	}
}

The only change: One additional LoadLibrary call, throwing away the returned pointer

from fluentftp.gnutls.

FanDjango avatar FanDjango commented on June 1, 2024

Ooops, so now you see what happens when you loose focus and just play around. In my mainframe world, the additional load by the same TCB (task control block), here it would be thread, would not have counted. My bad.

from fluentftp.gnutls.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.