Unhandled stack overflow with infinite recursion

Bugs that have been investigated and resolved somehow.

Moderator: GZDoom Developers

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
Post Reply
User avatar
Rachael
Posts: 13591
Joined: Tue Jan 13, 2004 1:31 pm
Preferred Pronouns: She/Her
Contact:

Unhandled stack overflow with infinite recursion

Post by Rachael »

This might be unfixable but I figured it couldn't hurt to report it.

This will cause a very fast CTD with a stack overflow from the scripting VM:

Code: Select all

	override bool CanCollideWith (Actor other, bool passive)
	{
		return CanCollideWith(other, passive);
	}
The intent of course, was to call the Super.CanCollideWith, but I forgot, and that caused a crash.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49087
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: Unhandled stack overflow with infinite recursion

Post by Graf Zahl »

You need some thorough analysis in the compiler which is beyond its scope.
User avatar
Player701
 
 
Posts: 1640
Joined: Wed May 13, 2009 3:15 am
Graphics Processor: nVidia with Vulkan support
Contact:

Re: Unhandled stack overflow with infinite recursion

Post by Player701 »

It is definitely not possible to do a 100% accurate analysis, as that would be equivalent to solving the halting problem. Control flow analysis may detect simple cases like the example one, where the offending recursive call happens without any prior condition checking.

But wouldn't it be easier to do this at runtime by limiting the VM stack depth to some reasonable threshold value? If one more frame is added, a VM abort would happen. (Somewhat similar to how runaway ACS scripts are detected.)
Professor Hastig
Posts: 232
Joined: Mon Jan 09, 2023 2:02 am
Graphics Processor: nVidia (Modern GZDoom)

Re: Unhandled stack overflow with infinite recursion

Post by Professor Hastig »

What VM stack? Aside from allocating the VM frames from the heap it just recursively calls the V'M's exec function.
User avatar
Player701
 
 
Posts: 1640
Joined: Wed May 13, 2009 3:15 am
Graphics Processor: nVidia with Vulkan support
Contact:

Re: Unhandled stack overflow with infinite recursion

Post by Player701 »

Well, I thought since the engine is capable of producing VM stack traces (e.g. when a VM abort happens), it should know how many VM stack frames there are at any given time.

If not, then here's another proposal: keep a running count of recursive VM calls, incrementing before each call and decrementing after the call has returned. If the count reaches a certain threshold, abort execution.

There certainly should be a way to keep track of this in one form or another. Of course it would not be 100% accurate, but the same can be said about control flow analysis, and that one is definitely much harder to implement. Another benefit of a runtime check is that it prevents a hard crash.
User avatar
phantombeta
Posts: 2093
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

Re: Unhandled stack overflow with infinite recursion

Post by phantombeta »

I don't think the potential performance costs are worth it, not to mention different functions can use up more or less space in the stack, so you'd have to be very conservative- which carries the very real risk of mistriggering for code that wouldn't have crashed otherwise.
I think it'd be better to somehow modify the crash dump code to be aware of VM/JIT functions, if possible.
Professor Hastig wrote: Mon Nov 13, 2023 1:54 am What VM stack? Aside from allocating the VM frames from the heap it just recursively calls the V'M's exec function.
Indeed. Not to mention that the the JIT doesn't even do that, it uses the real stack. And there can be direct calls that bypass the "VM exec" function.
User avatar
Player701
 
 
Posts: 1640
Joined: Wed May 13, 2009 3:15 am
Graphics Processor: nVidia with Vulkan support
Contact:

Re: Unhandled stack overflow with infinite recursion

Post by Player701 »

I see.

Still, IMO it'd be best to avoid a hard crash if at all possible since it could improve user experience - even at the cost of potential false-positives.

Again, compare with how runaway ACS scripts are handled - instead of a hard freeze, the engine produces a more-or-less informative error message that can actually be helpful to the end user. A hard freeze or crash is way more likely to cause the user to think there's something wrong with the engine itself, and even if VM functions get added to the crash log, I'm not sure most users actually bother reading them. A normal VM abort, on the other hand, usually makes it clear whether it's a mod or the engine that's at fault.
Professor Hastig
Posts: 232
Joined: Mon Jan 09, 2023 2:02 am
Graphics Processor: nVidia (Modern GZDoom)

Re: Unhandled stack overflow with infinite recursion

Post by Professor Hastig »

Player701 wrote: Mon Nov 13, 2023 3:27 am Again, compare with how runaway ACS scripts are handled - instead of a hard freeze, the engine produces a more-or-less informative error message that can actually be helpful to the end user.
ACS counts executed instructions. Not only can you not do this with a JIT, a far bigger problem is that ZScript code can be a lot more complex, there's code that can run magnitudes more than a simple ACS script.
Player701 wrote: Mon Nov 13, 2023 3:27 am A hard freeze or crash is way more likely to cause the user to think there's something wrong with the engine itself, and even if VM functions get added to the crash log, I'm not sure most users actually bother reading them. A normal VM abort, on the other hand, usually makes it clear whether it's a mod or the engine that's at fault.
While you are certainly correct, trying to second guess arbitrary code to avoid a stack overflow comes with a cost that's normally too high, this cannot be done without a severe performance or usability hit.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49087
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: Unhandled stack overflow with infinite recursion

Post by Graf Zahl »

It's indeed impractical. While we could count the recursions let's not forget that the VM needs to act like a real programming environment and not like a limited game event scripting handler like ACS. The presence of the JIT will also massively complicate things because it's just native machine code it creates.
User avatar
Rachael
Posts: 13591
Joined: Tue Jan 13, 2004 1:31 pm
Preferred Pronouns: She/Her
Contact:

Re: Unhandled stack overflow with infinite recursion

Post by Rachael »

I'm guessing there's no easy way to catch a stack overflow and throw a VM abort gracefully to the console instead of a CTD?
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49087
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: Unhandled stack overflow with infinite recursion

Post by Graf Zahl »

Under Windows you can catch system exceptions with structured exception handling and pass them down the same way as C++ exceptions, but stack overflows have the nasty habit of getting recursively triggered which just results in an application abort.
User avatar
phantombeta
Posts: 2093
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

Re: Unhandled stack overflow with infinite recursion

Post by phantombeta »

Not to mention there's no guarantee of GZDoom being in a stable state after the stack overflow. All the other VM aborts are based on checking *before* a potentially dangerous action, so they're all guaranteed to maintain stability, whereas trying to catch a stack overflow is very likely to end up with GZDoom in an unstable state.
User avatar
Player701
 
 
Posts: 1640
Joined: Wed May 13, 2009 3:15 am
Graphics Processor: nVidia with Vulkan support
Contact:

Re: Unhandled stack overflow with infinite recursion

Post by Player701 »

phantombeta wrote: Mon Nov 13, 2023 3:04 amperformance costs
Just curious, but would it really incur that much overhead compared to other safety checks that are already present in the VM? For example, GZDoom does check for null pointer dereference in ZScript instead of just letting itself crash, and it also makes sure arrays are not accessed out of bounds. All of those definitely cost some performance, but on modern CPUs the difference is likely not noticeable at all.
Graf Zahl wrote: Mon Nov 13, 2023 12:32 pm Under Windows you can catch system exceptions with structured exception handling and pass them down the same way as C++ exceptions, but stack overflows have the nasty habit of getting recursively triggered which just results in an application abort.
What about SetThreadStackGuarantee? Looks like it's provided specifically to handle stack overflow exceptions, so perhaps it could be of use? (I have no idea of any equivalent for other platforms, though.)
Post Reply

Return to “Closed Bugs [GZDoom]”