Skip to content

Conversation

@HydrogenC
Copy link

@HydrogenC HydrogenC commented Nov 29, 2025

A rework of #110997 (see this and this). Now the implementation is a full adaptation from XeGTAO, and the codebase is completely decoupled from ASSAO and independent.

Showcase:
f47b2aa5856d6cdf4d156a30597dcb13

Implementation details

A full port of the XeGTAO gather and blur passes, with the following differences: Godot uses a de-interleaved depth buffer and half resolution intermediate images whilst XeGTAO uses full resolution images. So I took the interleave pass from ASSAO to restore the half resolution image back to full res.

Possible questions

Q: Why still keep ASSAO as an option?

A: Keeping ASSAO and making the mode togglable will make the comparing of different AO effects easier during the dev phase. GTAO (or specifically XeGTAO) is recommended to work with TAA, whereas ASSAO doesn't have such requirements. Also, since the current GTAO implementation is parallel to ASSAO, it would be easy to remove ASSAO and switch completely to GTAO if required.

Q: Why not GT-VBAO?

A: Here's another branch that implements GT-VBAO on top of changes from this PR: https://github.com/HydrogenC/godot/tree/vbao-neo. It's possible that that branch rather than this branch would be adopted at last. And here's the preview:
微信图片_20251129121834_154_49

Future works

These works are mostly chores not done yet and has to be done, but I prefer to do them after the content of this PR gets approval:

  • Expose thickness parameter as a editor param
  • Write docs

@Saul2022
Copy link

Saul2022 commented Nov 29, 2025

It looking beatiful, on my end it does look like xegtao , so that's really good, though i noticed that when going from low to medium or high the cost becomee way higher ( it is like way more expensive than ASSAO at those quality levels), for example ASSAO0
gave me 95fps while xegtao went down from a comfortable 105 fps in low to like less than 65 fps in high. It might be because of overheating or not using taa, but issue is taa by itself in godot is a performance killer( since it's not optimized at all ), so maybe you can apply the temporal super sampling effect on xegtao by default as clay said before so it does not need taa and might perform better.

Also comparing intel xegtao with pr ( since they look the same for me now )
exterior-assao-medium-vs-gtao-high~2

editor_screenshot_2025-11-29T094923

And also I may confirm the low quality blurring artifacts are gone ! And image intensity from the bottom is using the default ao intensity, hence why it look´s more intense than in the Godot case ( since ASSAO and XEGTAO use different intensity values).

@mrjustaguy
Copy link
Contributor

I'd hardly say just under 1ms at 1440p on an sth like a rx 6600 is a "performance killer" for TAA.

@Saul2022
Copy link

Saul2022 commented Nov 29, 2025

I'd hardly say just under 1ms at 1440p on an sth like a rx 6600 is a "performance killer" for TAA.

I mean lower end gpus exist ( in my game using taa hurts performance a lot.,just like if i used fsr2) and that's a m3 . Also the framerrate was captured on FHD when running th3 game , i don't use hidpi in game. + you cant see framerate in editor on mac.

@mrjustaguy
Copy link
Contributor

That may just be a mac problem, from what I've been able to find, the weakest m3 should be atleast in the 1050 ti's ball park GPU wise, which should at 1080p mean like 2ms TAA to 5ms FSR2 native or better.
Also on such a device I'd avoid Post Processing in general, as that'll quickly eat up half your Frame Time Budget if say targeting 1080p60 and would even consider taking a look at the Mobile renderer if targeting such hardware, but that's just me.

@HydrogenC
Copy link
Author

so maybe you can apply the temporal super sampling effect on xegtao by default as clay said before so it does not need taa and might perform better.

This is what I want to do, but disappointingly it would be difficult. The reason is that Godot generates motion vectors during the opaque pass, but AO runs before opaque. Doing a reorder in the core render pipeline would be out of scope.

@HydrogenC
Copy link
Author

And also I may confirm the low quality blurring artifacts are gone ! And image intensity from the bottom is using the default ao intensity, hence why it look´s more intense than in the Godot case ( since ASSAO and XEGTAO use different intensity values).

Godot uses 2.0 as default for ASSAO. As for GTAO I would recommend 1.0 as the best intensity.

@HydrogenC
Copy link
Author

And also I may confirm the low quality blurring artifacts are gone ! And image intensity from the bottom is using the default ao intensity, hence why it look´s more intense than in the Godot case ( since ASSAO and XEGTAO use different intensity values).

I added a multiplier for intensity in my latest commit, so now it would look better at default intensity value (the strength of GTAO looks similar to ASSAO at the same intensity value).

@Ansraer
Copy link
Contributor

Ansraer commented Nov 29, 2025

Why no Visibility Masks?
A: SSILVB works better with colors. The difference is that AO is ran before the color is rendered and SSIL is ran after. So SSILVB would work better as an SSIL algorithms instead of an AO algorithm. For anyone interested of implementing SSIVB, I have a working experimental Godot branch of SSILVB here. Feel free to take it and adapt it to a SSIL method.

Uhm sorry, but based on my understanding of VBAO I have to disagree with that statement. AFAIK the idea of using bitmasks during AO is to improve the accuracy of occlusion detection, resulting in a more accurate AO effect with reduced artifacts. The way I interpret the author using it for indirect illumincation is more of a welcome optional sidebenefit that can be implemented on top.

While glancing at your code I also noticed that your implementation still uses an acos approximation. If I am not wrong it should be possible to implement GTAO without needing that at all. Maybe check out https://www.shadertoy.com/view/4cdfzf, afaik thats the current state of the art when it comes to AO.

@mrjustaguy
Copy link
Contributor

The results of VBAO weren't good as per #110997 (comment)

@HydrogenC
Copy link
Author

While glancing at your code I also noticed that your implementation still uses an acos approximation. If I am not wrong it should be possible to implement GTAO without needing that at all. Maybe check out https://www.shadertoy.com/view/4cdfzf, afaik thats the current state of the art when it comes to AO.

Small problem: Actually, even VBAO still uses fast acos. The difference is that VBAO no longer does inner integration.

@HydrogenC
Copy link
Author

HydrogenC commented Nov 30, 2025

The results of VBAO weren't good as per #110997 (comment)

I figured it out. The rim artifact is because of the use of ASSAO blurring pass, after replacing with XeGTAO blurring pass it now looks nice.
I'm glad to share my latest experiments with VBAO:
image
image

For anyone who's interested in trying out VBAO, here's the branch: https://github.com/HydrogenC/godot/tree/vbao-neo.
It's up to Godot maintainers whether to use VBAO or vanilla GTAO.

@Ansraer
Copy link
Contributor

Ansraer commented Nov 30, 2025

Great to see that you managed to get VBAO working!
Makes sense that the problem was either with the blurring pass or the interleaving pass.

On another note, have you already considered how this is going to behave when TAA is enabled? While godot's TAA implementation is admittedly not the best, I would still try to leverage it (when enabled) for the best possible results. Has been a while since I last read the official paper, but iirc GTAO was designed with some kind of temporal accumulation in mind, the blur was only ever intended as a fallback option.

If we are lucky it might be possible to just skip the blurring entirely when TAA is on (Though we might still need it, since this is rendering at half the resolution the official version is using. Even then, we might be able to make it somewhat cheaper, reduce kernel size, ...).

And to come back to the acos discussion, have you checked the link I posted aboce? I am on my phone right now so it is somewhat difficult for me to verify this myself, but I am very certain that this implementation works without needing any acos at all.

@HydrogenC
Copy link
Author

HydrogenC commented Nov 30, 2025

On another note, have you already considered how this is going to behave when TAA is enabled? While godot's TAA implementation is admittedly not the best, I would still try to leverage it (when enabled) for the best possible results. Has been a while since I last read the official paper, but iirc GTAO was designed with some kind of temporal accumulation in mind, the blur was only ever intended as a fallback option.

I implemented this from XeGTAO. XeGTAO omits the temporal denoising pass mentioned in the paper and relies on TAA for temporal denoising, and so do my implementation. Though XeGTAO recommends always using with TAA, I tested without TAA and it still looks fine. The comparison photos I taken above are all taken with TAA enabled, as it's the recommended way to use XeGTAO.
For your link that optimizes acos out, I couldn't really understand the mathematics within it. Could it be proved to be mathematically identical to the version with acos? I referenced a few game-engine-ready implementations of VBAO while writing the code, like VBAO in Bevy, Spartan Engine, etc, and they all use the version with acos.

@mrjustaguy
Copy link
Contributor

mrjustaguy commented Nov 30, 2025

Do note I have been working on TAA - #113043 so results may be subject to change should that pass, though other TAA improved effects (namely the Soft Shadows) haven't really changed their behavior as far as I can tell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants