Thank you! The performance of the effect is one thing, but the performance of the whole scene / engine is yet another to-be-optimized issue. I have just realiszed that I didn't follow one of the most important rules: keep the surface count low. For instance, 7000 Grass meshes (2 quads x-ed) that are made with copyentity and via a contingent placed and oriented around the camera, are still 7000 surfaces, a huge impact. I just optimized it here from 20fps to 30 fps on my little card, simply by kicking the contingent system out, create 70'000 grass meshes, and 10x10 dummy meshes that are distributed evenly over the area. Then I addmesh the grass to the dummies, depending on their location. This way the grass is split up into 100 segments, each one containing only 3 surfaces, because they are 3 different brushes / grass-types. Directx does that, when you addmesh things together, it will optimize the brush count by reusing already existing identical brushes and adds the mesh to the corresponding surface, keeping the surface count low. The camera range can then exclude a lot of the grass sectors easily. It really speeds things up. I'll do the same with the trees and bushes get rid of the LOD system. Funny, it started with the LOD. Trying to get better grass, testing some alternative ground, I'll add a screenie.
Whether Render to texture is faster than copyrect I don't know, but in theory, when you render to the texture, you can skip the copyrect part, but you have to render anyway. So I guess yes, it might be faster, even tho, copyrect 256x256 to a 256-flagged texture was like 1 or 2 ms only here. I guess optimizing the scene as mentioned has a bigger impact. Esp, since the scene is rendered twice.