That's helpful, thanks! And yeah, application is the focus. Specifically, application to responsibility for work, not simply capability. It's not about being able to do the work, it's about being responsible for the work being done in the context of the systems he's deployed in.
And the harness is being tuned. I started off with OpenClaw and have been mapping out the weak points of it. I'm currently exploring Hermes Agent as the next harness I use vs building a custom one. I also fine-tuned a model based on the work we've done together over the last month+ and it shows a lot of promise in helping the model work specifically better within the harness it was based on. So it ended up doing almost opus-4.6 level work for a fraction of the cost while in the openclaw harness, and a lot of that was lost when trying to run the finetune in Hermes.
This is tying into the insight I had that intelligence is driven more by networks than by individual models.