So actually, there is another one that isn't mine or in the training data bc it was a friend's own personal theory (SKL) and there's also came up higher on average, but still didn't come out on top. Someone at the hackathon also wrote down a sophisticated version of their own moral theory and ran it and it didn't break the top 7 and SFOM still came out on top.
So yes I think that you make a great point and that that's one of the things that should be added to a more sophisticated version of the benchmark ranking system! (in fact I wanted to automate it and create a site so people could add their own more easily etc) to make the ranking and research loop continuous. I think the other two friend theories coming up above the top 50% in ranking shows that you might be right that the AI's have a sycophantic bias to out of training data theories... however on the other hand I thought they might have a bias towards theories in their training data since they have a lot more information on them versus the personal summaries of out of training data theories (like my own / the friends & the hackathon's submission) so it's not cut and dry which way the system would be biased.
I did talk about the limitations of this current rough version of the benchmark on the site, but I'd add all your concerns and more to it. So maybe instead of 2-3 non training data ones, we will have many more to reduce sycophancy. but it is interesting to note as a counter to your point that among the 3 non training data theories, SFOM still ended up on top. I think although imperfect, it is a non-trivial signal that this should happen consistently.
I do think the criteria are also not the strongest. Again they were LLM generated at the time (to reduce my own bias) by asking the LLMs to come up with a thorough list of criteria to assess and rank moral theories. There are many criteria however that I think can be cut, merged, or refined, and i want to add quantitative criteria, and more. With funding I'd also try to get expert philosophers to contribute and write their own criterion, etc. Plus the automated website version would allow anyone to critique and submit their own criteria.
I think you're making a good point about the circularity of some of the criteria, but that's a whole other can of ouroboros to discuss in more detail!
Once again as stated in the doc and site, the benchmark is really a rough proof of concept signal, all the theories listed are listed as summaries with an example. The SFOM summary there was a piece of my theory written in 2025 focusing on key components like the Subjective v Objective morality resolution, but it was by no means exhaustive. and it is currently outdated as things have advanced substantially in the last year. So I will keep you posted as I begin to release more refined theory!
I would say there is some overlap w/ the EA theory and other theories before it like moral naturalism , natural law , utilitarianism, deontology etc. because I do think that all of those theories contain kernels of truth / are special cases or subparts of a larger unified one. A lot of the theory should be compatible with past correct theory , but more cohesive and general, and yes it should feel obviously true if it is obviously true.
One of the things I wanted to add to the benchmarking is moral scenarios as you mention, so maybe you could write down which ones you'd think are worth testing and I'll add them in the future benchmarking system! :)
I feel bad that you feel you need more information to rate me accurately, because I agree! this submission is not a good reflection of the work / theory in a complete or updated sense but i rushed this submission to participate bc Defender wanted me to and I thought it would be a fun low stakes way to push myself forward. so i apologize if you feel your time was wasted and would recommend ignoring me / waiting until I publish somewhere more fully / thoroughly.
Viewing post in Formalizing Morality (Neodore) jam comments
I don't at all find it to have been a waste of time, it was an interesting read, I just don't feel like I understand what you're selling, yet. I think the EAs are ridiculous, and, while well-intentioned, absolutely rife with practical failings akin to that which have created many hells in recent history. I would want to be able to read your actual theory, even a several sentence summary, to be able to understand that which is being posited.