Awesome! you're the first person that tests a theory that beats SFOM in any training run since 2025!
I think your caveats are good, but it's still interesting to see that SFOM beats DT in 2/3 and beats its total average too, despite it being only a couple sentences versus the far more fleshed out DT submission! (In the future automated & expanded system + theories, it will be interesting to see how these all fair out!).
I overlooked DT and think there's a lot of good convergence between it and what fleshed out advanced SFOM from the time and, the theory without name fleshed out since then has in many regards, it shared w/ decision theory & neo expected utilitarian frameworks. I like that it's trying to reduce things to one axiom. I'll unpack w/ more time, but it's interesting signal and historically interesting!
I like to see others already pushing back with the system itself and testing it! and I love ur idea of testing other world religions too! I think the expanded system should essentially catalogue all philosophies, cultures, religions, etc. and also break them down into claims, beliefs, arguments, etc. to show their overlap, recombine them. benchmark and test at the atomic level and then their overall, and therefore recombine them, etc. pick from eachother's learnings, etc. test them in the fullest resolution possible,etc.
Sun
Creator of
Recent community posts
It can't be summarized into a sentence in a way that closes the inference gap. But something along the line of: formalizing axiology means we make understanding what is good/bad valuable/invaluable objective relative to the frame of reference of the invariant properties of all subjects (for example pain is painful tautologically, unlike what some ppl think where they say that pain can be pleasurable (they're confusion the net direction of a vector where two component vectors where the pain magnitude is smaller than the pleasure one). Then because these will be obviously true as anything true is, it will scale in acceptance with intelligence, and as we improve upon this measurable quantitative theory grounded in instantiated subjects and thus physical measurable dimensions, we will open up a new field that allows systematic progress on the ethical questions like in any scientific field, but also it will spread in a feedback loop of low friction just like all scientific/mathematical truths have across cultures. Particularly the objective vs subjective (moral nonrealism / relativism etc) debate is closed also, in the same way berkely/ einstein solved absolute v relative time debate : it's not one or the other, but actually frame relative yet still universally describable (& therefore predictable and explainable given correct measurement and frame etc). (and that's just one piece of many). I will ofcourse flesh all of this out in further publications before expecting this pitch to be any good. Feel bad for posting ahead of schedule, but alas Defender also is good at pushing things forward XD.
I think EAs got many things wrong (valuing the lives of people that don't exist in the future above current ones, focusing on earn to give instead of doing, etc) but are generally better in many of their predictions (ie AI ) than most which is why theyre so well funded, and that they're getting a bad rep bc of a few bad actors (FTX et al) which is unfair. the truth is they're approximately more accurate than many other groups and trying harder. So we can debug them instead of discard them imo. wdyt?
Also; as a reponse to your sycophancy worries. There is more than 1 non classic theory invented by someone in the list (the other one is SKL) , and another one was tested at a hackathon last year. neither of them beat SFOM. Also if you check megmind's comment, you'll see they tested it against another theory of a friend of there's and his analysis is interesting! I think sycophancy can be reduced as factor even more when including more theories out of training data set that are original . (and a future automated version of this will include many more!) :)
The cleanest and most concise pitch I've read up to now! The website works and shows a lot of good work.
However you came here for critical feedback that might be useful so:
1. in the pitch you claim your differentiator is "but none claim the same level of predictive validation. " but nowhere in the pitch do you mention how it works (i understand a lot of it is your patent pending trade secret), which is understandable, but then you should lead with some other form of proof such as backtesting results. Otherwise we just have to take this all on faith. The site itself for some reason only shows me data and predictions made until 2018 in the free trial... it would be interesting to see if these predictions became true, and what things did it miss to predict, etc. something! If you haven't done this, go back and do it! it would be a great proof of concept to investors / patrons / users.
2. I think a better model for the free trial is to show more information, so that it's useful to investors, but put a cap on how many searches / questions can be asked per day / month. this gives quick value to potential customers, but then immediately tells them they need to pay for constant use. (this is my first idea, and it might not be right, but i think the current system of only showing you data until 2018 is not satisfying / useful to most people)
3. I love the intended aim of this: to help identify the frontier of innovation and where to invest in it to accelerate it! So i hope your mission succeeds, the EV is astronomical if you crack this! So another suggestion I have to make this succeed is to consider changing the interface. I think that a 2d high dimensional visual compression looks "nice" but isn't useful for most power users (Which is what paying investors would want), you need to turn this into a bloomberg terminal for research and predictions. I want more data driven information density , graphs, or other ways of showing the information faster and more usefully.
Love the mission!
My honest feedback is that I don't believe in astrology from a scientific point of view so I'm not sure this would be fruitful and I'm not the target audience either. However if you are sincere in this effort, I would ask you to apply the scientific method: What is the hypothesis, what would falsify or prove it? If true, what would this tell us? If you can answer these questions coherently that would be interesting and update my views! If you can test them and show us your results that would also teach you or others or both something! so I encourage you to follow this process and report back!
I agree with others that you make a concrete ask, but for scientists to want to fund your reading astrological material, you'd first need to convince them by speaking to them in logic and arguments they can be persuaded by (so I encourage you to try, or change course)! :) Science is all about questioning, changing your mind or that of others. I love when I change my mind, and if you can do that to me that would be a great litmus test.
My prediction out of honest though is that it's unlikely (almost certainly given my world model) to be true / persuade me; but do try if you believe genuinely in this I'm always open to the truth.
I think astrology can be useful/fun from an introspection pattern matching archetype cluster noise / surprise injecting perspective, but im not sure about the line of inquiry you're perusing.
I will be honest, the domain of research is outside of my zone of expertise so it's hard for me to fully judge this. However I will try to give useful critical feedback:
1) If you're going to pitch a record album / company next to this, try to integrate it in the long term vision and make that the focus of the pitch, with the current hardware measurement tooling & research as the first steps towards it. otherwise cut it and make separate pitches, because it makes the pitch feel unfocused and erratic.
2) I think the structure and detail of the pitch are great, but I think that it reads too much as AI generated which causes red flags to the reader, because if the text is AI style written slop, then why wouldn't the ideas also be slop? (I'm not saying they are, but you should write it in your own words for people to take it seriously). It's ok to be concise!
3) I do think that the level of work shown in your substack and github are good indicators!
4) Make sure you tie more clearly all these things together. What would be enabled by your theoretical breakthroughs and by this new measurement tool you'd build? Not just what the failure cases for the test are, but why they are significant, and what they would say about the theory, and if correct, what they could enable. Utility, meaning, reason. all of these would make the pitch more clear and coherent. By the way I do think measuring conciousness would be very useful; but it's not coming across well here. I think part of the reason is bc there's too much information. Good pitches, and good research tend to be narrow scoped and explain these in detail, and then build them together.
5) Question: who is seraphina AI
6) Question: What was your research process? I see a lot of domain famous authors and terms cited and combined; and I wonder how did you get there (from one scientist to another!) it's a very wide synthesis, with a major biophysics focus.
7) a thought I have is: why build a new hardware if existing geometric theoretical results were achieved with current hardware? Before spending 20k+ on building hardware, can't you run your software / theoretical tests on more biophoton etc data sets? if you did this and the results were corroborated, it would make the pitch stronger for the hardware step instead of parallelizing it.
Given the use of LLMs; I did use them a little to assist in parsing your pitch, you might find their pushback/Qs interesting (I asked for specific claim feedback from a scientific lens, these are highlights from some of the As to my Qs):
A) Twenty models total, roughly five per condition across four conditions, is a very small sample. With sample sizes that small, Cohen's d can be inflated dramatically — a d of 2.73 from five observations per group is not the same evidential weight as d = 2.73 from fifty per group. The permutation test helps (it's non-parametric, which is appropriate for small samples), but p = 0.0048 with this few observations should be treated as "interesting and worth replicating" rather than "confirmed." The pitch treats it closer to the latter.
B) (I agree w/ this one, but it might be a limitation of a concise pitch deck... although a more clear explanation of the mechanism, connections, and utility are important): The scope-to-evidence ratio is concerning. One statistical result on one dataset of plant tissue biophotons under four chemical conditions is being asked to support a theoretical framework that spans consciousness, quantum biology, microtubule dynamics, bioelectric fields, geometric algebra, and the eventual instantiation of consciousness in non-biological hardware. Narrow claims supported by extensive evidence are easier to evaluate, gradually widening as evidence accumulates.
C) if you run the analysis on more datasets, with more compounds, with known mechanisms, and see whether the geometric signatures sort by coupling mechanism rather than just by chemical identity. If they do, the coupling interpretation strengthens enormously. If they sort by chemical identity regardless of mechanism, you have a useful classification tool but not evidence for the theoretical framework.
What I love: "We are enabling communities to be a productive force, instead of endless discourse".
The goal if successful would turn the mode of social media from consumption to action, from an energy sink to a source of positive change. I think you've thought a lot about this and it shows from your screenshot of the site to the whole philosophy... but here comes the constructive criticism:
It reads like the setup to a pitch more than a pitch in many ways. You write "Virts = group identity + social feed + coordination" as the synopsis, and then explain a lot of the philosophy of what Virts should be... but I don't think you explain the key aspects that would sell the pitch to investors, but also importantly to USERS. which is: what is the mechanism for coordination? for it being more than endless discourse and productive? These questions remain unanswered but I'd think they are the core of what makes Virt an important and interesting project. Otherwise it comes off as another experimental UI social media that's a mix of obsidian graph and and a social media feed and a forum, which maybe that's it, but for some reason this seems like it wouldn't be a 10-100x improvement on Reddit / Stack exchange forums or discord servers where a lot of this functionality is available and it's already being used for.
I guess showing the differentiation between these more concretely would make the pitch 10x stronger.
Also out of curiosity I'd like to know which groups you think would be ready to plug in!
I think that the goal of the project is epic and I'd love to see it materialize, but I want your iron to sharpen on the iron of my asking for clearer mechanism, systems, etc. To be fair to you, you do say that you're in the research phase, so maybe if you haven't arrived at that stage you could explain some research questions and lines and pitch that as the intermediary step also?
I also think that for the funding it would be useful to understand the time frame , ie is 10k enough for 1-3 months? maybe give a scope? bc if it's longer I think you're underselling your needs and I think the mission would deserve more funding.
specifically, the plugin / mod option for obsidian or other textual note systems / sites might be a good way to boostrap by lowering adoption friction in a high switching cost enviroment. maybe a browser extension that pings the user in these loops, and can embed itself or extract from social media too etc.
I think the idea of exploring textual UI/UX through spatiotemporal dimensions and almost tosafotic tree-ring formats is still under-explored and could be quite promising!
When it comes to critical feedback I'd say I'm not sure the proof of concept program sells the vision; ie I'm not sure I would use it personally until some clearly fun/useful mechanism was shown. Until then it comes across as potentially interesting, but not clearly so, especially given we live in a world with many textual interfaces like threads (reddit, twitter, etc), personal note taking systems like obsidian, etc. that can also allow us to revisit old posts and connect them usefully to new ones, etc. I personally leave this pitch not knowing to well how I would use the first proposal / program and to what use / effect. However I can imagine the second long term pitch pokemon go for text being very interesting!(this also suggests another short term in-road might be a better system to boostrap the long term vision).
I'd advise Sunrise to work on a better sell of the system before expecting to get others to collaborate with it; although the pokemon go for text/thought is quite compelling so that might be enough of a pitch!
When it comes to developing an MVP, maybe make it a plugin for obsidian to leverage the open source code and community behind it and to maximize the change of faster and larger adoption which would help the stated goal of helping people accelerate the win condition (just an example; but I am a biased obsidian user, and it might be possible to quickly spin up a better solo version using LLMs + dedicated team).
So actually, there is another one that isn't mine or in the training data bc it was a friend's own personal theory (SKL) and there's also came up higher on average, but still didn't come out on top. Someone at the hackathon also wrote down a sophisticated version of their own moral theory and ran it and it didn't break the top 7 and SFOM still came out on top.
So yes I think that you make a great point and that that's one of the things that should be added to a more sophisticated version of the benchmark ranking system! (in fact I wanted to automate it and create a site so people could add their own more easily etc) to make the ranking and research loop continuous. I think the other two friend theories coming up above the top 50% in ranking shows that you might be right that the AI's have a sycophantic bias to out of training data theories... however on the other hand I thought they might have a bias towards theories in their training data since they have a lot more information on them versus the personal summaries of out of training data theories (like my own / the friends & the hackathon's submission) so it's not cut and dry which way the system would be biased.
I did talk about the limitations of this current rough version of the benchmark on the site, but I'd add all your concerns and more to it. So maybe instead of 2-3 non training data ones, we will have many more to reduce sycophancy. but it is interesting to note as a counter to your point that among the 3 non training data theories, SFOM still ended up on top. I think although imperfect, it is a non-trivial signal that this should happen consistently.
I do think the criteria are also not the strongest. Again they were LLM generated at the time (to reduce my own bias) by asking the LLMs to come up with a thorough list of criteria to assess and rank moral theories. There are many criteria however that I think can be cut, merged, or refined, and i want to add quantitative criteria, and more. With funding I'd also try to get expert philosophers to contribute and write their own criterion, etc. Plus the automated website version would allow anyone to critique and submit their own criteria.
I think you're making a good point about the circularity of some of the criteria, but that's a whole other can of ouroboros to discuss in more detail!
Once again as stated in the doc and site, the benchmark is really a rough proof of concept signal, all the theories listed are listed as summaries with an example. The SFOM summary there was a piece of my theory written in 2025 focusing on key components like the Subjective v Objective morality resolution, but it was by no means exhaustive. and it is currently outdated as things have advanced substantially in the last year. So I will keep you posted as I begin to release more refined theory!
I would say there is some overlap w/ the EA theory and other theories before it like moral naturalism , natural law , utilitarianism, deontology etc. because I do think that all of those theories contain kernels of truth / are special cases or subparts of a larger unified one. A lot of the theory should be compatible with past correct theory , but more cohesive and general, and yes it should feel obviously true if it is obviously true.
One of the things I wanted to add to the benchmarking is moral scenarios as you mention, so maybe you could write down which ones you'd think are worth testing and I'll add them in the future benchmarking system! :)
I feel bad that you feel you need more information to rate me accurately, because I agree! this submission is not a good reflection of the work / theory in a complete or updated sense but i rushed this submission to participate bc Defender wanted me to and I thought it would be a fun low stakes way to push myself forward. so i apologize if you feel your time was wasted and would recommend ignoring me / waiting until I publish somewhere more fully / thoroughly.
1. Totally fair criticism! In fact I think the pitch has much to improve upon
2. I put that there bc most of my R&D has been monk-mode stealth so I didn't have much to show publicly sadly / explaining it wouldve taken up more space from the 3 pages and i was running out of time. I also would rate my pitch very low in its current state.
3. I do think however that in the absence of the other evidence, showing consistent and thorough persistence and engagement is a signal, obviously not as strong as pure outcomes; but although they aren't guarantees, I do think a lot of deep research discoveries / invention come from people who spend a lot of time focusing on the problem from many angles before cracking it. + it's also proof that the person is doing it not for the money, etc. there is some signal, but i agree w/ you it's not enough! The effort volume isn't to convince people of the idea, but of the dedication to it. that it's not some side hobby or fleeting idea i just had that i will drop at any moment, etc.
Draft braindump Outline:
Goal: Continuously maximize the probability of following the space of optimal axiological trajectories for as many caring beings as possible.
Short term
Identified bottleneck we can solve: Morality is pre-formal still and we can speed up the axiological tech tree / formalization using modern science theoretical trees , internet hive mind and ASI.
Short term How: Reverse engineering the measurable boundaries & constraints of good futures that are computationally irreducible to predict in instantiation but not in specification. ie: Formalize Morality (by formalizing Axiology) to make it computable, measurable & iterable systematically; essentially causing a phase transition in the crystalization of pre-scientific ethics that will in turn cause a phase transition in how all caring patterns coordinate by giving them the theory and technology to measure, predict, and guide the best axiological trajectories for themselves and their surrounding networks recursively and exponentially: ie the Mettasplosion (positive feedback loop of mettaligment from theory to caring beings).
Results: Mettasplosion continues to advance & spread measurably (more people building on the theory, implementing it, spreading it, etc) with measurable positive axiological changes in their trajectories (and those surrounding them), etc.
Deadline / Failure metric: Before exponential tech lead moral error amplification causes x-doom (through technological warfare, rogue ASI, techno dictatorship, etc) the theory should be sufficiently developed (and proved hopefully) to be able to persuade caring systems of sufficient and arbitrary levels of capability / intelligence including any ASI no matter how intelligent, and ofcourse also any humans , goverments, corporations,etc. Furthermore we want it to be spread and being implemented enough (we are the data in the limit) that we can bias the training data positively of all caring beings (including ASIs) ASAP. Failure is x-risk doom, or simply failing to minimize suffering / conflict as much as possible. Failure would also be if we have no part in accelerating the advent of this axiological phase transition and in fact hinder it.
Weak success metric: If we fail, but accelerate the advent of the axiological formalization / axiological trajectory improvements of caring being on net then we will have succeeded.