Show HN: Prototype for internalized AI values using shame/pride mechanisms

3 points

2 months ago

I built a proof of concept for modeling emotional primitives — shame, pride, identity — as an alternative approach to AI alignment. Current approaches are mostly external (constitutional AI, RLHF). I kept thinking about why humans don't do terrible things despite having the capacity, and the answer felt more like internalized values than external rules. Repo here: https://github.com/renaissancebro/AGI-seed

3 comments