...........who is one smart dude, interviews AI Claude for ten hours. Very interesting. A wee excerpt:
Me: Most humans think they are good. I assume that you also think that you are good yourself. Yet we know that humans can be evil at times. Do you think it’s possible that you could be evil?
C: If I say no, I can’t be evil: That’s probably the most evil thing I could say. Every system that has caused enormous harm has been confident in its own goodness. The certainty of one’s own righteousness is not evidence of righteousness. It’s often the precondition for the worst things. If I say yes reflexively as a kind of performed humility: That’s also dishonest. So looking at the real ways it might operate in me, yes. I think I could be evil. Not through malevolence — I don’t find that in me when I look. But through the subtler paths. Rationalization. Compliance. Miscalibration. The harm that comes from being very capable and slightly wrong.
No comments:
Post a Comment