What do people mean when they say that something will become more like a utility maximizer?

Wait 5 sec.

Published on September 21, 2025 4:03 PM GMTAI risk arguments often gesture at smarter AIs being “more rational”/“closer to a perfect utility maximizer" (and hence being more dangerous) but what does this mean, concretely? Almost anything can be modeled as a maximizer of some utility function.The only way I can see to salvage this line of reasoning is to restrict the class of utility functions one can have such that the agent's best-fit utility function cannot be maximized until it gets very capable. The restriction may be justified on the basis of which kind of agents are unstable under real-world conditions/will get outcompeted by other agents. What do we mean when we say a person is more or less of a perfect utility maximizer/is more or less of a "rational agent"?With people, you can appeal to the notion of reasonable vs. unreasonable utility functions, and hence look at their divergence from a maximizer of the best-fit "reasonable" utility function. For example, if I appear to have very different preferences at different points in time (e.g. I prefer to hold a red apple on odd hours and a green apple on even hours), you can extract money from me, and that seems "irrational" to us. But it’s only truly irrational if you require that I’m not indifferent to money and that I don’t prefer different fruit depending on the current time. You can also informally constrain the set of "reasonable" utility functions by what people say they want. Like if I say "I want to win a chess tournament" you might consider me irrational if I get drunk on the day of the tournament. In any particular real-world situation where people discuss rationality and preferences, we can use a rough situation-specific model of "what kinds of things can one have preferences over" and this allows us to constrain the set of valid utility functions.Unsatisfactory answers I've seenA1: It's about being able to cause the universe to look more like the way you want it toMy point here is that it's easy to model something as an EU-maximizer if you allow "unreasonable"-seeming utility functions. If you're saying it's only rational when the utility function you're maximizing is the one you "want" to maximize, how do you define "want" here in a non-circular way?A2: It's more rational if the implied utility function is simplerSimplicity per se doesn't make sense, e.g. if I want to maximize the value of my coins and you want to maximize the total size of all your coins, it doesn't seem relevant which one of those goals is simpler.A3: It's the degree to which you satisfy the VNM axiomsThere are only four axioms, so this doesn't provide a lot of differences in degree.The most promising answers I've seen are ways to formalize the "reasonableness" restrictionA4: It's the degree to which your implied preferences are coherent over timeA narrow type of "reasonableness" restriction might be that you're not allowed to prefer different things depending on the current time, otherwise over time you'll bleed real-world resources (like energy or money) and get outcompeted by other agents that don't have time-varying preferences. However, such a restriction seems insufficient. For example, I could say that Deep Blue is just as rational/EU-maximizing as Stockfish if the chess models' utility functions are just a lookup table of which move they prefer to take in each board state. A5: It's the degree to which your implied preferences are robust to arbitrary-seeming perturbationsIf you prefer different things depending on extremely subtle changes, you're not robust to noise in perception and computation. Therefore you're unlikely to be able to fulfil your preferences under real-world constraints.I think a combination of A4 and A5 is the way to go here; when people discuss "approximation of a utility maximizer" what they really mean is "approximation of a utility maximizer with consistent preferences over time and small perturbations".Discuss