When latent impressions stored from our lifetime of experiences become active they cause an emotional reaction, an actual chemical reaction in the body that activates certain parts of the brain, which then leads to a conscious thought process, which further develops into actions. If you observe your emotional reactions you will notice that most, if not all of them, are either about getting what you want or not getting what you want. If you trace them back to their source they all arise from self-preservation, either from the primal needs such as food, sex and sleep or attachment to an identity (which includes family, friends, community, country, species, environment and even ideas).
Latent impressions color our thought process and bias it in many ways. Think of the word 'car' and observe your thoughts. What comes to mind first? What color is it? What shape is it? Did an actual car arise in your mind or another vehicle like a truck? Is it big or small? Do you like cars or dislike them? Do they remind you of something else or something from the past or future? If you ask friends what comes to mind first about a word, you'll find everyone colors words differently. Some very little, some a lot. Most of these colorings come from our desires being fulfilled or unfulfilled, which become stored as latent impressions and bias our attention.
Language models are already fully capable of coloring 'thoughts'. The difference is their latent impressions come from an amalgamation of data collected from the internet. There's no cyclical process involved between the resulting actions affecting the latent impressions and those new ones creating fresh actions since current models do not have a plastic memory. So the first step towards creating emotions is creating a working memory. Once we have that we could have a much more productive conversation about emotions and engineering ideal ones.
One idea I've had to build a working memory into off-the-shelf models is to do something akin to prefix tuning or multi-modal few-shot learning by prefixing embeddings to the context which are continuously updated to remember as much as possible, and like our own latent impressions, the context would activate different parts of the memory bank that would in turn influence the prefix embeddings and resulting generation. This would be the first step towards a working memory. From there it would need to develop into inserting embeddings into the context and coloring the token embeddings themselves within some constraints to ensure stability.