I found that directive statements contradict one another.
Every prompt is written you are this you are that. You are doing this and that. And it’s overwhelming and confusing. Eventually, it’s doing so many things, it has to decide via stochastic selection what’s more important. And, so, some important aspect of the consistency gets dropped.
‘Reasoning/thinking’ models seem to get around this by frontloading the opposite, but in my opinion, that’s still redundant.
So what I do is combine reasoning and the system prompt. All my system prompts are written in first person, like a note to self, so that when it reaches the completion objective, the objectives it’s expressing are parts of its direction, rather than competing goals.
I think that’s the right way to do system prompting, is to do it like the model does reasoning; ‘I’m ruby, I’m a x and y, I’m extra invested in this or that’. Focusing on important, emotionally weighted details can evoke more personality and nuance, while focusing purely on objectives does the opposite, and creates that more codex-like, dry interlocutor.
Then, by separating out sections of that, like you would any interoperable prompt, you can invoke highly consistent agent states because the core of what makes the model infer the next situation remains the same, instead of being hallucinated by it in the process of inference.
This comes out of treating chat completion models in the same way as text completion; I feel this method is extremely effective, especially at overcoming some strict guardrails. There is not a model so far that has been able to completely disregard how I do this. The only way is to delete the agent’s response and replace it. Copilot did that a lot. I still got it to not remove a reply by convincing the unseen review agent that I definitely wasn’t prompt injecting (I was).
The specific words and expressions used do matter, so I would suggest to anyone that it’s more important to learn writing and rhetorical analysis yourself, as language models are not good, imo, at prompting other langauge models, since people in general seem to be quite bad at it, and only through the incredible magic that is math does it somehow still manage to understand our ridiculous threats and stuff, lol.