LLMs suffer from various limitations, such as hallucination, lack of robust reasoning, and difficult to
control their outputs. Our work includes multi-perspective LLM self-reflection to enhance QA accuracy,
addressing the order sensitivity of in-context learning, addressing the length bias of the Direct Preference
Optimisation (DPO) algorithm, task embedding learning, prompt optimisation, and encouraging monosemanticity
of LLM neurons.