What are other than fine-tuning methods to make LLM smarter? Im familair with RAG - Retrival Augumented Generation.

Other than fine-tuning and RAG, Guidance allows you to constrain the output of an LLM within a grammar, for example to guarantee JSON output 100% of the time.

Here's one library to do this https://github.com/guidance-ai/guidance