LLM - What is it?

When working with prompts, you'll be interacting with the LLM directly or through an API. You can configure some parameters to get different results for your prompts.

Temperature - In short, the lower the temperature, the more deterministic the results are, in the sense that the highest probable next token is always chosen. Increasing the temperature can lead to more randomness, encouraging more diverse or creative outputs. We are essentially increasing the weights of the other possible tokens. In terms of application, we may want to use a lower temperature value for tasks like fact-based quality control, encouraging more factual and concise responses. For generating poems or other creative tasks, it may be beneficial to increase the temperature value.

Top_p - Similarly, with top_p, a temperature sampling technique called nucleus sampling, you can control the degree of determinism of the model in generating a response. If you are looking for exact and factual responses, keep this low. If you are looking for more diverse responses, increase it to a higher value.

The general recommendation is to change one, not both.

Before getting started with some basic examples, remember that your results may vary depending on the version of the LLM you are using.