So far, running LLMs has required a large amount of computing resources, mainly GPUs. Running locally, a simple prompt with a typical LLM takes on an average Mac ...
Here is how all the files should be arranged in order to make the app work. Due to nature of models and its size. all the models are available in Kaggle. Here is the link to my profile . please find ...