Note: This model has been trained for approximately 2.7M steps (batch size = 1) and is still in the training process. I have attached a .ipynb file in the repository. You can refer to it to know how ...
Learn the right VRAM for coding models, why an RTX 5090 is optional, and how to cut context cost with K-cache quantization.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results