Survey on Enhancing Dialogue Agent Alignment through MiniLLM with Targeted Human Assessments

Swapnil B. Mahajan*, Chandu D. Vaidya**, Bhojraj Lalit Narware***, Divya Rameshwar Yemde****, Harshal Sanju Meshram*****, Harsh Anil Sukhdeve******, Harpreet Kaur Anoop Singh*******
*-******* Department of Computer Science and Engineering, S. B. Jain Institute of Technology, Management and Research, Nagpur, Maharashtra, India.
Periodicity:January - June'2025
DOI : https://doi.org/10.26634/jaim.3.1.21243

Abstract

This paper presents the development of a compact and effective language model inspired by the LLaMA architecture. The model's design is based on the fundamental principles of LLaMA, which influenced the architectural decisions and training methods. This study explores innovative approaches and expands the possibilities achievable with limited resources. By leveraging open-source datasets and advanced training techniques, significant progress was made without relying on extensive computational power or proprietary data. However, due to resource constraints, the model remains a work in progress. Individuals with access to greater computational capabilities could build upon this foundation to enhance its performance. This investigation aims to promote further contributions to the advancement of more robust and accessible language models. Key training parameters include context window size, number of layers, batch size, and model dimensions. Model evaluation is based on epoch count, execution time, model parameters, and validation loss.

Keywords

BERT, Machine Learning, LLM, NLP, Deep Learning.

How to Cite this Article?

Mahajan, S. B., Vaidya, C. D., Narware, B. L., Yemde, D. R., Meshram, H. S., Sukhdeve, H. A., and Singh, H. K. A. (2025). Survey on Enhancing Dialogue Agent Alignment through MiniLLM with Targeted Human Assessments. i-manager’s Journal on Artificial Intelligence & Machine Learning, 3(1), 11-25. https://doi.org/10.26634/jaim.3.1.21243

References

[8]. Choudhary, R. K. (2015). Implementation of efficient search by integrating proximity ranking & instant fuzzy. International Journal of Advances in Computer Science and Cloud Computing (IJACSCC) (pp. 25-35).
[13]. Geiping, J., & Goldstein, T. (2023). Cramming: Training a Language Model on a single GPU in one day. In International Conference on Machine Learning (pp. 11117-11143). PMLR.
[16]. Jiang, Z., Gu, J., Zhu, H., & Pan, D. (2024). Pre-RMSNorm and Pre-CRMSNorm transformers: equivalent and efficient Pre-LN transformers. Advances in Neural Information Processing Systems, 36, 1-17.
[17]. Kenton, J. D. M. W. C., & Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NaacL-HLT, 1, 4171-4186.
[22]. Lewis, M. (2019). Bart: Denoising sequence-to- sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
[24]. Li, J., Li, D., Savarese, S., & Hoi, S. (2023). Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In International Conference on Machine Learning (pp. 19730-19742). PMLR.
[28]. Openai, A. R., Openai, K. N., Openai, T. S., & Openai, I. S. (2018). Improving Language Understanding by Generative Pre-Training. OpenAi Blog.
[29]. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., & Lowe, R. (2022a). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.
[30]. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., & Lowe, R. (2022b). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.
[31]. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., & Chintala, S. (2019). Pytorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 40 40 300
Online 15 15 300
Pdf & Online 40 40 300

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.