--- license: mit language: - en metrics: - perplexity - accuracy base_model: - microsoft/Phi-3-mini-4k-instruct library_name: transformers tags: - legal --- # 🧑‍⚖️ vakil-phi3-mini-4k-instruct-finetuned ## 📌 Overview This repository hosts a fine‑tuned version of **microsoft/Phi‑3‑mini‑4k‑instruct (3.8B parameters)**, adapted specifically for **Indian legal knowledge tasks**. The model was instruction‑tuned using **LoRA (Low‑Rank Adaptation)** on curated datasets covering constitutional acts and statutory sections. The objective of this fine‑tuning was to enhance the model’s ability to deliver **accurate, contextual, and explainable outputs** for legal queries in the Indian domain. --- ## ⚙️ Training Details - **Base Model:** microsoft/Phi‑3‑mini‑4k‑instruct - **Fine‑Tuning Method:** LoRA (parameter‑efficient fine‑tuning) - **Domain Data:** Indian constitutional acts, statutory sections, and related legal texts - **Training Infrastructure:** RunPod RTX A6000 GPU - **Training Duration:** 18 hours, 2 epochs - **Optimization Goal:** Reduce training loss and improve domain‑specific accuracy --- ## 📊 Evaluation - **Intrinsic Evaluation:** - Reduced perplexity compared to the base model - Improved accuracy on domain‑specific test sets - **Extrinsic Evaluation:** - Better parsing of statutes and structured legal outputs - Enhanced contextual reasoning in legal Q&A tasks - **Qualitative Observations:** - More consistent responses when asked about constitutional provisions - Improved ability to generate structured JSON outputs for legal sections --- ## 🚀 Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "ManjunathCode10x/vakil-phi3-mini-4k-instruct-finetuned" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) inputs = tokenizer("Explain Article 21 of the Indian Constitution:", return_tensors="pt") outputs = model.generate(**inputs, max_length=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```