Skip to content
/ CIA Public

Code for training, evaluating and using a cross-lingual Auto Evaluator

License

Notifications You must be signed in to change notification settings

AI4Bharat/CIA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

6bf8c9d · Oct 18, 2024

History

36 Commits
Sep 18, 2024
Aug 2, 2024
Sep 28, 2024
Sep 15, 2024
Jul 12, 2024
Oct 18, 2024
Sep 18, 2024

Repository files navigation

CIA

Code for training, evaluating and using a cross-lingual Auto Evaluator

Installation

We require separate environments for training and evaluation due to incompatible Torch versions.

Training Environment Setup

  1. Create and activate the training environment:

    conda create -n training python=3.10 && conda activate training
  2. Install numpy (ensure compatibility by avoiding numpy 2.x):

    pip install numpy==1.26.4
  3. Install PyTorch:

    conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
    • Use torch v2.1.2 only.
    • Compile with CUDA based on your system specifications.
    • For further instructions, refer to the official PyTorch installation guide.
  4. Clone and install the Alignment Handbook:

    git clone https://github.com/huggingface/alignment-handbook.git
    cd ./alignment-handbook/
    python -m pip install .
  5. Install Flash Attention 2:

    python -m pip install flash-attn --no-build-isolation
  6. Login to Hugging Face CLI:

    huggingface-cli login
  7. Install other useful libraries:

    pip install wandb huggingface-hub==0.24.7
  8. Install Git LFS to push models to the Hugging Face Hub:

    sudo apt-get install git-lfs

Inference Environment Setup

  1. Create and activate the inference environment:

    conda create -n inference python=3.10 && conda activate inference
  2. Install vLLM:

    pip install vllm
  3. Install datasets and transformers libraries:

    pip install datasets transformers

Citation

If you find the following model helpful, please consider citing our paper!

BibTeX:

@article{doddapaneni2024crosslingual,
  title   = {Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs},
  author  = {Sumanth Doddapaneni and Mohammed Safi Ur Rahman Khan and Dilip Venkatesh and Raj Dabre and Anoop Kunchukuttan and Mitesh M. Khapra},
  year    = {2024},
  journal = {arXiv preprint arXiv: 2410.13394}
}