CIA

Code for training, evaluating and using a cross-lingual Auto Evaluator

Installation

We require separate environments for training and evaluation due to incompatible Torch versions.

Training Environment Setup

Create and activate the training environment:

conda create -n training python=3.10 && conda activate training

Install numpy (ensure compatibility by avoiding numpy 2.x):
```
pip install numpy==1.26.4
```
Install PyTorch:
```
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
```
- Use torch v2.1.2 only.
- Compile with CUDA based on your system specifications.
- For further instructions, refer to the official PyTorch installation guide.

Clone and install the Alignment Handbook:

git clone https://github.com/huggingface/alignment-handbook.git
cd ./alignment-handbook/
python -m pip install .

Install Flash Attention 2:

python -m pip install flash-attn --no-build-isolation

Login to Hugging Face CLI:
```
huggingface-cli login
```

Install other useful libraries:

pip install wandb huggingface-hub==0.24.7

Install Git LFS to push models to the Hugging Face Hub:
```
sudo apt-get install git-lfs
```

Inference Environment Setup

Create and activate the inference environment:

conda create -n inference python=3.10 && conda activate inference

Install vLLM:
```
pip install vllm
```
Install datasets and transformers libraries:
```
pip install datasets transformers
```

Citation

If you find the following model helpful, please consider citing our paper!

BibTeX:

@article{doddapaneni2024crosslingual,
  title   = {Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs},
  author  = {Sumanth Doddapaneni and Mohammed Safi Ur Rahman Khan and Dilip Venkatesh and Raj Dabre and Anoop Kunchukuttan and Mitesh M. Khapra},
  year    = {2024},
  journal = {arXiv preprint arXiv: 2410.13394}
}

Name	Name	Last commit message	Last commit date
Latest commit sumanthd17 Update README.md Oct 18, 2024 6bf8c9d · Oct 18, 2024 History 36 Commits
CIA	CIA	cia pkg	Sep 18, 2024
artifacts	artifacts	testset preparation	Aug 2, 2024
scripts	scripts	english training	Sep 28, 2024
.gitignore	.gitignore	remove wandb from tracking	Sep 15, 2024
LICENSE	LICENSE	Initial commit	Jul 12, 2024
README.md	README.md	Update README.md	Oct 18, 2024
requirements.txt	requirements.txt	cia pkg	Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CIA

Installation

Training Environment Setup

Inference Environment Setup

Citation

About

Releases

Packages

Contributors 3

Languages

License

AI4Bharat/CIA

Folders and files

Latest commit

History

Repository files navigation

CIA

Installation

Training Environment Setup

Inference Environment Setup

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages