Skip to content

Official implementation for the paper *Narrowing the Gap between Supervised and Unsupervised Sentence Representation Learning with Large Language Model*

License

Notifications You must be signed in to change notification settings

BDBC-KG-NLP/NGCSE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

921ff50 · Dec 19, 2023

History

1 Commit
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023
Dec 19, 2023

Repository files navigation

Environment

Run the following command to create the required conda environment:

conda env create -f environment.yml -n your_new_environment_name

Data

  • Train files:

    Run the following command in the data/ directory:

    bash download_hybrid.sh
  • Evaluation files:

    Run the following command in the SentEval/data/downstream/ directory:

    bash download_dataset.sh

How we hold out 10% of the training data and how some data augmentations are performed are shown in tools.py.

Training

  • Train with final performance in the "Wiki.STS_HT" training setting :

    bash scripts/train_bert_wiki_sts.sh
  • Train with final performance in the "NLI.STS_HT" training setting :

    bash scripts/train_bert_nli_sts.sh

The scripts to perform experiments before Final Performance section are listed in scripts/data_domain.

Evaluation

bash scripts/evaluation.sh path_to_the_result

Plotting

How we plot figures in the paper are shown in plot.py.

About

Official implementation for the paper *Narrowing the Gap between Supervised and Unsupervised Sentence Representation Learning with Large Language Model*

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published