Skip to content

jzbjyb/lm-calibration

Repository files navigation

LM Calibration

This repository contains code for the paper How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering

Install

Our code is mainly based on T5 and mesh-tensorflow and runs on TPUs. Please follow the original T5 repository to properly setup TPUs. To install required packages, download T5 (version 0.6.4) and mesh-tensorflow (version 0.1.16) and copy source files into the t5 and mesh_tensorflow folder. Don't replace files already in these folders because those files are the files we modified for calibration purpose.

Fine-tune

Run the following commands to fine-tune the UnifiedQA models with softmax or margin objective functions. $tpu specifies the name of the TPU, $model_output specifies the output location to save the fine-tuned model, $objective specifies the objective function to use.

./finetune.sh $tpu 3B $model_output $objective uq_clean_train_ol_mix train mc

Evaluate candidate answers

Run the following commands to evaluate the probabilities of candidate answers. $score_output specifies the location to save the output, and 1103000 specifies the checkpoint to use.

./score.sh $tpu $score_output $model_output 1103000 uq_clean_test dev

Compute ECE

Run the following commands to compute the ECE metric given the probabilities of candidate answers.

python cal.py --mix uq_clean_test --split dev --score $score_output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy