Skip to content

PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper

License

Notifications You must be signed in to change notification settings

plkmo/BERT-Relation-Extraction

Repository files navigation

BERT(S) for Relation Extraction

Overview

A PyTorch implementation of the models for the paper "Matching the Blanks: Distributional Similarity for Relation Learning" published in ACL 2019.
Note: This is not an official repo for the paper.
Additional models for relation extraction, implemented here based on the paper's methodology:

For more conceptual details on the implementation, please see https://towardsdatascience.com/bert-s-for-relation-extraction-in-nlp-2c7c3ab487c4

If you like my work, please consider sponsoring by clicking the sponsor button at the top.

Requirements

Requirements: Python (3.8+)

python3 -m pip install -r requirements.txt
python3 -m spacy download en_core_web_lg

Pre-trained BERT models (ALBERT, BERT) courtesy of HuggingFace.co (https://huggingface.co)
Pre-trained BioBERT model courtesy of https://github.com/dmis-lab/biobert

To use BioBERT(biobert_v1.1_pubmed), download & unzip the model from here to ./additional_models folder.

Training by matching the blanks (BERTEM + MTB)

Run main_pretraining.py with arguments below. Pre-training data can be any .txt continuous text file.
We use Spacy NLP to grab pairwise entities (within a window size of 40 tokens length) from the text to form relation statements for pre-training. Entities recognition are based on NER and dependency tree parsing of objects/subjects.

The pre-training data taken from CNN dataset (cnn.txt) that I've used can be downloaded here.
Download and save as ./data/cnn.txt
However, do note that the paper uses wiki dumps data for MTB pre-training which is much larger than the CNN dataset.

Note: Pre-training can take a long time, depending on available GPU. It is possible to directly fine-tune on the relation-extraction task and still get reasonable results, following the section below.

main_pretraining.py [-h] 
	[--pretrain_data TRAIN_PATH] 
	[--batch_size BATCH_SIZE]
	[--freeze FREEZE]  
	[--gradient_acc_steps GRADIENT_ACC_STEPS]
	[--max_norm MAX_NORM]
	[--fp16 FP_16]  
	[--num_epochs NUM_EPOCHS]
	[--lr LR]
	[--model_no MODEL_NO (0: BERT ; 1: ALBERT ; 2: BioBERT)]  
	[--model_size MODEL_SIZE (BERT: 'bert-base-uncased', 'bert-large-uncased';   
				ALBERT: 'albert-base-v2', 'albert-large-v2';   
				BioBERT: 'bert-base-uncased' (biobert_v1.1_pubmed))]

Fine-tuning on SemEval2010 Task 8 (BERTEM/BERTEM + MTB)

Run main_task.py with arguments below. Requires SemEval2010 Task 8 dataset, available here. Download & unzip to ./data/ folder.

main_task.py [-h] 
	[--train_data TRAIN_DATA]
	[--test_data TEST_DATA]
	[--use_pretrained_blanks USE_PRETRAINED_BLANKS]
	[--num_classes NUM_CLASSES] 
	[--batch_size BATCH_SIZE]
	[--gradient_acc_steps GRADIENT_ACC_STEPS]
	[--max_norm MAX_NORM]
	[--fp16 FP_16]  
	[--num_epochs NUM_EPOCHS]
	[--lr LR]
	[--model_no MODEL_NO (0: BERT ; 1: ALBERT ; 2: BioBERT)]  
	[--model_size MODEL_SIZE (BERT: 'bert-base-uncased', 'bert-large-uncased';   
				ALBERT: 'albert-base-v2', 'albert-large-v2';   
				BioBERT: 'bert-base-uncased' (biobert_v1.1_pubmed))]    
	[--train TRAIN]
	[--infer INFER]

Inference (--infer=1)

To infer a sentence, you can annotate entity1 & entity2 of interest within the sentence with their respective entities tags [E1], [E2]. Example:

Type input sentence ('quit' or 'exit' to terminate):
The surprise [E1]visit[/E1] caused a [E2]frenzy[/E2] on the already chaotic trading floor.

Sentence:  The surprise [E1]visit[/E1] caused a [E2]frenzy[/E2] on the already chaotic trading floor.
Predicted:  Cause-Effect(e1,e2) 
from src.tasks.infer import infer_from_trained

inferer = infer_from_trained(args, detect_entities=False)
test = "The surprise [E1]visit[/E1] caused a [E2]frenzy[/E2] on the already chaotic trading floor."
inferer.infer_sentence(test, detect_entities=False)
Sentence:  The surprise [E1]visit[/E1] caused a [E2]frenzy[/E2] on the already chaotic trading floor.
Predicted:  Cause-Effect(e1,e2) 

The script can also automatically detect potential entities in an input sentence, in which case all possible relation combinations are inferred:

inferer = infer_from_trained(args, detect_entities=True)
test2 = "After eating the chicken, he developed a sore throat the next morning."
inferer.infer_sentence(test2, detect_entities=True)
Sentence:  [E2]After eating the chicken[/E2] , [E1]he[/E1] developed a sore throat the next morning .
Predicted:  Other 

Sentence:  After eating the chicken , [E1]he[/E1] developed [E2]a sore throat[/E2] the next morning .
Predicted:  Other 

Sentence:  [E1]After eating the chicken[/E1] , [E2]he[/E2] developed a sore throat the next morning .
Predicted:  Other 

Sentence:  [E1]After eating the chicken[/E1] , he developed [E2]a sore throat[/E2] the next morning .
Predicted:  Other 

Sentence:  After eating the chicken , [E2]he[/E2] developed [E1]a sore throat[/E1] the next morning .
Predicted:  Other 

Sentence:  [E2]After eating the chicken[/E2] , he developed [E1]a sore throat[/E1] the next morning .
Predicted:  Cause-Effect(e2,e1) 

FewRel Task

Download the FewRel 1.0 dataset here. and unzip to ./data/ folder.
Run main_task.py with argument 'task' set as 'fewrel'.

python main_task.py --task fewrel

Results:
(5-way 1-shot)
BERTEM without MTB, not trained on any FewRel data

Model size Accuracy (41646 samples)
bert-base-uncased 62.229 %
bert-large-uncased 72.766 %

Benchmark Results

SemEval2010 Task 8

  1. Base architecture: BERT base uncased (12-layer, 768-hidden, 12-heads, 110M parameters)

Without MTB pre-training: F1 results when trained on 100 % training data:

  1. Base architecture: ALBERT base uncased (12 repeating layers, 128 embedding, 768-hidden, 12-heads, 11M parameters)

Without MTB pre-training: F1 results when trained on 100 % training data:

To add

  • inference & results on benchmarks (SemEval2010 Task 8) with MTB pre-training
  • felrel task

About

PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy