29thJunePresentation
29thJunePresentation
29thJunePresentation
Authors
Generation of Embeddings
Comparison of Techniques
Both models were initialized with key parameters: vector size of 100,
window size of 5, and a minimum word count of 5.
The models were trained for 10 epochs, meaning the dataset was
iterated over 10 times for training.
The trained models were then used to generate embeddings from the
tokens.
Generation of Embeddings
The pre-trained models were fine-tuned using the corpus.
Sequence tokens from each project version were inputted into these
models.
The trained models output "1" for detected bugs and "0" for bug-free
software.
Lower FNR and FPR values are associated with Doc2Vec embeddings.