Image Captioning Models

Source files for Image Captioning system for the use in my Bachelor thesis: "Design and implementation of image caption generation system" (2022). The model relies on RNN network for the sentence embedding and pretrained networks (like VGG19) for image embedding.

Paper Summary

A model for captioning images was constructed by merging the sentence and image representation and then calculating the propability of the most probable next token in the caption. The Following merging methods were tested:

Simple Sum (A+B)
Multiplication (A*B)
Weighted Sum (A + (WA)*B) Of which the weighted sum was the most proficient.

Captioning an image starts with feeding the model with image embedding and starting off with [START] token, the caption generation ends when model generates [END] token.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
paper		paper
preprocessed		preprocessed
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
Untitled Diagram.drawio(1).png		Untitled Diagram.drawio(1).png
images.png		images.png
images2.png		images2.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Captioning Models

Paper Summary

Examples

About

Uh oh!

Releases

Packages

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

radkoder/Image-Captioning-Models

Folders and files

Latest commit

History

Repository files navigation

Image Captioning Models

Paper Summary

Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages