MASK implementation

MASK leverages word embeddings as bridges to associate words with their corresponding prototypes, thereby enabling semantic knowledge alignment between the image and text modalities.

Datasets and Metrics

We test the performance of MASK on two standard benchmark datasets: Flickr30k and MSCOCO. The image-text matching usually includes two sub-tasks in terms of: 1) image annotation: retrieving related texts given images, and 2) image retrieval: retrieving related images given texts. The commonly used evaluation criterions are R@1", R@5" and R@10", i.e., recall rates at the top-1, 5 and 10 results. Following existing works, we also use an additional criterion of Rs" by summing all the recall rates to evaluate the overall performance.

Implementation Details

In the multimodal aligned semantic knowledge, we collect all words from the VG dataset and filter out some special characters and rare words, resulting in a total of $K$=12,385 semantic concepts. For each image, we initially employ the pre-trained object detection model Bottom-UP Top-Down \footnote{https://github.com/MILVLG/bottom-up-attention.pytorch} to extract raw region representations, setting the number of detected regions to $I$=36 and the dimensionality of each region representation to $M$=2048. For each word, we obtain its word embedding using the pre-trained word vectors glove-twitter-50 \footnote{https://nlp.stanford.edu/projects/glove/}. The batch size is 4096 for the first 200 epochs and 2048 for the next 200 epochs. The trade-off factors $\lambda_1$ and $\lambda_2$ are set to 3. We use the Adam to optimize the loss with a learning rate of 1e-4.

Result

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
__pycache__		__pycache__
.gitignore		.gitignore
Demo.py		Demo.py
LICENSE		LICENSE
MASKModel_2_ForLinux_2.py		MASKModel_2_ForLinux_2.py
MASKModel_2_ForLinux_2_loss.py		MASKModel_2_ForLinux_2_loss.py
MASKModel_2_ForLinux_2_loss_addInnerLoss.py		MASKModel_2_ForLinux_2_loss_addInnerLoss.py
MASKModel_2_ForLinux_2_loss_addInnerLoss_PrototypeVector.py		MASKModel_2_ForLinux_2_loss_addInnerLoss_PrototypeVector.py
MASKModel_2_ForLinux_2_loss_addInnerLoss_PrototypeVector_temp.py		MASKModel_2_ForLinux_2_loss_addInnerLoss_PrototypeVector_temp.py
MASKModel_2_ForLinux_2_loss_addInnerLoss_forGlove.py		MASKModel_2_ForLinux_2_loss_addInnerLoss_forGlove.py
MASKModel_2_ForLinux_2_loss_increment.py		MASKModel_2_ForLinux_2_loss_increment.py
MASKModel_2_ForLinux_2_loss_increment_noAdjusting.py		MASKModel_2_ForLinux_2_loss_increment_noAdjusting.py
MASKModel_2_ForLinux_2_loss_increment_temp.py		MASKModel_2_ForLinux_2_loss_increment_temp.py
MASKModel_2_ForLinux_2_parallel.py		MASKModel_2_ForLinux_2_parallel.py
README.md		README.md
constructMACKKB.py		constructMACKKB.py
constructObjWord.py		constructObjWord.py
featureExtractionForF30k.py		featureExtractionForF30k.py
featureExtractionFromMACK_ALL.py		featureExtractionFromMACK_ALL.py
featureExtractionFromMACK_concat.py		featureExtractionFromMACK_concat.py
getKB.py		getKB.py
getKB_forGlove.py		getKB_forGlove.py
inference.py		inference.py
knowledgeBasedMatching.py		knowledgeBasedMatching.py
knowledgeBasedMatching_reconstruction.py		knowledgeBasedMatching_reconstruction.py
knowledgeBasedMatching_reconstruction_block.py		knowledgeBasedMatching_reconstruction_block.py
knowledgeBasedMatching_reconstruction_clipAugumentation.py		knowledgeBasedMatching_reconstruction_clipAugumentation.py
knowledgeBasedMatching_reconstruction_reverse.py		knowledgeBasedMatching_reconstruction_reverse.py
knowledgeBasedMatching_reconstruction_reverse_clipAugumentation.py		knowledgeBasedMatching_reconstruction_reverse_clipAugumentation.py
myModels.py		myModels.py
reProduceMACKKB.py		reProduceMACKKB.py
testCLIPModel.py		testCLIPModel.py
testFeatureExtraction.py		testFeatureExtraction.py
testUnfoldFunction_v5_MultiScale_ForLinux.py		testUnfoldFunction_v5_MultiScale_ForLinux.py
testUnfoldFunction_v5_MultiScale_ForLinuxFromSavedFile.py		testUnfoldFunction_v5_MultiScale_ForLinuxFromSavedFile.py
trainingInUnpairedData.py		trainingInUnpairedData.py
trainingInUnpairedDataRevised.py		trainingInUnpairedDataRevised.py
trainingInUnpairedDataRevisedUsingSavedData.py		trainingInUnpairedDataRevisedUsingSavedData.py
trainingInUnpairedDataRevised_savedData.py		trainingInUnpairedDataRevised_savedData.py
updateKB.py		updateKB.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MASK implementation

Datasets and Metrics

Implementation Details

Result

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

AndroidDevelopersTools/MASK

Folders and files

Latest commit

History

Repository files navigation

MASK implementation

Datasets and Metrics

Implementation Details

Result

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages