IJFANS International Journal of Food and Nutritional Sciences

Volume 13 Issue 4

Efficacy of Withania somnifera against oxidative stress in male Drosophila melanogaster.
Volume 13 | Issue 4

Intervention Study On Cognitive Development Among School Children Of Siddi Tribe
Volume 13 | Issue 4

ASSESSMENT OF INTRA AND INTER- EXAMINER RELIABILITY FOR PROPOSED SCALE OF MASTICATION IN CHILDREN– A PRELIMINARY STUDY
Volume 13 | Issue 4

AI-DRIVEN APPROACH FOR MULTI CROP DISEASE CLASSIFICATION WITH PESTICIDE RECOMMENDATION SYSTEM
Volume 13 | Issue 4

AI-based Detecting Deception In Online Interactions: An Analysis of Dishonest Internet Users for Advances In Online Security
Volume 13 | Issue 4

Advancements in Image Captioning: Bridging Computer Vision and Natural Language Processing with Deep Learning

PDF

Keywords:

CNN, Bi-LSTM, BLEU, Captioning

S.Sagar Imambi1, N.V. Nikhila2
» doi: 10.48047/IJFANS/11/S6/036

Abstract

the past few years, image captioning has emerged as a complex and demanding task within the field of artificial intelligence. It has attracted many researchers in the field of AI and became an arduous and an interesting task. Image captioning, automatically generates the textual description according to the content observed in an image and it is the combination of two methods including computer vision and natural language processing. Computer vision is to realize the content of the image and natural speech processing is to understand the image into words in the correct order. Recently, Deep learning methods are achieving better results on the problem of caption generation and they can define a single end-to-end model to predict a caption when a photograph is given, instead of requiring a pipeline of specifically designed models or sophisticated data preparation. By using deep learning techniques like CNN, RNN accurate descriptions can be predicted. Convolutional Deep Neural Network (CNN) is used for feature extraction from image and Recurrent Neural Network is used for sentence generation. the model is trained in such a way that if an image is given to the model it generates the textual description observed in an image. Recurrent neural network can be trained on a dataset of images and text descriptions, and then used to generate new text descriptions for new images

Issue

Volume 11, Special Issue 6 (2022 )

Submit article