Volume 13 | Issue 4
Volume 13 | Issue 4
Volume 13 | Issue 4
Volume 13 | Issue 4
Volume 13 | Issue 4
the past few years, image captioning has emerged as a complex and demanding task within the field of artificial intelligence. It has attracted many researchers in the field of AI and became an arduous and an interesting task. Image captioning, automatically generates the textual description according to the content observed in an image and it is the combination of two methods including computer vision and natural language processing. Computer vision is to realize the content of the image and natural speech processing is to understand the image into words in the correct order. Recently, Deep learning methods are achieving better results on the problem of caption generation and they can define a single end-to-end model to predict a caption when a photograph is given, instead of requiring a pipeline of specifically designed models or sophisticated data preparation. By using deep learning techniques like CNN, RNN accurate descriptions can be predicted. Convolutional Deep Neural Network (CNN) is used for feature extraction from image and Recurrent Neural Network is used for sentence generation. the model is trained in such a way that if an image is given to the model it generates the textual description observed in an image. Recurrent neural network can be trained on a dataset of images and text descriptions, and then used to generate new text descriptions for new images