In this project, we systematically analyze a deep neural networks based image caption generation method. With an image as the input, the method can output an English sentence describing and speaking out the scene in the image. And also deployed this model on the AWS cloud using the flask farmework
Dataset used: https://www.kaggle.com/hsankesara/flickr-image-dataset Flickr 30K is a dataset for automatic image description and grounded language understanding. It contains 30k images collected from Flickr with 158k captions provided by human annotators.