Ebook: Deep Learning for Image Processing Applications
Deep learning and image processing are two areas of great interest to academics and industry professionals alike. The areas of application of these two disciplines range widely, encompassing fields such as medicine, robotics, and security and surveillance.
The aim of this book, ‘Deep Learning for Image Processing Applications’, is to offer concepts from these two areas in the same platform, and the book brings together the shared ideas of professionals from academia and research about problems and solutions relating to the multifaceted aspects of the two disciplines. The first chapter provides an introduction to deep learning, and serves as the basis for much of what follows in the subsequent chapters, which cover subjects including: the application of deep neural networks for image classification; hand gesture recognition in robotics; deep learning techniques for image retrieval; disease detection using deep learning techniques; and the comparative analysis of deep data and big data.
The book will be of interest to all those whose work involves the use of deep learning and image processing techniques.
Deep learning and image processing are two areas that interest many academics and industry professionals. The main objective of this book is to provide concepts about these two areas in the same platform. Professionals from academia and research labs have shared ideas, problems and solutions relating to the multifaceted aspects of these areas.
The first chapter deals with an interesting introduction to deep learning: the relation between man, mind and intelligence, which is dealt with here. This provides an excellent foundation for subsequent chapters. The second chapter demonstrates the application of deep neural networks for image classification. A wide range of images are used in this application, proving the robustness of the proposed approach. Hand gesture recognition with deep neural networks is analyzed in the third chapter. An interesting aspect of this chapter explains how recognized hand gestures are used to control the robotic arm.
Deep learning techniques for image retrieval are discussed in the fourth chapter. The significance of increasing multimedia data in real time and the necessity for efficient search processes is also stressed in this chapter. The fifth chapter concentrates on from human images using deep learning techniques. The sample disease used in this approach is a form of diabetes commonly found in humans. The sixth chapter deals with the application of tuberculosis detection in the human body through deep learning approaches. Experimental results show promising results for the proposed technique.
Object retrieval from images using deep convolutional features are discussed in the seventh chapter. Convolutional neural networks are used for the experimental analysis in this work. The eighth chapter highlights the application of hierarchical object detection with deep reinforcement learning approaches using different variations of the images. A comparative analysis of deep data and big data are discussed in the ninth chapter which adds a different dimension to the preceding content.
Vehicle type recognition using sparse filtered convolutional neural networks is discussed in the tenth chapter. Images from publicly available database are used for the experimental analysis in this work. The application of deep learning approaches for surveillance and security applications is discussed in the eleventh chapter. The final chapter talks about the possibility of enhancing the quality of images captured from a long distance using deep learning approaches. The variety of content in these chapters provides an excellent platform for researchers working in these areas.
We would like to express our gratitude to all of the authors who submitted chapters for their contributions. We also acknowledge the great efforts of the reviewers who have spent their valuable time working on the contents of this book. We would also like to thank Prof. Gerhard Joubert, Editor-in-Chief, Advances in Parallel Computing series and IOS Press for their constant guidance throughout this book project.
D. Jude Hemanth
Vania Viera Estrela
Image processing (IP) and artificial intelligence (AI) is an exciting research area in cognitive and computer science. This chapter deals with the image processing and attempts have been taken to illustrate the intertwining among mind, machine and image processing. Further, different subtle aspects pertaining to mind and intelligence are also presented. In fact, intelligence acts as a substrate of mind to engage it in consciousness. Indeed, mind is distinct from the consciousness and both are subtler than the body. The chapter also elucidates the comparison of mind machine, and intelligence in image processing. Moreover, attempt has been made to envision the core issues related to different aspects like: (1) intelligence is the means to engage the mind in consciousness (2) Brain is analogous to a feed forward hierarchical state machine and acts as a substrate for mind and possesses vital relationship with intelligence. Indeed, there are two important approaches to develop an artificially intelligent device. These are the conceptual understanding of mind and the computational framework of brain. This chapter emphasizes that the agent in which embryonic artificial intelligence is to be created must have the capacity of embodied experience and it must possess sensory components to establish relations between it and the external world. Further, this chapter also renders the philosophical perspectives of some intuitive questions such as: Can mind be explained in terms of machines? Can mind be replicated on machines? Do machines can ever be intelligent? In fact mind or intelligence cannot be reflected by physical means. In addition, this chapter preludes that mainly there are two approaches pertinent to the philosophy of mind. These are the dualism and functionalism. However, it is hard to explain the true nature of mind with the help of only these two approaches because mind displays the characteristics like and dislikes a machine simultaneously. Moreover, the concept of machine seems to be based on dichotomy between these two approaches. Various difficulties pertaining to image processing have also been incorporated.
This chapter gives an insight into Deep Learning Neural Networks and their application to Image Classification / Pattern Recognition. The principle of Convolutional Neural Networks will be described and an in-depth study of the algorithms for image classification will be made. In artificial intelligence, machine learning plays a key role. The algorithm learns when exposed to new data or environment. Object / Pattern Recognition is an integral part of machine learning and image classification is an integral part of such algorithms. The Human Visual System efficiently classifies known objects and also learns easily when exposed to new objects. This capability is being developed in Artificial Neural Networks and there are several types of such networks with increasing capabilities in solving problems. Neural networks themselves have evolved from evolutionary computing techniques that try to simulate the behavior of the human brain in reasoning, recognition and learning. Deep neural networks have powerful architectures with the capability to learn and there are training algorithms that make the networks adapt themselves in machine learning. The networks extract the features from the object and these are used for classification. The chapter concludes with a brief overview of some of the applications / case studies already published in the literature.
Hand gestures and Deep Learning Strategies can be used to control a virtual robotic arm for real-time applications. A robotic arm which is portable to carry various places and which can be easily programmed to do any work of a hand and is controlled by using deep learning techniques. Deep hand is a combination of both virtual reality and deep learning techniques. It estimated the active spatio-temporal feature and the corresponding pose parameter for various hand movements, to determine the unknown pose parameter of hand gestures by using various deep learning algorithms. A novel framework for hand gestures has been made to estimate by using a deep convolution neural network (CNN) and a deep belief network (DBN). A comparison in terms of accuracy and recognition rate has been drawn. This helps in analyzing the movement of a hand and its fingers which can be made to control a robotic arm with high recognition rate and less error rate.
With increase in amount of multimedia content, there arises need to retrieve it from database effectively. Several techniques have been introduced to deal with the situation efficiently. Such methods are known as Image Retrieval methods. This chapter focuses on brief review of different content based and sketch based image retrieval systems. Along with existing techniques, it also covers about what further can be achieved with these systems.
Almost all of us are tempted to take sweets due to its pleasant nature. When it is overused, it will affect our body entirely. Diabetic is a disease that occurs when the blood glucose level is high. According to the study of World Health Organization (WHO), the prevalence percentage of diabetic persons is doubled in the last 10 years. Life style, working environment, nature of the work, food habits and hereditary are few reasons for diabetic. Diabetic leads to various health problems like heart disease, stroke, kidney problems, nerves damage, eye and dental problems over time. Stevia is a sugar substitute which is available all over the world and it is proved to give more safety for diabetic patients. Stevia contains proteins, vitamins and minerals. Stevia plant may be affected by various diseases such as root rot, charcoal rot, wilt, leaf spot disease and so on. This chapter demonstrates the deep learning approach to enable the disease detection through image recognition. A deep convolutional neural network is trained to classify the disease affected leaves, achieving the accuracy of over 99%.
In this work, an attempt has been made to demarcate Tuberculosis (TB) sputum smear positive and negative images using statistical method based on Gray Level Co-occurrence Matrix (GLCM). The sputum smear images (N=100) recorded under standard image acquisition protocol are considered for this work. Haralick descriptor based statistical features are calculated from the sputum smear images. The most relevant features are ranked by principal component analysis. It is observed that the first five principal components contribute more than 96% of the variance for the chosen significant features. These features are further utilized to demarcate the positive from negative smear images using Support Vector Machines (SVM) and Differential Evolution based Extreme Learning Machines (DE-ELM). Results demonstrate that DE-ELM performs better than SVM in terms of performance estimators such as sensitivity, specificity and accuracy. It is also observed that the generalization learning capacity of DE-ELM is better in terms of number of hidden neurons utilized than the number of support vectors used by SVM. Thus it appears that this method could be useful for mass discrimination of positive and negative TB sputum smear images.
Image representations extracted from convolutional neural networks (CNNs) outdo hand-crafted features in several computer vision tasks, such as visual image retrieval. This chapter recommends a simple pipeline for encoding the local activations of a convolutional layer of a pretrained CNN utilizing the well-known Bag of Words (BoW) aggregation scheme and called bag of local convolutional features (BLCF). Matching each local array of activations in a convolutional layer to a visual word results in an assignment map, which is a compact representation relating regions of an image with a visual word. We use the assignment map for fast spatial reranking, finding object localizations that are used for query expansion. We show the suitability of the BoW representation based on local CNN features for image retrieval, attaining state-of-the-art performance on the Oxford and Paris buildings benchmarks. We demonstrate that the BLCF system outperforms the latest procedures using sum pooling for a subgroup of the challenging TRECVid INS benchmark according to the mean Average Precision (mAP) metric.
This work introduces a model for Hierarchical Object Detection with Deep Reinforcement Learning (HOD-DRL). The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention on five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis.We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal. Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution. We argue that, while this loss seems unavoidable when working with a large number of object candidates, the much more reduced number of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions. Source code and models are available at https://imatge-upc.github.io/detection-2016-nipsws/.
Now a day's big data is an important topic in corporate as well as in academics. The root of big data is the ability to study and analyze large sections of information to search for patterns and finding the trends. The root of big data is analytics, After applying the analytics in will lead to many findings that were undiscovered before. Big data simply take existing data and looks into a different way. Deep data on the other hand gathered data on daily basis and lined it with experts of industry. The main role of deep data is to section down the massive amount of data in Exabyte's or perabytes exclude the information that is duplicate or use less. But there are many challenges in switching the current scenario from Big data to Deep data. We have many machine learning approaches that can be applied to Big data. Deep learning is one of those machine learning approaches. But there are many challenges that are to be addressed. The objective is to discuss the various challenges in analyzing Big data as well as Deep data using Deep learning.
Vehicle type recognition has become an important application in Intelligence Transportation Systems (ITSs) to provide a safe and efficient road and transportation infrastructure. There are some challenges in implementing this technology including the complexity of the image that will distract accuracy performance, and how to differentiate intra-class variation of the vehicle, for instance, taxi and car. In this paper, we propose to use a deep learning framework that consists of a Sparse-Filtered Convolutional Neural Network with Layer Skipping (SF-CNNLS) strategy to recognize the vehicle type. We implemented 64 sparse filters in Sparse Filtering to extract discriminative features of the vehicle and 2 hidden layers of CNNLS for further processes. The SF-CNNLS can recognize the different types of vehicles due to the combined advantages of each approach. We have evaluated the SF-CNNLS using various classes of vehicle namely car, taxi, and truck. The implementation of the evaluation is during daylight time with different weather conditions and frontal view of the vehicle. From that evaluation, we able to correctly recognize the classes with almost 91% of average accuracy and successfully recognize the taxi as a different class of car.
Convolutional neural networks have achieved great success in computer vision, significant improving the state of the art in image classification, semantic segmentation, object detection and face recognition. In this chapter, we illustrate the advance made by the convolutional neural network (CNN) in surveillance and security applications using two examples. For the surveillance application, a novel military object detector called Deep Fusion Detector was proposed, which incorporates information fusion techniques and the CNN. Specifically, we fused multi-channel images within a CNN to enhance the significance of deep features, and adapted a state-of-the-art generic object detector for military scenario. For the security application, with inspiration from recent advances in the deep learning community, we presented an effective face recognition system called Deep Residual Face. Where the Inception-ResNet CNN architecture was utilized to extracting deep features and the center loss function was adopted for training the face verification network. The extensive experiments showed the effectiveness of the presented methods.
This chapter proposes a deep convolutional neural network based super-resolution framework to super-resolve and to recognize the long-range captured iris image sequences. The proposed framework is tested on CASIA V4 iris database by analyzing the peak signal-to-noise ratio (PSNR), structural similarity index matrix (SSIM) and visual information fidelity in pixel domain (VIFP) of the state-of-art algorithms. The performance of the proposed framework is analyzed for the upsampling factors 2 and 4 and achieved PSNRs of 37.42 dB and 34.74 dB respectively. Using this framework, we have achieved an equal error rate (EER) of 0.14%. The results demonstrate that the proposed framework can super-resolve the iris images effectively and achieves better recognition performance.