Computer Science - Neural and Evolutionary Computing Publications (50)


Computer Science - Neural and Evolutionary Computing Publications

In this paper, a new approach to solve the cubic B-spline curve fitting problem is presented based on a meta-heuristic algorithm called " dolphin echolocation ". The method minimizes the proximity error value of the selected nodes that measured using the least squares method and the Euclidean distance method of the new curve generated by the reverse engineering. The results of the proposed method are compared with the genetic algorithm. Read More

Credit assignment in traditional recurrent neural networks usually involves back-propagating through a long chain of tied weight matrices. The length of this chain scales linearly with the number of time-steps as the same network is run at each time-step. This creates many problems, such as vanishing gradients, that have been well studied. Read More

End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid hidden Markov model (HMM)-deep neural network based automatic speech recognition (ASR) systems. Such E2E systems are attractive due to the lack of dependence on alignments between input acoustic and output grapheme or HMM state sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Read More

The increase in the use of microblogging came along with the rapid growth on short linguistic data. On the other hand deep learning is considered to be the new frontier to extract meaningful information out of large amount of raw data in an automated manner. In this study, we engaged these two emerging fields to come up with a robust language identifier on demand, namely Language Identification Engine (LIDE). Read More

Recurrent neural networks with various types of hidden units have been used to solve a diverse range of problems involving sequence data. Two of the most recent proposals, gated recurrent units (GRU) and minimal gated units (MGU), have shown comparable promising results on example public datasets. In this paper, we introduce three model variants of the minimal gated unit (MGU) which further simplify that design by reducing the number of parameters in the forget-gate dynamic equation. Read More

The standard LSTM recurrent neural networks while very powerful in long-range dependency sequence applications have highly complex structure and relatively large (adaptive) parameters. In this work, we present empirical comparison between the standard LSTM recurrent neural network architecture and three new parameter-reduced variants obtained by eliminating combinations of the input signal, bias, and hidden unit signals from individual gating signals. The experiments on two sequence datasets show that the three new variants, called simply as LSTM1, LSTM2, and LSTM3, can achieve comparable performance to the standard LSTM model with less (adaptive) parameters. Read More

In this work we study the problem of network morphism, an effective learning scheme to morph a well-trained neural network to a new one with the network function completely preserved. Different from existing work where basic morphing types on the layer level were addressed, we target at the central problem of network morphism at a higher level, i.e. Read More

We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques. Read More

We introduce a tree-structured attention neural network for sentences and small phrases and apply it to the problem of sentiment classification. Our model expands the current recursive models by incorporating structural information around a node of a syntactic tree using both bottom-up and top-down information propagation. Also, the model utilizes structural attention to identify the most salient representations during the construction of the syntactic tree. Read More

Brain inspired neuromorphic computing has demonstrated remarkable advantages over traditional von Neumann architecture for its high energy efficiency and parallel data processing. However, the limited resolution of synaptic weights degrades system accuracy and thus impedes the use of neuromorphic systems. In this work, we propose three orthogonal methods to learn synapses with one-level precision, namely, distribution-aware quantization, quantization regularization and bias tuning, to make image classification accuracy comparable to the state-of-the-art. Read More

Several learning rules for synaptic plasticity, that depend on either spike timing or internal state variables, have been proposed in the past imparting varying computational capabilities to Spiking Neural Networks. Due to design complications these learning rules are typically not implemented on neuromorphic devices leaving the devices to be only capable of inference. In this work we propose a unidirectional post-synaptic potential dependent learning rule that is only triggered by pre-synaptic spikes, and easy to implement on hardware. Read More

In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. Read More

In this paper, we study learning generalized driving style representations from automobile GPS trip data. We propose a novel Autoencoder Regularized deep neural Network (ARNet) and a trip encoding framework trip2vec to learn drivers' driving styles directly from GPS records, by combining supervised and unsupervised feature learning in a unified architecture. Experiments on a challenging driver number estimation problem and the driver identification problem show that ARNet can learn a good generalized driving style representation: It significantly outperforms existing methods and alternative architectures by reaching the least estimation error on average (0. Read More

In distributed evolutionary algorithms, migration interval is used to decide migration moments. Nevertheless, migration moments predetermined by intervals cannot match the dynamic situation of evolution. In this paper, a scheme of setting the success rate of migration based on subpopulation diversity at each interval is proposed. Read More

This paper reviews the current status and challenges of Neural Networks (NNs) based machine learning approaches for modern power grid stability control including their design and implementation methodologies. NNs are widely accepted as Artificial Intelligence (AI) approaches offering an alternative way to control complex and ill-defined problems. In this paper various application of NNs for power system rotor angle stabilization and control problem is discussed. Read More

Neural Style Transfer has recently demonstrated very exciting results which catches eyes in both academia and industry. Despite the amazing results, the principle of neural style transfer, especially why the Gram matrices could represent style remains unclear. In this paper, we propose a novel interpretation of neural style transfer by treating it as a domain adaptation problem. Read More

Over the last three decades, a large number of evolutionary algorithms have been developed for solving multiobjective optimization problems. However, there lacks an up-to-date and comprehensive software platform for researchers to properly benchmark existing algorithms and for practitioners to apply selected algorithms to solve their real-world problems. The demand of such a common tool becomes even more urgent, when the source code of many proposed algorithms has not been made publicly available. Read More

We propose a swarm-based optimization algorithm inspired by air currents of a tornado. Two main air currents - spiral and updraft - are mimicked. Spiral motion is designed for exploration of new search areas and updraft movements is deployed for exploitation of a promising candidate solution. Read More

With rapid development of the Internet, the web contents become huge. Most of the websites are publicly available and anyone can access the contents everywhere such as workplace, home and even schools. Nev-ertheless, not all the web contents are appropriate for all users, especially children. Read More

Despite of the pain and limited accuracy of blood tests for early recognition of cardiovascular disease, they dominate risk screening and triage. On the other hand, heart rate variability is non-invasive and cheap, but not considered accurate enough for clinical practice. Here, we tackle heart beat interval based classification with deep learning. Read More

We present a model of a basic recurrent neural network (or bRNN) that includes a separate linear term with a slightly "stable" fixed matrix to guarantee bounded solutions and fast dynamic response. We formulate a state space viewpoint and adapt the constrained optimization Lagrange Multiplier (CLM) technique and the vector Calculus of Variations (CoV) to derive the (stochastic) gradient descent. In this process, one avoids the commonly used re-application of the circular chain-rule and identifies the error back-propagation with the co-state backward dynamic equations. Read More

Testing provides means pertaining to assuring software performance. The total aim of software industry is actually to make a certain start associated with high quality software for the end user. However, associated with software testing has quite a few underlying concerns, which are very important and need to pay attention on these issues. Read More

One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning. In this document we introduce the predictron architecture. The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple "imagined" planning steps. Read More

Advances in machine learning have produced systems that attain human-level performance on certain visual tasks, e.g., object identification. Read More

Quantum inspired Evolutionary Algorithms were proposed more than a decade ago and have been employed for solving a wide range of difficult search and optimization problems. A number of changes have been proposed to improve performance of canonical QEA. However, canonical QEA is one of the few evolutionary algorithms, which uses a search operator with relatively large number of parameters. Read More

The computational properties of neural systems are often thought to be implemented in terms of their network dynamics. Hence, recovering the system dynamics from experimentally observed neuronal time series, like multiple single-unit (MSU) recordings or neuroimaging data, is an important step toward understanding its computations. Ideally, one would not only seek a state space representation of the dynamics, but would wish to have access to its governing equations for in-depth analysis. Read More

With recent progress in graphics, it has become more tractable to train models on synthetic images, potentially avoiding the need for expensive annotations. However, learning from synthetic images may not achieve the desired performance due to a gap between synthetic and real image distributions. To reduce this gap, we propose Simulated+Unsupervised (S+U) learning, where the task is to learn a model to improve the realism of a simulator's output using unlabeled real data, while preserving the annotation information from the simulator. Read More

The past year saw the introduction of new architectures such as Highway networks and Residual networks which, for the first time, enabled the training of feedforward networks with dozens to hundreds of layers using simple gradient descent. While depth of representation has been posited as a primary reason for their success, there are indications that these architectures defy a popular view of deep learning as a hierarchical computation of increasingly abstract features at each layer. In this report, we argue that this view is incomplete and does not adequately explain several recent findings. Read More

In order to better understand the advantages and disadvantages of a constrained multi-objective evolutionary algorithm (CMOEA), it is important to understand the nature of difficulty of a constrained multi-objective optimization problem (CMOP) that the CMOEA is going to deal with. In this paper, we first propose three primary types of difficulty to characterize the constraints in CMOPs, including feasibility-hardness, convergence-hardness and diversity-hardness. We then develop a general toolkit to construct difficulty adjustable CMOPs with three types of parameterized constraint functions according to the proposed three primary types of difficulty. Read More

The paper deals with using chaos to direct trajectories to targets and analyzes ruggedness and fractality of the resulting fitness landscapes. The targeting problem is formulated as a dynamic fitness landscape and four different chaotic maps generating such a landscape are studied. By using a computational approach, we analyze properties of the landscapes and quantify their fractal and rugged characteristics. Read More

This article analyzes the stochastic runtime of a Cross-Entropy Algorithm on two classes of traveling salesman problems. The algorithm shares main features of the famous Max-Min Ant System with iteration-best reinforcement. For simple instances that have a $\{1,n\}$-valued distance function and a unique optimal solution, we prove a stochastic runtime of $O(n^{6+\epsilon})$ with the vertex-based random solution generation, and a stochastic runtime of $O(n^{3+\epsilon}\ln n)$ with the edge-based random solution generation for an arbitrary $\epsilon\in (0,1)$. Read More

In recent years, the research community has discovered that deep neural networks (DNNs) and convolutional neural networks (CNNs) can yield higher accuracy than all previous solutions to a broad array of machine learning problems. To our knowledge, there is no single CNN/DNN architecture that solves all problems optimally. Instead, the "right" CNN/DNN architecture varies depending on the application at hand. Read More

This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation. Specifically, we use unsupervised motion-based segmentation on videos to obtain segments, which we use as 'pseudo ground truth' to train a convolutional network to segment objects from a single frame. Read More

We introduce an exceptionally simple gated recurrent neural network (RNN) that achieves performance comparable to well-known gated architectures, such as LSTMs and GRUs, on the word-level language modeling task. We prove that our model has simple, predicable and non-chaotic dynamics. This stands in stark contrast to more standard gated architectures, whose underlying dynamical systems exhibit chaotic behavior. Read More

One of the major distinguishing features of the dynamic multiobjective optimization problems (DMOPs) is the optimization objectives will change over time, thus tracking the varying Pareto-optimal front becomes a challenge. One of the promising solutions is reusing the "experiences" to construct a prediction model via statistical machine learning approaches. However most of the existing methods ignore the non-independent and identically distributed nature of data used to construct the prediction model. Read More

Mirror neurons have been observed in the primary motor cortex of primate species, in particular in humans and monkeys. A mirror neuron fires when a person performs a certain action, and also when he observes the same action being performed by another person. A crucial step towards building fully autonomous intelligent systems with human-like learning abilities is the capability in modeling the mirror neuron. Read More

We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. Read More

An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. The gradient descent backpropagation rule is a powerful algorithm that is ubiquitous in deep learning, but it relies on the immediate availability of network-wide information stored with high-precision memory. However, recent work shows that exact backpropagated weights are not essential for learning deep representations. Read More

Many neural networks exhibit stability in their activation patterns over time in response to inputs from sensors operating under real-world conditions. By capitalizing on this property of natural signals, we propose a Recurrent Neural Network (RNN) architecture called a delta network in which each neuron transmits its value only when the change in its activation exceeds a threshold. The execution of RNNs as delta networks is attractive because their states must be stored and fetched at every timestep, unlike in convolutional neural networks (CNNs). Read More

Cell formation is a critical step in the design of cellular manufacturing systems. Recently, it was tackled using a cut-based-graph-partitioning model. This model meets real-life production systems requirements as it uses the actual amount of product flows, it looks for the suitable number of cells, and it takes into account the natural constraints such as operation sequences, maximum cell size, cohabitation and non-cohabitation constraints. Read More

Existing models based on artificial neural networks (ANNs) for sentence classification often do not incorporate the context in which sentences appear, and classify sentences individually. However, traditional sentence classification approaches have been shown to greatly benefit from jointly classifying subsequent sentences, such as with conditional random fields. In this work, we present an ANN architecture that combines the effectiveness of typical ANN models to classify sentences in isolation, with the strength of structured prediction. Read More

We present a method for implementing an Efficient Unitary Neural Network (EUNN) whose computational complexity is merely $\mathcal{O}(1)$ per parameter and has full tunability, from spanning part of unitary space to all of it. We apply the EUNN in Recurrent Neural Networks, and test its performance on the standard copying task and the MNIST digit recognition benchmark, finding that it significantly outperforms a non-unitary RNN, an LSTM network, an exclusively partial space URNN and a projective URNN with comparable parameter numbers. Read More

Many time series are generated by a set of entities that interact with one another over time. This paper introduces a broad, flexible framework to learn from multiple inter-dependent time series generated by such entities. Our framework explicitly models the entities and their interactions through time. Read More

In an attempt to solve the lengthy training times of neural networks, we proposed Parallel Circuits (PCs), a biologically inspired architecture. Previous work has shown that this approach fails to maintain generalization performance in spite of achieving sharp speed gains. To address this issue, and motivated by the way Dropout prevents node co-adaption, in this paper, we suggest an improvement by extending Dropout to the PC architecture. Read More

A dynamic Boltzmann machine (DyBM) has been proposed as a model of a spiking neural network, and its learning rule of maximizing the log-likelihood of given time-series has been shown to exhibit key properties of spike-timing dependent plasticity (STDP), which had been postulated and experimentally confirmed in the field of neuroscience as a learning rule that refines the Hebbian rule. Here, we relax some of the constraints in the DyBM in a way that it becomes more suitable for computation and learning. We show that learning the DyBM can be considered as logistic regression for binary-valued time-series. Read More

We introduce a method for imposing higher-level structure on generated, polyphonic music. A Convolutional Restricted Boltzmann Machine (C-RBM) as a generative model is combined with gradient descent constraint optimization to provide further control over the generation process. Among other things, this allows for the use of a "template" piece, from which some structural properties can be extracted, and transferred as constraints to newly generated material. Read More

It is believed that hippocampus functions as a memory allocator in brain, the mechanism of which remains unrevealed. In Valiant's neuroidal model, the hippocampus was described as a randomly connected graph, the computation on which maps input to a set of activated neuroids with stable size. Valiant proposed three requirements for the hippocampal circuit to become a stable memory allocator (SMA): stability, continuity and orthogonality. Read More

In this paper we aim to leverage the powerful bottom-up discriminative representations to guide a top-down generative model. We propose a novel generative model named Stacked Generative Adversarial Networks (SGAN), which is trained to invert the hierarchical representations of a discriminative bottom-up deep network. Our model consists of a top-down stack of GANs, each trained to generate "plausible" lower-level representations, conditioned on higher-level representations. Read More

We propose to use Digital Memcomputing Machines (DMMs), implemented with self-organizing logic gates (SOLGs), to solve the problem of numerical inversion. Starting from fixed-point scalar inversion we describe the generalization to solving linear systems and matrix inversion. This method, when realized in hardware, will output the result in only one computational step. Read More

Deep convolutional neural networks (CNNs) have shown great potential for numerous real-world machine learning applications, but performing inference in large CNNs in real-time remains a challenge. We have previously demonstrated that traditional CNNs can be converted into deep spiking neural networks (SNNs), which exhibit similar accuracy while reducing both latency and computational load as a consequence of their data-driven, event-based style of computing. Here we provide a novel theory that explains why this conversion is successful, and derive from it several new tools to convert a larger and more powerful class of deep networks into SNNs. Read More