# Computer Science - Neural and Evolutionary Computing Publications (50)

## Search

## Computer Science - Neural and Evolutionary Computing Publications

In machine learning, the use of an artificial neural network is the mainstream approach. Such a network consists of layers of neurons. These neurons are of the same type characterized by the two features: (1) an inner product of an input vector and a matching weighting vector of trainable parameters and (2) a nonlinear excitation function. Read More

We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches. Our model benefits from a partial sampling scheme and is conceptually simple, yet achieves state-of-the-art performance on the Chinese Discourse Treebank. We also visualize its attention activity to illustrate the model's ability to selectively focus on the relevant parts of an input sequence. Read More

While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a fully connected network with squared loss and analytic activation function given that the number of hidden units of one layer of the network is larger than the number of training points and the network structure from this layer on is pyramidal. Read More

As part of a complete software stack for autonomous driving, NVIDIA has created a neural-network-based system, known as PilotNet, which outputs steering angles given images of the road ahead. PilotNet is trained using road images paired with the steering angles generated by a human driving a data-collection car. It derives the necessary domain knowledge by observing human drivers. Read More

We study unsupervised learning by developing introspective generative modeling (IGM) that attains a generator using progressively learned deep convolutional neural networks. The generator is itself a discriminator, capable of introspection: being able to self-evaluate the difference between its generated samples and the given training data. When followed by repeated discriminative learning, desirable properties of modern discriminative classifiers are directly inherited by the generator. Read More

In this paper we propose introspective classifier learning (ICL) that emphasizes the importance of having a discriminative classifier empowered with generative capabilities. We develop a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our classifier is able to iteratively: (1) synthesize pseudo-negative samples in the synthesis step; and (2) enhance itself by improving the classification in the reclassification step. Read More

Computer programs written in one language are often required to be ported to other languages to support multiple devices and environments. When programs use language specific APIs (Application Programming Interfaces), it is very challenging to migrate these APIs to the corresponding APIs written in other languages. Existing approaches mine API mappings from projects that have corresponding versions in two languages. Read More

We propose sensorimotor tappings, a new graphical technique that explicitly represents relations between the time steps of an agent's sensorimotor loop and a single training step of an adaptive model that the agent is using internally. In the simplest case this is a relation linking two time steps. In realistic cases these relations can extend over several time steps and over different sensory channels. Read More

Symbolic regression is an important but challenging research topic in data mining. It can detect the underlying mathematical models. Genetic programming (GP) is one of the most popular methods for symbolic regression. Read More

Efficiency of single-objective optimization can be improved by introducing some auxiliary objectives. Ideally, auxiliary objectives should be helpful. However, in practice, objectives may be efficient on some optimization stages but obstructive on others. Read More

We propose a sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset. This language modeling objective incentivises the system to learn general-purpose patterns of semantic and syntactic composition, which are also useful for improving accuracy on different sequence labeling tasks. The architecture was evaluated on a range of datasets, covering the tasks of error detection in learner texts, named entity recognition, chunking and POS-tagging. Read More

Recurrent neural network (RNN) are being extensively used over feed-forward neural networks (FFNN) because of their inherent capability to capture temporal relationships that exist in the sequential data such as speech. This aspect of RNN is advantageous especially when there is no a priori knowledge about the temporal correlations within the data. However, RNNs require large amount of data to learn these temporal correlations, limiting their advantage in low resource scenarios. Read More

We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models. By incorporating this approximation into the scheduled sampling training procedure (Bengio et al., 2015)--a well-known technique for correcting exposure bias--we introduce a new training objective that is continuous and differentiable everywhere and that can provide informative gradients near points where previous decoding decisions change their value. Read More

Several approaches have recently been proposed for learning decentralized deep multiagent policies that coordinate via a differentiable communication channel. While these policies are effective for many tasks, interpretation of their induced communication strategies has remained a challenge. Here we propose to interpret agents' messages by translating them. Read More

While Monte Carlo Tree Search and closely related methods have dominated General Video Game Playing, recent research has demonstrated the promise of Rolling Horizon Evolutionary Algorithms as an interesting alternative. However, there is little attention paid to population initialization techniques in the setting of general real-time video games. Therefore, this paper proposes the use of population seeding to improve the performance of Rolling Horizon Evolution and presents the results of two methods, One Step Look Ahead and Monte Carlo Tree Search, tested on 20 games of the General Video Game AI corpus with multiple evolution parameter values (population size and individual length). Read More

We investigate the Robust Multiperiod Network Design Problem, a generalization of the classical Capacitated Network Design Problem that additionally considers multiple design periods and provides solutions protected against traffic uncertainty. Given the intrinsic difficulty of the problem, which proves challenging even for state-of-the art commercial solvers, we propose a hybrid primal heuristic based on the combination of ant colony optimization and an exact large neighborhood search. Computational experiments on a set of realistic instances from the SNDlib show that our heuristic can find solutions of extremely good quality with low optimality gap. Read More

Today, with the continued growth in using information and communication technologies (ICT) for business purposes, business organizations become increasingly dependent on their information systems. Thus, they need to protect them from the different attacks exploiting their vulnerabilities. To do so, the organization has to use security technologies, which may be proactive or reactive ones. Read More

**Authors:**Fabio D'Andreagiovanni

Base station cooperation (BSC) has recently arisen as a promising way to increase the capacity of a wireless network. Implementing BSC adds a new design dimension to the classical wireless network design problem: how to define the subset of base stations (clusters) that coordinate to serve a user. Though the problem of forming clusters has been extensively discussed from a technical point of view, there is still a lack of effective optimization models for its representation and algorithms for its solution. Read More

Recurrent neural network architectures can have useful computational properties, with complex temporal dynamics. However, evaluation of recurrent dynamic architectures requires solution of systems of differential equations, and the number of evaluations required to determine their response to a given input can vary with the input, or can be indeterminate altogether in the case of oscillations or instability. In feed-forward networks, by contrast, only a single pass through the network is needed to determine the response to a given input. Read More

Empirically, neural networks that attempt to learn programs from data have exhibited poor generalizability. Moreover, it has traditionally been difficult to reason about the behavior of these models beyond a certain level of input complexity. In order to address these issues, we propose augmenting neural architectures with a key abstraction: recursion. Read More

We propose a computational model of neuron, called firing cell (FC), properties of which cover such phenomena as attenuation of receptors for external stimuli, delay and decay of postsynaptic potentials, modification of internal weights due to propagation of postsynaptic potentials through the dendrite, modification of properties of the analog memory for each input due to a pattern of short-time synaptic potentiation or long-time synaptic potentiation (LTP), output-spike generation when the sum of all inputs exceeds a threshold, and refraction. The cell may take one of the three forms: excitatory, inhibitory, and receptory. The computer simulations showed that, depending on the phase of input signals, the artificial neuron's output frequency may demonstrate various chaotic behaviors. Read More

Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. We compare the proposed methods with existing techniques and present results of systematic evaluation of those methods on the WMT16 Multimodal Translation and Automatic Post-editing tasks. Read More

This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image classifier that is based on a shallow convolutional neural network architecture and demonstrate that object recognition improves when this is combined with object state information from the tracking algorithm. Read More

Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning that detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different hierarchies of abstraction. Read More

Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The key idea of Softmax GAN is to replace the classification loss in the original GAN with a softmax cross-entropy loss in the sample space of one single batch. In the adversarial learning of $N$ real training samples and $M$ generated samples, the target of discriminator training is to distribute all the probability mass to the real samples, each with probability $\frac{1}{M}$, and distribute zero probability to generated data. Read More

Genetic Algorithms are widely used in many different optimization problems including layout design. The layout of the shelves play an important role in the total sales metrics for superstores since this affects the customers' shopping behaviour. This paper employed a genetic algorithm based approach to design shelf layout of superstores. Read More

We propose a multi-view network for text classification. Our method automatically creates various views of its input text, each taking the form of soft attention weights that distribute the classifier's focus among a set of base features. For a bag-of-words representation, each view focuses on a different subset of the text's words. Read More

Efficient global optimization is a popular algorithm for the optimization of expensive multimodal black-box functions. One important reason for its popularity is its theoretical foundation of global convergence. However, as the budgets in expensive optimization are very small, the asymptotic properties only play a minor role and the algorithm sometimes comes off badly in experimental comparisons. Read More

While deep learning is remarkably successful on perceptual tasks, it was also shown to be vulnerable to adversarial perturbations of the input. These perturbations denote noise added to the input that was generated specifically to fool the system while being quasi-imperceptible for humans. More severely, there even exist universal perturbations that are input-agnostic but fool the network on the majority of inputs. Read More

Behavior domination is proposed as a tool for understanding and harnessing the power of evolutionary systems to discover and exploit useful stepping stones. Novelty search has shown promise in overcoming deception by collecting diverse stepping stones, and several algorithms have been proposed that combine novelty with a more traditional fitness measure to refocus search and help novelty search scale to more complex domains. However, combinations of novelty and fitness do not necessarily preserve the stepping stone discovery that novelty search affords. Read More

In this paper, we propose a new Recurrent Neural Network (RNN) architecture. The novelty is simple: We use diagonal recurrent matrices instead of full. This results in better test likelihood and faster convergence compared to regular full RNNs in most of our experiments. Read More

For many types of integrated circuits, accepting larger failure rates in computations can be used to improve energy efficiency. We study the performance of faulty implementations of certain deep neural networks based on pessimistic and optimistic models of the effect of hardware faults. After identifying the impact of hyperparameters such as the number of layers on robustness, we study the ability of the network to compensate for computational failures through an increase of the network size. Read More

This paper outlines a methodological approach to generate adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach captures the well-known principle of universality that classifies critical phenomena inside a few universality classes of systems without relying on specific mechanisms or topologies. Read More

This paper addresses maximum likelihood (ML) estimation based model fitting in the context of extrasolar planet detection. This problem is featured by the following properties: 1) the candidate models under consideration are highly nonlinear; 2) the likelihood surface has a huge number of peaks; 3) the parameter space ranges in size from a few to dozens of dimensions. These properties make the ML search a very challenging problem, as it lacks any analytical or gradient based searching solution to explore the parameter space. Read More

Optimization techniques play an important role in several scientific and real-world applications, thus becoming of great interest for the community. As a consequence, a number of open-source libraries are available in the literature, which ends up fostering the research and development of new techniques and applications. In this work, we present a new library for the implementation and fast prototyping of nature-inspired techniques called LibOPT. Read More

Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. One hypothesized mechanism for evolvability is developmental canalization, wherein certain dimensions of variation become more likely to be traversed and others are prevented from being explored (e. Read More

We propose a new type of leaf node for use in Symbolic Regression (SR) that performs linear combinations of feature variables (LCF). These nodes can be handled in three different modes -- an unsynchronized mode, where all LCFs are free to change on their own, a synchronized mode, where LCFs are sorted into groups in which they are forced to be identical throughout the whole individual, and a globally synchronized mode, which is similar to the previous mode but the grouping is done across the whole population. We also present two methods of evolving the weights of the LCFs -- a purely stochastic way via mutation and a gradient-based way based on the backpropagation algorithm known from neural networks -- and also a combination of both. Read More

In this paper, we study various parallelization schemes for the Variable Neighborhood Search (VNS) metaheuristic on a CPU-GPU system via OpenMP and OpenACC. A hybrid parallel VNS method is applied to recent benchmark problem instances for the multi-product dynamic lot sizing problem with product returns and recovery, which appears in reverse logistics and is known to be NP-hard. We report our findings regarding these parallelization approaches and present promising computational results. Read More

Symbolic regression via genetic programming is a flexible approach to machine learning that does not require up-front specification of model structure. However, traditional approaches to symbolic regression require the use of protected operators, which can lead to perverse model characteristics and poor generalisation. In this paper, we revisit interval arithmetic as one possible solution to allow genetic programming to perform regression using unprotected operators. Read More

Adversarial attack has cast a shadow on the massive success of deep neural networks. Despite being almost visually identical to the clean data, the adversarial images can fool deep neural networks into wrong predictions with very high confidence. In this paper, however, we show that we can build a simple binary classifier separating the adversarial apart from the clean data with accuracy over 99%. Read More

A sport tournament problem is considered the Traveling Tournament Problem (TTP). One interesting type is the mirrored Traveling Tournament Problem (mTTP). The objective of the problem is to minimize either the total number of traveling or the total distances of traveling or both. Read More

We present the optimisation of a neuromorphic adaptation of a spiking neural network model of the locust Lobula Giant Movement Detector (LGMD), which detects looming objects and can be used to facilitate obstacle avoidance in robotic applications. Our model is constrained by the parameters of a mixed signal analogue-digital neuromorphic device and is driven by the output of a neuromorphic vision sensor DVS. Due to the number of user-defined parameters and the difficulty to find values that perform well we investigate the use of Differential Evolution and self-adaptive DE (SADE) to find optimal values. Read More

In this paper, we propose a Hybrid Ant Colony Optimization algorithm (HACO) for Next Release Problem (NRP). NRP, a NP-hard problem in requirement engineering, is to balance customer requests, resource constraints, and requirement dependencies by requirement selection. Inspired by the successes of Ant Colony Optimization algorithms (ACO) for solving NP-hard problems, we design our HACO to approximately solve NRP. Read More

**Authors:**Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, Doe Hyun Yoon

Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. Read More

Over the last decade, wireless networks have experienced an impressive growth and now play a main role in many telecommunications systems. As a consequence, scarce radio resources, such as frequencies, became congested and the need for effective and efficient assignment methods arose. In this work, we present a Genetic Algorithm for solving large instances of the Power, Frequency and Modulation Assignment Problem, arising in the design of wireless networks. Read More

We consider the problem of optimally designing a body wireless sensor network, while taking into account the uncertainty of data generation of biosensors. Since the related min-max robustness Integer Linear Programming (ILP) problem can be difficult to solve even for state-of-the-art commercial optimization solvers, we propose an original heuristic for its solution. The heuristic combines deterministic and probabilistic variable fixing strategies, guided by the information coming from strengthened linear relaxations of the ILP robust model, and includes a very large neighborhood search for reparation and improvement of generated solutions, formulated as an ILP problem solved exactly. Read More

A parallel genetic algorithm (GA) implemented on GPU clusters is proposed to solve the Uncapacitated Single Allocation p-Hub Median problem. The GA uses binary and integer encoding and genetic operators adapted to this problem. Our GA is improved by generated initial solution with hubs located at middle nodes. Read More

The $(1+(\lambda,\lambda))$ genetic algorithm, first proposed at GECCO 2013, showed a surprisingly good performance on so me optimization problems. The theoretical analysis so far was restricted to the OneMax test function, where this GA profited from the perfect fitness-distance correlation. In this work, we conduct a rigorous runtime analysis of this GA on random 3-SAT instances in the planted solution model having at least logarithmic average degree, which are known to have a weaker fitness distance correlation. Read More

Elucidating principles that underlie computation in neural networks is currently a major research topic of interest in neuroscience. Transfer Entropy (TE) is increasingly used as a tool to bridge the gap between network structure, function, and behavior in fMRI studies. Computational models allow us to bridge the gap even further by directly associating individual neuron activity with behavior. Read More

Experimental data suggest that neural circuits configure their synaptic connectivity for a given computational task. They also point to dopamine-gated stochastic spine dynamics as an important underlying mechanism, and they show that the stochastic component of synaptic plasticity is surprisingly strong. We propose a model that elucidates how task-dependent self-configuration of neural circuits can emerge through these mechanisms. Read More