Computer Science - Distributed; Parallel; and Cluster Computing Publications (50)


Computer Science - Distributed; Parallel; and Cluster Computing Publications

The asynchronous computability theorem (ACT) uses concepts from combinatorial topology to characterize which tasks have wait-free solutions in read-write memory. A task can be expressed as a relation between two chromatic simplicial complexes. The theorem states that a task has a protocol (algorithm) if and only if there is a certain chromatic simplicial map compatible with that relation. Read More

This paper reports on the state of the art of virtualization technology for both general purpose domains as well as real-time domains. There exits no entirely instantaneous data transmission/transfer. There always exist a delay while transmitting data, either in the processing or in the medium itself. Read More

In this paper we consider a distributed optimization scenario in which a set of processors aims at minimizing the maximum of a collection of "separable convex functions" subject to local constraints. This set-up is motivated by peak-demand minimization problems in smart grids. Here, the goal is to minimize the peak value over a finite horizon with: (i) the demand at each time instant being the sum of contributions from different devices, and (ii) the local states at different time instants being coupled through local dynamics. Read More

In this paper we consider a distributed optimization scenario in which the aggregate objective function to minimize is partitioned, big-data and possibly non-convex. Specifically, we focus on a set-up in which the dimension of the decision variable depends on the network size as well as the number of local functions, but each local function handled by a node depends only on a (small) portion of the entire optimization variable. This problem set-up has been shown to appear in many interesting network application scenarios. Read More

Distributed storage systems are known to be susceptible to long tails in response time. In modern online storage systems such as Bing, Facebook, and Amazon, the long tails of the service latency are of particular concern. with 99. Read More

Memory caches are being aggressively used in today's data-parallel systems such as Spark, Tez, and Piccolo. However, prevalent systems employ rather simple cache management policies--notably the Least Recently Used (LRU) policy--that are oblivious to the application semantics of data dependency, expressed as a directed acyclic graph (DAG). Without this knowledge, memory caching can at best be performed by "guessing" the future data access patterns based on historical information (e. Read More

The need for modern data analytics to combine relational, procedural, and map-reduce-style functional processing is widely recognized. State-of-the-art systems like Spark have added SQL front-ends and relational query optimization, which promise an increase in expressiveness and performance. But how good are these extensions at extracting high performance from modern hardware platforms? While Spark has made impressive progress, we show that for relational workloads, there is still a significant gap compared with best-of-breed query engines. Read More

Testing random number generators is a very important task that, in the resent past, has taken upwards of twelve hours when testing with the current agship testing suite TestU01. Through this paper we will discuss the possible performance increases to the existing random number generator testing suite TestU01 that are available by offering the executable to an HTCondor pool to execute on. We will see that with a few modifications we are able to decrease the running time of a sample run of Big Crush from about five and a half hours to only five and a half minutes. Read More

We describe a high-performance implementation of the lattice-Boltzmann method (LBM) for sparse geometries on graphic processors. In our implementation we cover the whole geometry with a uniform mesh of small tiles and carry out calculations for each tile independently with a proper data synchronization at tile edges. For this method we provide both the theoretical analysis of complexity and the results for real implementations for 2D and 3D geometries. Read More

Snafu, or Snake Functions, is a modular system to host, execute and manage language-level functions offered as stateless (micro-)services to diverse external triggers. The system interfaces resemble those of commercial FaaS providers but its implementation provides distinct features which make it overall useful to research on FaaS and prototyping of FaaS-based applications. This paper argues about the system motivation in the presence of already existing alternatives, its design and architecture, the open source implementation and collected metrics which characterise the system. Read More

Graph spanners have been studied extensively, and have many applications in algorithms, distributed systems, and computer networks. For many of these application, we want distributed constructions of spanners, i.e. Read More

This paper introduces PriMaL, a general PRIvacy-preserving MAchine-Learning method for reducing the privacy cost of information transmitted through a network. Distributed sensor networks are often used for automated classification and detection of abnormal events in high-stakes situations, e.g. Read More

In this paper we consider the problem of identifying intersections between two sets of d-dimensional axis-parallel rectangles. This is a common problem that arises in many agent-based simulation studies, and is of central importance in the context of High Level Architecture (HLA), where it is at the core of the Data Distribution Management (DDM) service. Several realizations of the DDM service have been proposed; however, many of them are either inefficient or inherently sequential. Read More

The singular value decomposition (SVD) is a widely used matrix factorization tool which underlies plenty of useful applications, e.g. recommendation system, abnormal detection and data compression. Read More

These are the proceedings of the 14th International Workshop on Formal Engineering approaches to Software Components and Architectures (FESCA). The workshop was held on April 22, 2017 in Uppsala (Sweden) as a satellite event to the European Joint Conference on Theory and Practice of Software (ETAPS'17). The aim of the FESCA workshop is to bring together junior researchers from formal methods, software engineering, and industry interested in the development and application of formal modelling approaches as well as associated analysis and reasoning techniques with practical benefits for software engineering. Read More

The multiway rendezvous introduced in Theoretical CSP is a powerful paradigm to achieve synchronization and communication among a group of (possibly more than two) processes. We illustrate the advantages of this paradigm on the production cell benchmark, a model of a real metal processing plant, for which we propose a compositional software controller, which is written in LNT and LOTOS, and makes intensive use of the multiway rendezvous. Read More

This paper focuses on a passivity-based distributed reference governor (RG) applied to a pre-stabilized mobile robotic network. The novelty of this paper lies in the method used to solve the RG problem, where a passivity-based distributed optimization scheme is proposed. In particular, the gradient descent method minimizes the global objective function while the dual ascent method maximizes the Hamiltonian. Read More

As technology proceeds and the number of smart devices continues to grow substantially, need for ubiquitous context-aware platforms that support interconnected, heterogeneous, and distributed network of devices has given rise to what is referred today as Internet-of-Things. However, paving the path for achieving aforementioned objectives and making the IoT paradigm more tangible requires integration and convergence of different knowledge and research domains, covering aspects from identification and communication to resource discovery and service integration. Through this chapter, we aim to highlight researches in topics including proposed architectures, security and privacy, network communication means and protocols, and eventually conclude by providing future directions and open challenges facing the IoT development. Read More

We identify multirole logic as a new form of logic in which conjunction/disjunction is interpreted as an ultrafilter on the power set of some underlying set (of roles) and the notion of negation is generalized to endomorphisms on this underlying set. We formalize both multirole logic (MRL) and linear multirole logic (LMRL) as natural generalizations of classical logic (CL) and classical linear logic (CLL), respectively, and also present a filter-based interpretation for intuitionism in multirole logic. Among various meta-properties established for MRL and LMRL, we obtain one named multiparty cut-elimination stating that every cut involving one or more sequents (as a generalization of a (binary) cut involving exactly two sequents) can be eliminated, thus extending the celebrated result of cut-elimination by Gentzen. Read More

Computer vision is one of the most active research fields in information technology today. Giving machines and robots the ability to see and comprehend the surrounding world at the speed of sight creates endless potential applications and opportunities. Feature detection and description algorithms can be indeed considered as the retina of the eyes of such machines and robots. Read More

Cognitive radio networks are a new type of multi-channel wireless network in which different nodes can have access to different sets of channels. By providing multiple channels, they improve the efficiency and reliability of wireless communication. However, the heterogeneous nature of cognitive radio networks also brings new challenges to the design and analysis of distributed algorithms. Read More

A common problem in large-scale data analysis is to approximate a matrix using a combination of specifically sampled rows and columns, known as CUR decomposition. Unfortunately, in many real-world environments, the ability to sample specific individual rows or columns of the matrix is limited by either system constraints or cost. In this paper, we consider matrix approximation by sampling predefined blocks of columns (or rows) from the matrix. Read More

The (ultra-)dense deployment of small-cell base stations (SBSs) endowed with cloud-like computing functionalities paves the way for pervasive mobile edge computing (MEC), enabling ultra-low latency and location-awareness for a variety of emerging mobile applications and the Internet of Things. To handle spatially uneven computation workloads in the network, cooperation among SBSs via workload peer offloading is essential to avoid large computation latency at overloaded SBSs and provide high quality of service to end users. However, performing effective peer offloading faces many unique challenges in small cell networks due to limited energy resources committed by self-interested SBS owners, uncertainties in the system dynamics and co-provisioning of radio access and computing services. Read More

Online trust systems are playing an important role in to-days world and face various challenges in building them. Billions of dollars of products and services are traded through electronic commerce, files are shared among large peer-to-peer networks and smart contracts can potentially replace paper contracts with digital contracts. These systems rely on trust mechanisms in peer-to-peer networks like reputation systems or a trustless public ledger. Read More

Branch and bound searches are a common technique for solving global optimisation and decision problems, yet their irregularity, search order dependence, and the need to share bound information globally makes it challenging to implement them in parallel, and to reason about their parallel performance. We identify three key parallel search properties for replicable branch and bound implementations: Sequential Lower Bound, Non-increasing Runtimes, and Repeatability. We define a formal model for parallel branch and bound search problems and show its generality by using it to define three benchmarks: finding a Maximum Clique in a graph, 0/1 Knapsack and Travelling Salesperson (TSP). Read More

This paper severs as a user guide to the mapping framework VieM (Vienna Mapping and Sparse Quadratic Assignment). We give a rough overview of the techniques used within the framework and describe the user interface as well as the file formats used. Read More

In the paper, we present designs for multiple blockchain consensus primitives and a novel blockchain system, all based on the use of trusted execution environments (TEEs), such as Intel SGX-enabled CPUs. First, we show how using TEEs for existing proof of work schemes can make mining equitably distributed by preventing the use of ASICs. Next, we extend the design with proof of time and proof of ownership consensus primitives to make mining energy- and time-efficient. Read More

Distributed shared memory systems maintain multiple replicas of the shared memory locations. Maintaining causal consistency in such systems has received significant attention in the past. However, much of the previous literature focuses on full replication wherein each replica stores a copy of all the locations in the shared memory. Read More

Data deduplication is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in cloud storage systems to save storage space and network bandwidth. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. Read More

This paper studies problems on locally stopping distributed consensus algorithms over networks where each node updates its state by interacting with its neighbors and decides by itself whether certain level of agreement has been achieved among nodes. Since an individual node is unable to access the states of those beyond its neighbors, this problem becomes challenging. In this work, we first define the stopping problem for generic distributed algorithms. Read More

We present a simple distributed algorithm that, given a regular graph consisting of two communities (or clusters), each inducing a good expander and such that the cut between them has sparsity $1/\mbox{polylog}(n)$, recovers the two communities. More precisely, upon running the protocol, every node assigns itself a binary label of $m = \Theta(\log n)$ bits, so that with high probability, for all but a small number of outliers, nodes within the same community are assigned labels with Hamming distance $o(m)$, while nodes belonging to different communities receive labels with Hamming distance at least $m/2 - o(m)$. We refer to such an outcome as a "community sensitive labeling" of the graph. Read More

With explosion of data size and limited storage space at a single location, data are often distributed at different locations. We thus face the challenge of performing large-scale machine learning from these distributed data through communication networks. In this paper, we study how the network communication constraints will impact the convergence speed of distributed machine learning optimization algorithms. Read More

Robustness is a correctness notion for concurrent programs running under relaxed consistency models. The task is to check that the relaxed behavior coincides (up to traces) with sequential consistency (SC). Although computationally simple on paper (robustness has been shown to be PSPACE-complete for TSO, PGAS, and Power), building a practical robustness checker remains a challenge. Read More

High-performance computing systems are more and more often based on accelerators. Computing applications targeting those systems often follow a host-driven approach in which hosts offload almost all compute-intensive sections of the code onto accelerators; this approach only marginally exploits the computational resources available on the host CPUs, limiting performance and energy efficiency. The obvious step forward is to run compute-intensive kernels in a concurrent and balanced way on both hosts and accelerators. Read More

With the increasing of electric vehicle (EV) adoption in recent years, the impact of EV charging activities to the power grid becomes more and more significant. In this article, an optimal scheduling algorithm which combines smart EV charging and V2G gird service is developed to integrate EVs into power grid as distributed energy resources, with improved system cost performance. Specifically, an optimization problem is formulated and solved at each EV charging station according to control signal from aggregated control center and user charging behavior prediction by mean estimation and linear regression. Read More

In this work, we study theoretical models of \emph{programmable matter} systems. The systems under consideration consist of spherical modules, kept together by magnetic forces and able to perform two minimal mechanical operations (or movements): \emph{rotate} around a neighbor and \emph{slide} over a line. In terms of modeling, there are $n$ nodes arranged in a 2-dimensional grid and forming some initial \emph{shape}. Read More

Population protocols are a well established model of computation by anonymous, identical finite state agents. A protocol is well-specified if from every initial configuration, all fair executions reach a common consensus. The central verification question for population protocols is the well-specification problem: deciding if a given protocol is well-specified. Read More

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system. However, a complete cloud resource allocation framework exhibits high dimensions in state and action spaces, which prohibit the usefulness of traditional RL techniques. In addition, high power consumption has become one of the critical concerns in design and control of cloud computing systems, which degrades system reliability and increases cooling cost. Read More

Cognitive inference of user demographics, such as gender and age, plays an important role in creating user profiles for adjusting marketing strategies and generating personalized recommendations because user demographic data is usually not available due to data privacy concerns. At present, users can readily express feedback regarding products or services that they have purchased. During this process, user demographics are concealed, but the data has never yet been successfully utilized to contribute to the cognitive inference of user demographics. Read More

Inference of user context information, including user's gender, age, marital status, location and so on, has been proven to be valuable for building context aware recommender system. However, prevalent existing studies on user context inference have two shortcommings: 1. focusing on only a single data source (e. Read More

Affiliations: 1Fermi National Accelerator Laboratory, 2Fermi National Accelerator Laboratory, 3Princeton University, 4Fermi National Accelerator Laboratory, 5Fermi National Accelerator Laboratory, 6Princeton University, 7Fermi National Accelerator Laboratory, 8Fermi National Accelerator Laboratory now Johns Hopkins University, 9Princeton University, 10Fermi National Accelerator Laboratory

Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. Read More

Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security and performance are well studied. Read More

Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making. These applications pose a new set of requirements, none of which are difficult to achieve in isolation, but the combination of which creates a challenge for existing distributed execution frameworks: computation with millisecond latency at high throughput, adaptive construction of arbitrary task graphs, and execution of heterogeneous kernels over diverse sets of resources. We assert that a new distributed execution framework is needed for such ML applications and propose a candidate approach with a proof-of-concept architecture that achieves a 63x performance improvement over a state-of-the-art execution framework for a representative application. Read More

DotGrid platform is a Grid infrastructure integrated with a set of open and standard protocols recently implemented on the top of Microsoft .NET in Windows and MONO .NET in UNIX/Linux. Read More

Grid infrastructures that have provided wide integrated use of resources are becoming the de-facto computing platform for solving large-scale problems in science, engineering and commerce. In this evolution, desktop grid technologies allow the grid communities to exploit the idle cycles of pervasive desktop PC systems to increase the available computing power. In this paper we present DotGrid, a cross-platform grid software. Read More

Resource discovery is one of the most important services that significantly affects the efficiency of grid computing systems. The inherent dynamic and large-scale characteristics of grid environments make their resource discovery a challenging task. In recent years, different approaches have been proposed for resource discovery, attempting to tackle the challenges of grid environments and improve the efficiency. Read More

In this paper we introduce and describe the highly concurrent xDFS file transfer protocol and examine its cross-platform and cross-language implementation in native code for both Linux and Windows in 32 or 64-bit multi-core processor architectures. The implemented xDFS protocol based on xDotGrid.NET framework is fully compared with the Globus GridFTP protocol. Read More

We deal with the problem of maintaining a shortest-path tree rooted at some process r in a network that may be disconnected after topological changes. The goal is then to maintain a shortest-path tree rooted at r in its connected component, Vr, and make all processes of other components detecting that r is not part of their connected component. We propose, in the composite atomicity model, a silent self-stabilizing algorithm for this problem working in semi-anonymous networks, where edges have strictly positive weights. Read More

In the era of big data and Internet of things, massive sensor data are gathered with Internet of things. Quantity of data captured by sensor networks are considered to contain highly useful and valuable information. However, for a variety of reasons, received sensor data often appear abnormal. Read More

We investigate a special case of hereditary property that we refer to as {\em robustness}. A property is {\em robust} in a given graph if it is inherited by all connected spanning subgraphs of this graph. We motivate this definition in different contexts, showing that it plays a central role in highly dynamic networks, although the problem is defined in terms of classical (static) graph theory. Read More