# Comparison-Based Choices

A broad range of on-line behaviors are mediated by interfaces in which people make choices among sets of options. A rich and growing line of work in the behavioral sciences indicate that human choices follow not only from the utility of alternatives, but also from the choice set in which alternatives are presented. In this work we study comparison-based choice functions, a simple but surprisingly rich class of functions capable of exhibiting so-called choice-set effects. Motivated by the challenge of predicting complex choices, we study the query complexity of these functions in a variety of settings. We consider settings that allow for active queries or passive observation of a stream of queries, and give analyses both at the granularity of individuals or populations that might exhibit heterogeneous choice behavior. Our main result is that any comparison-based choice function in one dimension can be inferred as efficiently as a basic maximum or minimum choice function across many query contexts, suggesting that choice-set effects need not entail any fundamental algorithmic barriers to inference. We also introduce a class of choice functions we call distance-comparison-based functions, and briefly discuss the analysis of such functions. The framework we outline provides intriguing connections between human choice behavior and a range of questions in the theory of sorting.

**Comments:**20 pages, 3 figures

## Similar Publications

In this paper we initiate the study of property testing in simultaneous and non-simultaneous multi-party communication complexity, focusing on testing triangle-freeness in graphs. We consider the $\textit{coordinator}$ model, where we have $k$ players receiving private inputs, and a coordinator who receives no input; the coordinator can communicate with all the players, but the players cannot communicate with each other. In this model, we ask: if an input graph is divided between the players, with each player receiving some of the edges, how many bits do the players and the coordinator need to exchange to determine if the graph is triangle-free, or $\textit{far}$ from triangle-free? For general communication protocols, we show that $\tilde{O}(k(nd)^{1/4}+k^2)$ bits are sufficient to test triangle-freeness in graphs of size $n$ with average degree $d$ (the degree need not be known in advance). Read More

We develop a coalgebraic generalization of the classical Paige-Tarjan algorithm for efficient bisimilarity checking. Coalgebraic generality implies that our algorithm applies to systems beyond the standard relational setup, in particular various flavours of weighted systems. The specific requirements of the algorithm force rather strong assumptions on the coalgebraic type functors, but by using modularity principles in multi-sorted coalgebra and generalizing our methods beyond the category of sets, we nevertheless arrive at covering not just the known examples (transition systems and Markov chains) but also systems with mixed transition types, such as Segala-style probabilistic automata. Read More

We analyze the caching overhead incurred by a class of multithreaded algorithms when scheduled by an arbitrary scheduler. We obtain bounds that match or improve upon the well-known $O(Q+S \cdot (M/B))$ caching cost for the randomized work stealing (RWS) scheduler, where $S$ is the number of steals, $Q$ is the sequential caching cost, and $M$ and $B$ are the cache size and block (or cache line) size respectively. Read More

In a vertex-colored graph, an edge is happy if its endpoints have the same color. Similarly, a vertex is happy if all its incident edges are happy. Motivated by the computation of homophily in social networks, we consider the algorithmic aspects of the following Maximum Happy Edges (k-MHE) problem: given a partially k-colored graph G, find an extended full k-coloring of G maximizing the number of happy edges. Read More

Finding groups of connected individuals in large graphs with tens of thousands or more nodes has received considerable attention in academic research. In this paper, we analyze three main issues with respect to the recent influx of papers on community detection in (large) graphs, highlight the specific problems with the current research avenues, and propose a first step towards a better approach. First, in spite of the strong interest in community detection, a strong conceptual and theoretical foundation of connectedness in large graphs is missing. Read More

A common approach for designing scalable algorithms for massive data sets is to distribute the computation across, say $k$, machines and process the data using limited communication between them. A particularly appealing framework here is the simultaneous communication model whereby each machine constructs a small representative summary of its own data and one obtains an approximate/exact solution from the union of the representative summaries. If the representative summaries needed for a problem are small, then this results in a communication-efficient and round-optimal protocol. Read More

The massive quantities of genomic data being made available through gene sequencing techniques are enabling breakthroughs in genomic science in many areas such as medical advances in the diagnosis and treatment of diseases. Analyzing this data, however, is a computational challenge insofar as the computational costs of the relevant algorithms can grow with quadratic, cubic or higher complexity--leading to the need for leadership scale computing. In this paper we describe a new approach to calculations of the Custom Correlation Coefficient (CCC) between Single Nucleotide Polymorphisms (SNPs) across a population, suitable for parallel systems equipped with graphics processing units (GPUs) or Intel Xeon Phi processors. Read More

The surge in availability of genomic data holds promise for enabling determination of genetic causes of observed individual traits, with applications to problems such as discovery of the genetic roots of phenotypes, be they molecular phenotypes such as gene expression or metabolite concentrations, or complex phenotypes such as diseases. However, the growing sizes of these datasets and the quadratic, cubic or higher scaling characteristics of the relevant algorithms pose a serious computational challenge necessitating use of leadership scale computing. In this paper we describe a new approach to performing vector similarity metrics calculations, suitable for parallel systems equipped with graphics processing units (GPUs) or Intel Xeon Phi processors. Read More

We study the problem of testing conductance in the distributed computing model and give a two-sided tester that takes $\mathcal{O}(\log n)$ rounds to decide if a graph has conductance at least $\Phi$ or is $\epsilon$-far from having conductance at least $\Theta(\Phi^2)$ in the distributed CONGEST model. We also show that $\Omega(\log n)$ rounds are necessary for testing conductance even in the LOCAL model. In the case of a connected graph, we show that we can perform the test even when the number of vertices in the graph is not known a priori. Read More

Let $G$ be a graph such that each vertex has its list of available colors, and assume that each list is a subset of the common set consisting of $k$ colors. For two given list colorings of $G$, we study the problem of transforming one into the other by changing only one vertex color assignment at a time, while at all times maintaining a list coloring. This problem is known to be PSPACE-complete even for bounded bandwidth graphs and a fixed constant $k$. Read More