Johan Ugander

Johan Ugander
Are you Johan Ugander?

Claim your profile, edit publications, add additional information:

Contact Details

Johan Ugander

Pubs By Year

Pub Categories

Physics - Physics and Society (9)
Statistics - Methodology (3)
Computer Science - Artificial Intelligence (2)
Statistics - Machine Learning (1)
Mathematics - Probability (1)
Physics - Data Analysis; Statistics and Probability (1)
Computer Science - Data Structures and Algorithms (1)
Quantitative Biology - Quantitative Methods (1)

Publications Authored By Johan Ugander

A broad range of on-line behaviors are mediated by interfaces in which people make choices among sets of options. A rich and growing line of work in the behavioral sciences indicate that human choices follow not only from the utility of alternatives, but also from the choice set in which alternatives are presented. In this work we study comparison-based choice functions, a simple but surprisingly rich class of functions capable of exhibiting so-called choice-set effects. Read More

The observation that individuals tend to be friends with people who are similar to themselves, commonly known as homophily, is a prominent and well-studied feature of social networks. Many machine learning methods exploit homophily to predict attributes of individuals based on the attributes of their friends. Meanwhile, recent work has shown that gender homophily can be weak or nonexistent in practice, making gender prediction particularly challenging. Read More

Random graph null models have found widespread application in diverse research communities analyzing network datasets. The most popular family of random graph null models, called configuration models, are defined as uniform distributions over a space of graphs with a fixed degree sequence. Commonly, properties of an empirical network are compared to properties of an ensemble of graphs from a configuration model in order to quantify whether empirical network properties are meaningful or whether they are instead a common consequence of the particular degree sequence. Read More

Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods though the seed set expansion problem: given a subset $S$ of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of landing probabilities of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. Read More

As datasets capturing human choices grow in richness and scale---particularly in online domains---there is an increasing need for choice models that escape traditional choice-theoretic axioms such as regularity, stochastic transitivity, and Luce's choice axiom. In this work we introduce the Pairwise Choice Markov Chain (PCMC) model of discrete choice, an inferentially tractable model that does not assume any of the above axioms while still satisfying the foundational axiom of uniform expansion, a considerably weaker assumption than Luce's choice axiom. We show that the PCMC model significantly outperforms the Multinomial Logit (MNL) model in prediction tasks on both synthetic and empirical datasets known to exhibit violations of Luce's axiom. Read More

Online social networks represent a popular and diverse class of social media systems. Despite this variety, each of these systems undergoes a general process of online social network assembly, which represents the complicated and heterogeneous changes that transform newly born systems into mature platforms. However, little is known about this process. Read More

Estimating the effects of interventions in networks is complicated when the units are interacting, such that the outcomes for one unit may depend on the treatment assignment and behavior of many or all other units (i.e., there is interference). Read More

A/B testing is a standard approach for evaluating the effect of online experiments; the goal is to estimate the `average treatment effect' of a new feature or condition by exposing a sample of the overall population to it. A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network. In this work, we propose a novel methodology using graph clustering to analyze average treatment effects under social interference. Read More

A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs -- these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks. A natural question is how to usefully define a domain-independent coordinate system for such a collection of graphs, so that the set of possible structures can be compactly represented and understood within a common space. In this work, we draw on the theory of graph homomorphisms to formulate and analyze such a representation, based on computing the frequencies of small induced subgraphs within each graph. Read More

People's interests and people's social relationships are intuitively connected, but understanding their interplay and whether they can help predict each other has remained an open question. We examine the interface of two decisive structures forming the backbone of online social media: the graph structure of social networks - who connects with whom - and the set structure of topical affiliations - who is interested in what. In studying this interface, we identify key relationships whereby each of these structures can be understood in terms of the other. Read More

Frigyes Karinthy, in his 1929 short story "L\'aancszemek" ("Chains") suggested that any two persons are distanced by at most six friendship links. (The exact wording of the story is slightly ambiguous: "He bet us that, using no more than five individuals, one of whom is a personal acquaintance, he could contact the selected individual [.. Read More

We study the structure of the social graph of active Facebook users, the largest social network ever analyzed. We compute numerous features of the graph including the number of users and friendships, the degree distribution, path lengths, clustering, and mixing patterns. Our results center around three main observations. Read More