Beng Chin Ooi

Beng Chin Ooi
Are you Beng Chin Ooi?

Claim your profile, edit publications, add additional information:

Contact Details

Beng Chin Ooi

Pubs By Year

Pub Categories

Computer Science - Databases (13)
Computer Science - Distributed; Parallel; and Cluster Computing (4)
Computer Science - Learning (2)
Statistics - Machine Learning (1)
Computer Science - Networking and Internet Architecture (1)
Computer Science - Cryptography and Security (1)

Publications Authored By Beng Chin Ooi

Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security and performance are well studied. Read More

DGCC protocol has been shown to achieve good performance on multi-core in-memory system. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expensive multi-node distributed transactions. During failure recovery, log must be examined from the last checkpoint onwards and the affected transactions are re-executed based on the way they are partitioned and executed. Read More

It is essential for the cellular network operators to provide cellular location services to meet the needs of their users and mobile applications. However, cellular locations, estimated by network-based methods at the server-side, bear with {\it high spatial errors} and {\it arbitrary missing locations}. Moreover, auxiliary sensor data at the client-side are not available to the operators. Read More

Today's storage systems expose abstractions which are either too low-level (e.g., key-value store, raw-block store) that they require developers to re-invent the wheels, or too high-level (e. Read More

Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multi-modal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. Read More

Due to the coarse granularity of data accesses and the heavy use of latches, indices in the B-tree family are not efficient for in-memory databases, especially in the context of today's multi-core architecture. In this paper, we present PI, a Parallel in-memory skip list based Index that lends itself naturally to the parallel and concurrent environment, particularly with non-uniform memory access. In PI, incoming queries are collected, and disjointly distributed among multiple threads for processing to avoid the use of latches. Read More

Modern Internet applications often produce a large volume of user activity records. Data analysts are interested in cohort analysis, or finding unusual user behavioral trends, in these large tables of activity records. In a traditional database system, cohort analysis queries are both painful to specify and expensive to evaluate. Read More

Recent years have witnessed amazing outcomes from "Big Models" trained by "Big Data". Most popular algorithms for model training are iterative. Due to the surging volumes of data, we can usually afford to process only a fraction of the training data in each iteration. Read More

Multicore CPUs and large memories are increasingly becoming the norm in modern computer systems. However, current database management systems (DBMSs) are generally ineffective in exploiting the parallelism of such systems. In particular, contention can lead to a dramatic fall in performance. Read More

A new type of logs, the command log, is being employed to replace the traditional data log (e.g., ARIES log) in the in-memory databases. Read More

Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. Read More

k nearest neighbor join (kNN join), designed to find k nearest neighbors from a dataset S for every object in another dataset R, is a primitive operation widely adopted by many data mining applications. As a combination of the k nearest neighbor query and the join operation, kNN join is an expensive operation. Given the increasing volume of data, it is difficult to perform a kNN join on a centralized machine efficiently. Read More

Some complex problems, such as image tagging and natural language processing, are very challenging for computers, where even state-of-the-art technology is yet able to provide satisfactory accuracy. Therefore, rather than relying solely on developing new and better algorithms to handle such tasks, we look to the crowdsourcing solution -- employing human participation -- to make good the shortfall in current technology. Crowdsourcing is a good supplement to many computer tasks. Read More

An increasing amount of trajectory data is being annotated with text descriptions to better capture the semantics associated with locations. The fusion of spatial locations and text descriptions in trajectories engenders a new type of top-$k$ queries that take into account both aspects. Each trajectory in consideration consists of a sequence of geo-spatial locations associated with text descriptions. Read More