Computer Science - Operating Systems Publications (50)


Computer Science - Operating Systems Publications

The rapid evolution of Internet-of-Things (IoT) technologies has led to an emerging need to make it smarter. A variety of applications now run simultaneously on an ARM-based processor. For example, devices on the edge of the Internet are provided with higher horsepower to be entrusted with storing, processing and analyzing data collected from IoT devices. Read More

Existing memory management mechanisms used in commodity computing machines typically adopt hardware based address interleaving and OS directed random memory allocation to service generic application requests. These conventional memory management mechanisms are challenged by contention at multiple memory levels, a daunting variety of workload behaviors, and an increasingly complicated memory hierarchy. Our ISCA-41 paper proposes vertical partitioning to eliminate shared resource contention at multiple levels in the memory hierarchy. Read More

The online feasibility problem (for a set of sporadic tasks) asks whether there is a scheduler that always prevents deadline misses (if any), whatever the sequence of job releases, which is a priori} unknown to the scheduler. In the multiprocessor setting, this problem is notoriously difficult. The only exact test for this problem has been proposed by Bonifaci and Marchetti-Spaccamela: it consists in modelling all the possible behaviours of the scheduler and of the tasks as a graph; and to interpret this graph as a game between the tasks and the scheduler, which are seen as antagonistic players. Read More

This paper reports on the state of the art of virtualization technology for both general purpose domains as well as real-time domains. There exits no entirely instantaneous data transmission/transfer. There always exist a delay while transmitting data, either in the processing or in the medium itself. Read More

The emerging hybrid DRAM-NVM architecture is challenging the existing memory management mechanism in operating system. In this paper, we introduce memos, which can schedule memory resources over the entire memory hierarchy including cache, channels, main memory comprising DRAM and NVM simultaneously. Powered by our newly designed kernel-level monitoring module and page migration engine, memos can dynamically optimize the data placement at the memory hierarchy in terms of the on-line memory patterns, current resource utilization and feature of memory medium. Read More

Affiliations: 1Systems Group, Department of Computer Science, ETH Zurich, 2Systems Group, Department of Computer Science, ETH Zurich, 3Systems Group, Department of Computer Science, ETH Zurich, 4Systems Group, Department of Computer Science, ETH Zurich

The hardware/software boundary in modern heterogeneous multicore computers is increasingly complex, and diverse across different platforms. A single memory access by a core or DMA engine traverses multiple hardware translation and caching steps, and the destination memory cell or register often appears at different physical addresses for different cores. Interrupts pass through a complex topology of interrupt controllers and remappers before delivery to one or more cores, each with specific constraints on their configurations. Read More

Code authorship is a key information in large-scale open source systems. Among others, it allows maintainers to assess division of work and identify key collaborators. Interestingly, open-source communities lack guidelines on how to manage authorship. Read More

Checkpoint-restart is now a mature technology. It allows a user to save and later restore the state of a running process. The new plugin model for the upcoming version 3. Read More

As its price per bit drops, SSD is increasingly becoming the default storage medium for cloud application databases. However, it has not become the preferred storage medium for key-value caches, even though SSD offers more than 10x lower price per bit and sufficient performance compared to DRAM. This is because key-value caches need to frequently insert, update and evict small objects. Read More

Timing and power consumption play an important role in the design of embedded systems. Furthermore, both properties are directly related to the safety requirements of many embedded systems. With regard to availability requirements, power considerations are of uttermost importance for battery operated systems. Read More

We introduce a user mode file system, CannyFS, that hides latency by assuming all I/O operations will succeed. The user mode process will in turn report errors, allowing proper cleanup and a repeated attempt to take place. We demonstrate benefits for the model tasks of extracting archives and removing directory trees in a real-life HPC environment, giving typical reductions in time use of over 80%. Read More

Memory and logic integration on the same chip is becoming increasingly cost effective, creating the opportunity to offload data-intensive functionality to processing units placed inside memory chips. The introduction of memory-side processing units (MPUs) into conventional systems faces virtual memory as the first big showstopper: without efficient hardware support for address translation MPUs have highly limited applicability. Unfortunately, conventional translation hardware (i. Read More

Applications written to run on conventional operating systems typically depend on OS abstractions like processes, pipes, signals, sockets, and a shared file system. Porting these applications to the web currently requires extensive rewriting or hosting significant portions of code server-side because browsers present a nontraditional runtime environment that lacks OS functionality. This paper presents Browsix, a framework that bridges the considerable gap between conventional operating systems and the browser, enabling unmodified programs expecting a Unix-like environment to run directly in the browser. Read More

Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory which negatively impacts cache hit rates. Read More

Read-Copy Update (RCU) is a scalable, high-performance Linux-kernel synchronization mechanism that runs low-overhead readers concurrently with updaters. Production-quality RCU implementations for multi-core systems are decidedly non-trivial. Giving the ubiquity of Linux, a rare "million-year" bug can occur several times per day across the installed base. Read More

The trade-off between coarse- and fine-grained locking is a well understood issue in operating systems. Coarse-grained locking provides lower overhead under low contention, fine-grained locking provides higher scalability under contention, though at the expense of implementation complexity and re- duced best-case performance. We revisit this trade-off in the context of microkernels and tightly-coupled cores with shared caches and low inter-core migration latencies. Read More

In this paper, the implementation of an operating system oriented RBAC model is discussed. Firstly, on the basis of RBAC96 model, a new RBAC model named OSR is presented. Secondly, the OSR model is enforced in RFSOS kernel by the way of integrating GFAC method and Capability mechanism together. Read More

OS-level virtualization incurs smaller start-up and run-time overhead than HAL-based virtualization and thus forms an important building block for developing fault-tolerant and intrusion-tolerant applications. A complete implementation of OS-level virtualization on the Windows platform requires virtualization of Windows services, such as system services like the Remote Procedure Call Server Service (RPCSS), because they are essentially extensions of the kernel. As Windows system services work very differently from their counterparts on UNIX-style OS, i. Read More

As OS-level virtualization technology usually imposes little overhead on virtual machine start-up and running, it provides an excellent choice for building intrusion/fault tolerant applications that require redundancy and frequent invocation. When developing Windows OS-level virtual machine, however, people will inevitably face the challenge of confining Windows Inter-Process Communications (IPC). As IPC on Windows platform is more complex than UNIX style OS and most of the programs on Windows are not open-source, it is difficult to discover all of the performed IPCs and confine them. Read More

In this thesis, we describe a new, practical approach to integrating hardware-based data compression within the memory hierarchy, including on-chip caches, main memory, and both on-chip and off-chip interconnects. This new approach is fast, simple, and effective in saving storage space. A key insight in our approach is that access time (including decompression latency) is critical in modern memory hierarchies. Read More

OS compromise is one of the most serious computer security problems today, but still not being resolved. Although people proposed different kinds of methods, they could not be accepted by most users who are non-expert due to the lack of compatibility and usability. In this paper, we introduce a kind of new mandatory access control model, named CUMAC, that aims to achieve good-enough security, high compatibility and usability. Read More

Today, security threats to operating systems largely come from network. Traditional discretionary access control mechanism alone can hardly defeat them. Although traditional mandatory access control models can effectively protect the security of OS, they have problems of being incompatible with application software and complex in administration. Read More

Modern mobile systems use a single input-to-display path to serve all applications. In meeting the visual goals of all applications, the path has a latency inadequate for many important interactions. To accommodate the different latency requirements and visual constraints by different interactions, we present POLYPATH, a system design in which application developers (and users) can choose from multiple path designs for their application at any time. Read More

In order to limit the damage of malware on Mac OS X and iOS, Apple uses sandboxing, a kernel-level security layer that provides tight constraints for system calls. Particularly used for Apple iOS, sandboxing prevents apps from executing potentially dangerous actions, by defining rules in a sandbox profile. Investigating Apple's built-in sandbox profiles is difficult as they are compiled and stored in binary format. Read More

OS-level virtualization techniques virtualize system resources at the system call interface, has the distinct advantage of smaller run-time resource requirements as compared to HAL-level virtualization techniques, and thus forms an important building block for virtualizing parallel and distributed applications such as a HPC clusters. Because the Windows operating system puts certain critical functionalities in privileged user-level system service processes, a complete OS-level virtualization solution for the Windows platform requires duplication of such Windows service as Remote Procedure Call Server Service (RPCSS). As many implementation details of the Windows system services are proprietary, duplicating Windows system services becomes the key technical challenge for virtualizing the Windows platform at the OS level. Read More

We have developed a task-parallel runtime system, called TREES, that is designed for high performance on CPU/GPU platforms. On platforms with multiple CPUs, Cilk's "work-first" principle underlies how task-parallel applications can achieve performance, but work-first is a poor fit for GPUs. We build upon work-first to create the "work-together" principle that addresses the specific strengths and weaknesses of GPUs. Read More

Fault tolerance for the upcoming exascale generation has long been an area of active research. One of the components of a fault tolerance strategy is checkpointing. Petascale-level checkpointing is demonstrated through a new mechanism for virtualization of the InfiniBand UD (unreliable datagram) mode, and for updating the remote address on each UD-based send, due to lack of a fixed peer. Read More

We propose three novel mathematical optimization formulations that solve the same two-type heterogeneous multiprocessor scheduling problem for a real-time taskset with hard constraints. Our formulations are based on a global scheduling scheme and a fluid model. The first formulation is a mixed-integer nonlinear program, since the scheduling problem is intuitively considered as an assignment problem. Read More

Virtual machines and virtualized hardware have been around for over half a century. The commoditization of the x86 platform and its rapidly growing hardware capabilities have led to recent exponential growth in the use of virtualization both in the enterprise and high performance computing (HPC). The startup time of a virtualized environment is a key performance metric for high performance computing in which the runtime of any individual task is typically much shorter than the lifetime of a virtualized service in an enterprise context. Read More

The period enforcer algorithm for self-suspending real-time tasks is a technique for suppressing the "back-to-back" scheduling penalty associated with deferred execution. Originally proposed in 1991, the algorithm has attracted renewed interest in recent years. This note revisits the algorithm in the light of recent developments in the analysis of self-suspending tasks, carefully re-examines and explains its underlying assumptions and limitations, and points out three observations that have not been made in the literature to date: (i) period enforcement is not strictly superior (compared to the base case without enforcement) as it can cause deadline misses in self-suspending task sets that are schedulable without enforcement; (ii) to match the assumptions underlying the analysis of the period enforcer, a schedulability analysis of self-suspending tasks subject to period enforcement requires a task set transformation for which no solution is known in the general case, and which is subject to exponential time complexity (with current techniques) in the limited case of a single self-suspending task; and (iii) the period enforcer algorithm is incompatible with all existing analyses of suspension-based locking protocols, and can in fact cause ever-increasing suspension times until a deadline is missed. Read More

Real-time scheduling algorithms proposed in the literature are often based on worst-case estimates of task parameters. The performance of an open-loop scheme can be degraded significantly if there are uncertainties in task parameters, such as the execution times of the tasks. Therefore, to cope with such a situation, a closed-loop scheme, where feedback is exploited to adjust the system parameters, can be applied. Read More

Mixed-criticality systems combine real-time components of different levels of criticality, i.e. severity of failure, on the same processor, in order to obtain good resource utilisation. Read More

This book describes the source code of the NetBSD Operating System Release 1.6 in SUN UltraSPARC 64-bit platform by annotating related excerpts from references and user manuals on the NetBSD Operating System. The goal of this book is to provide necessary information to understand the operation and the implementation of I/O subsystems in the kernel as well as to design and implement a new filesystem on the NetBSD platform. Read More

Virtualization is generally adopted in server and desktop environments to provide for fault tolerance, resource management, and energy efficiency. Virtualization enables parallel execution of multiple operating systems (OSs) while sharing the hardware resources. Virtualization was previously not deemed as feasible technology for mobile and embedded devices due to their limited processing and memory resource. Read More

CPU scheduling is one of the most crucial operations performed by operating system. Different algorithms are available for CPU scheduling amongst them RR (Round Robin) is considered as optimal in time shared environment. The effectiveness of Round Robin completely depends on the choice of time quantum. Read More

We present a scheduler that improves cluster utilization and job completion times by packing tasks having multi-resource requirements and inter-dependencies. While the problem is algorithmically very hard, we achieve near-optimality on the job DAGs that appear in production clusters at a large enterprise and in benchmarks such as TPC-DS. A key insight is that carefully handling the long-running tasks and those with tough-to-pack resource needs will produce good-enough schedules. Read More

Smartphones' cameras, microphones, and device displays enable users to capture and view memorable moments of their lives. However, adversaries can trick users into authorizing malicious apps that exploit weaknesses in current mobile platforms to misuse such on-board I/O devices to stealthily capture photos, videos, and screen content without the users' consent. Contemporary mobile operating systems fail to prevent such misuse of I/O devices by authorized apps due to lack of binding between users' interactions and accesses to I/O devices performed by these apps. Read More

Previous OS abstractions and structures are mainly proposed for the average performance. The shift toward server side computing calls for new OS structures for the worst-case performance. This paper presents the "isolate first, then share" OS architecture. Read More

Voice control is a popular way to operate mobile devices, enabling users to communicate requests to their devices. However, adversaries can leverage voice control to trick mobile devices into executing commands to leak secrets or to modify critical information. Contemporary mobile operating systems fail to prevent such attacks because they do not control access to the speaker at all and fail to control when untrusted apps may use the microphone, enabling authorized apps to create exploitable communication channels. Read More

Modern operating system kernels are written in lower-level languages such as C. Although the low-level functionalities of C are often useful within kernels, they also give rise to several classes of bugs. Kernels written in higher level languages avoid many of these potential problems, at the possible cost of decreased performance. Read More

The crucial importance of software platforms was highlighted by recent events both at the political level (e.g. renewed calls for digital data and operating system "sovereignty", following E. Read More

Effectively protecting the Windows OS is a challenging task, since most implementation details are not publicly known. Windows has always been the main target of malwares that have exploited numerous bugs and vulnerabilities. Recent trusted boot and additional integrity checks have rendered the Windows OS less vulnerable to kernel-level rootkits. Read More

This report gives an overview of secure element integration into Android devices. It focuses on the Open Mobile API as an open interface to access secure elements from Android applications. The overall architecture of the Open Mobile API is described and current Android devices are analyzed with regard to the availability of this API. Read More

This paper addresses the problem of scheduling tasks with different criticality levels in the presence of I/O requests. In mixed-criticality scheduling, higher criticality tasks are given precedence over those of lower criticality when it is impossible to guarantee the schedulability of all tasks. While mixed-criticality scheduling has gained attention in recent years, most approaches typically assume a periodic task model. Read More

Multi-core processors are becoming more and more popular in embedded and real-time systems. While fixed-priority scheduling with task-splitting in real-time systems are widely applied, current approaches have not taken into consideration energy-aware aspects such as dynamic voltage/frequency scheduling (DVS). In this paper, we propose two strategies to apply dynamic voltage scaling (DVS) to fixed-priority scheduling algorithms with task-splitting for periodic real-time tasks on multi-core processors. Read More

Large number of cores and hardware resource sharing are two characteristics on multicore processors, which bring new challenges for the design of operating systems. How to locate and analyze the speedup restrictive factors in operating systems, how to simulate and avoid the phenomenon that speedup decreases with the number of cores because of lock contention (i.e. Read More

When integrating hard, soft and non-real-time tasks in general purpose operating systems, it is necessary to provide temporal isolation so that the timing properties of one task do not depend on the behaviour of the others. However, strict budget enforcement can lead to inefficient use of the computational resources in the presence of tasks with variable workload. Many resource reclaiming algorithms have been proposed in the literature for single processor scheduling, but not enough work exists for global scheduling in multiprocessor systems. Read More

Real-time systems are traditionally classified into hard real-time and soft real-time: in the first category we have safety critical real-time systems where missing a deadline can have catastrophic consequences, whereas in the second class we find systems or which we need to optimise the Quality of service provided to the user. However, the frontier between these two classes is thinner than one may think, and many systems that were considered as hard real-time in the past should now be reconsidered under a different light. In this paper we shall first recall the fundamental notion of time-predictability and criticality, in order to understand where the real-time deadlines that we use in our theoretical models come from. Read More

This paper proposes to use a frequency based cache admission policy in order to boost the effectiveness of caches subject to skewed access distributions. Given a newly accessed item and an eviction candidate from the cache, our scheme decides, based on the recent access history, whether it is worth admitting the new item into the cache at the expense of the eviction candidate. Realizing this concept is enabled through a novel approximate LFU structure called TinyLFU, which maintains an approximate representation of the access frequency of a large sample of recently accessed items. Read More

We introduce a controlled concurrency framework, derived from the Owicki-Gries method, for describing a hardware interface in detail sufficient to support the modelling and verification of small, embedded operating systems (OS's) whose run-time responsiveness is paramount. Such real-time systems run with interrupts mostly enabled, including during scheduling. That differs from many other successfully modelled and verified OS's that typically reduce the complexity of concurrency by running on uniprocessor platforms and by switching interrupts off as much as possible. Read More