We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. The 15th USENIX Symposium on Operating Systems Design and Implementation seeks to present innovative, exciting research in computer systems. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. For general conference information, see https://www.usenix.org/conference/osdi22. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. You must not improperly identify a PC member as a conflict if none of these three circumstances applies, even if for some other reason you want to avoid them reviewing your paper. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. We demonstrate that the hardware thread scheduler is able to lower RPC tail response time by about 5 while enabling the system to sustain 20% higher load, relative to traditional thread scheduling techniques. Weak Links in Authentication Chains: A Large-scale Analysis of Email Sender Spoofing Attacks Accepted papers will be allowed 14 pages in the proceedings, plus references. Alas, existing profiling techniques incur high overhead when used to identify data locality problems and cannot be deployed in production, where programs may exhibit previously-unseen performance problems. Proceedings Front Matter Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). We built a functional NFSv3 server, called GoNFS, to use GoJournal. These limitations require state-of-the-art systems to distribute training across multiple machines. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. NrOS replicates kernel state on each NUMA node and uses operation logs to maintain strong consistency between replicas. Sat, Aug 7, 2021 3 min read researches review. A PC member is a conflict if any of the following three circumstances applies: Institution: You are currently employed at the same institution, have been previously employed at the same institution within the past two years (not counting concluded internships), or are going to begin employment at the same institution during the review period. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. Kernel code requires manual memory management and type-unsafe code and must efficiently handle complex, asynchronous events. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. Writing a correct operating system kernel is notoriously hard. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. Distributed Trust: Is Blockchain the answer? Authors must make a good faith effort to anonymize their submissions, and they should not identify themselves or their institutions either explicitly or by implication (e.g., through the references or acknowledgments). Professor Veloso has been recognized with a multiple honors, including being a Fellow of the ACM, IEEE, AAAS, and AAAI. Consensus bugs are bugs that make Ethereum clients transition to incorrect blockchain states and fail to reach consensus with other clients. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses.
Publications | Mosharaf Chowdhury Software Systems Laboratory Wins Best Paper Awards at the OSDI and Taking place in Carlsbad, CA from 11-13 July, OSDI is a highly selective flagship conference in computer science, especially on the topic of computer systems. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. Session Chairs: Ryan Huang, Johns Hopkins University, and Manos Kapritsos, University of Michigan, Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan, Columbia University. Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. Marius is open-sourced at www.marius-project.org.
OSDI - Guide Proceedings PET discovers and applies program transformations that improve computation efficiency but only maintain partial functional equivalence. The chairs will review paper conflicts to ensure the integrity of the reviewing process, adding or removing conflicts if necessary. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. We prove that DistAI is guaranteed to find the -free inductive invariant that proves the desired safety properties in finite time, if one exists. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. 64 papers accepted out of 341 submitted. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. We demonstrate that Marius achieves the same level of accuracy but is up to one order of magnitude faster. Call for Papers. Zeph executes privacy-adhering data transformations in real-time and scales to thousands of data sources, allowing it to support large-scale low-latency data stream analytics. Distributed systems are notoriously hard to implement correctly due to non-determinism. Her robot soccer teams have been RoboCup world champions several times, and the CoBot mobile robots have autonomously navigated for more than 1,000km in university buildings. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. We built an FPGA prototype of the nanoPU fast path by modifying an open-source RISC-V CPU, and evaluated its performance using cycle-accurate simulations on AWS FPGAs. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings.
Call for Papers - EuroSys 2022 One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce). The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. Welcome to the SOSP 2021 Website. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%.
OSDI '22 Technical Sessions | USENIX Evaluations show that Vegito can perform 1.9 million TPC-C NewOrder transactions and 24 TPC-H-equivalent queries per second simultaneously, which retain the excellent performance of specialized OLTP and OLAP counterparts (e.g., DrTM+H and MonetDB). The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. Papers so short as to be considered extended abstracts will not receive full consideration. Author Response Period While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. (Visa applications can take at least 30 working days to process.) Main conference program: 5-8 April 2022. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. If your accepted paper should not be published prior to the event, please notify production@usenix.org. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. The hybrid segment recycling chooses a proper block reclaiming policy between segment compaction and threaded logging based on their costs. The co-chairs may then share that paper with the workshops organizers and discuss it with them. Our evaluation shows that NrOS scales to 96 cores with performance that nearly always dominates Linux at scale, in some cases by orders of magnitude, while retaining much of the simplicity of a sequential kernel. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. Despite their extensive use for debugging and vulnerability discovery, sanitizer checks often induce a high runtime cost. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. Based on this observation, P3 proposes a new approach for distributed GNN training. As a result, data characteristics and device capabilities vary widely across clients. A scientific paper consists of a constellation of artifacts that extend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. Therefore, developers typically find data locality issues via dynamic profiling and repair them manually. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. This is especially true for DPF over Rnyi DP, a highly composable form of DP. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction.
OSDI '21 - HotCRP.com Our approach effectively eliminates high communication and partitioning overheads, and couples it with a new pipelined push-pull parallelism based execution strategy for fast model training. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury, University of Michigan. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. OSDI 2021 papers summary. Second, Fluffy uses multiple existing Ethereum clients that independently implement the specification as cross-referencing oracles. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . Furthermore, to enable automatic runtime optimization, GNNAdvisor incorporates a lightweight analytical model for an effective design parameter search. Sponsored by USENIX in cooperation with ACM SIGOPS. She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. Widely used log-search tools like Elasticsearch and Splunk Enterprise index the logs to provide fast search performance, yet the size of the index is within the same order of magnitude as the raw log size. Camera-ready submission (all accepted papers): 15 Mars 2022. The NVMe zoned namespace (ZNS) is emerging as a new storage interface, where the logical address space is divided into fixed-sized zones, and each zone must be written sequentially for flash-memory-friendly access. Log search and log archiving, despite being critical problems, are mutually exclusive. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. will work with the steering committee to ensure that the symposium program will accommodate presentations for all accepted papers. We develop a prototype of Zeph on Apache Kafka to demonstrate that Zeph can perform large-scale privacy transformations with low overhead. If your paper is accepted and you need an invitation letter to apply for a visa to attend the conference, please contact conference@usenix.org as soon as possible. In 2023 I started another two-year term on the . Responses should be limited to clarifying the submitted work.
SOSP 2021 - Symposium on Operating Systems Principles When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments.
Precision Conservation: Linking Set-aside and Working Lands Policy Zeph enforces privacy policies cryptographically and ensures that data available to third-party applications complies with users' privacy policies. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. We implemented the ZNS+ SSD at an SSD emulator and a real SSD. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. Submissions violating the detailed formatting and anonymization rules will not be considered for review. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. Based on the observation that real-world workloads always feature skewed access patterns, Nap introduces a NUMA-aware layer (NAL) on the top of existing concurrent PM indexes, and steers accesses to hot items to this layer.
SOSP 2021 - Symposium on Operating Systems Principles Research Impact Score 9.24. . For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. For conference information, . For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked.
USENIX ATC '21 - HotCRP.com For example, optimistic concurrency control (OCC) is better than two-phase-locking (2PL) under low contention, while the converse is true under high contention. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. Graph Neural Networks (GNNs) have gained significant attention in the recent past, and become one of the fastest growing subareas in deep learning. While several new GNN architectures have been proposed, the scale of real-world graphsin many cases billions of nodes and edgesposes challenges during model training. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. PC members are not required to read supplementary material when reviewing the paper, so each paper should stand alone without it. Our further evaluation on 38 CVEs from 10 commonly-used programs shows that SanRazor reduced checks suffice to detect at least 33 out of the 38 CVEs.
PLDI 2019 - PLDI Research Papers - PLDI 2019 - SIGPLAN J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. We propose Marius, a system for efficient training of graph embeddings that leverages partition caching and buffer-aware data orderings to minimize disk access and interleaves data movement with computation to maximize utilization. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software.
We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. OSDI takes a broad view of the systems area and solicits contributions from many fields of systems practice, including, but not limited to, operating systems, file and storage systems, distributed systems, cloud computing, mobile systems, secure and reliable systems, systems aspects of big data, embedded systems, virtualization, networking as it relates to operating systems, and management and troubleshooting of complex systems. See the USENIX Conference Submissions Policy for details. Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep. GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Research Petuum CASL research and engineering team's Pollux technical paper on adaptive scheduling for optimized. Session Chairs: Nadav Amit, VMware Research Group, and Ada Gavrilovska, Georgia Institute of Technology, Stephen Ibanez, Alex Mallery, Serhat Arslan, and Theo Jepsen, Stanford University; Muhammad Shahbaz, Purdue University; Changhoon Kim and Nick McKeown, Stanford University. P3 exposes a simple API that captures many different classes of GNN architectures for generality. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. Questions? Third, GNNAdvisor capitalizes on the GPU memory hierarchy for acceleration by gracefully coordinating the execution of GNNs according to the characteristics of the GPU memory structure and GNN workloads. The full program will be available in May 2021. Advisor: You have a past or present association as thesis advisor or advisee. Proceedings Cover | In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. This fast path contains programmable hardware support for low latency transport and congestion control as well as hardware support for efficient load balancing of RPCs to cores. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. Researchers from the Software Systems Laboratory bagged Best Paper Awards at the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021) and the 2021 USENIX Annual Technical Conference (USENIX ATC 2021).. Jay Lepreau Best Paper Award, OSDI'21. While compiler-based techniques have been proposed to improve data locality, they depend on heuristics, which can sometimes hurt performance. Perennial 2.0 makes this possible by introducing several techniques to formalize GoJournals specification and to manage the complexity in the proof of GoJournals implementation. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. As a result, the design of a file system with respect to space management and crash consistency is simplified, requiring only 10.8K LOC for full functionality. Authors are required to register abstracts by 3:00 p.m. PST on December 3, 2020, and to submit full papers by 3:00 p.m. PST on December 10, 2020. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Welcome to the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22) submissions site. If in doubt about whether your submission to OSDI 2021 and your upcoming submission to SOSP are the same paper or not, please contact the PC chairs by email. We also verified a simple NFS server using GoJournals specs, which confirms that they are helpful for application verification: a significant part of the proof doesnt have to consider concurrency and crashes. We describe Fluffy, a multi-transaction differential fuzzer for finding consensus bugs in Ethereum. Metadata from voice calls, such as the knowledge of who is communicating with whom, contains rich information about peoples lives.