Title: False Concurrency and strange-but-True Machines
Abstract: Concurrency theory and real-world multiprocessors have developed in parallel for the last 50 years, from their beginnings in the mid 1960s. Both have been very productive: concurrency theory has given us a host of models, calculi, and proof techniques, while engineered multiprocessors are now ubiquitous, from 2--8 core smartphones and laptops through to servers with 1024 or more hardware threads. But the fields have scarcely communicated, and the shared-memory interaction primitives offered by those mainstream multiprocessors are very different from the theoretical models that have been heavily studied.
My colleagues and I have been working at this interface: establishing rigorous and accurate concurrency semantics for multiprocessors (x86, IBM POWER, and ARM) and for the C11 and C++11 programming languages, and reasoning about them, developing the CompCertTSO verified compiler from a concurrent C-like language to x86, verified compilation schemes from C/C++11 to POWER/ARM, and verified concurrent algorithms. The models and reasoning principles are new, but we draw on the toolbox established in the theoretical world. In this talk I'll highlight a few examples of this work.
Bio: Peter Sewell is a Reader in Computer Science and EPSRC Leadership Fellow, Computer Laboratory, University of Cambridge. Fellow of Wolfson College. His website: http://www.cl.cam.ac.uk/~pes20/
Title: Network Time Synchronization: a Full Hardware Approach
Jorge Juan, Julián Viejo, Manuel J. Bellido
Abstract: Complex digital systems are typically built on top of several abstraction levels: digital, RTL, computer, operating system and software application. Each abstraction level, together with design automation tools, greatly facilitates the design task at the cost of paying in performance (timing and power) and hardware resources usage.
Network time synchronization is a good example of a complex system using several abstraction levels since the traditional solutions used e.g. in Internet servers and routers is a software application running on top of several software and hardware layers. In this talk we study the case where a standards-compliant network time synchronization solution is fully implemented in hardware on a FPGA chip doing without any software layer. This solution makes it possible to implement very compact, inexpensive and accurate synchronization systems to be used either stand-alone or as embedded cores. Some general aspects of the design experience are commented like development costs and available platforms and tools. As a conclusion, full hardware implementation of complex digital systems should be seen as a feasible design option, from which great performance advantages can be expected, provided that we can find a suitable set of tools and control the design costs.
Bio: Dr. Jorge Juan received the BSc. degree (1994) and the PhD. degree in Physics (2000) from the University of Seville, Spain. He is currently Associate Professor in the Electronics Technology Department at the same University where he is leadening the Research and Development Group. He has also been with the Institute of Microelectronics of Seville, part of the National Center of Microelectronics in Spain (CNM-CSIC), from 1995 to 2007. Dr. Juan has done research in the areas of metastability, delay modelling, timing and power simulation and digital embedded systems, where he has authored 2 complete books, one book chapter, numerous research papers in indexed journals and more than 40 conference papers.
He has been guest editor for Springer-Verlag's Lecture Notes in Computer Science and the IEE Proceedings on Computers and Digital Techniques. He is member of the steering and program committees of the IEEE Workshop on Power and Timing Modelling (PATMOS) since 2002. He has also been scientific consultant for the University of Siena, the Spanish National Commission for Scientific Research Activity, several international journals like IEEE Trans. on Computers, Elsevier's Integration VLSI Journal; and international conferences like ISCAS and DATE. He has also participated in a number European and national research projects founded by the Spanish Government.
Title: Radically improving IC power and performance through Relative Timing
Abstract: Time is ubiquitous - so much so we rarely think about its true role.
In many ways the act of measuring timing, and the value that is returned, has permeated the way we think about this omnipresent property. It may be argued that we focus on the value of what is measured, converted to megahertz or gigahertz, rather than the function of time. The stopwatch, metronomic cadence, and ability to regularly break time into fixed quantities such as days, hours, minutes, or a fixed cycle time has become the de facto model for time and how we apply it to building, modeling, and reasoning about systems that we create. However, this infusion of how we measure time into how we model time and interact with time results in a very restricted view and application of timing to our circuits and systems. We can observe that properties of the physical world, as well as computation, rarely if ever occur on precise boundaries. While frequency based responses are common in nature, they occur with significant inherent variation. Thus now is a good time to be reminded of the life of Alan Turing, who worked things out from first principle, rather than follow the conventional wisdom of the day. Taking a cue from his life, by evaluating time from first principle we will clearly observe that the key property of time is not its duration, or the value that we measure. Rather the key property of time is the ordering or sequencing that it introduces into systems. This has resulted in a new theory for modeling timing called relative timing.
A simple logical representation of timing is presented that models the sequencing property of timing. The most interesting and challenging aspect of any design is concurrency and synchronization. The relative timing model elucidates these aspects of system design by making the key synchronization points and their relationships in a system explicit. The mathematical representation of timing is directly applied to model, synthesize, and verify the synchronization of concurrent processes. Relative timing also results in lowering the cost of a design by reducing power and area while concurrently increasing performance. All active and passive components that are employed to create a design introduce delay. Thus it is often the case that introducing relative timing constraints into a system can be done at no additional cost in logic. Indeed it is often the case that adding knowledge of timing conditions and the sequencing it introduces allows one to reduce the amount of hardware (and subsequently the power and area) in a design. The relative timing model can be applied to any system where timing is a factor, such as detecting a sequence of similarly spaced events in time. The modeling and application of relative timing to integrated circuit designs will be presented. It will be shown that such an approach can result in an average reduction in power to one-third that of traditional clocked system design over a very broad range of applications.
Bio: Kenneth S. Stevens is an Associate Professor at the University of Utah. Prior to Utah, Ken worked at Intel's Strategic CAD Lab in Hillsboro Oregon where he developed timing technology for the double frequency ALU cores, multiple input switching validation, and the design of the front end of the Pentium Processor with asynchronous circuits that operated at 3.5 GHz when the lead processor's fastest clock speed was 450 MHz. Prior to Intel, Ken was an Assistant Professor at the Air Force Institute of Technology (AFIT) in Dayton Ohio where he developed asynchronous communication chips for space applications. Ken received his Ph.D. at the University of Calgary, where he researched the verification of sequential circuits and systems. Before that he worked at Fairchild Labs for AI Research and Hewlett Packard Labs. There he developed an asynchronous circuit synthesis methodology called "burst mode", and designed and fabricated an ultra high bandwidth communication chip for distributed memory multiprocessors. He received three degrees from the University of Utah, including a B.A. in Biology, and B.S. and M.S. degrees in Computer Science. Ken has published in journals and conferences, has 10 patents, and is a Senior Member of the IEEE. He also created a successful software startup company, has developed software for the GNU project, and serves on program committees and conference chairmanships. His current research focus includes novel timing verification technology that transforms circuit timing into logical expressions, network fabric and desynchronized pipeline designs, and asynchronous circuits and systems.
Title: Synthesizing Logical Computation on Stochastic Bit Streams
Abstract: Most digital systems operate on a positional representation of data, such as binary radix. A positional representation is a compact way to encode signal values: in binary radix, 2^n distinct values can be represented with n bits. However, operating on it requires complex logic: in each operation such as addition or multiplication, the signal must be "decoded," with the higher order bits weighted more than the lower order bits. We advocate an alternative representation: random bit streams where the signal value is encoded by the probability of obtaining a one versus a zero. This representation is much less compact than binary radix. However, complex operations can be performed with very simple logic. For instance, multiplication can be performed with a single AND gate. Also, because the representation is uniform, with all bits weighted equally, it is highly tolerant of soft errors (i.e., bit flips). In this talk, we will discuss a general method for synthesizing digital circuitry that computes on such stochastic bit streams. Our method can be used to synthesize arbitrary polynomial functions. Through polynomial approximations, it can also be used to synthesize non-polynomial functions. Experiments on functions used in image processing show that our method produces circuits that are highly tolerant of input errors. The accuracy degrades gracefully with the error rate. For applications that mandate simple hardware, producing relatively low precision computation very reliably, our method is a winning proposition.
Bio: Marc Riedel is Associate Professor of Electrical and Computer Engineering at the University of Minnesota. From 2006 to 2011 he was Assistant Professor. He is also a member of the Graduate Faculty in Biomedical Informatics and Computational Biology. From 2004 to 2005, he was a lecturer in Computation and Neural Systems at Caltech. He has held positions at Marconi Canada, CAE Electronics, Toshiba, and Fujitsu Research Labs. He received his Ph.D. and his M.Sc. in Electrical Engineering at Caltech and his B.Eng. in Electrical Engineering with a Minor in Mathematics at McGill University. His Ph.D. dissertation titled "Cyclic Combinational Circuits" received the Charles H. Wilts Prize for the best doctoral research in Electrical Engineering at Caltech. His paper "The Synthesis of Cyclic Combinational Circuits" received the Best Paper Award at the Design Automation Conference. He is a recipient of the NSF CAREER Award.