Official notification with all details will be sent out by February 28, 2020
PaperID | PaperTitle |
295-1548 | 3D CNN Acceleration on FPGA using Hardware-Aware Pruning |
295-1747 | A 90nm 103.14 TOPS-W Binary-Weight Spiking Neural Network CMOS ASIC for Real-Time Object Classification |
295-1493 | A Cross-Layer Power and Timing Evaluation Method for Wide Voltage Scaling |
295-1185 | A Device Non-Ideality Resilient Approach for Mapping Neural Networks to Crossbar Arrays |
295-1911 | A Formal Approach for Detecting Vulnerabilities to Transient Execution Attacks in Out-of-order Processors |
295-2218 | A History-based Auto-tuning Framework for Fast and High-performance DNN Design on GPU |
295-2229 | A Machine Learning Approach for Reliability-Aware Application Mapping for Heterogeneous Multicores |
295-2283 | A Novel GPU Overdrive Fault Attack |
295-1375 | A Pragmatic Approach to On-device Incremental Learning System with Selective Weight Updates |
295-1414 | A Provably Good Wavelength-Division-Multiplexing-Aware Clustering Algorithm for On-Chip Optical Routing |
295-1975 | A Robust Exponential Integrator Method for Generic Nonlinear Circuit Simulation |
295-2092 | A Simple Cache Coherence Scheme for Integrated CPU-GPU Systems |
295-1488 | A Two-way SRAM Array based Accelerator for Deep Neural Network On-chip Training |
295-2352 | A Versatile and Flexible Chiplet-based System Design for Heterogeneous Manycore Architectures |
295-1131 | Access Characteristic Guided Partition for Read Performance Improvement on Solid State Drives |
295-2255 | Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation |
295-1814 | Adaptive Layout Decomposition with Graph Embedding Neural Networks |
295-1768 | Adjoint Transient Sensitivity Analysis for Objective Functions Associated to Many Time Points |
295-2235 | AHEC: End-to-end Compiler Framework for Privacy-preserving Machine Learning Acceleration |
295-1416 | ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks |
295-1630 | Algorithm-Hardware Co-Design for In-Memory Neural Network Computing with Minimal Peripheral Circuit Overhead |
295-1594 | Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference |
295-1857 | ALSRAC: Approximate Logic Synthesis by Resubstitution with Approximate Care Set |
295-1764 | An Efficient and Robust Yield Optimization Method for High-dimensional SRAM Circuits |
295-1700 | An Efficient Asynchronous Batch Bayesian Optimization Approach for Analog Circuit Synthesis |
295-1625 | An Efficient Circuit Compilation Flow for Quantum Approximate Optimization Algorithm |
295-2312 | An Efficient Critical Path Generation Algorithm Considering Extensive Path Constraints |
295-1858 | An Efficient Deep Learning Accelerator for Compressed Video Analysis |
295-1156 | An Efficient EPIST Algorithm for Global Placement with Non-Integer Multiple-Height Cells |
295-1665 | Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency |
295-1970 | Anonymous: Detailed-Routability-Driven 3D Global Routing with Probabilistic Resource Model |
295-2012 | ApproxFPGAs: Embracing ASIC-based Approximate Arithmetic Components for FPGA-Based Systems |
295-2061 | A-QED Verification of Hardware Accelerators |
295-1521 | ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor |
295-2128 | AXI HyperConnect: A Predictable, Hypervisor-level AXI Interconnect for Hardware Accelerators in FPGA SoC |
295-1054 | Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration |
295-1165 | Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator |
295-2081 | Bit parallel 6T SRAM In-memory Computing with Reconfigurable Bit-Precision |
295-1568 | Bit-Parallel Vector Composability for Neural Acceleration |
295-2004 | BitPruner: Network Pruning for Bit-Serial Accelerators |
295-1727 | BPNet: Branch-pruned Conditional Neural Network for Systematic Time-accuracy Tradeoff |
295-1595 | BPU: A Blockchain Processing Unit for Accelerated Smart Contract Execution |
295-1104 | BrezeFlow: Unified Debugger for Android CPU Power Governors and Schedulers on Edge Devices |
295-1928 | Camouflage: Hardware-assisted CFI for the ARM Linux kernel |
295-2272 | CAP'NN: Class-Aware Personalized Neural Network Inference |
295-1797 | CDRing: Reconfigurable Ring Architecture by Exploiting Cycle Decomposition of Torus Topology |
295-1192 | Centaur: Hybrid Processing in On-Off-chip Memory Architecture for Graph Analytics |
295-1540 | Characterization and Applications of Spatial Variation Models for Silicon Microring-Based Optical Transceivers |
295-1478 | Circuit Learning for Logic Regression on High Dimensional Boolean Space |
295-1429 | CL(R)Early: An Early-stage DSE Methodology for Cross-layer Reliability-aware Heterogeneous Embedded Systems |
295-2183 | Closing the Design Loop: Bayesian Optimization Assisted Hierarchical Analog Layout Synthesis |
295-1972 | Closing the RISC-V Compliance Gap: Looking from the Negative Testing Side |
295-1919 | Clustering approach for solving traveling salesman problems via Ising model based solver |
295-1437 | CODAR : A Contextual Duration-Aware Qubit Mapping for Various NISQ Devices |
295-1259 | COEXE: An Efficient Co-execution Architecture for Real-Time Neural Network Services |
295-1601 | Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks |
295-1366 | CoinPurse: A Device-Assisted File System with Dual Interfaces |
295-1776 | Compact domain-specific co-processor for accelerating module lattice-based key encapsulation mechanism |
295-1952 | Content Sifting Storage: Achieving Fast Read for Large-scale Image Dataset Analysis |
295-1238 | Convergence-aware Neural Network Training |
295-2205 | CRAFFT: High Resolution FFT Accelerator In Spintronic Computational RAM |
295-2311 | CryptoPIM: In-Memory Acceleration for RLWE Lattice-based Cryptography |
295-1444 | DDOT: Data Driven Online Tuning for energy efficient acceleration |
295-1734 | DECOY: DEflection-Driven HLS-Based Computation Partitioning for Obfuscating Intellectual PropertY |
295-1543 | Deep Learning Multi-Channel Fusion Attack Against Side-Channel Protected Hardware |
295-1696 | Deep Learning-Driven Simultaneous Layout Decomposition and Mask Optimization |
295-2354 | Defending Bit-Flip Attack through DNN Weight Reconstruction |
295-1258 | Don’t-Care-Based Node Minimization for Threshold Logic Networks |
295-1536 | DPCP-p: A Distributed Locking Protocol for Parallel Real-Time Tasks |
295-1850 | DRAMDig: A Knowledge-assisted Tool to Uncover DRAM Address Mapping |
295-2079 | DRMap: A Generic DRAM Data Mapping Policy for Energy-Efficient Processing of Convolutional Neural Networks |
295-1513 | DVFS-Based Scrubbing Scheduling for Reliability Maximization on Parallel Tasks in SRAM-based FPGAs |
295-1965 | Dynamic Information Flow Tracking for Embedded Binaries using SystemC-based Virtual Prototypes |
295-1765 | EANeM: Energy-Aware Network Stack Management for Mobile Devices |
295-2219 | EDD: Efficient Differentiable DNN architecture and implementation co-search for embedded AI solutions |
295-1495 | Efficient Multi-Grained Wear Leveling for Inodes of Persistent Memory File Systems |
295-1689 | Efficiently Exploiting Low Activity Factors to Accelerate RTL Simulation |
295-1855 | Eliminating Redundant Computation in Noisy Quantum Computing Simulation |
295-2007 | EMAP: A Cloud-Edge Hybrid Framework EEG Monitoring and Cross-Correlation Based Real-time Anomaly Prediction |
295-1383 | Enabling a B+-tree-based Data Management Scheme for Key-value Store over SMR-based SSHD |
295-2144 | Enhancing Thread-Level Parallelism in Asymmetric Multicores using Transparent Instruction Offloading |
295-1610 | Exploiting Computation Reuse for Stencil Accelerators |
295-1705 | Exploiting Zero Data to Reduce Register File and Execution Unit Dynamic Power Consumption in GPGPUs |
295-2178 | Exploration of Design Space and Runtime Optimization for Affective Computing in Machine Learning Empowered Ultra-low Power SoC |
295-1299 | Exploring a Bayesian Optimization Framework Compatible with Digital Standard Flow for Soft-Error-Tolerant Circuit |
295-1291 | Exploring Inherent Sensor Redundancy for Automotive Anomaly Detection |
295-1806 | Extending the RISC-V ISA for Efficient RNN-based 5G Radio Resource Management |
295-1752 | Factored Radix-8 Systolic Array for Tensor Processing |
295-1308 | Fast and Accurate Wire Timing Estimation on Tree and Non-Tree Net Structures |
295-1562 | Fast and Efficient Processing-in-Memory Accelerator for Collision Detection |
295-1762 | FCNNLib: An Efficient and Flexible Convolution Algorithm Library on FPGAs |
295-2287 | Flashmark: Watermarking of NOR Flash Memories for Counterfeit Detection |
295-2359 | FlexReduce: Flexible All-reduce for Distributed Deep Learning on Asymmetric Network Topology |
295-1515 | FLOPS: Efficient On-Chip Learning for Optical Neural Networks Through Stochastic Zeroth-Order Optimization |
295-2017 | From Homogeneous to Heterogeneous: Leveraging Deep Learning based Power Analysis across Devices |
295-1894 | FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability |
295-1626 | GENIEx: A Generalized Approach to Emulating Non-Idealities in Memristive X-bars using Neural Networks |
295-1243 | GPNPU: Enabling Efficient Hardware-Based Direct Convolution with Multi-Precision Support in GPU Tensor Cores |
295-1572 | GRANNITE: Graph Neural Network Inference for Transferable Power Estimation |
295-1467 | GUI-Enhanced Layout Generation of FFE SST TXs for Fast High-Speed Serial Link Design |
295-1385 | Hamiltonian Path Based Mixed-Cell-Height Legalization for Neighbor Diffusion Effect Mitigation |
295-1712 | Hardware Acceleration of Graph Neural Networks |
295-1197 | Hardware-Assisted Intellectual Property Protection of Deep Learning Models |
295-1087 | Hardware-assisted Service Live Migration in Resource-limited Edge Computing Systems |
295-2131 | Hawkware: Network Intrusion Detection based on Behavior Analysis with ANNs on an IoT Device |
295-1945 | High PE Utilization CNN Accelerator with Channel Fusion Supporting Pattern-Compressed Sparse Neural Networks |
295-1034 | HITTSFL: Design of a Cost-Effective HIS-Insensitive TNU-Tolerant and SET-Filtering Latch for Safety-Critical Applications |
295-1103 | How to Cut Out Expired Data with Nearly Zero Overhead for Solid-State Drives |
295-2189 | HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation |
295-1634 | ICS Protocol Fuzzing: Coverage Guided Packet Crack and Generation |
295-2025 | Impeccable Circuits II |
295-1362 | Imperceptible Misclassification Attack on Deep Learning Accelerator by Glitch Injection |
295-1244 | Improving the Concurrency Performance of Persistent Memory Transactions on Multicores |
295-1171 | INCA: INterruptible CNN Accelerator for Multi-tasking in Embedded Robots |
295-1494 | Input-Dependent Edge-Cloud Mapping of Recurrent Neural Networks Inference |
295-2133 | Intermittent Inference with Nonuniformly Compressed Multi-Exit Neural Network for Energy Harvesting Powered Devices |
295-2160 | Just Like the Real Thing: Fast Weak Simulation of Quantum Computation |
295-1462 | KFR: Optimal Cache Management with K-Framed Reclamation for Drive-Managed SMR Disks |
295-1186 | Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling |
295-1729 | Latch Clustering for Timing-Power Co-Optimization |
295-1358 | Lattice: An ADC-DAC-less ReRAM-based Processing-In-Memory Architecture for Accelerating Deep Convolution Neural Networks |
295-2268 | Layer RBER Variation Aware Read Performance Optimization for 3D Flash Memories |
295-2171 | Learning Concise Models from Long Execution Traces |
295-1956 | Learning From A Big Brother - Mimicking Neural Networks in Profiled Side-channel Analysis |
295-1379 | Learning to Predict IR Drop with Effective Training for ReRAM-based Neural Network Hardware |
295-2109 | Learning to Quantize Deep Neural Networks: A Competitive-Collaborative Approach |
295-1497 | LOFFS: A Low-Overhead File System for Large Flash Memory on Embedded Devices |
295-1966 | LoPher: SAT-Hardened Logic Embedding on Block Ciphers |
295-1672 | Low-Power Acceleration of Deep Neural Network Training Using Computational Storage Devices |
295-1286 | Machine Learning to Set Meta-Heuristic Specific Parameters for High-Level Synthesis Design Space Exploration |
295-2375 | Massively Parallel Approximate Simulation of Quantum Circuits |
295-1553 | MLParest: Machine Learning based Parasitic Estimation for Custom Circuit Design |
295-2347 | Monitoring the Health of Emerging Neural Network Accelerators with Cost-effective Concurrent Test |
295-1159 | Multiplicative Complexity of Autosymmetric Functions: Theory and Applications to Security |
295-1108 | NACU: A Non-Linear Arithmetic Unit for Neural Networks |
295-1325 | Navigator: Dynamic Multi-kernel scheduling to improve GPU performance |
295-2159 | Non-uniform DNN Structured Subnets Sampling for Dynamic Inference |
295-1453 | O-2A: Low Latency DNN Compression with Outlier-Aware Approximation |
295-1642 | On Computing Exact WCRT for DAG Task |
295-1224 | On Countermeasures against the Thermal Covert Channel Attacks Targeting Many-core Systems |
295-1390 | On the Security of Strong Memristor-based Physically Unclonable Functions |
295-2253 | Opportunistic Intermittent Control with Safety Guarantees for Autonomous Systems |
295-1862 | PAIR: Pin-aligned In-DRAM ECC architecture using expandability of Reed-Solomon code |
295-2056 | ParaGraph: Layout Parasitics and Device Parameter Prediction using Graph Neural Networks |
295-1697 | PattPIM: A Practical ReRAM-based DNN Accelerator by Reusing Weight Pattern Repetitions |
295-1040 | PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators |
295-2069 | PEMACx: A Probabilistic Error Analysis Methodology for Adders with Cascaded Approximate Units |
295-1786 | Permutation-Write: Optimizing Write Performance and Energy for Skyrmion Racetrack Memory |
295-1431 | PETNet: Polycount and Energy Trade-off Deep Networks for Producing 3D Objects from Images |
295-2059 | PIM-Assembler: A Processing-in-Memory Platform for Genome Assembly |
295-1365 | PIM-Prune: Fine-Grain DCNN pruning for Crossbar-based Process-In-Memory architecture |
295-1095 | PISCES: Power-Aware Implementation of SLAM by Customizing Efficient Sparse Algebra |
295-2110 | Predictable Memory-CPU Co-Scheduling with Support for Latency-sensitive Tasks |
295-1937 | Prediction Confidence based Low Complexity Gradient Computation for Accelerating DNN Training |
295-2251 | Prive-HD: Privacy-Preserved Hyperdimensional Computing |
295-1376 | Proactive Aging Mitigation in CGRAs through Utilization-Aware Allocation |
295-2048 | Probabilistic Error Propagation through Approximated Boolean Networks |
295-2119 | Pythia: Intellectual Property Verification in Zero-Knowledge |
295-1278 | Q-CapsNets: A Specialized Framework for Quantizing Capsule Networks |
295-1129 | Q-PIM: A Genetic Algorithm based Flexible DNN Quantization Method and Application to Processing-In-Memory Platform |
295-2265 | R2D3: A Reliability Engine for 3D Parallel Systems |
295-2003 | RaQu: An automatic high-utilization CNN quantization and mapping framework for general-purpose RRAM Accelerator |
295-1898 | Realistic Fault Models and Fault Simulation for Quantum Dot Quantum Circuits |
295-1828 | Reduced DRAM Caching |
295-1394 | Reducing Bit Writes in Non-volatile Main Memory by Similarity-aware Compression |
295-2291 | Reducing DRAM Access Latency via Helper Rows |
295-1602 | RELIC-FUN: Logic Identification through Functional Signal Comparisons |
295-1042 | Remote Atomic Extension (RAE) for Scalable High Performance Computing |
295-1635 | ReSiPE: ReRAM-based Single-Spiking Processing-In-Memory Engine |
295-1372 | ReTriple: Reduction of Redundant Rendering on Android Devices for Performance and Energy Optimizations |
295-2293 | Reverse Engineering Deep Neural Networks Using Floating-point Timing Side-channel |
295-2021 | Robust Design of Large Area Flexible Electronics via Compressed Sensing |
295-2304 | Romeo: Conversion and Evaluation of HDL Designs in the Encrypted Domain |
295-1135 | ROPAD: A Fully Digital Highly Predictive Ring Oscillator Probing Attempt Detector |
295-1481 | Routing Topology and Time-Division Multiplexing Co-Optimization for Multi-FPGA Systems |
295-1574 | RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition |
295-1698 | Runtime Trust Evaluation and Hardware Trojan Detection Using On-Chip EM Sensors |
295-1699 | S-ADAPT: Adaptive Low-Power Sensing and Activity Recognition for Wearable Devices |
295-1608 | SAT-Sweeping Enhanced for Logic Synthesis |
295-2308 | SCA: A Secure CNN Accelerator for both Training and Inference |
295-1321 | Scalable Multi-FPGA Acceleration for Large RNNs with Full Parallelism Levels |
295-1890 | S-CDA: A Smart Cloud Disk Allocation Approach in Cloud Block Storage System |
295-1046 | Seesaw: End-to-end Machine Learning based Dynamic Sensing for IoT |
295-2187 | SFO: A Scalable Approach to Fanout-Bounded Logic Synthesis for Emerging Technologies |
295-2358 | SHIELDeNN: Online Accelerated Framework for Fault-Tolerant Deep Neural Network Architectures |
295-1269 | SIEVE: Speculative Inference on the Edge with Versatile Exportation |
295-1476 | SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training |
295-1058 | Statistical Timing Analysis considering Multiple-Input Switching |
295-2026 | StatSAT: A Boolean Satisfiability based Attack on Logic-Locked Probabilistic Circuits |
295-1947 | STC: Significance-aware Transform-based Codec Framework for External Memory Access Reduction |
295-2202 | Stealing Your Data from Compressed Machine Learning Models |
295-1265 | Symbolic Computer Algebra and SAT Based Information Forwarding for Fully Automatic Divider Verification |
295-1713 | T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding |
295-1971 | TAEM: On-Chip Transfer-Aware Effective Loop Mapping for CGRAs |
295-1262 | Tail: An Automated and Lightweight Gradient Compression Framework for Distributed Deep Learning |
295-1624 | Taming Unstructured Sparsity on GPUs via Latency-Aware Optimization |
295-1411 | TCIM: Triangle Counting Acceleration With Processing-In-MRAM Architecture |
295-2121 | TDP-ADMM: A Timing Driven Placement Approach for Superconductive Electronic Circuits Using Alternating Direction Method of Multipliers |
295-1105 | Tensor Virtualization Technique to Support Efficient Data Reorganization for CNN Accelerators |
295-1742 | TEVoT: Timing Error Modeling of Functional Units under Dynamic Voltage and Temperature Variations |
295-2089 | The Best of Both Worlds: Combining CUDA Graph with an Image Processing DSL |
295-1110 | The Power of Simulation for Equivalence Checking in Quantum Computing |
295-1611 | The Tao of PAO: Anatomy of a Pin Access Oracle for Detailed Routing |
295-1728 | Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost |
295-1144 | Tight Compression: Compressing CNN Model Tightly Through Unstructured Pruning and Simulated Annealing Based Permutation |
295-1989 | Time Multiplexing via Circuit Folding |
295-1172 | Time-Division Multiplexing Based System-Level FPGA Routing for Logic Verification |
295-2182 | Timing-Accurate General-Purpose I-O for Multi- and Many-Core Systems: Scheduling and Hardware Support |
295-1417 | Topological Structure and Physical Layout Codesign for Wavelength-Routed Optical Networks-on-Chip |
295-1676 | Towards Memory-Efficient Streaming Processing with Counter-Cascading Sketching on FPGA |
295-1473 | Towards Purposeful Design Space Exploration of Heterogeneous CGRAs: Clock Frequency Estimation |
295-1710 | Towards State-Aware Computation in ReRAM Neural Networks |
295-2203 | TP-GNN: A Graph Neural Network Framework for Tier Partitioning in Monolithic 3D ICs |
295-1194 | Tracing Cache Side Channels via Reuse Distance Analysis |
295-1819 | Transfer Learning-Based Microfluidic Design System for Customized Concentration Generation |
295-2094 | TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning |
295-1950 | TSN-Builder: Enabling Rapid Customization of Resource-Efficient Switches for Time-Sensitive Networking |
295-1140 | TTS: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning |
295-1889 | TYMER: A Yield-based Performance Model for Timing-speculation SRAM |
295-1722 | UEFI Firmware Fuzzing with Simics Virtual Platform |
295-1796 | Utilizing Direct Photocurrent Computation and 2D Kernel Scheduling to Improve In-Sensor-Processing Efficiency |
295-1959 | VarSim: A Fast and Accurate Variability and Leakage Aware Thermal Simulator |
295-1968 | Verification for Field-coupled Nanocomputing Circuits |
295-1457 | Via-based Redistribution Layer Routing for InFO Packages with Irregular Pad Structures |
295-2254 | Wafer Map Defect Patterns Classification using Deep Selective Learning |
295-1980 | WarningNet: A Deep Learning Platform for Early Warning of Task Failures under Input Perturbation for Reliable Autonomous Platforms |
295-1953 | WET: Write Efficient Loop Tiling for Non-Volatile Main Memory |
295-2108 | ZENCO: Zero-bytes based ENCOding for Non-Volatile Buffers in On-Chip Interconnects |
DAC is the premier conference devoted to the design and automation of electronic systems (EDA), embedded systems and software (ESS), and intellectual property (IP).