Research Project

Efficient Machine Learning and Model Compression

Deep learning is great for many tasks but very expensive in terms of raw storage and computation cost. Especially when considering edge platforms. In our group, we have been working on efficient deep learning compression from both algorithmic and hardware perspectives. Including but not limited to network architecture search (NAS), novel quantization/binarization scheme, pruning, knowledge distillation, low-rank factorization, model re-parameterization, novel floating-point number format, etc. Meanwhile, automated design space exploration for resource-constrained targeting platforms by algorithm/compiler/hardware level co-design is the main theme of our lab.

Related Publication:

Ye Qiao, Alnemari, Mohammed, and Nader Bagherzadeh. “A Two-Stage Efficient 3-D CNN Framework for EEG Based Emotion Recognition.” 23rd IEEE International Conference on Industrial Technology (ICIT). IEEE, 2022.

2. Process-in-Memory Architecture and Acceleration

Recently, memristive crossbar arrays have gained considerable attention from researchers to perform analog in-memory vector-matrix multiplications in machine learning accelerators, with low power and constant computational time. The low power consumption and in-memory computation abilities of crossbars arrays make it an attractive method of analog AI acceleration. However, crossbar arrays have many non-ideal characteristics such as memristor device imperfections, weight noise, device drift, input/output noises, and DAC/ADC overhead. Thus our current research in this field explores novel and state-of-the-art machine learning models and their performance on crossbar array-based accelerators. We also research novel architectures for crossbar array-based AI accelerators. To measure the performance of ML models on analog AI accelerators, we use simulation tools such as the IBM Analog Hardware Development Kit, DNN Neurosim, and/or Spice simulations.

Related Publication:

Ding, Andrew*, Ye Qiao*, and Nader Bagherzadeh. “BNN an Ideal Architecture for Acceleration With Resistive in Memory Computation.” IEEE Transactions on Emerging Topics in Computing (2023). (Co-first Author)

3. Neural Architecture Search for Low-Power MCUs

Neural Architecture Search (NAS) has been proven to be an effective method for discovering new convolutional neural network (CNN) architectures, especially for scenarios with well-defined optimization targets or constraints. However, most previous works require time-consuming training on supernets or intensive architecture sampling and evaluations. This heavy computation resource consumption and potentially biased searching results, due to the use of proxy, limit their applications and increase the difficulty of deployment. Furthermore, the lack of hardware consideration makes them challenging to adapt to resource-constrained edge environments such as microcontroller units (MCUs). To address these issues, we propose MicroNAS, a novel hardware-aware zero-shot NAS framework specifically designed for MCUs.

4. Efficient and Partial Reconfigurable Hardware Accelerator Engine for Transformers

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28