| دانشجو محمد حاجی زاده صفار دانشجوی دکتر عادل ترکمان رحمانی مورخ : ۱۴۰۳/۱۱/۰۸ ساعت: ۸:۰۰ صبح از رساله دکتری خود با عنوان "تشخیص بیدرنگ اشیا در ویدئو بر روی دستگاههای توکار با استفاده از شبکههای عصبی عمیق " دفاع خواهند نمود. |
ارائه دهنده:
محمد حاجی زاده صفار
استاد راهنما:
دکتر عادل ترکمان رحمانی
استاد مشاور:
استاد مشاور:
دکتر محمود فتحی
دکتر محمد سبکرو
هیات داوران:
دکتر رضا صفابخش
دکتر بابک نجار اعرابی
دکترمحسن سریانی
دکتر محمدرضا محمدی
زمان ۸ بهمن ماه ۱۴۰۳
ساعت: : ۸:۰۰ صبح
مکان : دانشکده کامپیوتر- طبقه سوم. اتاق جلسات دفاع دکتری
چکیده پایان نامه :
Abstract
and coordinates of the objects in the scene along with the classification of each of them. This topic is one of the basic ones that can be defined in many high-level activities in the field of machine vision, such as activity recognition, scene analysis, scene description, summarization, semantic understanding, etc. Object detection is divided into two sub-sections called object detection in images and object detection in videos.
Improving accuracy, speed and processing power has always been the focus of researchers, a large part of which is focused on the processing power of GPUs and devices based on powerful servers. Solutions based on GPUs and high processing power have many diverse applications in the real world, and current research has put many solutions in front of commercial product developers. On the other hand, solutions based on built-in devices have received less attention and have been out of the spotlight. The limited processing power, the size of the model to be placed in the memory, and the consumption of the hardware, are among the complexities of this field.
In this thesis, an efficient method is presented based on deep neural networks to detect objects in video, in real time (processing speed higher than ۱۵ frames per second) and with processing power that can be used on embedded devices. For a robust video object detection, we must first have a robust object detection on images so that it can be extended to video using some techniques. In order to improve the recognition of objects on images, in this research, a new backbone and neck was first designed and implemented, which has strengthened the basic network. In addition, to share the weights in the head part, the idea of Half Share of the weights was implemented, which has increased the accuracy. Finally, the specifications of the prior boxes have also changed a bit and have improved for smaller objects and overall final accuracy. By doing these, the accuracy of object detection on the images in this research improved compared to other articles. In addition, with the introduction of a new recurrent cell called GCRU for feature propagation over time and other changes, including dual networks and increasing the interval of previous frames, accuracy of ۶۷.۵% and speed of ۶۲ frames per second on the MobileDenseNet architecture has been achived. In addition, we reached ۶۸.۷% accuracy and a speed of ۵۲ frames per second on the EfficientNet architecture, which is the best performance among similar solutions in this field.