دانشگاه علم و صنعت ایران - دانشکده مهندسی کامپیوتر

دانشکده مهندسی کامپیوتر- دفاعیه دکترا

علی متقی

حذف تصاویر و رنگ‌ها | تاریخ ارسال: 1399/12/5 |

آقای علی متقی دانشجوی دکترای آقای دکتر محسن سریانی روز سه شنبه مورخ ۱۳۹۹/۱۲/۰۵ ساعت ۱۸:۰۰ از رساله دکتری خود تحت عنوان "بازشناسی رویداد در ورزش کشتی آزاد " دفاع خواهند نمود.

ارائه دهنده:
علی متقی
استاد راهنما:
دکتر محسن سریانی
هیات داوران:

دکتر شهره کسایی ؛ دکتر نصرالله مقدم چرکری؛ دکتر محمود فتحی؛

زمان : ۰۵ اسفندماه ۱۳۹۹

ساعت ۱۸:۰۰

محل برگزاری: به صورت مجازی

چکیده پایان نامه :

بازشناسی اعمال انسان کاربردهای فراوانی در حوزه پردازش تصویر دارد و سال‌هاست که پژوهشگران و علاقه‌مندان در این حوزه کار می‌کنند. یکی از خلأهای تحقیقاتی موجود در این زمینه بازشناسی اعمال ورزشی دونفره است. در این پژوهش، بازشناسی اعمال ورزشکاران در کشتی آزاد انتخاب‌شده و مجموعه‌داده‌ای از فنون کشتی در دو قالب ویدیوهای اصلی تمام‌رنگی و ویدیوهای پیش‌زمینه به‌صورت سایه‌نما تهیه شده‌است. در اولین تلاش، اسکلت سایه‌نمای ورزشکاران تبدیل به گرافی به نام گراف آزاد می‌شود. برخلاف روشهای رایجی که از مدل بدن استفاده می‌کنند، این گراف مستقل از مدل بدن است؛ با این حال ساختار تقریبی اسکلت بدن را نشان می‌دهد. با محاسبه هیستوگرام نقاط گراف، ویژگیها را استخراج و از الگوریتمهای SVM و KNN برای دستهبندی استفاده کردیم. بالاترین دقت با KNN برابر ۹/%۸۴ و برای SVM ۳/%۶۸ به‌دست آمد. برای ارزیابی و مقایسه روش پیشنهادی، آزمایشها را روی دو مجموعه‌داده‌ی عمومی SBU و THETIS نیز انجام دادیم. نتایج به‌دست آمده، قابل مقایسه با روشهای مشابه بوده و برای کار در این حوزه رضایت بخش به‌نظر می‌رسد.
در ادامه پژوهش، از روشهای یادگیری عمیق و از جمله شبکهی LSTM استفاده کردیم. برای استحراج ویژگی شبکه‌ای را به‌کار بردیم که برای بازشناسی تصاویر آموزش دیده است. ابتدا فریمهای ویدیویی به این داده شده و خروجی آن وارد شبکه LSTM می‌شود. در این روش علاوه‌بر داده‌های RGB معمولی، با بهره‌گیری از رویکرد توجه، از پیش‌زمینه ویدیوها نیز به عنوان ورودی استفاده و آزمایشها را تکرار کردیم.
برای جبران حجم نسبتا کم داده‌ها در یادگیری عمیق، از روشهای دادهافزایی تصاویر و ویدیو استفاده کردیم. علاوه بر داده‌افزایی ویدیو در بعد زمان، داده‌افزایی تصویر را به روشهای مختلف و همچنین با رویکردهای قطعی، تصادفی و تصادفی همراه با پارامترهای تصادفی انجام دادیم. نتایج به‌دست‌آمده نشان از تأثیر مثبت دادهافزایی روی عمکرد سامانه دارد. با این وجود باید در داده‌افزایی عوامل دیگری از جمله مجموعه‌داده را در نظر داشته و با دقت لازم این کار را انجام دهیم.
واژه‌های کلیدی: بازشناسی اعمال، کشتی آزاد، ویژگی‌های اسکلتی، گراف آزاد، هیستوگرام نقاط گراف، یادگیری عمیق، ماشین بردار پشتیبان، k-نزدیک‌ترین همسایه، داده‌افزایی قطعی، داده‌افزایی تصادفی

Abstract

Recognition of human actions and behavior has many applications in the field of image processing, and researchers have been working in this field for many years. One of the research gaps in this field is the recognition of two-person sport. In this study, the recognition of athletes&#۳۹; actions on the freestyle wrestling was selected and a dataset of wrestling techniques was prepared. In the first attempt, the human’s silhouette skeleton was converted to a graph named “free graph”, and by calculating the histogram of the graph points, called the histogram of graph points. In this approach we used SVM and KNN classification algorithms. The best result for the KNN was ۸۴.۹% and SVM ۶۸.۳%. To evaluate and compare the proposed method, we also performed experiments on two public datasets, SBU and THETIS. The results are comparable to similar methods and seem to be satisfactory to begin with.
In the following, deep learning approaches were used as a second approach using LSTM (long--short-term memory) and C۳D (three-dimensional convolution) networks. To use LSTMT, we first rendered video data images to a trained object recognition network and outputs were feeded to the LSTM as a features. In this method, in addition to the conventional RGB data, based on the idea and approach of attention, we also used the foreground of the videos as input and repeated the experiments. The three-dimensional network, inspired by the use of ۲D convolution for videos, has also been used with improvements. The advantage of this method over the previous one is its dependence on image-based networks and its relatively higher speed.
To compensate for the relatively low volume of data appropriate for deep learning, we used image and video data augmentation methods. We implemented data-driven approaches and repeated experiments with all three deep learning methods. The results show the positive effect of data augmentation on system performance.
Studies and results can be used to provide more accurate recognition techniques in sports such as wrestling and also other AR areas. Given the high speed of movement in activities such as wrestling movements, the use of imaging devices with higher precision and frame rates can be very helpful. In addition, the deep learning approach can also lead to good results.

Keywords: Action Recognition, Freestyle Wrestling, Free Graph, Histogram of Graph Nodes, Data Augmentation, Deterministic data augmentation, Random-based data augmentation

دانشکده مهندسی کامپیوتر مدیریت تحصیلات تکمیلی

نشانی مطلب در وبگاه دانشکده مهندسی کامپیوتر:
http://idea.iust.ac.ir/find-14.11063.62815.fa.html
برگشت به اصل مطلب