Prediction & classification of websites by their structural & design components
Prediction & classification of websites by their structural & design components
Doron Cohen, Tel-Aviv University Advisor: Prof. Irad Ben-Gal
Abstract:
In this study, we use an approach, named 'Website DNA', which extracts and models granular website structural and design components, such as HTML elements, JavaScript, Document Object Models (DOMs), and images (including object recognition), and compiles them into an aggregated feature representation list. Then, we use this list to classify different websites types. We show that the utilization of 'website DNA' for classifying websites results with high accuracy and high score for three different applications. Namely, identification of Crack websites, Malicious websites and Successful websites, the latter according to efficient traffic into the website. We have demonstrated our methodology across large datasets and different website classification tasks, which implies complex challenge to resolve. Our results show that the model achieves high accuracy and an F1-score in all the above applications. By utilizing explainable artificial intelligence (XAI) techniques, we have also identified the most influential features, or 'DNA elements' that potentially determine whether a website is classified as we predicted
Bio:
Doron Cohen is a seasoned manager with extensive experience in technical program management, data science, and process deployment in large & complex organizations. As a Manager of Technical Program Management and Data Science at Intel Corporation, he directed a diverse team across multiple geographies, planning and tracking large-scale cutting edge hardware projects while involving innovating data management and AI solutions. Previously, he held roles as a Technical Program Manager and Project Analyst at Intel, and as a Supplier Manager at Israel Aerospace Industries. Doron is currently pursuing a PhD in Industrial Engineering and AI at Tel Aviv University, where his research focuses on using AI to classify websites by their structural and design components. He holds an MSc in Industrial Engineering and a BSc in Industrial & Management Engineering. Doron is also volunteering as a director in a non-profit organization which helps people dealing with anxiety and depression, he also actively mentors junior engineers inside and outside of Intel corporation