What are you interested in?
Jun 13th, 2018The Analytics for Urban Transportation & Operations Laboratory For advanced transportation systems
May 24th, 2018Postdoctoral fellowship winners for training at Northwestern University The TAU-Northwestern post-doc program 2018
Sep 8th, 2016Academic Ranking of World Universities in Engineering/Technology and Computer We were ranked again in 51-75 dignitaries along with the Technion
Jul 28th, 2016Ohad Eizenhendler's work reached the finals Ohad Eizenhendler's work reached the finals
Jul 21st, 2016The Robot that will Teach us about Human Curiosity The Curiosity Lab in Tel-Aviv University wants to make robots behave like curious babies that explore their environment, in the hope of gaining new insights into human behavior and new ways how to implement it in robots of the future.
Jul 16th, 2015New Cyber Security study aims to identify malicious websites through their
Web Mining utilizes data mining techniques to discover and automatically learn information on the Internet (www). One of the most challenging applications of Web Mining is identifying the type of a specific website based on the category to which it belongs, and in particular identifying malicious websites. Malicious websites are websites created with the intent of harming users, stealing information or conducting other undesirable activities when the user enters the website, and afterwards.
In recent years, many methods have been developed for identifying website categories. Some are based on analyzing the website's textual content, its users' navigating profiles??, suspect traits found in the website itself, IP forgeries and many other properties. At the same time, the already enormous amount of information existing on the Internet continues to rise rapidly and exponentially, making the identification task ever more complex – often requiring a great deal of expensive resources.
A study carried out by Doron Cohen, under the supervision of Prof. Irad Ben-Gal and Prof. Shulamith Kreitler, tested a new method for identifying websites, based on an analysis of their graphic design, via methods of data mining. To this end, an algorithm was constructed, which receives a URL as input, and produces an output by retrieving and processing all graphic design features and storing them as tables in the server. The researchers examined hundreds of home pages of websites taken from Google's top 1000 sites. For each of these, over 1000 graphic design features were examined, such as: size of area covered by each color, font size, number of characters, standard deviations, quantity and type of elements etc.
After processing and analyzing the data, the researchers built a predictive model based on a decision tree, and then cross-validated the model. In the first experiment they discovered that classification based solely on graphic design enables relatively high prediction of all five examined website categories (including that of malicious websites). Another experiment found that adding graphic design features to another objective prediction method can improve the precision of identification – specifically of malicious websites - by 95%, and in a statistically significant manner, using low-cost resources and low runtimes. A possible explanation for these findings is that malicious websites apparently try to conceal keywords to avoid detection by search engines, while a search based on so many graphic features can identify repeated patterns which are more difficult to conceal.
The study revealed that graphic design, especially colors, plays an important part in the prediction of website categories. Adding graphic properties to other predictive systems can improve their accuracy, and is therefore highly recommended. The study will be presented at the national cyber conference scheduled to take place in September at Tel Aviv University.