Computational Model for Webpage Aesthetics using SVM

Computational model for webpage aesthetics prediction helps designer to determine usability and to improve it. It has been reported that positional geometry of the webpage objects are primarily important for aesthetics computation. In this paper, we propose a computational model for predicting webpage aesthetics based on the positional geometry features of webpage objects. We have considered the best known 13 features that affect aesthetics. By varying these 13 features, we have designed 52 interfaces and rated them by 100 users in a 5 point Likerts scale. Our 1 dimensional ANOVA study on users rating shows, 9 out of the 13 features are important for webpage aesthetics. Based on these 9 features, we created a computational model for webpage aesthetics prediction. Our computational model works based on Support Vector Machine (SVM). To judge the efficacy of our model, we considered 10 popular webpages, and got them rated by 80 users. Experimental results show that our computational model can predict webpage aesthetics with an accuracy of 90%.


INTRODUCTION
With the advancement of technology, electronic devices like mobiles, laptops, i-pads etc. have become an essential part in our daily life.As a result, design of interfaces has become a vital issue for the researchers of Human Computer Interaction (HCI) community.Good interface design requires knowledge about people: how they see, understand and think [Galitz (1997)].Another important aspect of good design is how information is visually presented to the users or the aesthetics of the interface.It is argued that aesthetically pleasing interface increase user efficiency and decrease perceived interface complexity, which in turn helps in increasing usability, productivity and acceptability.So the current requirement of interface design is the design of aesthetically pleasing interface.Thus, in general we can say that the interface aesthetics have an important role to play in determining usability.This is true particularly in the web page design.A webpage is composed of many rectangular shapes webpage objects.The content of these objects may be text, image and short video/animation etc. Geometric positions of these webpage objects in layouts along with their contents may be perceived as complex by the user.Aesthetics can play important role here in reducing the perceived interface complexity and increase acceptability and therefore, usability of the system.Since aesthetics is important in webpage design, it is necessary to measure it.Aesthetic measurement of webpage is considered to be subjective.Hence, the measure is primarily done through empirical means.A parallel research effort also attempted to develop computational models to evaluate aesthetics of the whole interface [Ngo et al. (2003)] as well as parts of it [Bansal et al. (2013);maity et al. (2015)].The advantage of computational model has the ability to evaluate interface aesthetics automatically, which in turn makes it possible to automate the design process itself.Computation of webpage aesthetics is a challenging task.A number of works reports on the contents of webpage objects.Ngo et al. (2003) reported that positions of rectangular webpage objects in a layout are primarily important for aesthetics computation.On the other hand, contents of webpages objects seem no impact on aesthetics.They proposed a computational model for webpage aesthetics based on the 13 positional geometry features of webpage objects.The average of these feature values (termed as order) was used to judge the aesthetics of a webpage.They also claimed, their model may not be appropriate for the real webpages.So, we felt that there is a need to review the geometry related features and to develop a computational model for webpage aesthetics.In this work, we re-examined the 13 best known features of webpage aesthetics, as reported in Ngo et al. (2003)].We created 52 webpages and rated them by 100 participants.ANOVA study on the users' ratings revealed only 9 out of the 13 features are important for aesthetics measurement.Based on those 9 features, we developed a computational model to predict webpage aesthetics.Our model 2 work based on the linear kernel of Support Vector Machine (SVM) [Cortes et al. (1995)].To judge the efficiency of our model, we considered 10 real webpages, and rated them by 80 users.Experimental results show that our model can predict aesthetics with 90% accuracy.The details of the empirical data collections, the proposed model and analysis are described in this paper.

EMPIRICAL STUDY FOR FEATURE IDENTIFICATION
In order to find the impact of the 13 independent features associated with aesthetics [Ngo et al. (2003)], we performed an empirical study.We created 4 webpages for each feature by systematically varying the feature values.These webpages were rated by 100 participants.One dimensional ANOVA was used to find the impact of each feature, associated with webpage aesthetics.

Experimental Setup
For each feature, we created 4 webpages by varying the feature values systematically in 4 levels significant low (SL), low (LO), average (AV), high (HI).All these features are independent of each other, as reported in [Ngo et al. (2003)].During the variation of each feature (in 4 different feature classes), we did not observe any significant variations in the other features, which may affect aesthetics.Altogether, we designed 52 webpages using Adobe Photoshop CS6TM.The size of each webpage was 700×700 pixels.Figure 1 shows set of 4 webpages where the unity feature varies.All these 52 webpages were shown to the participants on PCs having 2.6 GHz AMD Phenom II X3 710, processor running on Windows 8.Each PC had a 23 inch wide viewing angle colour display.

Participants
One hundred participants took part in our study.Out of them, 20 were school students (average age 16 years), 60 were under and post graduate students (average age 22 years), and rest 20 were teachers (average age 36 years).All of them had normal or corrected-to-normal vision and none of them was colour blind (self reported).All of them were regular computer users.However, none was familiar with screen design concepts.

Procedure of Data Collection
We created webpage models of the 52 webpages without considering the contents of the webpage elements.The webpage models for the unity feature are shown in Figure 1.All the 52 models were rated by 100 users in a five point Likerts scale (1-5); five denoted aesthetically pleasing webpage, and one denoted the aesthetically least pleasing webpage.

Figure 1: webpage designed by varying the unity feature and their models
All the 52 models were rated by 100 users in a five point Likerts scale (1-5); five denoted aesthetically pleasing webpage, and one denoted the aesthetically least pleasing webpage.A browserbased viewer was created for the users, with facilities to view previous/next sample and to rate a webpage model.After viewing each webpage model, participants rated it according to its aesthetic appeal.Each participant rated the 52 models assigned to him/her in two sessions (26 each) in a day.They were allowed to take breaks in each session.These measures were taken to avoid discomfort to the participants that might have arose due to the large number of webpages to be rated.
To avoid the learning effect, we randomly varied the sequence of the webpage models shown to the users.Before data collection, we performed a small training sessions for the participants.In these sessions, participants were familiarized with the 5 point scale and the web interface by which they had to rate the webpage models.

Result and Analysis
We computed the feature values of the 52 webpages using the analytical expressions proposed by Ngo et al. (2003) work.Based on the empirical results, we performed 13 independent 1 dimensional ANOVA (one for each feature) by using the ANOVA1 command of MATLAB 2014.
Results of the study are shown in Table 1.From the Table 1, we observe that the p values are higher (p>0.05) for the features -density (0.11), economy (.108), rhythm (.174) and simplicity (.053).This implies that the variations of these feature values do not have statistically significant impact on the webpage aesthetics.On the contrary, the remaining 9 features balance, cohesion, equilibrium, homogeneity, proportion, regularity, sequence, symmetry and unity were found to be statistically significant for the webpage aesthetics.So, we can claim, only 9 out of the 13 best known geometry features are important for webpage aesthetics.
Accordingly, we propose a computational model based on these 9 features, as discussed below.

PROPOSED COMPUTATIONAL MODEL
Our model works based on the linear kernel of SVM [Cortes et al. (1995)].SVM is popularly used for solving the binary classification problems.In the following section, we discuss about the training procedure of our model.

Model Training
For the model training, we considered a subset of data (9 out of 13 features) from our previous empirical study.It may be noted that we have 5200 (100×4×13) data points which are the ratings of 100 users on the 52 webpages; 4 webpages for each of the 13 features.Out of these 13 features, we considered 9 features for the training of our model.So, for these 9 features we have 36 webpages (4 for each feature) and their corresponding 100 users' ratings.Altogether, we have 3600 (9×4×100) training data points.In our model we consider 9 SVMs to predict the 9 features independently.Each of these 9 SVMs was trained by the 400 data points (100×4), which are the ratings of 100 users for the 4 different webpages of a particular feature.As, SVM works on labelled data, we converted all the 3600 unlabeled data to labelled training data by using the following logic.
A particular feature of a webpage was labelled as good (+1), only when an user gave a rating more than 2, in a 5 point scale, as well as the feature value was greater than or equals to 0.5 (computed by the Ngo et al. (2003) analytical expression) in 0 to 1 scale.This was done to label almost half of the scale for aesthetically pleasing (good) feature and the rest was for aesthetically unpleasing feature.
Based on these labelled data, we trained our model using the SVMTRAIN function of MATLAB 2014.

Empirical Study for Model Validation
The main objective of our model is to predict aesthetics for real webpages.For this purpose, we performed another empirical study on 10 real webpages.These webpages represent some popular domains, like -education (CIT, Kokrajhar  [tcs.com]).For all these webpages, we created the webpage models (constructed without considering the content of the webpage elements) by using Adobe Photoshop CS6TM. Figure 2b shows the model of the IIT Guwahati(Figure 2a).These models were rated by the users in PCs, having 2.6 GHz AMD Phenom II X3 710 processor, running on Windows 8.Each PC had a 23 inch wide viewing angle colour display.

Participants
All the 10 webpage models were rated by 80 new users.Out of them, 40 were male and the rest were female.Fifty participants were undergraduate students (average age 21 years), 20 users were postgraduate students (average age 27 years) and rest 10 were the faculty members (average age 40 years).All the participants were regular computer users but none of them had any knowledge about the website design principles.All the users had normal or corrected-to-normal vision and none of them was colour blind (self reported).

Procedure of data collection
Each participant rated the 10 webpage models using the same browser-based viewer used in our previous study.After viewing each webpage model, they rated it according to its aesthetic appeal in the same 5 point Likerts scale, used in our previous study; 5 denoted aesthetically pleasing webpage, and 1 denoted aesthetically least pleasing webpage.To avoid the learning effect, we randomly varied the sequence of the webpage models shown to the users.A participant rated the 10 webpages assigned to him/her in one session in a day.Before data collection, we performed a small training sessions for the participants.In these sessions, participants were familiarized with the 5 point scale and the web interfaces by which they had to rate the webpages.

Results
Using the 10 webpages we performed an empirical study.For each webpage we considered the mode value (the rating, given by most of the users) as the final rating.Based on the mode value, we classified the webpages in two classes good or bad.For the binary classification we used the following logic -if (mode >= 3) webpage class=good (aesthetically pleasing (+1)) else webpage class = bad (aesthetically unpleasing (-1)).We independently trained our SVMs to predict the feature class (good or bad) of the 9 features (for 10 webpages).Feature class prediction was done using the SVMPREDICT function of MATLAB 2014 version.Finally, for predicting the aesthetics of the whole webpage we used the following algorithm.If most of the predicted features (5 out of 9) are aesthetically pleasing (good) then our algorithm (as mentioned above) predicts the webpage as good (aesthetically pleasing), otherwise it is treated as bad (aesthetically unpleasing).We compared the predicted result of our model with the results obtained from empirical study.
Using our model, we predicted the feature types of the 10 webpages.Then based on our webpage prediction algorithm we predicted the webpages as aesthetically pleasing (+1) or not (-1).Out of 10 webpages, our model accurately predicted 9 webpages, except CIT, Kokrajhar.Thus, our model predicted webpage aesthetics with an accuracy of 90% (9 out of 10).Ngo et al. (2003) claimed that the order value may be a measure for aesthetics computation.However, in our study we observe that the order may not be relevant for real webpages.The order value of the 10 real webpages lied in the range from 0.43 to 0.52.We observed the order value of the facebook is 0.45 was treated by users as aesthetically pleasing webpage.In a contrary, the higher order value (than that of Facebook) of Flipkart 0.47 was treated as aesthetically unpleasing by the users.Based on the above observations, we can claim that order is not a suitable metric for aesthetics computation of real webpages.In contrast, our model works based on SVM; which has the capability for solving binary classification problems with high accuracy.

DISCUSSION AND CONCLUSION
In this work, we reassessed the best known features for webpage aesthetics.We performed an empirical study and found that 9 features are important for aesthetics measurement.By considering these 9 features, we developed a computational model for aesthetics prediction.Our model works based on the linear kernel of SVM.To judge the efficiency of our model, we performed another empirical study on real webpages, and found that our model can predict webpage aesthetics with a high accuracy of 90%.

Table 1 :
Results of ANOVA study on 13 features