Travel time on a route varies substantially by time of day and from day to day. It is critical to understand to what extent this variation is correlated with various factors, such as weather, incidents, events or travel demand level in the context of dynamic networks. This helps a better decision making for infrastructure planning and real-time traffic operation. We propose a data-driven approach to understand and predict highway travel time using spatio-temporal features of those factors, all of which are acquired from multiple data sources. The prediction model holistically selects the most related features from a high-dimensional feature space by correlation analysis, principle component analysis and LASSO. We test and compare the performance of several regression models in predicting travel time 30 min in advance via two case studies: (1) a 6-mile highway corridor of I-270N in D.C. region, and (2) a 2.3-mile corridor of I-376E in Pittsburgh region. We found that some bottlenecks scattered in the network can imply congestion on those corridors at least 30 minutes in advance, including those on the alternative route to the corridors of study. In addition, real-time travel time is statistically related to incidents on some specific locations, morning/afternoon travel demand, visibility, precipitation, wind speed/gust and the weather type. All those spatio-temporal information together help improve prediction accuracy, comparing to using only speed data. In both case studies, random forest shows the most promise, reaching a root-mean-squared error of 16.6\% and 17.0\% respectively in afternoon peak hours for the entire year of 2014.