In general, the goal of project is to investigate the train travel data and figure out the main factors affecting train travel time. Moreover, we will use machine learning algorithms to predict their arrival time to stations and forecast when delays will happen. Specifically, to figure out what factors are affecting train travel times, we will investigate several possibilities according to prior empirical knowledge. Among them, useful factors will be chosen from the data exploration process and undergone statistical significance tests.