Climate Science Hackathon Winner: Team 154 2017-10-31T12:59:29-07:00

Climate Science Hackathon Winner: Team 154

team 154 climate science hackathongroup 154 slide screenshot

Members:

 Xi Chen (ICS; Informatics):

I am a second-year master student majoring in Informatics from ICS.  I have worked in several different data

science projects where I was responsible for the analysis of raw data and also the creation of thoughts and

analysis. My previous hands-on project experience taught me how to identify real world questions, extract and

analyze the data, then visualize and present the report. The Bachelor and Master of Science in Electronic

Engineering and biomedical engineering that I earned from Fudan University were instrumental to my

knowledge of Internet technology during my time as a project manager at China Mobile. Along the way to

earning my second Master degree in Informatics in UCI,  I also developed a strong background in statistics and

machine learning that has served me well in my pursuit to advance the interests of Healthcare information

system. 

 Reza Asadi (Computer Science):

I am a third year PhD student in Computer Science at UC, Irvine. I got my B.Sc. and M.Sc. in Computer

Science at Amirkabir University of Technology, Iran at 2011 and 2013. I am working with Professor Amelia

Regan at UCI. My research area falls in distributed optimization algorithms with the application in Large-scale

Machine Learning, Transportation Systems and Power Flow networks. I am working on distributed

optimization as core component of large scale machine learning problems where a distributed methodologies

increase performance of the system. In transportation systems, I am working on data scientist projects in

academia and industry with the goal of designing efficient intelligent transportation systems. Also, I develop

the theory of optimization to propose a distributed power flow system.

 Ahmad Razavi (Computer Science):

Ahmad Razavi received his bachelor and master degree in computer engineering in 2007, and 2010

respectively. He joined University of California, Irvine in 2014. Currently, He is a computer science PhD

candidate, working on decentralizing data fusion algorithms in multi robot systems. His research interest

includes embedded systems, robotics, and Machine Learning. 

Description:

California Drought dataset contains storage level of 83 reservoirs during the last 15 years. Apart from the storage level

and corresponding date, we also have the geographical information such as longitude, latitude and elevation.  

After taking a quick overview of the data and pre-processing data, the seasonal trend among some of the reservoirs

caught our eyes and brought us to two hypotheses. Firstly, there is a prevalent seasonal trend among most of the

reservoirs. Secondly, geographical parameters are closely related to this trend. Then, we analyzed the dataset to answer

how trends are geographically distributed in California.

We applied temporal auto-correlation and seasonal analysis on each reservoir’s time sequence data. We used phase and

length of cycle to describe each seasonal trend. The results show that most storage levels have seasonal patterns and

change dramatically over the years. Some reservoirs have longer cycles than others, some start peak time earlier than

others. According to the temporal analysis result, we split reservoirs’ storage level into two groups: seasonal and non-

seasonal trends.

Given phase and cycle (temporal correlation) we got from the temporal analysis, we clustered reservoirs using K-means

clustering to illustrate how trends geographically change. The results show not only geographical areas change the

trends which could be results of factors outside of given dataset, but also elevation has a high impact on storage level

trends, e.g. high elevation reservoirs feed by melted snow (in summer), while others feed by rain (in fall and spring).

In conclusion, temporal analysis explains the seasonal trend, spatial analysis illustrates how reservoir’s trends are

related to a geographical area. Both analyses results in better prediction of water level trends in California.  

To further explore the dataset, we could explore the causal relationship between temporal trend and spatial trend. Also,

one can introduce more spatial information such as river and mountain range to the dataset, so that we might able to find

the factors that impact the trends. Finding more factors related to storage levels results in better prediction and

understanding of storage level changes.