Cornell's Data Science Hackathon kicks off on February 12th with networking. Cornell Tech is our meeting spot for a weekend of coding, camaraderie, and creativity. This interdisciplinary & experiential event is open-ended and meant to bring together engineers, business students, designers, entrepreneurs, etc. With a focus on utilizing data science, it's your turn to produce viable solutions in one of our verticals.
7:15pm Fri Feb 12
Keynote: Claudia Perlich-Chief Scientist, Dstillery
7:45-10pm Fri Feb 12
Opening Remarks
Rules & Logistics
Students have 90 seconds to pitch their ideas
Team formation & networking
9am Sat Feb 13
Breakfast @ Cornell Tech
Welcome
10am Sat Feb 13
Hacking Starts
Work space is assigned
11am Sat Feb 13
Capital One Tech TalkÂ
12pm Sat Feb 13
Using data to redefine success in amateur baseball
By Spencer Wright, GameChanger
12pm Sat Feb 13
Synopsis must be completed
1pm Sat Feb 13
Mentors start arriving
2:00pm Sat Feb 13
PiNG: 1 member from ea. team reports
Progress, Needs, Goals (these continue every 4hrs for duration of event)
10am Sun Feb 14
Stop hacking,speed round pitches
11am Sun Feb 14
Top Ten Teams Demo
4min demo, 4min Q&A
12:00pm Sun Feb 14
Break for Lunch
1:00pm Sun Feb 14
Conclusion of Demos
2:30pm Sun Feb 14
Winners announced
Prizes awarded
Teams
Recommended team size is 6 (a minimum of 4 members is enforced). Â
Synopsis
By noon on Saturday every team MUST complete the synopsis document that is posted on their conference room door. Â Synopsis form fields include: Team name, Members, Name of Product/Service, Overview of problem your team is working on, areas in which you seek advice/assistance from mentors. PiNGs are also listed on synopsis & schedule. At 2pm, 6pm, 10pm, etc, 1 member of you team must meet in the common area to report your teams Progress, Needs, and Goals.Â
Data
Datasets used for the Hackathon need to be public (all participants should have open access to the data). Either data needs to be available for use on public domain and legal to use, or provided data sources by participating companies. Teams must disclose the use of their own data set.
Â
Code
All code used must be open sourced. It can be existing code, new code written for the hackathon project, or a combination, but all code must be open sourced and licensed appropriately.Â
Â
Submitting projects
Final submission consists of: Emailing your Demo OR a link to your demo to ams345@cornell.edu. Title/Name of project; a short description of project (<25 words); name of team & team members; data source; code (linked through Github); screen-cast or screenshots describing the project. Team Photograph. All projects must be submitted by the conclusion of the event.Â
Presentations/Demos
Presentations will be judged during a poster session/speed round on Sunday where judges will go around and ask teams to present their projects. Â The top 10 teams (selected by the judges) will then present to all the judges and hackathon participants. Â Teams will present using their own computers. Time allotted for each presentation is 4 minutes (strictly enforced), plus up to 4min of Q&A (from the judges only).
Introduction
In line with the core vision of Cornell's newly developed campus in NYC, this event is directly applicable to the new age of data and aims to help students understand the opportunities created by it. The event further challenges you to develop solutions in an intensive, limited time frame, while working with a team of students outside your own major. This highly in demand skill set is applicable to companies in almost all domains of life. Throughout the event, mentors from industry and faculty will coach you. At the conclusion of the hackathon on Sunday, teams will demo their proposed solutions to an audience comprised of students, faculty, staff, alumni, mentors, and judges. While the event is a competition, it is also a 'coopetition' and encourages collaboration among all teams.
Â
Why Data Science?
Â
Because when we did the first one in 2015 (1st on east coast by a university) it was an overwhelming success. Â We could not accomodate all the students that wanted to participate. Data science hackathons are new for both working professionals and university students. Many major industries are riding on a new wave of opportunity as collecting data and computation is becoming cheaper and more efficient. A decade ago, only major companies invested in data science. Now, almost all companies are collecting more and more data, but struggling to monetize it efficiently in decision making. There is a huge demand for qualified data scientists.
What is Data Science?
Deriving meaning from data by understanding how it fits into the larger picture. Think of business analytics that utilize CS, modeling, statistics, analytics, and mathematics. By 2020 the world will generate 50x the amount of information compared to 2011 [EMC.com]. The U.S. could face a shortage of up to 190,000 professionals with data science skills by 2018 [McKinsey Global Institute]. Business, healthcare, and urban living will all benefit from problems analyzed using data science.
If you would like to participate in the hackathon, feel free to check out some resources and start hacking.
  Â
Three Verticals to Compete In
Visualization
Product
Analysis
 ~$4,000 in cash & prizes will be awarded
Â
Following the final pitches on Sunday afternoon, the panel of judges will evaluate each project and pick winners.
Company/sponsor specific prizes will be selected by participating companies and their representatives.
Â
Each project will be evaluated in the 3 categories below
Product
A data product, something used by a customer over time, will be judged upon its usefulness, interface and creativity.
Analysis
A good analysis project will apply sophisticated statistics or machine learning to derive interesting insights from the dataset. An analysis project will be judged upon technical strength, complexity, impact of derived result and innovation
Visualization
A good visualization project will find interesting ways to look into a dataset or combine information from multiple data sources. A viz. project will be judged upon design aesthetics, value add, and complexity.
Â
The first prize will go to the best overall score. Three subsequent prizes will be selected for each individual category.
It is highly recommended, to consider all 3 while deciding the scope of your project. Reach out to the organizers or mentors if you have any questions.
Â
General Criteria for Each Category
Creativity of Idea
Creativity of approach and solution
Technical difficulty
Importance of question asked and impact on problem addressed
Degree of completionÂ
CartoDB is an easy-to-use, open source, cloud-based geospatial data visualization tool build on top of a PostgreSQL + PostGIS database that easily integrates w/other APIs.
Rhine is an API that allows you to build applications that understand relationships between people, places, and things.
D3:Â https://d3js.org/
Lots of Examples (Plug data into example for wonderful visualizations)
fnu Apoorva, Cornell University, PhD MAE '17
Kaitlyn Gayvert, Cornell University, PhD Comp Bio '17
Aleksandr Makarov, Columbia University MS Data Science '16
Mayur Saxena, Columbia University, PhD BioMed '18
Harshit Saxena, Columbia University MS CS '16
Tianyun Wu, Cornell University, MS Stastical Sci '16
A complete list of hackathons planned for this academic year:Â eship.cornell.edu/hackathons
 Tech Events Manager-Ami Stuart-can be reached by clicking the 'contact organizer' link below on the right.