7pm Fri Mar 6
Keynotes: Hilary Mason
The Future of Recommendation Engines Michael Li
Students have 90 seconds to pitch their ideas
Team formation & networking
8am Sat Mar 7
Breakfast & kickoff @ Cornell Tech
9am Sat Mar 7
Keynotes: Antonio Castro, the slide deck
Claudia Perlich, the slide deck
10am Sat Mar 7
Hacking starts
Stake claim to a work space and begin
11am Sat Mar 7
Tech Talk: Create interactive maps using excel data w/ CartoDB
1pm Sat Mar 7
Mentors start arriving
1pm Sat Mar 7
Tech Talk: 'Balancing Infrastructure w/Optimization & Problem Formulation' by SailThru Data Scientists
The slide deck
3pm Sat Mar 7
Tech Talk: 'Streaming Data Processing' by Moat Data Scientists
To access presentation via IPython Notebook
5pm Sat Mar 7
Tech Talk: 'Real Time Big Data Processing & Visualization Using Spark & d3.js' by Capital One Data Scientists
12pm Sun Mar 8
Stop hacking
Demos, Judges select top 10
1:30pm Sun Mar 8
Panel of industry leaders: The Future of Data Science
2:30pm Sun Mar 8
Final 10 Pitches (4 min pitch, 4 min Q&A)
3:45pm Sun Mar 8
Winners announced
Prizes awarded
4pm Sun Mar 8
Networking
Rules & Guidelines
Teams
Recommended team size is 6 (a minimum of 4 members is enforced).
Data
Datasets to be used for the Hackathon need to be public (all participants should have open access to the data). Either data needs to be available for use on public domain and legal to use, or provided data sources by participating companies. Teams must disclose the use of their own data set.
Code
All code used must be open sourced. It can be existing code, new code written for the hackathon project, or a combination, but all code must be open sourced and licensed appropriately.
Submitting projects
Final submission consists of: Title/Name of project; a short description of project (~300 words); name of team & team members; data source; code (linked through Github); screen-cast or screenshots describing the project. A link will be provided for project submissions on Sunday.
Presentations
Presentations will be judged during a poster session on Sunday where judges will go around and ask teams to present their projects. The top 10 teams (selected by the judges) will then present to all the judges and Hackathon participants. Teams will present using their own computers. Time allotted for each presentation is 4 minutes (strictly enforced), plus up to 4min of Q&A.
Introduction
In line with the core vision of Cornell's newly developed campus in NYC, this event is directly applicable to the new age of data and aims to help students understand the career paths created by it. The event further challenges you to develop solutions in an intensive, limited time frame, while working with a team of students outside your own major. This highly in demand skill set is applicable to companies in almost all domains of life. Throughout the event, mentors from industry and faculty will coach you. At the conclusion of the hackathon on Sunday, teams will demo their proposed solutions to an audience comprised of students, faculty, staff, alumni, mentors, and judges, both physically present and through live streaming. While the event is a competition, it is also a 'coopetition' and encourages collaboration among all teams.
Why Data Science?
Because it has not been done before (at least not on the east coast and not by a university)! Data science hackathons are new for both working professionals and university students. Many major industries are riding on a new wave of opportunity as collecting data and computation is becoming cheaper and more efficient. A decade ago, only major companies invested in data science. Now, almost all companies are collecting more and more data, but struggling to monetize it efficiently in decision making. There is a huge demand for qualified data scientists.
What is Data Science?
Deriving meaning from data by understanding how it fits into the larger picture of an organization. Think of business analytics that utilize CS, modeling, statistics, analytics, and mathematics. By 2020 the world will generate 50x the amount of information compared to 2011 [EMC.com]. The U.S. could face a shortage of up to 190,000 professionals with data science skills by 2018 [McKinsey Global Institute]. Business, healthcare, and urban living will all benefit from problems analyzed using data science.
If you would like to participate in the hackathon, feel free to check out some resources and start hacking.
Three Verticals to Compete In
Visualization
Product
Analysis
~$5K in cash & prizes will be awarded
Judges
Larry Solomon, N. America Operating Officer, Accenture
Drew Conway, Head of Data, Project Florida
Sarah Guido, Data Scientist, Bitly
Randy Carnevale, Director of Data Science, Capital One
Kevin Novak, Head Data Scientist, Uber
Jesse Beyroutey, IA Ventures
Lela Prashad, Chief Data Scientist & Co-Founder, NiJeL
Judging
Following the final pitches on Sunday afternoon, the panel of judges will evaluate each project and pick winners.
Company/sponsor specific prizes will be selected by participating companies and their representatives.
Each project will be evaluated in the 3 categories below.
Product
A data product, something used by a customer over time, will be judged upon its usefulness, interface and creativity.
Analysis
A good analysis project will apply sophisticated statistics or machine learning to derive interesting insights from the dataset. An analysis project will be judged upon technical strength, complexity, impact of derived result and innovation.
Visualization
A good visualization project will find interesting ways to look into a dataset or combine information from multiple data sources. A viz. project will be judged upon design aesthetics, value add, and complexity.
The first prize will go to the best overall score. Three subsequent prizes will be selected for each individual category.
It is highly recommended, to consider all 3 while deciding the scope of your project. Reach out to the organizers or mentors if you have any questions.
General Criteria for Each Category
Creativity of Idea
Creativity of approach and solution
Technical difficulty
Importance of question asked and impact on problem addressed
Degree of completion
Swag & Prizes

Everyone gets six months of Student Prime free & $5 gift card

Everyone gets a t-shirt, blanket, water bottle & more

Networking & Dinner w/Chief Data Scientist & Engineers

Networking & Lunch with Accenture Tech Executives, & Winner Luncheon & fleece embroidered overnight blankets

Dinner with Data Scientists & Engineering Panel

Networking & Drinks with Executives
Tech Talk by Jack Welde

Lunch & Networking with Data Science Team
Captial One Labs SIG bottles

Lunch & Neworking with CTO & CSO
CartoDB T-shirts
We have developed a dynamic application that computes localized crime to find you the least risky way to get you from point a point b.
Andy Billingsley-Cornell, BS OR '15 Patrick Boueir, Columbia MS MSE '16
Gaby Rojas-Cornell, BS CS '15 Seikun Kambashi-Waterloo, BEng SE '18
Marie Muir-Cornell, MS ILR '15 Shayan Masood-Waterloo, BMath CS '16
clarity (CLICK TO VIEW) AT 1:11:10)
Used tf-idf and t-SNE to show in 3D how topics on Twitter are connected. We aimed to color-code streams of tweets by real-time sentiment analysis, and animate balls representing each topic by how many tweets were coming in. They rendered this world in a Unity3D application for an Oculus Rift. They also introduced an intuitive means for interacting with the 3D data, through simple hand motions, which we tracked through a Microsoft Kinect.
Anton Gilgur-Cornell, BS CS '16 Lance Legel-Columbia, MS CS '16
Baird Howland-Cornell, BS ECE '15 Leo Mebazaa-Columbia, BS CS '16
George Li-Cornell, BS AEM '16 Sagar Vadalia-Cornell, BS CS '18
Jeff Ho-Columbia, MA Stats '15
Lyr[Assist} is a data-driven program drawing from 110,000 songs across all genres that helps songwriters/lyricists fill in their empty blanks, producing more successful lyrics by generating optimal work-choice given user input.
Jiheng Lu-Cornell, MS CS '15 Siyi Fan-Cornell, MS CS '15
John Dunn-Cornell, BS AEM '15 Wen Li-Cornell, MEng CS '15
Rose Pember-Cornell, BA Comp Lit '16 Yanfeng Zhou-Cornell, MEng CS '15
Cary Chen-Cornell, MEng CS '15