Data Analytics Cumulative Capstone Projects

Data Analytics Cumulative Capstone Project Description:

Use historical Citi Bike trip data* to identify patterns in urban transportation across NYC. Explore how usage varies by time, location, and weather, and develop insights to inform operational or policy decisions.

*You may choose your own set of data. Citi Bike is an example. 

Deliverables: 

  1. Gather & Prepare Data
    • Download Citi Bike data for the most recent 12-month period. Join it with relevant external data sources (e.g. weather, borough population) using SQL or Python, and clean/normalize the dataset for analysis.
  2. Perform Exploratory Data Analysis 
    • Analyze trends in trip duration, popular start and end stations, usage by time of day/week, and user demographics. Use Excel, Python (Pandas, Seaborn), or SQL queries to generate summary statistics and initial visuals.
  3. Build Geospatial and Time-Based Visuals
    • Use Tableau or Python to map high-traffic areas, station-to-station flows, and time series of ride volume. Focus on identifying usage peaks, gaps, and imbalances across neighborhoods.
  4. Create a Predictive or Insight Model
    • Use Python (Scikit-learn) to build a simple predictive model (e.g., regression or clustering) to estimate demand based on time, location, and weather. Explain your feature selection and model accuracy.
  5. Develop a Final Presentation
    • Prepare a presentation (slides or dashboard) explaining your findings, visualizations, and methodology. Clearly communicate patterns, anomalies, and recommendations, and describe the tools you used at each step. You should discuss the strengths and limitations of your analysis, along with potential steps to enhance it further.
Yelp Facebook LinkedIn YouTube Twitter Instagram