1. Background

In 2016, A Company launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geo-tracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

Company sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Users are more likely to ride for leisure, but about 30% use them to commute to work each day.

Company’s pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are members.

2. Objective

Maximize the number of annual subscription members for future growth by converting casual riders into annual subscribers.

3. Business Task

Understand difference between bike usage of annual members and casual riders to help marketing analyst team achieve objective.


4. Data Description

Dataset used is collected and hosted by the company which Consist of there customer’s bike usage data between June 2020 to May 2021. It contains information about ride type, start and end station name with location and customer type causal or member. It is appropriate, complete and latest to address business task at hand, however 7.72% of rows missing at least one value. Original dataset is available at http://tiny.cc/ridedata

 [1] "ride_id"             [2] "rideable_type"     
 [3] "started_at"          [4] "ended_at"          
 [5] "start_station_name"  [6] "start_station_id"  
 [7] "end_station_name"    [8] "end_station_id"    
 [9] "start_lat"          [10] "start_lng"         
[11] "end_lat"            [12] "end_lng"           
[13] "member_casual"    

Table 1 : Dataset Sample, First 10 Rows :-

ride_id rideable_type started_at ended_at start_station_name end_station_name member_casual
02668AD35674B983 docked_bike 2020-05-27 10:03:52 UTC 2020-05-27 10:16:49 UTC Franklin St & Jackson Blvd Wabash Ave & Grand Ave member
7A50CCAF1EDDB28F docked_bike 2020-05-25 10:47:11 UTC 2020-05-25 11:05:40 UTC Clark St & Wrightwood Ave Clark St & Leland Ave casual
2FFCDFDB91FE9A52 docked_bike 2020-05-02 14:11:03 UTC 2020-05-02 15:48:21 UTC Kedzie Ave & Milwaukee Ave Kedzie Ave & Milwaukee Ave casual
58991CF1DB75BA84 docked_bike 2020-05-02 16:25:36 UTC 2020-05-02 16:39:28 UTC Clarendon Ave & Leland Ave Lake Shore Dr & Wellington Ave casual
A79651EFECC268CD docked_bike 2020-05-29 12:49:54 UTC 2020-05-29 13:27:11 UTC Hermitage Ave & Polk St Halsted St & Archer Ave member
1466C5B39F68F746 docked_bike 2020-05-29 13:27:24 UTC 2020-05-29 14:14:45 UTC Halsted St & Archer Ave May St & Taylor St member
2500D7957D4D0A34 docked_bike 2020-05-20 12:51:41 UTC 2020-05-20 13:46:47 UTC Hermitage Ave & Polk St Hermitage Ave & Polk St member
ED42D3E06AFB2F26 docked_bike 2020-05-06 18:21:42 UTC 2020-05-06 19:07:07 UTC Ritchie Ct & Banks St Ritchie Ct & Banks St casual
23AFBD962F9C8F14 docked_bike 2020-05-30 17:00:58 UTC 2020-05-30 17:19:52 UTC Halsted St & Clybourn Ave Broadway & Barry Ave casual
52C0D13F6B81C5F8 docked_bike 2020-05-23 10:22:02 UTC 2020-05-23 10:52:02 UTC Damen Ave & Cortland St Western Ave & Division St casual

5. Data cleaning Log

  • Found number of rows with at least one missing value is 314299 or 7.72% of total dataset.

  • Backed up original dataset and removed those rows.

  • However 7.72% is nearly 1 month worth of data. To compensate for the loss cleaned & added May 2020 data to the dataset.

  • Removed columns ‘start_station_id’, ‘end_station_id’, ‘start_lat’, ‘start_lng’,‘end_lat’ and ‘end_lng’.

  • After cleaning, Dataset contains approximately 4M records.

  • Generated “trip duration(in minutes)” for each ride from start(started_at) and end(ended_at) trip time.

  • Generated “day of week” ride was started by using ride start time(started_at).

 [1] "ride_id"             [2] "rideable_type"     
 [3] "started_at"          [4] "ended_at"          
 [5] "start_station_name"  [6] "end_station_name"
 [7] "member_casual"       [8] "ride_length_in_min"
 [9] "day_of_week"

6. Summary of Analysis


Table 2 : Distribution from May 2020 to May 2021 :-

Below table describes bike usage pattern of two customer types Casual and Members from May 2020 to May 2021. It shows total number of rides taken by Casual customers and Members respectively, their average ride length in minutes and time of longest and shortest rides.

Customer Type Number of Rides Average Ride Time Max Ride Time Min Ride Time
Casual 1664875 44 min 37.7 days 1 min
Member 2294132 18 min 28.6 days 1 min

Table 3 : Distribution by Day of Week :-

Below table further groups the dataset by days of week to gain insight on usage pattern of Casual customers and Members by Number of rides taken and Average ride length in minutes over a week by days.

Casual Member
Day of Week No. of Rides Average Ride Time(mins) No. of Rides Average Ride Time(mins)
Sunday 320876 49 298406 17
Monday 182720 43 304880 14
Tuesday 165858 48 317286 37
Wednesday 176122 39 338724 14
Thursday 183012 41 330460 14
Friday 239281 41 344404 15
Saturday 397031 45 360155 17

Table 4 : Distribution By Month :-

Below table further groups the dataset by months to gain insight on usage pattern of Casual customers and Members by Number of rides taken and Average ride length in minutes over a year by months.

Casual Member
Month No. of Rides Average Ride Time(mins) No. of Rides Average Ride Time(mins)
January 14690 26 68819 12
February 8613 47 34383 14
March 75642 38 130049 13
April 120420 38 177787 14
May 303592 43 347355 16
June 154509 51 188028 18
July 268733 59 281692 17
August 283006 44 325504 16
September 215300 38 285090 15
October 122810 31 216493 14
November 73033 33 149756 13
December 24552 92 89359 95

Table 5 : Distribution By Station By Number of Rides Taken :-

Below table gives insight on Top 10 most used stations by Casual customers and Members respectively by Number of rides started from those stations.

Casual Member
Start Station Name No. of Rides Start Station Name No. of Rides
Streeter Dr & Grand Ave 36482 Clark St & Elm St 23399
Lake Shore Dr & Monroe St 28034 Wells St & Concord Ln 18154
Millennium Park 25185 Broadway & Barry Ave 17944
Theater on the Lake 18454 Dearborn St & Erie St 17689
Michigan Ave & Oak St 18346 St. Clair St & Erie St 17614
Lake Shore Dr & North Blvd 17193 Theater on the Lake 17459
Indiana Ave & Roosevelt Rd 16389 Kingsbury St & Kinzie St 17277
Michigan Ave & Lake St 14309 Wells St & Elm St 16937
Clark St & Elm St 13977 Wells St & Huron St 16524
Shedd Aquarium 13719 Lake Shore Dr & North Blvd 16013

Table 6 : Distribution By Station By Average Ride Duration :-

Below table gives insight on Top 10 most used stations by Casual customers and Members respectively by Average ride length in minutes of rides started from those stations.

Casual Member
Start Station Name Average Ride Time(mins) Start Station Name Average Ride Time(mins)
Kostner Ave & Lake St 920 W Oakdale Ave & N Broadway 204
Clyde Ave & 87th St 860 Racine Ave & 61st St 150
Kenton Ave & Madison St 752 Marshfield Ave & 59th St 102
Ashland Ave & 63rd St 741 Kenton Ave & Madison St 100
Laramie Ave & Kinzie St 706 California Ave & Lake St 94
Ellis Ave & 83rd St 576 Cherry Ave & Blackhawk St 73
East End Ave & 87th St 520 Kedzie Ave & Foster Ave 70
Calumet Ave & 71st St 460 Warren Park West 66
Carpenter St & 63rd St 407 Eberhart Ave & 131st St 63
Kedzie Ave & Roosevelt Rd 380 Western Ave & Granville Ave 62

7. Results

7.1. Chart 1 and Chart 2 shows the distribution of total number of rides taken and Average ride time from May 2020 to May 2021 (13 months) for Casual customers and Members respectively. From the below charts we can observe that Casual customers take fewer but longer duration rides them members.

7.2. Chart 3 and Chart 4 shows the distribution of number of rides taken and Average ride time by days of week for Casual customers and Members respectively. From the below charts we can observe that On Saturday and Sunday Casual customers take more and longer rides then members compared to rest of the days.

7.3. Chart 5 and Chart 6 shows the distribution of number of rides taken and Average ride time by month for Casual customers and Members respectively. Same pattern from chart 1 and chart 2 can be observed here as well but bike usage increases between May to August specially during December for both customer types.

7.4. Chart 7 and Chart 8 shows the number of rides started and Average ride duration from each station in descending order for Casual customers and Members respectively. From distribution by station we can see that 75% of the rides are started from 20% of the stations and only 10% of stations are used by Casual customers to take longer rides.


8. Recomendations