Embedding Jupyter Notebooks

I found these tips to embedding Jupyter Notebooks hosted on Github from these great posts.

Master Post: https://www.andrewchallis.co.uk/portfolio/php-nbconvert-a-wordpress-plugin-for-jupyter-notebooks/

Supporting post: https://www.eg.bucknell.edu/~brk009/notebook-on-wp/

Installation Steps:

  1. WP Pusher install
  2. Install the Github plugin to WP Pusher
  3. https://github.com/ghandic/nbconvert
  4. Add the additional CSS code
  5. Insert the Shortcode like the one below.
  6. Celebrate

Personal Activity Tracking Data Analysis by Ivaylo Pavlov

In [1]:
#Generic Imports and Display Settings
import numpy as np, pandas as pd, matplotlib.pyplot as plt, matplotlib.patches as mpatches, warnings
import seaborn as sns, scipy.stats as ss, matplotlib.mlab as mlab
warnings.filterwarnings('ignore')

%pylab inline
plt.rc("savefig", dpi=200)
matplotlib.style.use('ggplot')

pd.set_option('display.max_colwidth',80)
Populating the interactive namespace from numpy and matplotlib

Import and clean the daily data and transform the heart rate intraday data for use

In [2]:
#Import the CSV files with the daily and intraday data
raw_daily_data = pd.read_csv("Health Data-daily.csv")
raw_intraday = pd.read_csv('Health Data-intraday.csv')
raw_sleep_data = pd.read_csv('Sleep Analysis.csv')

#Extract only the heart rate from the intraday data and the delete the rest, add a day only column
raw_intraday_data = raw_intraday.copy()
raw_intraday_data.index = raw_intraday_data['Start']
raw_intraday_data = raw_intraday_data.ix[:,'Heart Rate (count/min)']
raw_intraday_data = pd.DataFrame(raw_intraday_data)
raw_intraday_data.index = pd.to_datetime(raw_intraday_data.index)
raw_intraday_data = raw_intraday_data.ix[raw_intraday_data.index>'11-Mar-2016 18:00']
raw_intraday_data = raw_intraday_data.replace(0, np.nan)
raw_intraday_data = raw_intraday_data.dropna(axis=0)
raw_intraday_data['Date'] = raw_intraday_data.index.date
hr_intraday = raw_intraday_data

#Extract only the steps from the intraday data and the delete the rest, add a day only column
raw_intraday_data2 = raw_intraday.copy()
raw_intraday_data2 = raw_intraday_data2.ix[:,('Start','Finish','Steps (count)')]
raw_intraday_data2 = pd.DataFrame(raw_intraday_data2)
raw_intraday_data2.index = pd.to_datetime(raw_intraday_data2['Start'])
#raw_intraday_data2 = raw_intraday_data2.ix[raw_intraday_data2.index>'11-Mar-2016 18:00']
raw_intraday_data2 = raw_intraday_data2.replace(0, np.nan)
raw_intraday_data2 = raw_intraday_data2.dropna(axis=0)
raw_intraday_data2['Date'] = raw_intraday_data2.index.date
raw_intraday_data2['Hour'] = raw_intraday_data2.index.time
raw_intraday_data2['Weekday'] = pd.DatetimeIndex(raw_intraday_data2['Date']).dayofweek
days = {0:'0 Mon',1:'1 Tue',2:'2 Weds',3:'3 Thurs',4:'4 Fri',5:'5 Sat',6:'6 Sun'}
raw_intraday_data2['Weekday'] = raw_intraday_data2['Weekday'].apply(lambda x: days[x])
steps_intraday = raw_intraday_data2

#Extract only the heart rate from the intraday data and the delete the rest, add a day only column
raw_intraday_data3 = raw_intraday.copy()
raw_intraday_data3 = raw_intraday_data3.ix[:,('Start','Finish','Heart Rate (count/min)')]
raw_intraday_data3 = pd.DataFrame(raw_intraday_data3)
raw_intraday_data3.index = pd.to_datetime(raw_intraday_data3['Start'])
raw_intraday_data3 = raw_intraday_data3.ix[raw_intraday_data3.index>'11-Mar-2016 18:00']
raw_intraday_data3 = raw_intraday_data3.replace(0, np.nan)
raw_intraday_data3 = raw_intraday_data3.dropna(axis=0)
raw_intraday_data3['Date'] = raw_intraday_data3.index.date
raw_intraday_data3['Hour'] = raw_intraday_data3.index.time
raw_intraday_data3['Weekday'] = pd.DatetimeIndex(raw_intraday_data3['Date']).dayofweek
days = {0:'0 Mon',1:'1 Tue',2:'2 Weds',3:'3 Thurs',4:'4 Fri',5:'5 Sat',6:'6 Sun'}
raw_intraday_data3['Weekday'] = raw_intraday_data3['Weekday'].apply(lambda x: days[x])
hr_intraday2 = raw_intraday_data3

#Clean up the sleep data and aggregate by day and add a weekday to the table
raw_sleep_data2 = raw_sleep_data.ix[:,["In bed Finish","Minutes in bed"]]
raw_sleep_data2.index = pd.to_datetime(raw_sleep_data2["In bed Finish"])
raw_sleep_data2.index = raw_sleep_data2.index.date
raw_sleep_data2 = raw_sleep_data2.drop("In bed Finish", axis=1)
raw_sleep_data2 = raw_sleep_data2.groupby(raw_sleep_data2.index).agg('sum')
raw_sleep_data2['Hours in bed'] = raw_sleep_data2['Minutes in bed']/60
raw_sleep_data2 = raw_sleep_data2.drop("Minutes in bed", axis=1)
raw_sleep_data2['Weekday'] = pd.DatetimeIndex(raw_sleep_data2.index).dayofweek
days = {0:'0 Mon',1:'1 Tue',2:'2 Weds',3:'3 Thurs',4:'4 Fri',5:'5 Sat',6:'6 Sun'}
raw_sleep_data2['Weekday'] = raw_sleep_data2['Weekday'].apply(lambda x: days[x])
In [3]:
#Daily amount of sleep
raw_sleep_data2.plot.bar(color='g',figsize=(10,2))
Out[3]:
<matplotlib.axes._subplots.AxesSubplot at 0x20c8d0422b0>
In [4]:
# Average sleep per weekday
raw_sleep_data2.groupby("Weekday").agg('mean').plot.bar(color='purple',figsize=(5,2))
Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x20c8d5a2be0>

Delete all the zero columns, set table index and delete the useless columns and add Total Calories and Weekday columns to DataSet

In [5]:
data = raw_daily_data.ix[:, (raw_daily_data != 0).any(axis=0)]
data = data.set_index(pd.DatetimeIndex(data['Start']))
data = data.drop(['Start','Finish'],axis=1)

data['Total Calories (kcal)'] = data['Active Calories (kcal)'] + data['Resting Calories (kcal)']
data['Weekday'] = data.index.dayofweek
days = {0:'0 Mon',1:'1 Tue',2:'2 Weds',3:'3 Thurs',4:'4 Fri',5:'5 Sat',6:'6 Sun'}
data['Weekday'] = data['Weekday'].apply(lambda x: days[x])
In [6]:
#Get table stats - rows and columns
print ("ROWS OF DATA / DAILY: " + str(raw_daily_data.shape[0]) + " / INTRADAY: " + str(hr_intraday.shape[0]))
print ("")
data.tail(7)
ROWS OF DATA / DAILY: 767 / INTRADAY: 8149

Out[6]:
Active Calories (kcal) Blood Glucose (mg/dL) Blood Pressure (Diastolic) (mmHg) Blood Pressure (Systolic) (mmHg) Body Fat Percentage (%) Body Mass Index (count) Distance (km) Flights Climbed (count) Heart Rate (count/min) Lean Body Mass (kg) Oxygen Saturation (%) Resting Calories (kcal) Steps (count) Weight (kg) Total Calories (kcal) Weekday
Start
2017-09-04 349.592000 0.0 0.0 0.0 0.00 0.0 5.597074 18.0 104.0 0.000 0.0 1580.811218 7257.000000 0.00 1930.403218 0 Mon
2017-09-05 353.263329 0.0 0.0 0.0 0.00 0.0 5.420324 18.0 116.0 0.000 0.0 1584.984832 7038.004403 0.00 1938.248161 1 Tue
2017-09-06 416.007722 0.0 0.0 0.0 0.00 0.0 7.115341 21.0 132.0 0.000 0.0 1592.209780 9240.202021 0.00 2008.217502 2 Weds
2017-09-07 657.412282 0.0 0.0 0.0 0.12 18.6 7.555005 19.0 162.0 51.348 0.0 1601.301638 9711.000000 58.35 2258.713919 3 Thurs
2017-09-08 332.833916 0.0 0.0 0.0 0.00 0.0 5.087559 10.0 107.0 0.000 0.0 1579.810541 6691.846079 0.00 1912.644457 4 Fri
2017-09-09 429.742219 0.0 0.0 0.0 0.00 0.0 7.816389 7.0 105.0 0.000 0.0 1605.664488 10102.153921 0.00 2035.406707 5 Sat
2017-09-10 309.694532 0.0 0.0 0.0 0.00 0.0 2.485014 0.0 140.0 0.000 0.0 837.291716 3186.000000 0.00 1146.986248 6 Sun
In [7]:
data.ix[:,'Body Fat Percentage (%)'] *= 100 #Scale Body Fat Percentage (%) by 100
data.ix[:,'Weight (kg)'] = data.ix[:,'Weight (kg)'].replace(to_replace=0, method='ffill')
data.ix[:,'Body Fat Percentage (%)'] = data.ix[:,'Body Fat Percentage (%)'].replace(to_replace=0, method='ffill')
data.ix[:,'Body Mass Index (count)'] = data.ix[:,'Body Mass Index (count)'].replace(to_replace=0, method='ffill')
data.ix[:,'Lean Body Mass (kg)'] = data.ix[:,'Lean Body Mass (kg)'].replace(to_replace=0, method='ffill')
data_for_weight = data[np.isfinite(data['Weight (kg)'])]
data_for_weight = data_for_weight.ix[:,['Weight (kg)','Lean Body Mass (kg)','Body Fat Percentage (%)','Body Mass Index (count)']]
data_for_weight = data_for_weight.ix[data_for_weight.index>'2016-03-28']
print ("Rows remaining with Weight data: " + str(len(data_for_weight)))
Rows remaining with Weight data: 531

Clean up STEPS, FLIGHTS CLIMBED, DISTANCE data, Setting all days with less than 550 steps or less than 0.400 km as NaN (haven't worn the tracker or battery died)

In [8]:
print ("Rows before clean up: " + str(len(data)))
data_for_steps = data
data_for_steps.ix[data_for_steps.ix[:,'Steps (count)']<550,'Steps (count)'] = np.nan
data_for_steps.ix[data_for_steps.ix[:,'Distance (km)']<0.400,'Distance (km)'] = np.nan
data_for_steps = data_for_steps[np.isfinite(data_for_steps['Steps (count)'])]
data_for_steps = data_for_steps[np.isfinite(data_for_steps['Distance (km)'])]
print ("Rows lost after clean up: " + " " + str(len(data)-len(data_for_steps.index)))
print ("Rows after clean up: " + " " + str(len(data_for_steps.index)))
data_for_steps.tail()
Rows before clean up: 767
Rows lost after clean up:  103
Rows after clean up:  664
Out[8]:
Active Calories (kcal) Blood Glucose (mg/dL) Blood Pressure (Diastolic) (mmHg) Blood Pressure (Systolic) (mmHg) Body Fat Percentage (%) Body Mass Index (count) Distance (km) Flights Climbed (count) Heart Rate (count/min) Lean Body Mass (kg) Oxygen Saturation (%) Resting Calories (kcal) Steps (count) Weight (kg) Total Calories (kcal) Weekday
Start
2017-09-06 416.007722 0.0 0.0 0.0 12.7 18.8 7.115341 21.0 132.0 51.53319 0.0 1592.209780 9240.202021 59.03 2008.217502 2 Weds
2017-09-07 657.412282 0.0 0.0 0.0 12.0 18.6 7.555005 19.0 162.0 51.34800 0.0 1601.301638 9711.000000 58.35 2258.713919 3 Thurs
2017-09-08 332.833916 0.0 0.0 0.0 12.0 18.6 5.087559 10.0 107.0 51.34800 0.0 1579.810541 6691.846079 58.35 1912.644457 4 Fri
2017-09-09 429.742219 0.0 0.0 0.0 12.0 18.6 7.816389 7.0 105.0 51.34800 0.0 1605.664488 10102.153921 58.35 2035.406707 5 Sat
2017-09-10 309.694532 0.0 0.0 0.0 12.0 18.6 2.485014 0.0 140.0 51.34800 0.0 837.291716 3186.000000 58.35 1146.986248 6 Sun

Clean up CALORIES, WEEKDAY, HEART RATE (daily) data

In [9]:
data_for_cal = data
data_for_cal.ix[data_for_cal.ix[:,'Total Calories (kcal)']<0.1,['Total Calories (kcal)','Active Calories (kcal)','Resting Calories (kcal)']] = np.nan
data_for_cal = data[np.isfinite(data['Total Calories (kcal)'])]
data_for_cal = data_for_cal.ix[data_for_cal.index>'2016-03-11']
data_for_cal2 = data_for_cal.drop(['Body Fat Percentage (%)','Body Mass Index (count)','Lean Body Mass (kg)','Weight (kg)','Distance (km)','Flights Climbed (count)','Steps (count)','Weekday','Total Calories (kcal)','Heart Rate (count/min)'],axis=1)

data_for_weekday = data.drop(['Body Fat Percentage (%)','Body Mass Index (count)','Lean Body Mass (kg)','Weight (kg)'],axis=1)

data_for_hr = data[["Heart Rate (count/min)","Weekday"]].copy()
data_for_hr = data_for_hr.ix[data_for_hr["Heart Rate (count/min)"]>0]
In [10]:
plt.figure(1,figsize=(18,7))

plt.subplot(311)
plt.title('Frequency Charts')
plt.legend(handles=[mpatches.Patch(color='green', label='Steps')])
plt.hist(data_for_steps.ix[:,'Steps (count)'], bins=90, color='g')
plt.xlim(0,data_for_steps.ix[:,'Steps (count)'].max())

plt.subplot(312)
plt.legend(handles=[mpatches.Patch(color='orange', label='Kilometers')])
plt.hist(data_for_steps.ix[:,'Distance (km)'], bins=90, color='orange')
plt.xlim(0,data_for_steps.ix[:,'Distance (km)'].max())

plt.subplot(313)
plt.legend(handles=[mpatches.Patch(color='royalblue', label='Flights Climbed')])
plt.hist(data_for_steps.ix[:,'Flights Climbed (count)'], bins=60, color='royalblue')
plt.xlim(0,data_for_steps.ix[:,'Flights Climbed (count)'].max())

plt.figure(2,figsize=(16,3))
plt.subplot(121)
plt.legend(handles=[mpatches.Patch(color='red', label='Total Calories (kcal)')])
plt.hist(data_for_cal.ix[:,'Total Calories (kcal)'], bins=60, color='red')
plt.xlim(0,data_for_cal.ix[:,'Total Calories (kcal)'].max())

plt.subplot(122)
plt.legend(handles=[mpatches.Patch(color='salmon', label='Weight (kg)')])
plt.hist(data_for_weight.ix[:,'Weight (kg)'], bins=70, color='salmon')
plt.xlim(data_for_weight.ix[:,'Weight (kg)'].min(),data_for_weight.ix[:,'Weight (kg)'].max())

plt.tight_layout()
plt.show()
In [11]:
medianval = np.round(data_for_steps.ix[:,'Steps (count)'].median(),1)
avgval = np.round(data_for_steps.ix[:,'Steps (count)'].mean(),1)
maxval = np.round(data_for_steps.ix[:,'Steps (count)'].max(),1)

minor_ticks = np.arange(0, maxval+1, 1500, dtype=int)
minor_labels = minor_ticks

ax1 = data_for_steps.ix[:,'Steps (count)'].plot(color='000000',figsize=(11, 3),linewidth=1.0)

ax1.set_ylim(0,maxval)
ax1.set_ylabel('Steps')
ax1.set_yticks(minor_ticks)
ax1.set_yticklabels(minor_labels)
ax1.yaxis.tick_right()

ax1.set_xticks(data_for_steps.index, minor=True)
#Add the horizontal and vertical lines
ax1.axhline(y=medianval, linewidth=1, color='y')
ax1.axhline(y=avgval, linewidth=1, color='g')
ax1.axvline(x='2016-03-11', linewidth=1, color='r')
ax1.axvline(x='2016-11-25', linewidth=1, color='r')
ax1.grid(b=False)

#Rolling 20 Day MA
ma = data_for_steps.ix[:,'Steps (count)'].rolling(20).mean()
ax1.plot(ma)

pd.DataFrame(data_for_steps["Steps (count)"].describe()).transpose()
Out[11]:
count mean std min 25% 50% 75% max
Steps (count) 664.0 7458.360687 3533.766019 562.0 5510.0 7165.0 9231.050505 25540.233598
In [12]:
medianval2 = np.round(data_for_steps.ix[:,'Flights Climbed (count)'].median(),1)
avgval2 = np.round(data_for_steps.ix[:,'Flights Climbed (count)'].mean(),1)
maxval2 = np.round(data_for_steps.ix[:,'Flights Climbed (count)'].max(),1)

minor_ticks_stairs = np.arange(0, maxval2+2, 3, dtype=int)
minor_labels_stairs = minor_ticks_stairs

ax2 = data_for_steps.ix[:,'Flights Climbed (count)'].plot(color='royalblue',figsize=(11, 3),linewidth=1.0)

ax2.set_ylim(-1,maxval2)
ax2.set_ylabel('Flights Stairs')
ax2.set_yticks(minor_ticks_stairs)
ax2.set_yticklabels(minor_labels_stairs)
ax2.yaxis.tick_right()

#Add the horizontal and vertical lines
ax2.axhline(y=medianval2, linewidth=1, color='y')
ax2.axhline(y=avgval2, linewidth=1, color='g')
ax2.axvline(x='2016-03-11', linewidth=1, color='r')
ax2.axvline(x='2016-11-25', linewidth=1, color='r')
ax2.grid(b=False)

#Rolling 20 Day MA
ma2 = data_for_steps.ix[:,'Flights Climbed (count)'].rolling(20).mean()
ax2.plot(ma2,linewidth=1.0)

pd.DataFrame(data_for_steps['Flights Climbed (count)'].describe()).transpose()
Out[12]:
count mean std min 25% 50% 75% max
Flights Climbed (count) 664.0 11.946164 8.222177 0.0 5.448819 11.0 18.0 40.4134
In [13]:
medianval3 = np.round(data_for_steps.ix[:,'Distance (km)'].median(),1)
avgval3 = np.round(data_for_steps.ix[:,'Distance (km)'].mean(),1)
maxval3 = np.round(data_for_steps.ix[:,'Distance (km)'].max(),1)

minor_ticks_km = np.arange(0, maxval3+1, 3, dtype=int)
minor_labels_km = minor_ticks_km

ax6 = data_for_steps.ix[:,'Distance (km)'].plot(color='orange',figsize=(11, 3),linewidth=1.0)

ax6.set_ylim(-1,data_for_steps.ix[:,'Distance (km)'].max())
ax6.set_ylabel('Distance (km)')
ax6.set_yticks(minor_ticks_km)
ax6.set_yticklabels(minor_labels_km)
ax6.yaxis.tick_right()

#Add the horizontal and vertical lines
ax6.axhline(y=medianval3, linewidth=1, color='y')
ax6.axhline(y=avgval3, linewidth=1, color='g')
ax6.axvline(x='2016-03-11', linewidth=1, color='r')
ax6.axvline(x='2016-11-25', linewidth=1, color='r')
ax6.grid(b=False)

#Rolling 20 Day MA
ma2 = data_for_steps.ix[:,'Distance (km)'].rolling(20).mean()
ax6.plot(ma2,linewidth=1.0)

pd.DataFrame(data_for_steps['Distance (km)'].describe()).transpose()
Out[13]:
count mean std min 25% 50% 75% max
Distance (km) 664.0 5.943803 2.861489 0.45 4.360709 5.634852 7.345759 22.046162
In [14]:
#Scatter chart Steps vs Stairs
plt.figure(1,figsize=(15,7))

x = data_for_steps.ix[:,'Steps (count)']
y = data_for_steps.ix[:,'Flights Climbed (count)']
plt.xlabel('Steps (count)')
plt.ylabel('Flights of Stairs')
plt.xlim(-5,data_for_steps.ix[:,'Steps (count)'].max()+200)
plt.tick_params(axis='y', which='both', labelleft='off', labelright='on')
plt.ylim(-2,data_for_steps.ix[:,'Flights Climbed (count)'].max()+2)
plt.grid(b=False)
plt.axhline(y=avgval2, linewidth=1, color='y')
plt.axvline(x=avgval, linewidth=1, color='r')

plt.scatter(x, y, alpha=0.7,c=data_for_steps.ix[:,'Distance (km)'], s=data_for_steps.ix[:,'Total Calories (kcal)']/10, cmap=cm.brg)
plt.show()
Most active day ever by number of climbed stairs
In [15]:
data_for_steps.ix[data_for_steps['Flights Climbed (count)']==data_for_steps['Flights Climbed (count)'].max()]
Out[15]:
Active Calories (kcal) Blood Glucose (mg/dL) Blood Pressure (Diastolic) (mmHg) Blood Pressure (Systolic) (mmHg) Body Fat Percentage (%) Body Mass Index (count) Distance (km) Flights Climbed (count) Heart Rate (count/min) Lean Body Mass (kg) Oxygen Saturation (%) Resting Calories (kcal) Steps (count) Weight (kg) Total Calories (kcal) Weekday
Start
2016-05-06 513.0 0.0 0.0 0.0 12.3 18.0 12.07 40.4134 115.0 48.91029 0.0 1526.0 15216.0 55.77 2039.0 4 Fri
Most active day ever by number of steps
In [16]:
data_for_steps.ix[data_for_steps['Steps (count)']==data_for_steps['Steps (count)'].max()]
Out[16]:
Active Calories (kcal) Blood Glucose (mg/dL) Blood Pressure (Diastolic) (mmHg) Blood Pressure (Systolic) (mmHg) Body Fat Percentage (%) Body Mass Index (count) Distance (km) Flights Climbed (count) Heart Rate (count/min) Lean Body Mass (kg) Oxygen Saturation (%) Resting Calories (kcal) Steps (count) Weight (kg) Total Calories (kcal) Weekday
Start
2017-02-25 683.524 0.0 0.0 0.0 11.9 18.0 20.060258 25.0 102.0 49.080509 0.0 1548.050351 25540.233598 55.709999 2231.574351 5 Sat
Most active day ever by distance done
In [17]:
data_for_steps.ix[data_for_steps['Distance (km)']==data_for_steps['Distance (km)'].max()]
Out[17]:
Active Calories (kcal) Blood Glucose (mg/dL) Blood Pressure (Diastolic) (mmHg) Blood Pressure (Systolic) (mmHg) Body Fat Percentage (%) Body Mass Index (count) Distance (km) Flights Climbed (count) Heart Rate (count/min) Lean Body Mass (kg) Oxygen Saturation (%) Resting Calories (kcal) Steps (count) Weight (kg) Total Calories (kcal) Weekday
Start
2017-05-28 199.462149 0.0 0.0 0.0 13.3 19.0 22.046162 0.0 107.0 51.63852 0.0 0.0 2940.0 59.56 199.462149 6 Sun

Weight, Lean Body Mass and Body Fat % Analysis

In [18]:
minor_ticks_weight = np.arange(data_for_weight['Lean Body Mass (kg)'].min()-1, data_for_weight['Weight (kg)'].max()+1, 1, dtype=int)
minor_labels_weight = minor_ticks_weight

ax9 = data_for_weight['Weight (kg)'].plot(secondary_y=True,figsize=(11, 3));
ax9 = data_for_weight['Lean Body Mass (kg)'].plot(secondary_y=True,figsize=(11, 3));
ax9.legend(loc=1, bbox_to_anchor=(0.5, 0.1), ncol=2)
ax9.grid(b=False)
ax9.yaxis.tick_right()
ax9.set_ylim(data_for_weight['Lean Body Mass (kg)'].min()-1,data_for_weight['Weight (kg)'].max()+1)
ax9.set_yticks(minor_ticks_weight)
ax9.set_yticklabels(minor_labels_weight)

data_for_weight.tail(1)
Out[18]:
Weight (kg) Lean Body Mass (kg) Body Fat Percentage (%) Body Mass Index (count)
Start
2017-09-10 58.35 51.348 12.0 18.6
In [19]:
#Averages, Medians, High, Low per Weekday
data_for_weekday.groupby('Weekday').agg(['mean','median','min','max','std']).transpose()
Out[19]:
Weekday 0 Mon 1 Tue 2 Weds 3 Thurs 4 Fri 5 Sat 6 Sun
Active Calories (kcal) mean 568.621934 511.073526 481.545341 495.200127 473.370946 491.246265 400.132270
median 395.739927 364.571000 360.000000 375.000000 388.569820 388.747598 304.201766
min 40.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 4100.000000 3946.000000 2946.000000 2177.000000 3059.000000 2957.000000 2764.000000
std 593.840029 504.718981 417.101937 405.182294 410.723212 467.037564 359.383978
Blood Glucose (mg/dL) mean 0.793347 0.000000 0.000000 0.720624 0.000000 0.000000 0.000000
median 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 86.474822 0.000000 0.000000 79.268587 0.000000 0.000000 0.000000
std 8.282786 0.000000 0.000000 7.557963 0.000000 0.000000 0.000000
Blood Pressure (Diastolic) (mmHg) mean 0.568807 0.000000 0.000000 1.263636 0.663636 1.981818 1.890909
median 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 62.000000 0.000000 0.000000 75.000000 73.000000 78.000000 80.000000
std 5.938523 0.000000 0.000000 9.357994 6.960277 11.925596 11.478217
Blood Pressure (Systolic) (mmHg) mean 0.880734 0.000000 0.000000 2.154545 1.054545 3.327273 3.163636
median 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 96.000000 0.000000 0.000000 125.000000 116.000000 125.000000 122.000000
std 9.195132 0.000000 0.000000 15.929424 11.060166 19.966076 19.005324
Distance (km) mean 5.790673 5.873293 6.175080 6.411459 6.597284 5.709851 4.931396
median 5.802497 5.632659 5.931503 6.383015 6.400032 4.970000 3.835238
min 0.450000 0.740000 0.484112 1.100000 1.460000 0.660000 0.580000
max 12.248473 15.466486 17.800000 11.625000 15.840000 20.060258 22.046162
std 2.305463 2.369301 2.477156 2.012367 2.305351 3.901227 3.972398
Flights Climbed (count) mean 12.250969 11.879532 13.064715 12.828676 11.873166 6.979461 4.654826
median 12.000000 12.000000 12.000000 13.500000 10.595144 5.000000 2.500000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 33.000000 35.000000 36.000000 34.569647 40.413400 29.000000 30.000000
std 8.252108 7.577167 8.871316 8.866759 9.473809 7.626648 5.658815
Heart Rate (count/min) mean 88.376147 83.192661 82.532110 81.563636 82.381818 85.218182 86.418182
median 113.000000 107.000000 107.000000 107.500000 111.500000 111.500000 111.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 175.000000 167.000000 172.000000 183.000000 177.000000 177.000000 169.000000
std 58.367893 58.280846 57.462640 61.832940 58.947961 59.401044 59.016773
Oxygen Saturation (%) mean 0.000000 0.000000 0.000000 0.000000 0.009073 0.000000 0.000000
median 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 0.000000 0.000000 0.000000 0.000000 0.998000 0.000000 0.000000
std 0.000000 0.000000 0.000000 0.000000 0.095156 0.000000 0.000000
Resting Calories (kcal) mean 1505.486257 1527.659352 1498.053099 1496.717078 1495.366365 1505.074339 1496.279053
median 1534.943595 1535.229885 1532.602856 1537.171610 1536.320811 1543.630610 1533.000000
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
max 1860.000000 1860.000000 1860.000000 1860.000000 1860.000000 1860.000000 1860.000000
std 421.566375 384.738955 431.300163 424.315087 430.689373 425.612962 424.815336
Steps (count) mean 7330.887368 7460.758010 7832.416694 8044.837209 8367.603705 7234.271643 5880.363648
median 7275.000000 7132.000000 7471.164552 7919.500000 7865.000000 6113.841621 4492.000000
min 562.000000 910.000000 575.000000 1370.000000 1822.000000 803.000000 659.000000
max 14533.000000 20337.000000 21082.000000 14565.000000 19937.000000 25540.233598 25487.000000
std 2862.562375 3006.891487 3122.420119 2506.645600 2926.318364 5001.655157 4401.832853
Total Calories (kcal) mean 2074.108191 2038.732877 1979.598440 1991.917205 1968.737311 1996.320604 1896.411324
median 1927.201609 1910.000000 1894.302661 1911.601566 1949.900322 1918.577737 1857.122187
min 200.250485 304.001291 240.059252 162.826617 186.400372 112.983628 185.183000
max 5960.000000 5806.000000 4806.000000 4037.000000 4919.000000 4817.000000 4624.000000
std 834.472538 706.472484 678.438587 665.370154 660.079853 705.191912 620.406226

Heart Rate Data Analysis

In [20]:
#Box Plot for Heart Rate Data 
plt.figure(figsize=(32,7))
ax5 = sns.boxplot(x="Date", y="Heart Rate (count/min)", data=hr_intraday, whis=[0,100])
ax5.set_ylim(hr_intraday['Heart Rate (count/min)'].min()-5,hr_intraday['Heart Rate (count/min)'].max()+5)
ax5.set_xticklabels(hr_intraday['Date'].unique())
ax5.grid(b=False)
ax5.yaxis.tick_right()
ax5.axes.get_xaxis().set_visible(False)
labels = ax5.get_xticklabels()
plt.setp(labels, rotation=45, fontsize=8)
ax5.plot()
Out[20]:
[]
In [21]:
#Cumulative Charts
#Cumulative steps done
#Cumulative km walked
#Cumulative dataset

cumulative = data[['Steps (count)','Distance (km)','Flights Climbed (count)']].copy()
cumulative = cumulative.cumsum(axis=0, skipna=True)
cumulative = cumulative.fillna(method='ffill')
cumulative['Steps (count)'].plot(figsize=(11,3))
cumulative['Distance (km)'].plot(secondary_y=True, style='g')
Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x20c8f5b17b8>
In [22]:
#matrix charts histograms steps done per month
matrixdata = data_for_steps[['Steps (count)','Distance (km)','Flights Climbed (count)','Weekday']].copy()
matrixdata['Month']=pd.to_datetime(matrixdata.index).month
matrixdata['Year']=pd.to_datetime(matrixdata.index).year
g = sns.FacetGrid(matrixdata, row="Year", col="Month", size=4, margin_titles=True)
g.map(plt.hist, "Steps (count)", bins=12)
Out[22]:
<seaborn.axisgrid.FacetGrid at 0x20c96b3def0>
In [23]:
g = sns.FacetGrid(matrixdata, row="Year", col="Month", size=4, margin_titles=True)
g.map(plt.hist, "Flights Climbed (count)", bins=12, color="purple")
Out[23]:
<seaborn.axisgrid.FacetGrid at 0x20c8efc7c88>
In [24]:
g = sns.FacetGrid(matrixdata, row="Year", col="Month", size=4, margin_titles=True)
g.map(sns.regplot, "Steps (count)", "Distance (km)", order=2)
Out[24]:
<seaborn.axisgrid.FacetGrid at 0x20c9d9587f0>
In [25]:
matrixdata
Out[25]:
Steps (count) Distance (km) Flights Climbed (count) Weekday Month Year
Start
2015-09-26 3430.000000 2.812128 1.0 5 Sat 9 2015
2015-09-27 11099.000000 8.199522 4.0 6 Sun 9 2015
2015-09-28 8494.000000 6.162432 21.0 0 Mon 9 2015
2015-09-29 7810.000000 5.719998 6.0 1 Tue 9 2015
2015-09-30 7339.000000 5.386713 11.0 2 Weds 9 2015
2015-10-01 7865.000000 5.838327 3.0 3 Thurs 10 2015
2015-10-02 6728.000000 5.077170 13.0 4 Fri 10 2015
2015-10-05 7172.000000 5.544170 6.0 0 Mon 10 2015
2015-10-06 5612.000000 4.284495 4.0 1 Tue 10 2015
2015-10-07 6199.000000 4.754377 9.0 2 Weds 10 2015
2015-10-08 7974.000000 5.968299 13.0 3 Thurs 10 2015
2015-10-09 10014.000000 7.554140 18.0 4 Fri 10 2015
2015-10-10 9171.000000 7.256715 4.0 5 Sat 10 2015
2015-10-11 6908.000000 5.484903 2.0 6 Sun 10 2015
2015-10-12 6177.000000 4.722327 14.0 0 Mon 10 2015
2015-10-13 6649.000000 4.842473 5.0 1 Tue 10 2015
2015-10-14 7260.000000 5.461417 5.0 2 Weds 10 2015
2015-10-15 6980.000000 5.099102 7.0 3 Thurs 10 2015
2015-10-16 10154.000000 7.335241 15.0 4 Fri 10 2015
2015-10-17 7185.000000 5.154552 9.0 5 Sat 10 2015
2015-10-19 6430.000000 4.855418 12.0 0 Mon 10 2015
2015-10-20 7891.000000 6.329292 10.0 1 Tue 10 2015
2015-10-21 10412.000000 8.208768 14.0 2 Weds 10 2015
2015-10-22 7222.000000 5.628465 8.0 3 Thurs 10 2015
2015-10-23 6699.000000 5.364775 12.0 4 Fri 10 2015
2015-10-24 8619.000000 7.072148 7.0 5 Sat 10 2015
2015-10-25 12973.000000 10.605173 23.0 6 Sun 10 2015
2015-10-26 8483.000000 6.798020 5.0 0 Mon 10 2015
2015-10-27 19284.000000 15.466486 8.0 1 Tue 10 2015
2015-10-28 13498.000000 10.897970 10.0 2 Weds 10 2015
... ... ... ... ... ... ...
2017-08-12 7975.000000 5.977028 7.0 5 Sat 8 2017
2017-08-13 5079.000000 3.987767 11.0 6 Sun 8 2017
2017-08-14 6140.000000 4.699782 14.0 0 Mon 8 2017
2017-08-15 5560.000000 4.254047 12.0 1 Tue 8 2017
2017-08-16 10343.000000 8.005815 23.0 2 Weds 8 2017
2017-08-17 7175.641697 5.499198 21.0 3 Thurs 8 2017
2017-08-18 12927.232888 9.679550 14.0 4 Fri 8 2017
2017-08-19 23223.000000 17.990575 5.0 5 Sat 8 2017
2017-08-20 25487.000000 19.649693 15.0 6 Sun 8 2017
2017-08-21 9022.789601 6.967410 9.0 0 Mon 8 2017
2017-08-22 8408.000000 6.456929 13.0 1 Tue 8 2017
2017-08-23 5572.000000 4.317050 10.0 2 Weds 8 2017
2017-08-24 7125.000000 5.512576 17.0 3 Thurs 8 2017
2017-08-25 9335.000000 7.255489 29.0 4 Fri 8 2017
2017-08-26 11031.000000 8.547414 14.0 5 Sat 8 2017
2017-08-27 8113.965777 6.114310 13.0 6 Sun 8 2017
2017-08-28 8430.448295 6.571817 9.0 0 Mon 8 2017
2017-08-29 7235.000000 5.647706 14.0 1 Tue 8 2017
2017-08-30 8137.000000 6.274538 20.0 2 Weds 8 2017
2017-08-31 12434.507106 9.715011 27.0 3 Thurs 8 2017
2017-09-01 8411.000000 6.469647 19.0 4 Fri 9 2017
2017-09-02 2435.000000 1.908866 0.0 5 Sat 9 2017
2017-09-03 3555.000000 2.748559 0.0 6 Sun 9 2017
2017-09-04 7257.000000 5.597074 18.0 0 Mon 9 2017
2017-09-05 7038.004403 5.420324 18.0 1 Tue 9 2017
2017-09-06 9240.202021 7.115341 21.0 2 Weds 9 2017
2017-09-07 9711.000000 7.555005 19.0 3 Thurs 9 2017
2017-09-08 6691.846079 5.087559 10.0 4 Fri 9 2017
2017-09-09 10102.153921 7.816389 7.0 5 Sat 9 2017
2017-09-10 3186.000000 2.485014 0.0 6 Sun 9 2017

664 rows × 6 columns

In [26]:
matrixdata = matrixdata.sort_values("Weekday")
g = sns.FacetGrid(matrixdata, col="Weekday", size=4, margin_titles=True)
g.map(sns.regplot, "Steps (count)", "Distance (km)", order=2)
Out[26]:
<seaborn.axisgrid.FacetGrid at 0x20ca1435e10>