StravaWeather

Every cyclist knows just how much of an effect the weather can have, especially wind. The wind speed and direction can mean the difference between flying along, effortlessly smashing KOMs, and realising that you have to keep pedalling to avoid going backwards!

Which makes it all the more surprising that Strava has no provision for weather information (and, as far as I know, no plans to add it).

So I though I would have a crack, then left it for a year thinking about how complex getting, storing and retrieving all the weather data would be. By chance, I then found a great weather API that would hugely speed up the project at World Weather Online. So here goes….

Essentially, it is a project of two halves :

  • A server process to fetch Strava activities and weave weather data into them.
  • Client side code to map an activity and display the route with weather information overlaid onto it.

This blog post will outline the server side code whilst the next one will describe the client.

Server side code - StravaWeatherServlet

Essentially, the principal job of the StravaWeatherServlet is to sit between the browser client and the Strava servers, relaying requests for Strava activity data, parsing them, fetching the weather information from World Weather Online for the correct time and locations, and adding this weather info to the activity data.

Following an authentication step, where the servlet requests a Strava access token, the client asks the servlet to get a list of activities to choose from.

When the user selects an activity, the client sends the servlet an ‘activity’ request with a unique activity ID. This is then used in the function ‘getStravaActivityWithWeather’ which issues the activity request to the Strava API.

The information is returned as JSON which looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
{
  "comment_count": 0,
  "segment_efforts": [
    {
      "distance": 1038.3,
      "start_date_local": "2016-03-02T09:33:04Z",
      ...
      "segment": {
        "distance": 1041.8,
        "start_latlng": [
          53.424473,
          -2.217701
        ],
        "end_latlng": [
          53.433642,
          -2.214634
        ],
        "name": "1km Time Trial (N) - Parkville to Parsonage",
        ...
      },
      "elapsed_time": 133,
      "moving_time": 133,
      "start_date": "2016-03-02T09:33:04Z"
    },
    {
      
      "distance": 702.9,
      "start_date_local": "2016-03-02T09:33:52Z",
      "segment": {
        ...
        "start_latlng": [
          53.427669,
          -2.216579
        ],
        ...
        "end_latlng": [
          53.433826,
          -2.214619
        ],
        ...
      },
      "name": "Money Saver to Parsonage Rd sprint",
      "elapsed_time": 88,
      "id": 12172204111,
      "pr_rank": null,
      "moving_time": 88,
      "start_date": "2016-03-02T09:33:52Z"
    },
    ...
  ],
  "type": "Ride",
  "end_latlng": [
    51.32,
    -1.23
  ],
  "kilojoules": 138.6,
  ... 
  "max_speed": 12.6,
  "start_latlng": [
    53.42,
    -2.22
  ],
  "name": "Morning Ride",
  ...
  "start_latitude": 53.42,
  "location_city": "Manchester",
  "elapsed_time": 1402,
  "average_speed": 6.203,
  "moving_time": 1291,
  "start_date": "2016-03-02T09:29:19Z",
  "calories": 154.6,
  ...
}

I have omitted some of the data here, but each activity consists of some general information; start and end co-ordinates and date-times and a list of ‘segment efforts’ each of which contains start date-times and co-ordinates. These provide a set of co-ordinates in time and space that we can use to fetch weather data.

On receiving the activity, we parse the JSON and look at each segment effort, extracting the “start_latlng” co-ordinates and the “start_date” date and time of that segment. We then use these to request weather information for a particular place and point in time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
final JSONArray segmentEfforts = activity.getJSONArray("segment_efforts");
for (int i=0; i<segmentEfforts.length(); i++)
{
	final JSONObject segmentEffort = segmentEfforts.getJSONObject(i);
	if (segmentEffort.has("segment") && segmentEffort.has("start_date"))
	{
		final long time = 
			stravaDateFormat.parse(
				segmentEffort.getString("start_date")
			).getTime();
			
		final JSONObject segment = 
			segmentEffort.getJSONObject("segment");
			
		if (segment.has("start_latlng"))
		{
			final JSONArray startLatLng = 
				segment.getJSONArray("start_latlng");

			WeatherInfo weather = 
				getWeatherInfoByTimeAndLocation(
					weatherAccessToken, 
					time, startLatLng.getDouble(0), 
					startLatLng.getDouble(1)
				);

			if (weather!=null)
			{
				segment.put("weather",new JSONObject(weather));
			}

			Thread.currentThread().sleep(200);
		}
	}
}

The function ‘getWeatherInfoByTimeAndLocation’ fetches weather data from api.worldweatheronline.com for a specific date and lat/long. The WorldWeatherOnline API returns a set of hourly weather observations from the weather station closest to the location and for the date requested. To prevent repeated requests about similar locations and times, the responses are cached on disk against co-ordinates formatted to 3 decimal places and the datetime to nearest day. The appropriate hourly observation is then selected from the response with the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
JSONArray hourly = weather.getJSONArray("hourly");

JSONObject curWeather = null;
for (int i=0; i<hourly.length(); i++)
{
	curWeather = hourly.getJSONObject(i);
	long curWeatherTime = 
		dtFmt.parse(dateQuery+" "+curWeather.getString("time")+"00").getTime();
	if (i==hourly.length()-1) //curWeather is the last observation, use that
	{
		break;
	}

	final JSONObject nextWeather = hourly.getJSONObject(i+1);
	long nextWeatherTime = 
		dtFmt.parse(dateQuery+" "+nextWeather.getString("time")+"00").getTime();
	if (time < nextWeatherTime) //requested time is between 
	{				//curWeather and nextWeather
		//select the closest observation to the requested time
		//default is curWeather
		if ((time-curWeatherTime)>(nextWeatherTime-time))
		{
			curWeather=nextWeather;
		}
		break;
	}
}

This observation data is then added to the json for each of the segments (plus the start and end of the route). The result looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
...
"segment": {
			"country": "United Kingdom",
			"distance": 237.7,
			"city": "Manchester",
			"end_longitude": -2.233929,
			"end_latitude": 53.473518,
			"start_latlng": [53.471899, -2.234528],
			"elevation_low": 41,
			"starred": false,
			"end_latlng": [53.473518, -2.233929],
			"name": "Campus Link Path (Reverse)",
			"weather": {
				"temp": 4,
				"windChill": -3,
				"description": "Light sleet showers",
				"windDirDeg": 294,
				"windDir": "WNW",
				"iconURL": "http://cdn.worldweathe...sleet_showers.png",
				"time": 0,
				"windSpeed": 43
			},
			"id": 5785697,
			"state": "England"
		},

This extra weather information is used by the AJAX front-end code to plot the weather information on a map along with some segment information. (More about this in the next blog post).

All source and webapp code for this project is here. You will need to edit the following files to make it work:

1) WEB-INF/web.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<servlet>
        <servlet-name>StravaWeatherServlet</servlet-name>
        <servlet-class>io.wiretrip.stravaweather.StravaWeatherServlet</servlet-class>
		<init-param>
			<param-name>webapp_path</param-name>
			<param-value>http://127.0.0.1:8082/stravaweather/</param-value>
		</init-param>
		<init-param>
			<param-name>strava_client_id</param-name>
			<param-value>xxxx</param-value>
		</init-param>
		<init-param>
			<param-name>strava_client_secret</param-name>
			<param-value>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</param-value>
		</init-param>
		<init-param>
			<param-name>weather_api_token</param-name>
			<param-value>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</param-value>
		</init-param>
</servlet>

You will need to set the webapp_path to that of your deployment. The strava_client_id and strava_client_secret are assigned to your application by Strava in the Manage Applications section at Strava Developers . Likewise, when you register for API access at World Weather Online you will be given an access token which you need to enter for weather_api_token.

StreamGraph

A while ago at work we had a need to display changing ‘share of voice’ or ‘contribution of themes’ information over time. Essentially this would mean displaying the ‘Top 1 themes’ or ‘Top 1 authors’ for each epoch or timespan. The same authors and themes would typically exist in multiple epochs, but not necessarily contiguously; authors/themes could appear for a few epochs, disappear again and then reapppear later on, and new themes could appear at any point. But for this fact, a stacked area graph or a classic stream graph would work perfectly. Some WWW research, however, soon revealed this lovely graph. It is able to show the proportional sizes of the entities in each epoch, their changes in ‘rank’, and allows for entities to ‘join’ and leave, starting and ending streams over their lifetimes. Unfortunately, no source was available and I was unable to find any implementations at all. The only option was to do it myself, the results of which are here.

To keep things simple, and instantly useful, there were three constraints:

  • Implement in generic javascript with minimal dependencies - only jquery for handling json data.
  • Use SVG for the actual drawing - this is now fairly standard across browsers and requires no 3rd party libraries.
  • Use the json data produced by the API of our in-house data processing software, DexterDiscovery.

The library code is here and there is a working example at the end of the post.

How it works

1. Organise the data

The first stage of processing is to fetch the graph’s data - this arrives a json in the following format:

1
2
3
4
5
6
7
8
9
{"itemCountsByDate":
	[
	{"score":7, "partition":"BIRMINGHAM", "name":"BIRMINGHAM", "date":1393632000000},
	{"score":6, "partition":"BRADFORD", "name":"BRADFORD", "date":1393632000000},
	{"score":8, "partition":"BRISTOL", "name":"BRISTOL", "date":1393632000000},
	{"score":6, "partition":"GLASGOW", "name":"GLASGOW", "date":1393632000000}...
	{"score":9, "partition":"LEEDS", "name":"LEEDS", "date":1395360000000}
	]
}

Each object represents a count of items of a particular category or series (‘partition’) and at a particular point in time (‘date’ : number of milliseconds since 01/01/1970).

We use jquery to fetch the data and then we ‘bin’ the items into particular epochs (we do this by converting the ‘date’ value into a sortable string representation, e.g. ‘2015/06/05’ and adding the items to a ‘StreamColumn’ object representing that date).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$.each( data.itemCountsByDate, 
	  function( key, val ) 
	  {
		var date = new Date(val.date);
		
		//create 'x' scale of dates in format 'yyyy/MM/dd'
		var dateStr = (date.getFullYear()+'/'+
						('0'+date.getMonth()).slice(-2)+'/'+
						('0'+date.getDate()).slice(-2));
		var curColumn = streamColumns[dateStr];
		if (curColumn == null)
		{
			//add a new column for this date
			curColumn = new StreamColumn();
			streamColumns[dateStr] = curColumn;
		}
		curColumn.addItem(val);
	  }
  );

Note that when we add an ‘item’ to a column, we add it as an object ‘keyed’ on the ‘partition id’ to allow us to access the items by partitionId.

1
2
3
4
5
StreamColumn.prototype.addItem = function(item)
{
	this.items[item.partition] = item;
	this.totalScore += item.score;
}

By now we have all the data placed into ‘date’ columns. Optionally, at this point we can fill in some of the blanks, either by insisting that each column ‘stack’ contains all of the partitions/series and adding 0 scored items where they are not already there, or by ‘bridging’ series across columns with missing values as below:

Columns with a missing 'BRADFORD' bridged.

We do this by looking at each series in a stack, checking to see if it is in the previous column, and if not, we check the column before that, if it is in that ‘previous-previous column’, then we add a 0 scoring item of that series to the previous column.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//insert 'bridging' items
for (var key in curColumn.items) 
{
	var curItem = curColumn.items[key];
	var prevItem = prevColumn.items[key]; //look for an item 
						//from the same partition 
						//in the previous stack
	if (prevItem == null) //set this as this item's 'prevItem';
	{
		if (i>1) //check the previous-previous stack for occurrence
		{
			var prevPrevColumn = streamColumns[columnKeys[i-2]];
			var prevPrevItem = prevPrevColumn.items[key];
		
			if (prevPrevItem != null)
			{
				//add a 'bridge' intermediate item 
				//to the previous column
				var bridgeItem;
				bridgeItem.score=0;
				bridgeItem.partition=curItem.partition;
				bridgeItem.name=curItem.name;
				
				prevColumn.addItem(bridgeItem);
			}
		}
	}
}

2. Prepare the layout

First we sort the columns into date order ( by sorting our ‘map’ of StreamColumns by the date/column label ). Next we sort each stack into descending score order (we could optionally sort by series name - which would give us a stacked area graph essentially).

Finally, we iterate through each column and mark out a stack of rectangles, where the width is defined in ‘columnWidth’ and the height is the score * yScale. The yScale can be decided in two ways

  • (locally scaled/non proportional) The column height in pixels / The total score of that column (all items scores added). This will give a graph where every column is the same total height. Item heights are not comparable scross columns.
  • (proportional/globally scaled) The column height in pixels / The column total score in the whole graph. This makes the item heights comparable across columns.

3. Draw the graph

Finally, we draw the graph using SVG tags. Essentially we draw a stack of rectangles but with a twist: for each item, we check to see if there is an item of the same series in the previous column, if there isn’t, we draw a simple rectangle. If there is however, we draw the a path that starts at the top-right corner of the previous rectangle and follows a Bezier curve to the top left of this item rectangle, following round the rectangle and then back via a bezier curve to the bottom right of the previous rectangle. This creates a nice join between the two columns.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
if (prevItem==null) //just draw rectangle
{
	curItem.colour = chartItemColours[(colourIdx++) % chartItemColours.length];
	pathStr = "M"+curItem.left+","+curItem.top+" L"+curItem.right+","+curItem.top+
		" L"+curItem.right+","+curItem.bottom+
		" L"+curItem.left+","+curItem.bottom+" Z";
}
else //draw a rectangle with a Bezier 'join' to the rect of the item in the 
	 //same series in the previous column
{
	curItem.colour = prevItem.colour;
	var midX = prevItem.right + ((curItem.left - prevItem.right) / 2);
	pathStr = "M" + (prevItem.right-1)+","+prevItem.top+" C"+midX+","+
		prevItem.top+" "+midX+","+curItem.top+" "+curItem.left+","+curItem.top+
		" L"+curItem.right+","+curItem.top+
		" L"+curItem.right+","+curItem.bottom+
		" L"+curItem.left+","+curItem.bottom+
		" C"+midX+","+curItem.bottom+" "+midX+
		","+prevItem.bottom+" "+(prevItem.right-1)+","+prevItem.bottom+" Z";
}

Example - Temperatures in 2015 in major UK cities.

The HTML is here, the Javascript is here and the data JSON is here.

DeXtree

The hierarchical file system is a wonderful approach to organising files but it is also a great opportunity to ‘lose’ files in amongst all those directories. There are typically two ways to finding files provided by most file explorers. One is to search, whereby you have to give the filename (or some part of it) and the system will show matching files. The other is tediously to look in each folder until you see the file. The first option is quick - if you know the filename, but sometimes you can’t remember it. The other is painful.

The answer to both these problems is the ‘flat view’; showing all the files in a selected folder (or whole drive) and all subfolders. Curiously this is not a feature readily found in most file managers (I remember XTree could do it). I decided to write such a file explorer and to throw in a few other features not found on any other file managers. The result was DexTree, implemented in Delphi (Object Pascal) using the lovely Virtual Treeview control by Mike Lischke.

DeXtree application showing list view.

Features

  • Flat File View - Shows all the files in a selected folder, plus all those in subfolders. Optionally shows contents of system and hidden folders too.
  • List view sortable by all columns and with search as you type in each column. Instantly find all the biggest files or newest files on a disk, for example.
  • Can copy a the list view as a table of file details to the clipboard for pasting into Excel, Word or a text editor. Very useful if you need to list the files in a directory, for example in instructions or manifests.
  • File content hashing (using MD5 at present). Used to find duplicate files, either by sorting on the hash column or by using the ‘Select Dupes’ routine, which highlights all except the first instance of a file with a particular hash. Sorting by date, you can highlight the oldest or newest versions of files.
  • Timeline view - as far as I know, the only file manager that offers this (although I remember a Microsoft Demo for WinFS called Life Journal that did something like it). Essentially, this show all the files presented on a variable granularity timeline, allowing exploration of files temporally - sometimes you know a file by when you’ve written it.

You can download it here. It is a little rough around the edges, so any bugs or suggestions are welcome.