Combining GPX and HRM files into TCX format

Polar RCX5
Three of the most common file formats for recording exercise data are HRM, GPX and TCX. HRM is an older proprietary, but open, standard created and maintained by Polar, which deals with heartrate information, as well as speed, cadence, altitude and power. GPX is another older standard which deals primary with geolocation data from GPS receivers. TCX is a newer format that effectively supports all that HRM and GPX support combined, and then some.

My shiny new Polar RCX5 (which I really like) happens to export data (via the Polar WebSync application) as separate HRM and GPX files (for legacy reasons, no doubt). Whereas Strava (which I also really like) supports GPX and TCX imports (amoung others). So of course, I can import my GPX files from the RCX5 to Strava pretty easily, however, that will provide Strava with no heartrate nor cadence data, since the GPX format does not support those.

So the question I faced was: how to combine the GPX and HRM files from my RCX5 to a single TCX file? Since I found no appropriate tools readily available, I wrote my own ;)

The Script

Now, the script I created in response to this question, is not overly featureful - it certainly does not cover every facet of
any of the HRM, GPX or TCX standards. However, it does cover all that data from an RCX5 that I want to use :)

So without any further ado, here's the script:

# gpx2tcx.awk by Paul Colby (http://colby.id.au), no rights reserved ;)
# $Id: gpx2tcx.awk 265 2012-02-11 05:39:47Z paul $

BEGIN {
  # Skip to the HR data in the HRM file.
  FS="="
  while ((!FOUND_HRDATA) && (getline <HRMFILE > 0)) {
    if ($1 == "Version") {
      HRM_VERSION=$2
    } else if ((HRM_VERSION <= 105) && ($1 == "Mode")) {
      FLAG=int(substr($2,1,1)) # First integer flag (0, 1 or 3).
      HAVE_ALTITUDE=(FLAG == 1) ? 1 : 0
      HAVE_CADENCE=(FLAG == 0) ? 1 : 0
      IMPERIAL_UNITS=int(substr($2,3,1)); # Third bit flag (0 or 1).
    } else if ((HRM_VERSION >= 106) && ($1 == "SMode")) {
      HAVE_ALTITUDE=int(substr($2,3,1)) # Third bit flag (0 or 1).
      HAVE_CADENCE=int(substr($2,2,1))  # Second bit flag (0 or 1).
      HAVE_SPEED=int(substr($2,1,1))    # First bit flag (0 or 1).
      IMPERIAL_UNITS=int(substr($2,8,1)); # Eighth bit flag (0 or 1).
    } else if ($1 == "Length") {
      DURATION=$2
    } else if (($1 == "[HRData]") || ($1 == "[HRData]\r")) {
      FOUND_HRDATA="$1"
    }
  }
  FS="[<>= \"]+"

  printf "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\" ?>\n\
<TrainingCenterDatabase xmlns=\"http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2\"\
 xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\
 xsi:schemaLocation=\"http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2\
 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd\">\n"

  printf "\n  <Activities>\n"
  if (!SPORT) SPORT=(HAVE_CADENCE) ? "Biking" : "Running";
  printf "    <Activity Sport=\"%s\">\n", SPORT
}

{
  if ($2 == "trkpt") {
    IN_TRKPT=1
    for (i=0;i<NF-1;i++) {
      if ($i == "lat") LATITUDE=$(i+1)
      if ($i == "lon") LONGITUDE=$(i+1)
    }
  } else if ($2 == "time") {
    if (IN_TRKPT) {
      printf "          <Trackpoint>\n"
      printf "            <Time>%s</Time>\n", $3
      printf "            <Position>\n"
      printf "              <LatitudeDegrees>%s</LatitudeDegrees>\n", LATITUDE
      printf "              <LongitudeDegrees>%s</LongitudeDegrees>\n", LONGITUDE
      printf "            </Position>\n"
      if ((HAVE_ALTITUDE == 0) && (ALTITUDE > 0)) {
          printf "            <AltitudeMeters>%f</AltitudeMeters>\n", ALTITUDE
          ALTITUDE=0
      }
      if (FOUND_HRDATA) {
        getline HRMDATA <HRMFILE ; split(HRMDATA, HRMFIELDS, "[\t\r]")
        if (HAVE_ALTITUDE > 0) {
          ALTITUDE=(HRM_VERSION <= 105) ? ALTITUDE=HRMFIELDS[3] : ALTITUDE=HRMFIELDS[2+HAVE_SPEED+HAVE_CADENCE];
          if (HRM_VERSION <= 102) ALTITUDE=(ALTITUDE*10);
          if (IMPERIAL_UNITS > 0) ALTITUDE=(ALTITUDE/0.3048); # feet to meters.
          printf "            <AltitudeMeters>%f</AltitudeMeters>\n", ALTITUDE
        }
        printf "            <HeartRateBpm xsi:type=\"HeartRateInBeatsPerMinute_t\">\n"
        printf "              <Value>%s</Value>\n", HRMFIELDS[1]
        printf "            </HeartRateBpm>\n"
        if (HAVE_CADENCE)
          printf "            <Cadence>%s</Cadence>\n", HRMFIELDS[2+HAVE_SPEED]
      }
    } else {
      printf "      <Id>%s</Id>\n      <Lap StartTime=\"%s\">\n", $3, $3
      split(DURATION, DURATION_ARRAY, ":");
      DURATION_NUMBER=DURATION_ARRAY[1]*60*60 + DURATION_ARRAY[2]*60 + DURATION_ARRAY[3];
      printf "        <TotalTimeSeconds>%s</TotalTimeSeconds>\n", DURATION_NUMBER
      printf "        <DistanceMeters>0</DistanceMeters>\n        <Calories>0</Calories>\n"
      printf "        <Intensity>Active</Intensity>\n        <TriggerMethod>Manual</TriggerMethod>\n"
      printf "        <Track>\n"
    }

  } else if ($2 == "/trkpt") {
    printf "          </Trackpoint>\n"
    IN_TRKPT=0
  } else if ($2 == "/trk") {
    printf "        </Track>\n      </Lap>\n"
  }
}

END {
  printf "    </Activity>\n  </Activities>\n"

  split("$Revision: 265 $", REVISION, " ")
  split("$Date: 2012-02-11 16:39:47 +1100 (Sat, 11 Feb 2012) $", DATE, " ")
  printf "\n  <Author xsi:type=\"Application_t\"> \n\
    <Name>Paul Colby's GPX/HRM to TCX Converter</Name> \n\
    <Build> \n\
      <Version> \n\
        <VersionMajor>1</VersionMajor> \n\
        <VersionMinor>0</VersionMinor> \n\
        <BuildMajor>0</BuildMajor> \n\
        <BuildMinor>%d</BuildMinor> \n\
      </Version> \n\
      <Type>Internal</Type> \n\
      <Time>%sT%s%s</Time> \n\
      <Builder>PaulColby</Builder> \n\
    </Build> \n\
    <LangID>EN</LangID> \n\
    <PartNumber>636-F6C62-79</PartNumber> \n\
  </Author>\n", REVISION[2], DATE[2], DATE[3], DATE[4]

  printf "\n</TrainingCenterDatabase>\n"
}

(You can download it from this direct link, or from the files list at the end of this article).

Ok, so you hopefully already realise that this is an AWK script. AWK is certainly not as well known as a lot of other scripting languages, such as batch files or Bash, but it is very well suited to this task. In particularly, the above script would be a lot longer, and a lot more complicated if written in just about any other language (certainly the languages I'm skilled with anyway).

Usage

So, how to use it? It's pretty simple; usage is as follows:

gawk -f gpx2tcx.awk [-v ALTITUDE=1.0] -v HRMFILE=file.hrm file.gpx > file.tcx

You'll notice that I've called the script gpx2tcx.awk - at it's most basic level, that's what it is - a GPX to TCX converter. In other words, you don't need a HRM file to use this script; without a HRM file it will still convert GPX files to TCX quite happily. However, the real benefit of the script (for me at least) comes when you specifiy an HRMFILE to process too, as shown in the usage text above.

In the usgae example shown above, the gawk command will read in both file.hrm and file.gpx, and will output a valid TCX file to file.tcx. It doesn't get much simpler :)

Of course, there are a lot of things that can break the TCX output, but if using HRM and GPX files from a Polar RCX5 (and presumably other Polar devices too), then it should work correctly - if not, let me know in the comments section, and I'll take a look. As it is, it works for 100% of my Polar RCX5 GPX / HRM files (13 activities so far).

How it Works

The BEGIN section parses a number of statements at the head of the HRM file to determine things like whether or not the HRM file includes cadence information. The processing of HRM header data continues until the actual heartrate / cadence data is reached, as indicated by the HRDATA section header. Finally, the BEGIN section prints a basic TCX / XML header.

The main section coverts individual GPS points from GPX to TCX format, including whatever HRM data is available. Notice that when the script first comes across the latitude and longitude values, it has to store them in variables, to be printed later - until after the relevant timestamp. This is because the TCX schema uses sequences for everything, which means that the order of child elements is important... I've never understood why someone would want to enforce ordering of non-identical child elements... it just makes more work in situations like this </rant> ;)

Finally, the END section prints a basic TCX footer, including details about which application (gpx2tcx in this case) created the TCX file.

Polar RCX5 and Strava?

So, getting back to my introductory dilemma, now when I go for a run or ride, I export the Polar RCX5 data as HRM and GPX files and then use this script to combine those into a sing TCX file. I then upload that TCX file to Strava, providing the Strava activity with GPS, heartrate and cadence information.

Note that Strava has a number of existing issues relating to uploaded TCX files, which I don't believe are specific to TCX files generated by my script above (afterall, my script does generated correctly validating TCX files), such as uploaded runs always being matched against rides, and not runs.

What's Next

In the short term, I already have another two scripts (one a Windows batch file, the other a Bash script) that make using the above gpx2tcx.awk script much easier to use, by automatically calling the above script for all GPX files in a directory that do not already have matching TCX files - very handy!! Those two scripts will be the subject of my next two blogs posts, which should be done very soon :)

In the medium to long term, I intend to replace this script entirely, with a simple Qt (ie cross-platform) native C++ application. The main reason for this, is I'd really like to add one special feature that would be quite difficult to do well in AWK. That feature is probably best explained by considering a common usage case.

Consider the situation, as I have often right now, where you've recorded a ride using both the RCX5 and some other application (such as Strava's Android or iPhone app). I do this currently since the RCX5 has heartrate and cadence information, and a much more accurate GPS (than my Android phone), but the Android app records altitude information, which the RCX5 does not! So, by using both, I get all the information possible.

However, if I do the same ride (or run) in this way many times, then after a while, the Android (or iPhone) app becomes somewhat redundant, since it's recording altitude data for the same track again and again. And presumably the altitude data would not change each time (once you've averaged out the GPS errors of course).

So, what I'd like to write this new app to do, is provide a simple way to build and maintain an altitude database that you can populate automatically by feeding other GPX / TCX data files (ones with altitude data) into. Then, when converting from GPX+HRM to TCX, the app could include any known altitude information too. With such an application, I could use both the RCX5 and Strava Android (or iPhone) app the first few times a run / ride a given track, then, use just the RCX5 and still get elevation data! :)

Update

I've also written a couple of wrapper scripts (one for Windows, and one for Bash) which make using the above AWK script a little bit easier. They also implement some minor extra features, such as fixing up Polar's misleading UTC timestamps. You can read about those scripts in the following blog posts:

Update 2

I've just updated the above AWK script to include the following changes:

  • corrected the handling of altitude data from HRM files (previously the script was calculating the altitude, but not actually printing it... oops!)
  • corrected the detection of imperial units for v1.06+ HRM files (was looking at the wrong SMode byte)
  • added a new option - via the command line, you can now optionally specify a starting altitude like so:
    gawk ... -v ALTITUDE=1.0 ...

    First off, if you specify a HRM file to include, and that HRM file includes altitude data, then the above ALTITUDE argument will be ignored anyway. But, if your HRM file does not include altitude data (eg RCX5 HRM files), or you are not using HRM files at all, then the ALTITUDE argument (if set) will be used to specify the altitude (in meters) of the first point in the resulting TCX data.

    This feature is provided to allow for the way Strava fails to match segments for runs with no altitude data. But if we add one point of (fake) altitude data, then Strava's site will go ahead and replace all of the altitude data with its own database values, otherwise a number of things won't work Strava.

    You read more about this at http://support.strava.com/discussions/problems/4372-gpx-tcx-uploads-wont-match-run-segments.

Enjoy! ;)

Trackback URL for this post:

http://colby.id.au/trackback/152
AttachmentSize
gpx2tcx.awk4.44 KB

gawk for Windows

Quick tip: if you are looking for a version of gawk for Windows, you might try either of the following:

For what it's worth, I use the latter (ie UnxUtils).

pc.

Nicely done

This is a great post.
I don't have a need (yet) for the awk script, but the last half of the post includes some pretty exciting allusions.

Funnily enough, I've started work on a post with some related (but completely different) goals with regards to recording data for running. Hopefully I'll finish it off soon. ;-)

I'm excited by the idea of an elevation database. Would it be possible to make this an online database, open for contributions of data?

I also noticed Amazon are selling a low cost HRM watch which includes GPS, compass and elevation data as well. I'd be interested in finding out more about how it records data, but nothing I can find is clear about the formats.
(It's a Pyle PGSPW1). The thing that most interested me about it was the altitude data.

Online DB

Thanks for the feedback.

Yeah, the extension in my head, would be a free open online altitude database. There'd been some issues making sure people don't simply extract data from other paid-for sources and import it, since that would be a violation of someone else's IP no doubt. But if people just contribute their own data from their own GPS devices... that would be very useful, and very cool :)

When I write the Qt app I have in mind, I'd probably add an opt-in option to share the altitude data if the user wants to.

That Pyle device looks interesting :)

pc.

Forgot to log in

So that post was from me. I forgot to log in. heh.

I really like the idea of the open DB. I'd probably use that Qt app as well. :-)

The biggest problem I'm having with the Pyle device is that it's really unclear what format the data might be. I've downloaded the officially available manual. Unfortunately it's a PDF with scanned images of the pages - no searching. It's also difficult to read (compression artifacts).

The other thing is the price. For ~US$75 I could buy a reputable product like the Polar WearLink+ Buetooth HRM, and connect it to my Android device. Then I can record GPS + HR in one output file.
The Pyle is ~US$100, and records more data - but I don't know the format.

Hard decision.

Polar WearLink with Bluetooth

That Polar chest strap looks quite interesting. I see on the product page (http://www.polar.fi/en/products/accessories/Polar_WearLink_transmitter_with_Bluetooth) that it's compatible with a number of other apps.

One thing you could do, is use one of the other apps, such as RunKeeper, then export the data from there. A friend at work does this with his RunKeeper iPhone app... not sure what format RunKeeper exports as (ie GPX vs TCX?), but if it's TCX, then it might do all you want :)

I wonder how good those apps are at heartrate monitoring compared to say, the RCX5? Obviously the apps could record BPM easily, but Polar do a lot more analysis based on subtleties of the heart rhythm... do they provide that info to third party apps? I doubt it, but maybe?

Other Apps to record with

Yeah, I like the Polar Bluetooth. I've had a look at MyTracks which works with the chest strap, (I used to use it before using the Strava App).
https://market.android.com/details?id=com.google.android.maps.mytracks

MyTracks will output to tcx format, (among others).

I have no idea how much information you can get from the data on the Polar WearLink+ Bluetooth chest strap.
One way to find out would be to just buy it and test it out. I might do that. I can always upgrade to something else later if necessary.

Great script!

Thanks Paul for putting together this script, it works great. Now I can upload my HR and cadence to Strava. The only thing missing is the elevation, did you run into issues as well? I have a CS600X (I think it has a barometric altimeter). I can see the elevation just fine in Polar trainer. Any ideas?

Thanks again!
Alice (from Strava forum)

Thanks Alice

Thank for the feedback :)

I don't actually have any HRM files with elevation data... the altitude support in script is theoretical only, based on the HRM specs (which are hideously incomplete). But if you send me one of your HRM files (optionally, with matching GPX file), then I'll check it out :)

You can post the HRM data in a reply to this post (don't worry, all posts are moderated, and I won't let your HRM data get published), or email them to "blog at colby dot id dot au".

Cheers :)

PS - Just out of curiosity, what OS / platform are you using? And are you using the relevant *.cmd or *.sh script from my other posts?

Excellent, well done

Thanks Paul. Like you I my Polar provides great information, RS800CX. Your little progreamme works a treat. As identified above it would be great if altitude could be included. The other issue for me is the way that Polar records the local time in the GPX file, yet, it appears Strava as well as others work of GMT (I understand this is the standard). So it would be great to have a variable of some sort to adjust the time. In my case I am +10 so the script would subtract 10hrs from the time stamps in the file 17:00hrs local becomes 07:00 in the file. Not sure how this works when I start a ride at 04:30 and the time and date would need to change? Again great work. Paul (yes another Paul)

Thanks Paul

Thanks for the feedback Paul! :)

As I replied to Alice above, I don't have any HRM files with altitude at the moment, so if you'd like to post one here, I'm happy to take a look!

As for the timestamp issues, yeah, my RCX5 actually lies... it records timestamps in my local time, but uses a 'Z' suffix with indicated UTC. So Strava, et al, are doing the right thing by assuming that they're UTC timestamps (which they're not, for most people).

Anyway, I actually have wrapper scripts around the above AWK script that fixes up the timestamps for me. You can read about them in the following posts:

I really should have linked those at the end of the above post... I'll go do that now :)

pc.

Thanks Paul. Yes found the

Thanks Paul. Yes found the the new version and took about 10sec to covert the year so far with the timestamp corrected. Brilliant! This is probably a silly question...but..how do I get you the files to look at? Happy to supply as many as you need. Cheers Paul

Post it here :)

Perhaps the easiest way is to just post the data... you don't need to post the whole thing, just maybe first couple of hundres lines of the HRM file.

If open the HRM file and take a look, you'll see a line that just contains the text: "[HRData]"

Copy everything from the beginning of the HRM file up to maybe around 30 lines after that line. Post that text as a reply here (I won't let that reply get through the moderation queue).

Cheers,

pc.

HI PaulI tried to email but

HI Paul

I tried to email but will post here as well. Here is a file with Power data and altitude

[HRData]

... <snip/> ...

Here is the same data laid out in Polar ProTrainer5

... <snip/> ...

Here is another example without power

... <snip/> ...

As viewed in Polar ProTrainer5

... <snip/> ...

Thanks heaps for looking at this.
Regards

Paul

Thanks Paul

Thanks, I'll check it out now :)

BTW, can you confirm the "SMode" value from the HRM files? Or better yet, just post the entire "[Params]" section (shouldn't been more than around 25 lines).

Now to go check out the AWK script...

PS Good job on posting the Polar ProTrainer5 version too! That will definitely help with my sanity checking :)

pc.

doh!

Doh! It was a really obviously silly mistake! :)

Very easily fixed... I'm just adding one other quick feature, and will update the above script very soon :)

Thanks again for the data!! :)

pc.

gpx2tcx.awk updated

Ok, I've updated the AWK script above... it should now work for altitude :)

I've noted the changes at the end of the post above.

I hope it works for you! :)

pc.

Tested with a couple of files

Tested with a couple of files and works fantastic, you are brilliant :-) ! Thanks for this piece of work, much appreciated. Cheers Paul

I'm not sure if I have missed

I'm not sure if I have missed something but this example of negative altitude gives zero values in the tcx file. Positive altitude hrm to tcx look to work great.

So original looks like this

... <snip/> ...

Output to tcx looks like this

... <snip/> ...

Did I miss something>

Thanks

Paul

Will have a look...

I hadn't though to test negative altitudes - there's not many places you can run / ride under sea level, but it is possible ;)

(actually, your device might be including barometer-based reletive altitude, and not absolute altitude, which would make sense)

Let me have a look...

pc.

SMode

Hey Paul, can you tell me the SMode value for that HRM file?

Thanks :)

That should do it..

Hey Paul, I've done another update to the script above.

The problem was not the negatives, but that fact that one of the columns before the altitude column was missing (either speed or cadence).

This is actually an area where the Polar HRM file format doc is ambiguous, but now that I've seen your data, I can infer how the HRM data files handle that scenario.

Anyway, give the latest version a shot, and let me know how it goes :)

pc.

Of course. Out and about.

Of course. Out and about. Will try again when I get home. Thanks again

Hope I'm not wasting your

Hope I'm not wasting your time as it is working for your RCX5. I am not getting any altitude values at all now in any file. Run as a command line or batch.

... <snip/> ...

No problem! ;)

Hi, no, you're not wasting my time ;)

However, it does look like that last example TCX output you posted was from an old version of the AWK script - one released before today's updates (I'm guessing that, because the indentation of the Altitude element is wrong, and I fixed that this morning).

To check, if you look at the very top of the AWK script, you'll see a line that starts with "# $Id: gpx2tcx.awk 265" - there, the "265" is the code revision number. If the number you see there is less than 265, or (more likely) that line is not even there, then you've got an old version of the script.

Also, the resulting TCX files will end with a Version element, which should indicate version 1.0.0.265 (distributed among its child elements).

If both of those do indeed indicate "265", then you'll have to let me know the SMode parameter from your HRM file.

Cheers :)

pc.

Thanks PC. Also an

Thanks PC. Also an interesting diversion. I just tested loading the file to www.trainingpeaks.com which I use to deliver coaching programmes and monitor progress. I use Strava as a motivational tool as the small group I look after do the majority of their training solo. Strava provides a vitual partner/group. The file loads perfectly with the exception of distance/speed. I am assuming Strava gets this from the GPS data. TrainingPeaks like others have problems with these Polar gpx files. Anyway have to go out again. Will review my version when I get back. Cheers Paul

You are right and it works

You are right and it works when you use the correct version, my bad with a cut and paste. Again brilliant work! :-) I've got gpx files dating back to mid 2009 so will batch all these and do some random checks. Thanks a million. Paul

ur v welcome!

You're very welcome!! :)

Thanks for persisting with posting your data, and testing!! ;)

pc.

RCX5

Hi PC. I was wondering why you were doing this when you owned a RCX5. My understanding was it only downloaded to polarpersonaltrainer and that only exported as an xml file. Having a look at the product features no mention of hrm or gpx. However checking the Polar WebSync release notes it shows that a hrm and gpx can be produced. Actually owning the data is where the value lies at the price point of the RCX5 and the RS800CX. Learnt something new about the use of the RCX5. Thanks Paul

RCX5 exports

Yeah, the Polar WebSync app has two modes - one that syncs with polarpersonaltrainer, and the other for managing the device itself (setting up activity types, setting the clock etc). It's that second mode that allows exporting the RCX5's data as GPX and HRM files.

The XML export feature from polarpersonaltrainer is pointlessly sparse - containing nothing by the summary fields :|

pc.

Another suggestion

Just me again :-) I reckon it would be great if the batch file could attach the new tcx files to an email to the strava upload email address for automatic posting. Cheers Paul

I like it...

That's a good idea :)

Though personally, I'd prefer to use some web-based API to upload the files (less latency that way), but email might be practical to implement (definitely would be on *nix machines, but on Windows will probably depend on which mail app you use).

I'll look into it sometime... ;)

pc.

Post new comment