Race Stats and Plots

As someone who is... let's generously say "enthralled"... by numbers, lists, plots, and statistics, I've spent an embarrassing number of hours poring over old surfski race results.  I'm pretty sure this will make me a better paddler, although apparently there's some kind of fermentation period required before this becomes apparent.  Like a squirrel stockpiling nuts for winter, it seemed natural for me to collect all of the race data I could find into one place.  As a programmer by trade, I then instinctively wrote some software to help me manage this information.  And finally, as a wannabe web developer, how could I not try to put together some web tools to help visualize all of this information?

All of this has led to the awkwardly named Surfski New England Race Tracker (SNERT) project.  If you'd rather not read any more details before diving in, just select one of the two links below and start clicking on whatever catches your fancy...
RaceTrack A RaceTrack B

I've attempted to collect, collate, and clean all of the available data from surfski races that have occurred in New England over the past half-dozen years or so.  The source data has come from the two great surfski sites in the area - SurfskiRacing.com and New England Surfski - as well as from race sites, online photos, personal attendance, conversations with participants, and, in one memorable instance, a federal subpoena.  I've tried to include all races that were either primarily for surfskis or had a class that was primarily composed of skis (usually under the heading "HPK" or "Unlimited Kayak").

Where possible, for every paddler in a race I've noted time, place within class, overall place, boat brand and model, color scheme, and number.  In many cases, especially for races back a few years, a lot of this information was unavailable.  I'm hoping that individual paddlers will contact me (at lesher@pobox.com) if they see missing or incorrect information about themselves.

As a novice web programmer, I've used a handful of web tools to present race data in ways other than simple tables of race results.  RaceTrack A utilizes a suite of plotting tools provided by Google to show plots and tables, allowing you to overlay race results across various years and to compare individual paddler performances.  It's a no-nonsense type of analysis tool.  RaceTrack B, on the other hand, presents the race data in a more visually engaging manner that provides a better overview of the New England surfski racing scene.  Because of the graphics, RaceTrack B can get really bogged down in some browsers.  I recommend Google Chrome for the best experience.

RaceTrack A: The default presentation of RaceTrack A is a plot which superimposes multiple race years, showing the times and finish position of each paddler in those races.  You can choose which races to display and how to display the data using scrolling list controls at the top of the page.  You can use the leftmost of these controls to change the presentation from a plot to a table or bar chart.  Here's a screenshot of RaceTrack A showing just the top 20 finishers in the last 4 Blackburn Challenges, with Joe Glickman's finishes highlighted.


You'll find instructions for each type of data presentation at the bottom of the RaceTrack A web page.  Here are some additional tips:
  • If you get into a situation where you can't figure out why you're not seeing anything (or seeing something you don't expect), reload the page.
  • Hovering your mouse over a data point will show additional info about the paddler represented by that point (full name, time, finish position, and boat).
  • If you hold down the control key while clicking in certain list controls at the top of the page, you can select multiple entries.  For example, that's how I selected years 2009 through 2012.

RaceTrack B: The default (and only) presentation of RaceTrack B is a display in which multiple races are depicted side-by-side, with little circular icons for each paddler in the races.  You choose which races to display using scrolling list controls at the top of the page, much as you do for RaceTrack A.  By default, the icon for each paddler is a randomly chosen combination of two colors (like the red-and-blue pairing used for Borys Markin).  You can change the color coding so that each paddler icon instead represents the brand of boat, paddler home state, boat type, or a range of other info.  Here's a screenshot showing the 2012 races from the Surfskiracing.com point series, with each paddler icon color coded by the boat make and model he or she was paddling.


Once you have the races you want displayed, you can hover over the icons to see additional information about how the paddlers did in the race.  Click on a paddler to highlight all occurrences of the paddler on the screen.  If you hold the control key down while clicking, you can highlight multiple paddlers at once.  Just click anywhere on the black background to remove the highlights.  If you click on the yellow race name rather than on a paddler, that race will expand to show a plot of the race depicting the times and finish position of each paddler.  Click on the background to collapse the race plot.

There are some brief instructions shown in the upper right corner of the RaceTrack B web page.  Here are some additional tips:
  • As with RaceTrack A, if you're confused by what you're seeing or not seeing, try reloading the page.
  • In all of the lists except for "Color Coding", you can hold down the control key while clicking to select multiple entries.
  • Depending on your computer, your screen, and your browser, you may find that if you try to view too many races at once (for example, by selecting "ALL RACES" and "ALL YEARS"), things get really ugly, crowded, or slow.  If so, don't do that.
  • Use the Google Chrome browser for the best results.
  • If you resize your browser window, you'll need to reload the RaceTrack B web page to let it know that it needs to revise its layout.
  • In general, a light gray paddler icon for a particular color coding scheme means "no data available" (because I don't know the boat brand, home state, boat color, etc.).
  • For the "Boat Brand & Model" color coding methods, not all colors will show up in the color key (because there are too many possibilities).  For unknown color combinations, hover over the paddler to see boat information.

That about covers it.  I expect to keep RaceTrack A and B current with the latest info, updating no later than a day or two after race results are available.  If you have any corrections, suggestions, or any other feedback, please drop me an email.

For sticklers, here are some additional notes about the race data:
  • In all cases in which surfskis and non-surfski HPKs (racing kayaks) were in the same race class, I included the non-surfski HPKs in the database.  You'll therefore see the occasional racing kayak and K-1 in the results.  
  • In some rare instances for older races, I've combined two classes into one to mirror the modern "HPK" class.  For example, in the 2005 and 2006 Essex River Race and Blackburn Challenge, I combined the separate surfski and racing kayak categories.
  • Although information for the Essex River Race and Blackburn Challenge goes back further than 2005, I chose (at least for the time being) to exclude these years because (for the most part) there were more racing kayaks than surfskis.
  • Where possible, I've collected information on paddler home town/state and birth year from publicly available sources.  This data is used only indirectly in SNERT (to group paddlers by state and five-year age clumps), but if you'd prefer me not to use your information, please let me know and I'll remove it.
  • Although boat brand and model information is provided in the results for the Essex River Race and Blackburn Challenge, this data comes from registration rather than the actual race.  In a few instances, I've corrected this information using photographic evidence.
  • The decision of whether or not to include a mixed-class flatwater race is sometimes not obvious.  I chose to include the Great Stone Dam Classic because a majority of the participants were skis, while I elected to exclude the recent Holyoke Cup because skis were in the minority.
  • For now, I've omitted the NYC Mayor's Cup (2007, 2009, and 2010) because (a) the race wasn't in New England and (b) many of the participants weren't locals.  The second rationale is pretty weak given that to a lesser extent, this is also true of the Blackburn.
  • A few of the races have kept the same name while changing courses over the years.  The Nahant Bay race, for example, has had (I believe) 3 different courses over the past five years.  The Lighthouse-to-Lighthouse race changed courses in 2012.  The Essex River Race moved its start/finish downriver a hundred meters or so a few years back.  When an established race course has changed drastically (like the 2006 Blackburn), I have made this into its own race.
  • I've included all of the Salem Sound League races.  Despite what I just said in the last point, whenever one of these races has been changed (due to weather, low tide, or darkness), I've given it a new name (like "Salem League 3 All Water").

No comments:

Post a Comment