thorobase

The Education of a Horse Player

The thorobase API

View Comments

The thorobase API goes open-source! Right now, the (JavaScript) API consists of 2 projects:

thoroData, to encapsulate horse racing data, and
thoroMotion, to visualize this data.

The projects are open-source and are hosted on github.com under my robinhowlett account. thoroData and thoroMotion are submodules of the thorobase project.

Read more about the details at the thread on PaceAdvantage.com

Written by Robin Howlett (Admin)

April 19th, 2010 at 12:14 am

Posted in Introduction

thoroMotion: more features, an introductory video plus a bonus!

View Comments

In this post, I detail using line charts to get an full race-shape snapshot in one view, zooming in and out to focus on a subset of horses, and using logarithmic scales for specialized handicapping analysis. I’ve provided a video demonstration of using thoroMotion and explaining its basic features, and for a limited-time I’ve provided a bonus of thoroMotions of all final day races at the 2009 Breeders’ Cup.

Thank you!

I’ve been absolutely delighted with the initial reception to thoroMotion. From the comments to the post, to the feedback on Twitter, emails, and the responses at the excellent PaceAdvantage.com forum, I’m very pleased that there are so many horse racing fans open to new ideas and technologies.

For those anxious few waiting to get their hands on the code, it will definitely be released within the next day or two.

More thoroMotion features

I wanted to detail some more features that are included in thoroMotion that weren’t outlined in the introductory post. Some readers discovered them from testing the example on the original post and had some questions about them, so I want to describe them in detail here. I have also created a screencast video (embedded at the bottom of this post) where I run through a demonstration and outline its major features.

At the top-right, you can choose with chart type to use

Chart types

The first feature to highlight is thoroMotion’s ability to display multiple types of visualizations.

You can even change what the axes are displaying by clicking on the either the x- or the y-axis labels. “Lengths Behind” and “Wide” are the defaults respectively, but for other visualizations you can choose from the available drop-down boxes once the labels have been clicked to pick the values that suit your needs.

Apart from the default dynamic bubble-chart type display, there are also bar charts (for simple statistical comparison), and line charts, which are especially useful for quickly identifying race shapes.

Use the Line Chart view to see the entire race shape in one view

Change the y-axis to "Lengths Behind" and use the Line Chart view to see the entire race shape in one view

The example above is the demonstrating the use of the line chart view to get a detailed race shape of the entire 10 furlongs of the 2009 Breeders’ Cup Classic won by the great Zenyatta. You can clearly see she was 15 lengths behind the leader after 2 furlongs, and closed rapidly from then on in, especially the final quarter-mile.

The second feature I want to mention is that the race area of the chart supports both zooming and logarithmic scales.

Click-and-drag to zoom-in

Click-and-drag to zoom-in

Occasionally, there may be quite a lot of horses in the race and you wish to get a closer look at a subset of them.

As pictured on the left, by clicking in the race area and dragging the rectangle around the horses you wish to focus on, you can zoom in and the thoroMotion will dynamically adjust to that view.

Once you have zoomed-in, a “Zoom Area” (pictured right) will appear in the bottom-right of the screen.

The small white rectangular area highlights what part of the race area is currently zoomed-in, and you can drag this around to keep zoomed-in but focus on other sections.

To zoom out and return to the default view, simply click the “Zoom out” link.

Logarithmic scale

Log scale

A more complex and specialized method exists for adjusting the viewing experience. By default, thoroMotion’s axis are on a linear scale (e.g. 1, 2, 3, 4 etc.); every point along both x- and y-axes is worth the same.

So, if a horse is 5 lengths behind a leader, a horse 10 lengths behind will be exactly double the distance away from the leader (the right-side of the race area).

A logarithmic scale (e.g. 1, 10, 100, 1000) will, in effect, focus attention on the lower values of the axis. There are a limited number of use cases to use this feature, but one example would be if a track is playing much quicker by the rail. By using the log scale for the “Wide” axis, you can particularly highlight on performances that took advantage of this track bias.

thoroMotion demonstration video

For those who have found thoroMotion a little complex or the original blog post a little too long to read, I recorded a screencast of myself demonstrating the basic features of thoroMotion.

I’m inexperienced with recording these types of screencasts, so the picture is not as clear as I want it to be; it works best if you maximize the YouTube video screen with the “4-arrows button” near the bottom right.

The video is also captioned for the hearing-impaired and for those struggling with my Irish accent.

Bonus! thoroMotions of all final-day 2009 Breeders’ Cup races for a limited time

To reward all you readers who have helped support thorobase.com get off to a flying start, I’ve put together a this little package of thoroMotions of all races run on the second Breeders’ Cup day at Santa Anita in November 2009. This will be available for a couple of days.

This is just the raw chart data – I haven’t included any user-generated data for width against to rail, so the horses run in lanes according to their post positions. It’s still pretty fun to track the various race winning strategies in some of the biggest races in the world.

Due to a suggestion by a thorobase.com reader, I’ve reversed the y-axis so that the rail is now represented at the top of the race area (meaning PP #1 is at the top now, not the bottom), so that it’s more like the horses are running anti-clockwise.

Hope you enjoy it!

Written by Robin Howlett (Admin)

February 25th, 2010 at 8:26 pm

Posted in Introduction

Introducing thoroMotion

View Comments

thoroMotion is a multi-dimensional horse racing visualization tool. It allows you visualize how a race was run using just the official chart data. Here I describe why thoroMotion was built, how it works and how you will be able to use it.

Why is thoroMotion needed?

Thoroughbred horse racing has always been a very traditional sport; things tend to stay the same. I created thorobase to challenge this notion, to change the way handicappers, horse players and racing fans, of all ages and experience levels, interact with this great sport so that they may hopefully enjoy it even more.

I’d like to introduce to you a tool I developed that brings a completely new approach to understanding and analyzing a horse race – I call it thoroMotion.

This is the traditional racing result chart. It hasn’t changed for a very long time and every horse player who has ever wanted to look at horses’ previous racing performances in detail has encountered it at some stage.

Racing result chart - 2000 Kentucky Jockey Club Stakes

Chart of 2000 Kentucky Jockey Club Stakes (copyright Equibase, BRIS)

The chart outlines the race details and conditions, the finishing positions of each horse plus the number of lengths either ahead or behind their nearest competitor at each point of call, along with some trip comments.

There are only 6 horses in this race, yet the horse player is still presented with a considerable amount of data to decipher. To emphasize this point, see how easily you can answer the following three questions:

  • How did the race develop down the back stretch, between the quarter pole (2 furlongs) and the three-quarter pole (6 furlongs)?
  • Who made the biggest move in the race?
  • Dollar Bill “saved ground” for most of the race and rallied entering the stretch; who was he trying to catch?

I think you’ll see that, without being able to view a video of the race, it requires significant mental gymnastics and computation to truly get an good idea of how this race was run. thoroMotion solves this problem.

What exactly is thoroMotion?

The above chart is not an image, it is a live example of a thoroMotion of the 2000 Kentucky Jockey Club Stakes won by Dollar Bill. Press the “Play” button in the bottom-left to see it come to life and, just maybe, you’ll be able to understand it. If not, no problem, just keep reading!

thoroMotion is software based on the Motion Chart visualization tool from Google. It uses the information from an official chart data supplier and transforms it into a multi-dimensional dynamic animation. It can combine chart data with custom user-generated data, such as trip notes, to deliver an even richer handicapping analysis experience.

Let’s understand the structure of the thoroMotion chart.

The area of the chart representing a top-down view of the track.

The race area

This is the “race area” representing a top-down view of the track with the runners represented by colored bubbles.

The x-axis (“Lengths Behind”) is the number of lengths each horse is behind the leader.

So, the right edge of the area corresponds to the leader of the race, as the “Lengths Behind” figure for the leader is zero.

The y-axis (“Wide”) is the width from the rail – think of it like the running lane on an athletics track. This also means the x-axis doubles as the running rail because it’s lane 0 (zero) – isn’t that cool!

Right now all the runners are in a line on the lead (“Lengths Behind” is zero) and are spread out evenly across the lanes – that’s because they are in the starting gate! The “Wide” figure here corresponds to the post position.

Each bubble is a horse in the race - hover over it to see who is who!

Each bubble is a horse in the race

Each of the bubbles represent a horse in the race. They are initially ordered by post position.

You can see which bubble corresponds to which horse by hovering the mouse cursor over the bubble – the horse’s name will be displayed beside the bubble.

If you want to find a particular horse, hover the mouse over the “Select” list of horse’s names on the right of the chart, and the bubble will be highlighted on the chart.

When a horse/bubble has been highlighted, it will also highlight the relevant values on the x- and y-axes; in this example Dollar Bill is 0 (zero) lengths behind the leader.

Clicking on a bubble, or horse name(s) in the “Select” list, will keep the horse highlighted and make the other’s transparent – this is great if you want to keep track of certain runners.

The playback bar represents "Time" and it's value corresponds to the number of furlongs traveled in the race.

The right-most digit is number of furlongs traveled; 10000 = 0 furlongs

At the bottom of the chart, we have the “playback” controls.

The horizontal bar represents “time/distance”. There is an outstanding bug that does not allow starting from 0 (zero), so we have to start at 10000.

All you need to focus on is the right-most digit, that represents the number of furlongs traveled – right now it’s 0 (zero), meaning the start of the race.

The “Play/Pause” button on the left starts, pauses and resumes the thoroMotion visualization. To the right of that is the “Speed Control” which determines the playback speed. You can drag the small white triangle up the horizontal lines to go faster (fastest playback speed is 10 seconds) and down to go slower (slowest is 40 seconds).

Change the shape and color of the bubbles to represent more data

Change the shape and color of the bubbles to incorporate even more data.

There are even more dimensions we can use to represent extra relevant race data.

The user can change the size and color of the bubbles to incorporate information like:

* odds (for tracking favorites versus longshots), and

* post positions (for checking potential track biases).

By now you should have a good idea of what a thoroMotion visualization represents.

At any point of the race you can take in a massive amount of relevant information with just one look. Plus it’s animated so you can see how the race shape has evolved from start to finish.

I think this is an incredibly exiting tool to have for any horse player!

Let’s have one final example image. Before I explain what information is being displayed, try to understand it yourself by looking at the snapshot below and asking yourself questions like:

  1. How many furlongs have been traveled? Who is just taking the lead?
  2. How far behind the favorite is the 2nd favorite? (hint: the size of the bubbles equals odds)
  3. If the color of the bubbles represent Post Position, and blue equals drawn closer to the rail, where are the horses that started from gates 1 and 2 racing on the track?
  4. How far off the rail is Dollar Bill currently positioned? If it’s not a whole number, what do you think that means?
  5. How far back from the leader is the last horse? Can you guess his odds? (hint: look towards “Odds” and consider the bubble size)
See if you can figure out how the race is shaping up!

With one look you can instantly take in a huge amount of information!

Let’s answer these questions:

  1. The field have traveled 5 of the 8.5 furlongs so far; you can tell because the “Playback Bar” is at 10005 – remember you just have to use the right-most digit.
  2. Holiday Thunder is just taking the lead because he’s the one closest to the 0 (zero) “Lengths Behind” line to the right of the “race area”!

  3. We can see that the “Size” parameter on the right has been set to “Odds”, so the smallest bubble will equal the favorite; that’s Holiday Thunder again. The next biggest bubble belongs to Dollar Bill, so that must mean he’s the second (2nd) favorite!
  4. Both Dollar Bill and Holiday Thunder have been ‘checked’ in the “Select” list box on the right – that means we are just focusing on them so the other bubbles are a little more transparent. Also, Dollar Bill’s bubble has a yellow circle around it – that means we are currently hovering over it with the mouse. This highlights the appropriate values on the x- and y-axes.

    Look down to the x-axis, that’s represents the number of lengths behind the leader. So now we can answer the second part of the question! The x-axis value highlighted is “-2.3″, which means Dollar Bill is 2.3 lengths behind Holiday Thunder right now!

  5. Look to the top-right and the “Color” parameter; it’s set to “PP” for Post Position (the gate/starting-stall that the horse began the race from). If we know that blue means a lower PP number (i.e. began closer to the track’s running rail) and red means drawn widest of all, we can instantly identify where the horses that started from post positions 1 and 2 are now!
  6. The deepest blue bubble must be PP #1 and he’s got a y-axis (“Wide“) value of 1, so he must be still in lane #1, right up against the running rail. The lighter-colored blue bubble is right beside him in lane #2.

    You can infer from that that the rest of the field is going to have to come around them if they are going to win. Just one quick glance at thoroMotion, even when it’s paused, and you already have an amazing read on the race!

  7. Let’s go back to Dollar Bill’s bubble. Before we looked down the the x-axis to find out how many lengths he was behind the leader. This time look all the way to the left y-axis to figure out how far off the rail he is – he’s in lane #1.5! But what does lane one-and-a-half mean?
  8. This shows the power of thoroMotion. The data driving this visualization is the official chart data – the same data used to populate the chart at the top of the page. You can see that there wasn’t a point-of-call at 5 furlongs, but there was one at 4 furlongs (1/2 mile) and 6 furlongs (3/4 mile).

    In this example I combined the official chart data with data from trip notes that detail how wide each horse was at each point-of-call. A lane of 1.5 means that Dollar Bill was making a move away from the rail between the half-mile and three-quarters pole (as you can see from the live thoroMotion above).

    Why? Because there were 2 longshots (larger bubbles) that had raced on the rail (blue bubbles) all the way up to that point, remember?!? You can imagine that as they weakened, Dollar Bill had to be maneuvered off the rail to get around them to challenge.

  9. By now you should be easily able to see that the horse in last position, the big pink bubble, is about 5 lengths from the lead.
  10. We can guess his odds by looking over to the “Odds” parameter again; see the way it shows what the scale is underneath? If the highlighted Dollar Bill has odds of 2.3 (to a $1 bet), and the odds scale goes up to 10.3, we can figure out that the last horse has pretty much the biggest bubble and therefore must around the 10.3/1 odds mark!

If you have made it this far, well done! You should now have a decent grasp on how thoroMotion works and how powerful it can be.

Here’s the actual race video of the 2000 Kentucky Jockey Club Stakes so you can compare it to the visualization:

Why did I use the 2000 Kentucky Jockey Club Stakes as the example race? Because it is one of the races in the free sample Import Chart Result Data File downloaded from BRIS. In later posts I will show you how you can use results data purchased from data suppliers like brisnet.com with thoroMotion.

Right now, thoroMotion just supports the Import Charts data file structure. However, I’ve built an API that will allow developers to convert racing data from any data supplier to the format thoroMotion needs. This API will also be used for future projects here at thorobase.

What’s even better is that both the thorobase API and the thoroMotion code will be open-source software, so absolutely anyone can use this technology. My next few posts will show how you can build thoroMotions yourself, include them on any webpage and use other data sources, so stay tuned!

For now, I’d love to hear what people think and what they would like to see with thoroMotion next. Please leave a comment below, or on the thorobase Facebook page, or tweet me on Twitter, or even just email me. Thanks!

Written by Robin Howlett (Admin)

February 22nd, 2010 at 5:52 pm

Crowdsourcing Race Commentary Transcription

View Comments

I describe how crowdsourcing the transcription of commentary from racing videos on YouTube enables deaf and hard-of-hearing racing fans to experience a great race call. Also, I show how to automatically translate the commentary to another language.

Play the video below to see it in action. If closed-captioning is not already turned on, click the upwards-facing arrow in the bottom right of the YouTube video frame, and click the “CC” button. The left-facing arrow on the left of the “CC” button allows you to select between the English or Spanish subtitles, or even auto-translate into a different language (although the horse’s names may be translated then also).

How it was done

This week thorobase successfully organized a community-driven archiving of the Partymanners YouTube channel. The Partymanners channel contains over 2000 videos of stakes and notable races from years past, including some of the greatest performances by thoroughbred champions through the years.

As many of you know already know the story, I will spare you the details; for the uninitiated, please direct yourselves to the excellent Colin’s Ghost blog and “Race Fans Rally to Save Historic Race Archive, 2010“.

After the archiving was complete, I posted the information to two of the more popular racing forums: Pace Advantage and Thoroughbred Champions. Included in those posts was an idea, made off the cuff, to use this new archive for improving the experience of racing fans with disabilities. That idea did not go away.

I decided to experiment today. I chose the video mentioned in those posts above, Tom Durkin’s classic call to Cigar’s imperious victory in the 1995 Breeders’ Cup Classic. As opposed to the Partymanners archiving task, I used a specialist website, Amazon’s Mechanical Turk, to employ people to do a specific HIT (human intelligence task).

I created the following HIT:

Amazon Mechanical Turk - Human Intelligence Task

The details of the HIT I submitted

I selected the $5 reward amount as I had designed it as a 5-hour task, and therefore being equivalent to $1 an hour. I had requested the task be performed three times by three separate users. I provided with the task a link to the original YouTube video of the race and a link to the Racing Post’s result page to aid with horse and jockey names.

My inexperience with this technology showed, as there was evidence that perhaps I had overpaid. The entire process took about 10 minutes, and I returned to my normal activities. Later that day, I was alerted to the fact the task had been completed and the transcriptions were ready. In fact, the workers had only taken about an hour each to transcribe the video.

The quality of the transcription was very good, but Worker #1 had used a format especially well suited for YouTube transcription.

Using the same technique as for the Partymanners archiving, I downloaded the Cigar video from YouTube and uploaded it to my own account. YouTube has recently introduced automatic caption timing:

“With auto-timing, you no longer need to have special expertise to create your own captions in YouTube. All you need to do is create a simple text file with all the words in the video and we’ll use Google’s ASR [Automatic Speech Recognition] technology to figure out when the words are spoken and create captions for your video.. That meant I could now supply the transcript file to the video, and YouTube’s algorithm will figure out what works.”

This seemed perfect and I uploaded the transcript to be auto-timed. The video sync’d with the subtitles perfectly until about 1:25 into the video, when Durkin shouts “Cigar!” – the video then tried playing the rest of the captioning all withing a few seconds. Knowing the limitation of ASR technology to drastic changes in voice rhythm, I looked for a quick solution.

I found YouTube Subtitler, which allowed me to use my existing transcript and, while watching the video, let me very easily synchronize the text with the video. That task took no more than 10 minutes also.

Using YouTube Subtitler to synchronize the transcript with the video

While watching the video, you use simple controls to synchronize the transcript with the video

I left the transcript of Worker #1 largely intact as I wanted to demonstrate the quality of the transcription, despite the occasional error. I altered the first line, and I broke up the line containing Durkin’s famous race-ending exclamations purely for dramatic purposes. Other than that, the transcription is how I received it.

With the video now complete with appropriate captioning, I wanted to see where else I could take this. Another group of race fans that would not be able to fully enjoy the commentary of this race, even with subtitles, was non-English speakers. I opened Google’s translator toolkit and uploaded the caption script I had downloaded from YouTube Subtitler.

As I don’t speak a word of Spanish, I simply did a side-by-side comparison and fixed the horses names because some had been translated. Interestingly, the second time I tried this, Google had learned and no longer tried to translate Cigar into ‘de cigarro’.

Google Translator Toolkit

Translating the commentary from English to Spanish using Google translator toolkit

I would love to hear your feedback about this idea and demo, especially those of you who could show it to someone who hasn’t been able to experience this great commentary before.

Written by Robin Howlett (Admin)

February 12th, 2010 at 12:09 am

Welcome!

View Comments

thorobase is a brand new approach to horse racing from Robin Howlett.

This website will show you how to take a completely new approach to thoroughbred horse racing.

For the experienced horse player, thorobase will give you tools to improve your handicapping skills, and it will teach you skills that let you discover and manipulate racing data for profitable betting strategies.

For the racing beginner, it will open the world of thoroughbred racing to you. It will allow you to understand our fantastic sport in greater depth using simple visualizations, summaries and explanations.

For everyone in between, you’ll find something that will enhance your enjoyment of the Sport of Kings.

thorobase will focus strongly on using social media products, platforms and websites – like Google, Twitter, Facebook and Yahoo – to build fun and useful community-based applications. It will show you how to connect with fellow racing fans to share tips, stories and strategies. There are already some cool tools in the pipeline that you will be able to play with in the coming days and weeks.

Finally, the website will be a learning resource. All of the code used for the applications and demos will be open-source, meaning anyone will be free to contribute or use the code for their own purposes. Detailed instructions will show how to build horse racing software and services, so you can create customized solutions.

It’s an challenging time for horse racing today; thorobase wants to get you excited about this great sport of ours again!

Written by Robin Howlett (Admin)

February 10th, 2010 at 10:59 pm

Posted in Introduction