The project outline below was origially drafted in September, 2016. Much has changed since then. Mabye I’ll get around to updating the description at some point. Here are a few of the changes:
- the model
- new features
- feature reduction of highly correlated features
- stepwise feature selection according to AIC
- set of scripts to streamline model testing and evaluation
- monte carlo simulations to explore most profitable betting strategies (developed by a friend and my brother)
Below are real world predictions made prior to each event. You’ll notice the performance between the model and Vegas is pretty similar below. A few things to remember about the model and the results listed below:
- the model doesn’t make prediction where one of the fighters is making a UFC debut
- these are all model predictions at the minimum confidence threshold of 50%
- in a betting situation, these predictions are filtered to a higher confidence threshold, as well as by other means.
Date | Bets | Moneyline | Probability | Model.Win | Vegas.Win | Thresh |
---|---|---|---|---|---|---|
2017-03-18 | makwan amirkhani | -120 | 0.6285 | 0 | 0 | 60 |
2017-03-18 | brad pickett | -138 | 0.5749 | 0 | 0 | 55 |
2017-03-18 | scott askham | -142 | 0.5857 | 0 | 0 | 55 |
2017-03-18 | daniel omielanczuk | 153 | 0.6032 | 0 | 1 | 60 |
2017-03-18 | darren stewart | -176 | 0.8005 | 0 | 0 | 75 |
2017-03-18 | gunnar nelson | -334 | 0.6366 | 1 | 1 | 60 |
2017-03-18 | corey anderson | 125 | 0.5971 | 0 | 1 | 55 |
2017-03-18 | joe duffy | -591 | 0.7547 | 1 | 1 | 75 |
2017-03-18 | vicente luque | -128 | 0.5163 | 0 | 0 | 50 |
2017-03-18 | marc diakiese | -202 | 0.6297 | 1 | 1 | 60 |
2017-03-11 | beneil dariush | 135 | 0.5341 | 0 | 1 | 50 |
2017-03-11 | francisco trinaldo | 160 | 0.5582 | 0 | 1 | 55 |
2017-03-11 | ray borg | -124 | 0.5128 | 1 | 1 | 50 |
2017-03-11 | mauricio rua | -150 | 0.7802 | 1 | 1 | 75 |
2017-03-11 | michel prazeres | -241 | 0.7483 | 1 | 1 | 70 |
2017-03-11 | rani yahya | -211 | 0.7123 | 0 | 0 | 70 |
2017-03-11 | rony jason | 105 | 0.5065 | 0 | 1 | 50 |
2017-03-11 | tim means | -181 | 0.6000 | 0 | 0 | 60 |
2017-03-11 | kelvin gastelum | -340 | 0.7107 | 1 | 1 | 70 |
2017-03-04 | alistair overeem | -131 | 0.7104 | 1 | 1 | 70 |
2017-03-04 | mirsad bektic | -675 | 0.6382 | 0 | 0 | 60 |
2017-03-04 | luke sanders | -135 | 0.6983 | 0 | 0 | 65 |
2017-03-04 | david teymur | 305 | 0.5320 | 1 | 0 | 50 |
2017-03-04 | marcin tybura | -156 | 0.5285 | 1 | 1 | 50 |
2017-03-04 | paul craig | 115 | 0.5811 | 0 | 1 | 55 |
2017-03-04 | rashad evans | -220 | 0.5955 | 0 | 0 | 55 |
2017-03-04 | stephen thompson | -134 | 0.6237 | 0 | 0 | 60 |
2017-02-19 | elias theodorou | 106 | 0.5708 | 1 | 0 | 55 |
2017-02-19 | gerald meerschaert | -279 | 0.6066 | 1 | 1 | 60 |
2017-02-19 | johny hendricks | 120 | 0.5623 | 1 | 0 | 55 |
2017-02-19 | paul felder | -346 | 0.7766 | 1 | 1 | 75 |
2017-02-19 | santiago ponzinibbio | -330 | 0.6707 | 1 | 1 | 65 |
2017-02-19 | thiago santos | -182 | 0.5703 | 1 | 1 | 55 |
2017-02-19 | travis browne | 101 | 0.5026 | 0 | 1 | 50 |
2017-02-11 | randy brown | -143 | 0.7754 | 0 | 0 | 75 |
2017-02-11 | derek brunson | -120 | 0.6034 | 0 | 0 | 60 |
2017-02-11 | glover teixeira | -199 | 0.5439 | 1 | 1 | 50 |
2017-02-11 | jacare souza | -555 | 0.8839 | 1 | 1 | 75 |
2017-02-11 | dustin poirier | -431 | 0.6404 | 1 | 1 | 60 |
2017-02-11 | islam makhachev | -327 | 0.5128 | 1 | 1 | 50 |
2017-02-11 | rick glenn | -176 | 0.5366 | 1 | 1 | 50 |
2017-02-11 | ryan laflare | -296 | 0.6748 | 1 | 1 | 65 |
2017-02-11 | wilson reis | -591 | 0.8891 | 1 | 1 | 75 |
2017-02-04 | abel trujillo | -110 | 0.5531 | 0 | 0 | 55 |
2017-02-04 | curtis blaydes | -300 | 0.5549 | 1 | 1 | 55 |
2017-02-04 | alex morono | -105 | 0.5609 | 0 | 0 | 55 |
2017-02-04 | chris gruetzemacher | 230 | 0.5427 | 0 | 1 | 50 |
2017-02-04 | khalil rountree jr | -165 | 0.6463 | 1 | 1 | 60 |
2017-02-04 | dennis bermudez | -200 | 0.7212 | 0 | 0 | 70 |
2017-01-28 | raphael assuncao | -155 | 0.5353 | 1 | 1 | 50 |
2017-01-28 | francis ngannou | -375 | 0.7663 | 1 | 1 | 75 |
2017-01-28 | donald cerrone | -155 | 0.7404 | 0 | 0 | 70 |
2017-01-28 | alessio di chirico | -110 | 0.7764 | 0 | 0 | 75 |
2017-01-28 | jc cottrell | -145 | 0.5350 | 0 | 0 | 50 |
2017-01-28 | jason knight | -150 | 0.6837 | 1 | 1 | 65 |
2017-01-28 | sam alvey | -155 | 0.5182 | 1 | 1 | 50 |
2017-01-15 | aleksei oleinik | -145 | 0.5814 | 1 | 1 | 55 |
2017-01-15 | court mcgee | -140 | 0.5932 | 0 | 0 | 55 |
2017-01-15 | joachim christensen | -225 | 0.6326 | 1 | 1 | 60 |
2017-01-15 | dmitrii smoliakov | -110 | 0.6602 | 0 | 0 | 65 |
2017-01-15 | frankie saenz | -165 | 0.5557 | 0 | 0 | 55 |
2017-01-15 | marcin held | 100 | 0.5154 | 0 | 1 | 50 |
2017-01-15 | sergio pettis | -150 | 0.7080 | 1 | 1 | 70 |
2017-01-15 | tony martin | -175 | 0.5132 | 1 | 1 | 50 |
2017-01-15 | walt harris | -145 | 0.6143 | 1 | 1 | 60 |
2017-01-15 | yair rodriguez | -445 | 0.7820 | 1 | 1 | 75 |
2016-12-30 | antonio carlos junior | -140 | 0.5263 | 1 | 1 | 50 |
2016-12-30 | cody garbrandt | 160 | 0.5184 | 1 | 0 | 50 |
2016-12-30 | tarec saffiedine | 120 | 0.5112 | 0 | 1 | 50 |
2016-12-30 | neil magny | -165 | 0.5515 | 1 | 1 | 55 |
2016-12-30 | louis smolka | 115 | 0.6123 | 0 | 1 | 60 |
2016-12-30 | alex garcia | -190 | 0.5155 | 1 | 1 | 50 |
2016-12-30 | tj dillashaw | -205 | 0.5100 | 1 | 1 | 50 |
2016-12-17 | alex morono | -115 | 0.6271 | 1 | 1 | 60 |
2016-12-17 | bojan velickovic | -155 | 0.7095 | 0 | 0 | 70 |
2016-12-17 | colby covington | -360 | 0.5998 | 1 | 1 | 55 |
2016-12-17 | cole miller | -110 | 0.5299 | 0 | 0 | 50 |
2016-12-17 | eddie wineland | -225 | 0.5080 | 1 | 1 | 50 |
2016-12-17 | hector sandoval | -130 | 0.6126 | 1 | 1 | 60 |
2016-12-17 | scott holtzman | 150 | 0.5767 | 0 | 1 | 55 |
2016-12-17 | mike perry | -135 | 0.5634 | 0 | 0 | 55 |
2016-12-17 | mickey gall | -120 | 0.5004 | 1 | 1 | 50 |
2016-12-17 | urijah faber | -450 | 0.6846 | 1 | 1 | 65 |
2016-12-10 | max holloway | -195 | 0.5800 | 1 | 1 | 55 |
2016-12-10 | cub swanson | 188 | 0.5041 | 1 | 0 | 50 |
2016-12-10 | donald cerrone | -275 | 0.7071 | 1 | 1 | 70 |
2016-12-10 | lando vannata | -170 | 0.5478 | 1 | 1 | 50 |
2016-12-10 | kelvin gastelum | 110 | 0.7367 | 1 | 0 | 70 |
2016-12-10 | mitch gagnon | -156 | 0.5630 | 0 | 0 | 55 |
2016-12-10 | nikita krylov | 100 | 0.6733 | 0 | 1 | 65 |
2016-12-10 | olivier aubin mercier | -165 | 0.5002 | 1 | 1 | 50 |
2016-12-10 | rustam khabilov | -210 | 0.7076 | 1 | 1 | 70 |
2016-12-10 | zach makovsky | -160 | 0.5500 | 0 | 0 | 55 |
2016-12-09 | andrew sanchez | -205 | 0.7311 | 1 | 1 | 70 |
2016-12-09 | francis ngannou | -550 | 0.5372 | 1 | 1 | 50 |
2016-12-09 | corey anderson | -400 | 0.7424 | 1 | 1 | 70 |
2016-12-09 | derrick lewis | -205 | 0.7983 | 1 | 1 | 75 |
2016-12-09 | marc diakiese | -325 | 0.6540 | 1 | 1 | 65 |
2016-12-03 | anthony smith | -115 | 0.6544 | 1 | 1 | 65 |
2016-12-03 | brandon moreno | -125 | 0.7252 | 1 | 1 | 70 |
2016-12-03 | brendan oreilly | 100 | 0.5464 | 0 | 1 | 50 |
2016-12-03 | ryan hall | -110 | 0.5732 | 1 | 1 | 55 |
2016-12-03 | ion cutelaba | -215 | 0.6528 | 0 | 0 | 65 |
2016-12-03 | jorge masvidal | -250 | 0.5396 | 1 | 1 | 50 |
2016-12-03 | joseph benavidez | -200 | 0.5385 | 1 | 1 | 50 |
2016-12-03 | devin clark | -105 | 0.5631 | 1 | 1 | 55 |
2016-11-26 | jake matthews | -300 | 0.7287 | 0 | 0 | 70 |
2016-11-26 | omari akhmedov | -180 | 0.5074 | 1 | 1 | 50 |
2016-11-26 | damien brown | 115 | 0.5549 | 1 | 0 | 55 |
2016-11-26 | richard walsh | 112 | 0.6033 | 0 | 1 | 60 |
2016-11-26 | ben nguyen | 110 | 0.5989 | 1 | 0 | 55 |
2016-11-26 | marlon vera | 125 | 0.5636 | 1 | 0 | 55 |
2016-11-26 | derek brunson | -135 | 0.5036 | 0 | 0 | 50 |
2016-11-26 | chris camozzi | -245 | 0.7258 | 0 | 0 | 70 |
2016-11-26 | jason knight | 131 | 0.6187 | 1 | 0 | 60 |
2016-11-19 | gegard mousasi | -550 | 0.7373 | 1 | 1 | 70 |
2016-11-19 | stevie ray | -105 | 0.5643 | 1 | 1 | 55 |
2016-11-19 | teruto ishihara | -275 | 0.6556 | 0 | 0 | 65 |
2016-11-19 | kyoji horiguchi | -199 | 0.7209 | 1 | 1 | 70 |
2016-11-19 | kevin lee | -102 | 0.7378 | 1 | 1 | 70 |
2016-11-19 | zak cummings | -170 | 0.5942 | 1 | 1 | 55 |
2016-11-19 | ryan bader | -370 | 0.7399 | 1 | 1 | 70 |
2016-11-19 | albert morales | 235 | 0.5279 | 0 | 1 | 50 |
2016-11-19 | krzysztof jotko | 140 | 0.5699 | 1 | 0 | 55 |
2016-11-19 | zak ottow | 142 | 0.5038 | 0 | 1 | 50 |
2016-11-19 | manvel gamburyan | 140 | 0.6020 | 0 | 1 | 60 |
2016-11-19 | luis henrique | -275 | 0.6747 | 1 | 1 | 65 |
2016-11-19 | kamaru usman | -205 | 0.7164 | 1 | 1 | 70 |
2016-11-19 | jack hermansson | -210 | 0.5305 | 0 | 0 | 50 |
2016-11-19 | justin scoggins | -168 | 0.7722 | 0 | 0 | 75 |
2016-11-12 | conor mcgregor | -137 | 0.5500 | 1 | 1 | 55 |
2016-11-12 | stephen thompson | -191 | 0.5142 | 0 | 0 | 50 |
2016-11-12 | chris weidman | -190 | 0.7122 | 0 | 0 | 70 |
2016-11-12 | frankie edgar | -340 | 0.5929 | 1 | 1 | 55 |
2016-11-12 | rafael natal | -165 | 0.6559 | 0 | 0 | 65 |
2016-11-12 | vicente luque | -105 | 0.6086 | 1 | 1 | 60 |
2016-11-12 | jim miller | 130 | 0.7162 | 1 | 0 | 70 |
2016-11-12 | khabib nurmagomedov | -305 | 0.6494 | 1 | 1 | 60 |
2016-11-05 | rafael dos anjos | -145 | 0.5486 | 0 | 0 | 50 |
2016-11-05 | charles oliveira | -105 | 0.5347 | 0 | 0 | 50 |
2016-11-05 | beneil dariush | -115 | 0.6033 | 1 | 1 | 60 |
2016-11-05 | marco beltran | -125 | 0.5642 | 0 | 0 | 55 |
2016-11-05 | erick montano | -105 | 0.6659 | 0 | 0 | 65 |
2016-11-05 | douglas silva de andrade | -125 | 0.5554 | 1 | 1 | 55 |
2016-11-05 | alex nicholson | 175 | 0.5269 | 0 | 1 | 50 |
2016-11-05 | enrique barzola | -549 | 0.6046 | 1 | 1 | 60 |
2016-11-05 | felipe arantes | 180 | 0.5206 | 0 | 1 | 50 |
2016-10-08 | Michael Bisping | -225 | 0.8037 | 1 | 1 | 75 |
2016-10-08 | Gegard Mousasi | -300 | 0.9063 | 1 | 1 | 75 |
2016-10-08 | Ovince Saint Preux | -155 | 0.7126 | 0 | 0 | 70 |
2016-10-08 | Stefan Struve | -185 | 0.7629 | 1 | 1 | 75 |
2016-10-08 | Mirsad Bektic | -700 | 0.5751 | 1 | 1 | 55 |
2016-10-08 | Iuri Alcantara | -165 | 0.7111 | 1 | 1 | 70 |
2016-10-08 | Damian Stasiak | 145 | 0.6871 | 1 | 0 | 65 |
2016-10-08 | Albert Tumenov | -260 | 0.5388 | 0 | 0 | 50 |
2016-10-08 | Mike Perry | 120 | 0.6247 | 1 | 0 | 60 |
2016-10-08 | Leonardo Santos | 175 | 0.5480 | 1 | 0 | 50 |
2016-10-01 | John Lineker | 105 | 0.7123 | 1 | 0 | 70 |
2016-10-01 | Will Brooks | -270 | 0.5858 | 0 | 0 | 55 |
2016-10-01 | Andre Fili | 150 | 0.5666 | 1 | 0 | 55 |
2016-10-01 | Shamil Abdurakhimov | -120 | 0.5583 | 1 | 1 | 55 |
2016-10-01 | Elizeu Zaleski dos Santos | 130 | 0.5261 | 1 | 0 | 50 |
2016-10-01 | Nate Marquardt | 180 | 0.5480 | 1 | 0 | 50 |
2016-10-01 | Ion Cutelaba | -165 | 0.6151 | 1 | 1 | 60 |
2016-10-01 | Curtis Blaydes | -210 | 0.5855 | 1 | 1 | 55 |
2016-09-24 | Renan Barao | -500 | 0.7085 | 1 | 1 | 70 |
2016-09-24 | Antonio Silva | 350 | 0.5854 | 0 | 1 | 55 |
2016-09-24 | Fancisco Trinaldo | -130 | 0.7309 | 1 | 1 | 70 |
2016-09-24 | Thiago Santos | -620 | 0.6075 | 0 | 0 | 60 |
2016-09-24 | Mike de la Torre | 115 | 0.5159 | 0 | 1 | 50 |
2016-09-24 | Gilbert Burns | -175 | 0.5683 | 0 | 0 | 55 |
2016-09-24 | Rani Yahya | -140 | 0.6103 | 1 | 1 | 60 |
2016-09-24 | Jussier Formiga | -200 | 0.5980 | 1 | 1 | 55 |
2016-09-24 | Stevie Ray | -160 | 0.6967 | 0 | 0 | 65 |
2016-09-24 | Vicente Luque | -500 | 0.6840 | 1 | 1 | 65 |
2016-09-17 | Dustin Poirier | -170 | 0.6995 | 0 | 0 | 65 |
2016-09-17 | Derek Brunson | -190 | 0.5975 | 1 | 1 | 55 |
2016-09-17 | Roan Carneiro | 105 | 0.5722 | 1 | 0 | 55 |
2016-09-17 | Islam Makhachev | -135 | 0.5811 | 1 | 1 | 55 |
2016-09-17 | Chas Skelly | -175 | 0.5081 | 1 | 1 | 50 |
2016-09-17 | Gabriel Benitez | 105 | 0.5951 | 1 | 0 | 55 |
2016-09-17 | Belal Muhammad | -450 | 0.5708 | 1 | 1 | 55 |
2016-09-17 | Antonio Carlos Junior | -255 | 0.6906 | 1 | 1 | 65 |
2016-09-17 | Jose Quinonez | 140 | 0.5547 | 1 | 0 | 55 |
2016-09-17 | Randy Brown | -450 | 0.6144 | 1 | 1 | 60 |
2016-09-10 | Jimmie Rivera | -170 | 0.5714 | 1 | 1 | 55 |
2016-09-10 | Stipe Miocic | -110 | 0.6259 | 1 | 1 | 60 |
2016-09-10 | Fabricio Werdum | -225 | 0.8458 | 1 | 1 | 75 |
2016-09-10 | Brad Tavares | -200 | 0.6333 | 1 | 1 | 60 |
2016-09-10 | Sean Spencer | 105 | 0.5688 | 0 | 1 | 55 |
2016-09-03 | Andrei Arlovski | 120 | 0.6398 | 0 | 1 | 60 |
2016-09-03 | Alexander Gustaffson | -550 | 0.7330 | 1 | 1 | 70 |
2016-09-03 | Ryan Bader | -170 | 0.7937 | 1 | 1 | 75 |
2016-09-03 | Nick Hein | -340 | 0.7009 | 1 | 1 | 70 |
2016-09-03 | Nick Dalby | -190 | 0.5350 | 0 | 0 | 50 |
2016-09-03 | Taylor Lapilus | -139 | 0.7598 | 1 | 1 | 75 |
2016-09-03 | Rustam Khabilov | -560 | 0.7424 | 1 | 1 | 70 |
##
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
***
Ultimate Fighting Championship (UFC) is the world’s premier professional mixed martial arts organization. Fighters engage in one-on-one matches consisting of 3-5 rounds of 5 minutes each. Fighters are allowed to strike with punches, kicks, knees, elbows, or employ submission moves such as chokes or joint locks. Fighting occurs standing up, on the ground, or in a clinch position. Much like other professional sports, there are a variety of structured data gathered on individual fighters and on each fight match up. This project gathers much of that data, learns from it, and then makes predictions about future fight matchups. Specifically, this project seeks to answer the question: Given a specific mathcup, what is each fighter’s probability of winning?
The dataset ultimately developed by and for this project is not only an aggregation of data scraped from multiple sources, but is also comprised of newly generated data derived from the scraped data. Namely, the derivitive data refers to historic career statistics for each fighter at the time of each fight.
This project intially gathered data from four sources:
However, in the end the only sources of data that were used in the final dataset were FightMetric and FightMatrix. The process of scraping data generally followed the pattern below using the Rvest package.
1. Collect and store relevant URLs in a list.
# Collect individual fighter urls.
for (letter in letters) {
fightMetric_fighter_list_url <- paste("http://www.fightmetric.com/statistics/fighters?char=",
letter, "&page=all", sep = "")
fightMetric_fighter_list_page <- read_html(fightMetric_fighter_list_url)
# Get inidividual FightMetric Fighter links for letter of alphabet from Fighter List page.
fighter_links_per_letter <- fightMetric_fighter_list_page %>%
html_nodes(".b-statistics__table-col:nth-child(1) .b-link_style_black") %>%
html_attr("href")
fightMetric_fighter_links <- c(fightMetric_fighter_links, fighter_links_per_letter)
}
2. Make copies of HTML from the URL list and store in a list.
#### Make copies of each FightMetric fighter page HTML to later scrape data we need. ####
fightMetric_fighter_html <- list()
for (i in 1:length(fightMetric_fighter_links)) {
fightMetric_fighter_html[[i]] <- read_html(fightMetric_fighter_links[[i]])
}
3. Parse the HTML and put what we want in a dataframe.
#### Populate REVISED FIGHTER DATA ####
for (i in 1:length(fightMetric_fighter_html)) {
# Get fighter name
name <- fightMetric_fighter_html[[i]] %>%
html_nodes(".b-content__title-highlight") %>%
html_text(trim = TRUE)
fightMetric_fighters_df[i, 1] <- name[[1]]
# Get fighter pro record
record <- fightMetric_fighter_html[[i]] %>%
html_nodes(".b-content__title-record") %>%
html_text(trim = TRUE) %>%
strsplit(" ")
record <- record[[1]][2] %>%
strsplit("-")
fightMetric_fighters_df[i, 2] <- record[[1]][1]
fightMetric_fighters_df[i, 3] <- record[[1]][2]
fightMetric_fighters_df[i, 4] <- record[[1]][3]
# Get fighter stats
stats <- fightMetric_fighter_html[[i]] %>%
html_nodes(".b-list__box-list-item_type_block") %>%
html_text(trim = TRUE) %>%
gsub("^.*:(\\n *)*", "", .)
stats <- stats[-10]
for (s in 1:length(stats)) {
if (s == "") {
fightMetric_fighters_df[i, (4+s)] <- NA
} else {
fightMetric_fighters_df[i, (4+s)] <- stats[s]
}
}
}
This process was adapted and repeated to pull data for each of the three primary dataframes:
ufc_fight_details_df
: fight-by-fight statisticsfightMetric_fighters_df
: individual fighter statisticsfigher_ranking_df
: current and historical fighter rankings by weightclassMuch of the data that was scraped had to be cleaned to some extent. At the very least values were converted from character to numeric, and in other cases fixes such as removing ‘%’ signs, converting a fighter’s height into number of inches, converting Dates of birth into machine readable Date format, etc… were necessary. Here are a few examples:
Removing ‘%’ sign.
# Remove % sign in several of the columns and convert to numeric.
for (i in c(11,13,15,16)) {
fightMetric_fighters_df[,i] <- fightMetric_fighters_df[,i] %>%
gsub("%", "", .) %>%
as.numeric()
fightMetric_fighters_df[,i] <- fightMetric_fighters_df[,i]/100
}
Converting height into number of inches
# Convert Height to numeric inches.
fightMetric_fighters_df$Height <- fightMetric_fighters_df$Height %>%
gsub("'|\"", "", .) %>%
strsplit(split = " ")
for (h in 1:length(fightMetric_fighters_df$Height)) {
if (fightMetric_fighters_df[h, 5][[1]] == "--") {
fightMetric_fighters_df[h, 5][[1]] <- NA
} else {
fightMetric_fighters_df[h, 5][[1]] <- (as.numeric(fightMetric_fighters_df[h, 5][[1]][1])*12) +
as.numeric(fightMetric_fighters_df[h, 5][[1]][2])
}
}
fightMetric_fighters_df$Height <- unlist(fightMetric_fighters_df$Height)
There were also instances of UFC fighters with the same name, which was particulary troublesome when merging this fighter data with other dataframes on the fighter’s name. It gave the appearance of a dataset or a merge that was entirely accurate, but for a few inexplicable errors. So after further digging into the datasets and crosschecking with the original webpages, the issue was handled by identifying these fighters by their name in addition to a physical attribute, and then renaming one of them. Here’s an example.
# Handle instances of same name in FIGHTMETRIC_FIGHTERS_DF
fightMetric_fighters_df[fightMetric_fighters_df$Name=="Dong Hyun Kim" & fightMetric_fighters_df$Reach==70, 1] <- "Dong Hyun Kim (Maestro)"
fightMetric_fighters_df[fightMetric_fighters_df$Name=="Michael McDonald" & fightMetric_fighters_df$Weight==205, 1] <- "Michael McDonald (The Black Sniper)"
fightMetric_fighters_df[fightMetric_fighters_df$Name=="Tony Johnson" & fightMetric_fighters_df$Weight==185, 1] <- "Tony Johnson (185lbs.)"
After all the data had been collected and stored in dataframes, there was some reshaping required in order to end up with a single dataframe suitable to subset from and present to a logistic regression or random forest function for modeling. At this point there were three dataframes - fight_details
, fightMetric_fighters
, and fighter_ranking
- and the desired end state was to have a single fight_details
that contained not only detailed fighter stats for each fight, but also each fighter’s attributes, and each fighter’s ranking. Furthermore, fight_details
was currently in the form of each row representing one fighter’s stats for one fight, which meant there were two rows for each fight - one for each fighter’s stats. So, four things needed to happen…
1. Reshape fight_details
to contain both fighter’s statistics for a single fight in a single row.
#### Reshape fight_details_df to accomodate logistical regression or random forest ####
colnames(ufc_fight_details_df)[2:3] <- c("Fighter1", "Fighter2")
# Grab even number rows from fight_details to later bind with odd number rows.
fighter2_df <- ufc_fight_details_df %>%
slice(seq(2,nrow(ufc_fight_details_df),2))
# Rename columns in fighter2_df and original fight_details_df.
colnames(fighter2_df)[17:160] <- paste("F2", colnames(fighter2_df)[17:160], sep = "")
colnames(ufc_fight_details_df)[17:160] <- paste("F1", colnames(ufc_fight_details_df)[17:160], sep = "")
# Delete even number rows from fight_details.
ufc_fight_details_df <- ufc_fight_details_df[-seq(2,nrow(ufc_fight_details_df),2),]
# Combine columns from fight_details and fighter_2, so each row holds each fighter's data for
# a given fight.
ufc_fight_details_df <- bind_cols(ufc_fight_details_df, fighter2_df[17:160])
# Delete fighter_2 data frame.
rm(fighter2_df)
# Then, REMOVE rows with NA for FIGHTER.
na_row_index <- c()
for (n in 1:nrow(ufc_fight_details_df)) {
if (is.na(ufc_fight_details_df[n,2]) | is.na(ufc_fight_details_df[n,3])) {
na_row_index <- c(na_row_index, n)
}
}
ufc_fight_details_df <- ufc_fight_details_df[-c(na_row_index),]
###################################
2. Include a new variable to indicate the winner - simply a 0 or 1 variable indicating if Fighter 1 wins or not.
# Mutate FIGHT_DETAILS to give logical form of result (Fighter1 or Fighter2) ready for logistic
# regression or random forest.
ufc_fight_details_df <- ufc_fight_details_df %>%
mutate(F1Wins = as.character(Result) == as.character(Fighter1))
for (i in 1:nrow(ufc_fight_details_df)) {
if (is.na(ufc_fight_details_df[i,305])) {
ufc_fight_details_df[i,305] <- NA
} else if (ufc_fight_details_df[i,305] == TRUE) {
ufc_fight_details_df[i,305] <- 1
} else {
ufc_fight_details_df[i,305] <- 0
}
}
3. Merge fightMetric_fighters
with fight_details
on each fighter’s name.
##### Merge FIGHTMETRIC_FIGHTERS_DF with FIGHT_DETAILS_DF ####
#### Merge FIGHTERS_DF to FIGHTER1 in FIGHT_DETAILS ####
fightMetric_fighters_df <- fightMetric_fighters_df %>%
rename(Fighter1=Name)
ufc_fight_details_df <- ufc_fight_details_df %>%
left_join(fightMetric_fighters_df, by = "Fighter1")
left_copy_fight_details <- ufc_fight_details_df[,1:306]
right_copy_fighter1_details <- ufc_fight_details_df[,307:323]
left_copy_double_rows <- as.numeric(rownames(left_copy_fight_details[duplicated(left_copy_fight_details),]))
left_copy_fight_details <- left_copy_fight_details[-c(left_copy_double_rows),]
right_copy_fighter1_details <- right_copy_fighter1_details[-c(left_copy_double_rows),]
ufc_fight_details_df <- left_copy_fight_details
########################
#### Merge FIGHTERS_DF to FIGHTER2 in FIGHT_DETAILS ####
fightMetric_fighters_df <- fightMetric_fighters_df %>%
rename(Fighter2=Fighter1)
ufc_fight_details_df <- ufc_fight_details_df %>%
left_join(fightMetric_fighters_df, by = "Fighter2")
left_copy_fight_details <- ufc_fight_details_df[,1:306]
right_copy_fighter2_details <- ufc_fight_details_df[,307:323]
left_copy_double_rows <- as.numeric(rownames(left_copy_fight_details[duplicated(left_copy_fight_details),]))
left_copy_fight_details <- left_copy_fight_details[-c(left_copy_double_rows),]
right_copy_fighter2_details <- right_copy_fighter2_details[-c(left_copy_double_rows),]
ufc_fight_details_df <- left_copy_fight_details %>%
bind_cols(right_copy_fighter1_details) %>%
bind_cols(right_copy_fighter2_details)
colnames(ufc_fight_details_df)[307:323] <- paste("F1", colnames(fightMetric_fighters_df)[2:18], sep = "")
colnames(ufc_fight_details_df)[324:340] <- paste("F2", colnames(fightMetric_fighters_df)[2:18], sep = "")
fightMetric_fighters_df <- fightMetric_fighters_df %>%
rename(Name=Fighter2)
###################
4. Merge fighter_ranking
with fight_details
on each fighter’s name.
################### Assign FIGHTER_RANKING to fighters in NEW_FIGHTS #################
# Reset clean unique FightNumbers
new_fights_df$FightNumber <- c(1:nrow(new_fights_df))
new_fights_df <- new_fights_df %>%
mutate(F1Rank = 251, F2Rank = 251)
F1RankCol <- grep("F1Rank", colnames(new_fights_df))
F2RankCol <- grep("F2Rank", colnames(new_fights_df))
quarterly_date <- as.Date("2006-01-01", "%Y-%m-%d")
for (q in 1:(length(fighter_ranking_df)-9)) {
temp_fighter_ranking <- fighter_ranking_df %>%
select(c(q,353))
quarterly_fight_details <- new_fights_df %>%
filter(quarter(Date, with_year = TRUE) == quarter(quarterly_date, with_year = TRUE)) %>%
select(FightNumber, Date, Fighter1, Fighter2, F1Rank, F2Rank)
for (F1 in 1:nrow(quarterly_fight_details)) {
if (tolower(quarterly_fight_details[F1, 3]) %in% tolower(na.omit(temp_fighter_ranking[,1]))) {
quarterly_fight_details[F1,5] <- temp_fighter_ranking[tolower(na.omit(temp_fighter_ranking[,1])) == tolower(quarterly_fight_details[F1,3]), 2][1]
}
if (tolower(quarterly_fight_details[F1, 4]) %in% tolower(na.omit(temp_fighter_ranking[,1]))) {
quarterly_fight_details[F1,6] <- temp_fighter_ranking[tolower(na.omit(temp_fighter_ranking[,1])) == tolower(quarterly_fight_details[F1,4]), 2][1]
}
}
for (i in 1:nrow(quarterly_fight_details)) {
new_fights_df[new_fights_df$FightNumber == quarterly_fight_details[i,1], c(F1RankCol, F2RankCol)] <- quarterly_fight_details[i, 5:6]
}
quarterly_date <- quarterly_date + months(3)
if (quarter(quarterly_date, with_year = TRUE) > quarter(today(), with_year = TRUE)) {
quarterly_date <- as.Date("2006-01-01", "%Y-%m-%d")
}
}
new_fights_df$FightNumber <- c(1:nrow(new_fights_df))
new_fights_df$F1Rank <- as.numeric(new_fights_df$F1Rank)
new_fights_df$F2Rank <- as.numeric(new_fights_df$F2Rank)
#######################################################################################
As mentioned earlier, the dataset for this project is a combination of data scraped from several sources, as well as data derived (or newly created) from the scraped data. Specifically, the derived data includes career statistics for each fighter at the time of each fight. This data was not available anywhere online (to my knowledge), but was critical in developing an accurate description of each fighter at the time of each fight in which to build a model that predicts future outcomes. So, the basic approach to generating this new data was to sort through the entire fight dataset (ufc_fight_details_df
) one fighter at a time in chronologically tally up their career stats. Below is an example of some of the code, which can almost certainly be refactored into something more economical, but for the time-being is made up of about 200 lines of nested for
loops and if
else
statements.
4. Example block of code to generate derivitive data.
# Add columns to be populated to FIGHT_DETAILS
ufc_fight_details_df <- ufc_fight_details_df %>%
mutate(F1CarFightTime = NA, F2CarFightTime = NA, F1UFCWins = NA, F2UFCWins = NA, F1UFCLosses = NA,
F2UFCLosses = NA, F1UFCNC = NA, F2UFCNC = NA, F1CarSigStr = NA, F2CarSigStr = NA,
F1CarSigStrAtt = NA, F2CarSigStrAtt = NA, F1CarSigStrAbs = NA, F2CarSigStrAbs = NA,
F1CarTD = NA, F2CarTD = NA, F1CarTDAtt = NA, F2CarTDAtt = NA, F1CarTDAbs = NA, F2CarTDAbs = NA,
F1CarOppSigStrAtt = NA, F2CarOppSigStrAtt = NA, F1CarOppTDAtt = NA, F2CarOppTDAtt = NA)
#### Begin loop to generate additional fighter data ####
for (fighter in unique_fighters$.) {
fighter_career_data <- ufc_fight_details_df %>%
filter((Fighter1 == as.character(fighter)) | (Fighter2 == as.character(fighter))) %>%
arrange(Date)
for (i in 1:nrow(fighter_career_data)) {
#### Adding up total career time in ring for each fighter going into each fight - Fighter1 spot. ####
if (fighter == fighter_career_data[i,2]) {
if (i == 1) {
# F1CarFightTime when F1
fighter_career_data[i, 342] <- 0
# F1CarSigStr when F1
fighter_career_data[i, 350] <- 0
# F1CarSigStrAtt when F1
fighter_career_data[i, 352] <- 0
# F1CarSigStrAbs when F1
fighter_career_data[i, 354] <- 0
# F1CarTD when F1
fighter_career_data[i, 356] <- 0
# F1CarTDAtt when F1
fighter_career_data[i, 358] <- 0
# F1CarTDAbs when F1
fighter_career_data[i, 360] <- 0
# F1CarOppSigStrAtt when F1
fighter_career_data[i, 362] <- 0
# F1CarOppTDAtt when F1
fighter_career_data[i, 364] <- 0
} else {
# F1 was F1 last fight
if (fighter == fighter_career_data[(i-1),2]) {
# F1CarFightTime when F1 this fight was F1 last fight
fighter_career_data[i, 342] <- fighter_career_data[(i-1), 342] + fighter_career_data[(i-1), 341]
# Get fight data from last fight when F1 this fight was F1 last fight.
temp_fight_data <- c(fighter_career_data[(i-1), 18], fighter_career_data[(i-1), 19],
fighter_career_data[(i-1), 162], fighter_career_data[(i-1), 23],
fighter_career_data[(i-1), 24], fighter_career_data[(i-1), 167],
fighter_career_data[(i-1), 163], fighter_career_data[(i-1), 168])
# F1CarSigStr through F1CarOppTDAtt when F1 this fight was F1 last fight
for (n in 1:8) {
if (is.na(temp_fight_data[n])) {
fighter_career_data[i, ((174+n)*2)] <- 0
} else {
fighter_career_data[i, ((174+n)*2)] <- fighter_career_data[(i-1), ((174+n)*2)] + temp_fight_data[n]
}
}
}
Some widdling down of the features occured in going from the 382 features of ufc_fight_details_df
to the subset of 48 features ultimately used in the logisitic regression model. This process was part logic, part domain expertise. Perhaps more on the logic side were features of physical description, like age, height, weight, reach, weight class, and stance. Career statistics features are the product of both logic and domain expertise. You don’t need to know anything about Mixed Martial Arts or UFC to think it’s a good idea to consider each fighter’s career statistics when making a prediction, but, for example, data for round-by-round statistics were not included because not all fights have the same number of rounds, and there would have been a considerable amount of additional data wrangling to present this data to the model in a useful way. The decision to include each fighter’s rank might also seem natural, but on the other hand an excellent data scientist without as much domain expertise might think that statistical descriptions of each fighter’s performance should be enough, or even superior than a somewhat subjective or circumstantial indicator like rank. But the ranking variable takes into consideration factors such as: the quality of a fighter’s past opponenents, and the chronology of past performance (e.g. a fighter with a long successful career that is now fighting past his prime).
After collecting and wrangling the data, we have a dataset of over 5,000 MMA fights ready to train a statistical model to make a predictions. Given we’re trying to predict a binary outcome - Win or Lose - for each fighter in a given fight, there are a few methods that would be suitable. For this project I tested two different methods - Logistic Regression and Random Forest, and ended up going with Logistic Regression because it performed better on test data.
# Variable for the date ten years ago today.
ten_years_ago <- ymd(today()) - years(10)
# This gives us the subset of fights (rows) we'll use for the model.
row_subset_fights <- ufc_fight_details_df %>%
filter(Date > ten_years_ago) %>%
filter(Gender == "Men") %>%
filter(WeightClass != "Strawweight") %>%
filter(WeightClass != "Atomweight") %>%
filter(WeightClass != "Catch Weight") %>%
filter(WeightClass != "Open Weight") %>%
filter(WeightClass != "Other") %>%
filter(WeightClass != "Super Heavyweight") %>%
filter((F1UFCWins>0 | F1UFCLosses>0 | F1UFCNC>0) & (F2UFCWins>0 | F2UFCLosses>0 | F2UFCNC>0)) %>%
arrange(Date)
# Subset of columns to be used in the model subset.
model_dataset_columns <- read.csv(paste(getwd(), "/FreshModelColumns.csv", sep = ""))
model_dataset_columns <- as.character(model_dataset_columns$ColumnNames)
# Get subset fight_details dataset - all rows; only columns from selected_features
fresh_model_dataset <- row_subset_fights %>%
subset(select = model_dataset_columns)
# Factor Weightclass.
fresh_model_dataset$WeightClass <- as.factor(fresh_model_dataset$WeightClass)
# Factor F1Stance
fresh_model_dataset$F1Stance <- fresh_model_dataset$F1Stance %>%
as.factor()
# Factor F2Stance
fresh_model_dataset$F2Stance <- fresh_model_dataset$F2Stance %>%
as.factor()
##########!!!!!!!!! SUBSETTING !!!!!!!!!!###########
# Split fight_details_df into training and testing sets.
set.seed(109)
split_ufc_data <- fresh_model_dataset$F1Wins %>%
sample.split(SplitRatio = 0.70)
train_ufc_subset <- fresh_model_dataset %>%
subset(split_ufc_data == TRUE)
test_ufc_subset <- fresh_model_dataset %>%
subset(split_ufc_data == FALSE)
# Remove and save columns we don't need, but may need later... just in case.
saved_column_names <- c("FightNumber","Fighter1","Fighter2","Date")
column_pos <- c()
for (column in saved_column_names) {
column_pos <- c(column_pos, grep(column, colnames(fresh_model_dataset)))
}
saved_columns <- fresh_model_dataset %>%
select(column_pos)
train_ufc_subset <- train_ufc_subset %>% select(-column_pos)
test_ufc_subset <- test_ufc_subset %>% select(-column_pos)
####################################
######### LOGIT MODEL ##############
ufc_fresh_logit_model <- glm(F1Wins ~ ., data = na.omit(train_ufc_subset), family = binomial)
# Prediction - probability that F1 wins
test_ufc_subset$F1WinsProb <- round(predict(ufc_fresh_logit_model, newdata = test_ufc_subset, type = "response"), 4)
#########################################
########## THRESHOLD MODELS #############
WinThreshold <- 0.5
# Filter out fights to get only those where F1WinsProb > 0.6 or < 0.4
test_threshold_subset <- test_ufc_subset %>%
filter(F1WinsProb > WinThreshold | F1WinsProb < (1-WinThreshold))
test_threshold_subset$PredF1Wins <- as.numeric(test_threshold_subset$F1WinsProb > WinThreshold)
#######################################
######### LOGIT CONFUSION MATRIX #######
logitPerformance <- confusionMatrix(as.factor(test_threshold_subset$PredF1Wins), as.factor(test_threshold_subset$F1Wins),
dnn = c("PredF1Wins", "F1Wins"))
logitPerformance
Confusion Matrix and Statistics
F1Wins
PredF1Wins 0 1
0 119 99
1 237 479
Accuracy : 0.6403
95% CI : (0.6085, 0.6711)
No Information Rate : 0.6188
P-Value [Acc > NIR] : 0.09412
Kappa : 0.1761
Mcnemar's Test P-Value : 7.782e-14
Sensitivity : 0.3343
Specificity : 0.8287
Pos Pred Value : 0.5459
Neg Pred Value : 0.6690
Prevalence : 0.3812
Detection Rate : 0.1274
Detection Prevalence : 0.2334
Balanced Accuracy : 0.5815
'Positive' Class : 0
roc(F1Wins ~ F1WinsProb, test_threshold_subset, auc = TRUE, plot = TRUE)
Call:
roc.formula(formula = F1Wins ~ F1WinsProb, data = test_threshold_subset, auc = TRUE, plot = TRUE)
Data: F1WinsProb in 356 controls (F1Wins 0) < 578 cases (F1Wins 1).
Area under the curve: 0.6701
############################ RANDOM FOREST MODEL ###################################
#set.seed(388)
ufc_fresh_randomForest_model <- randomForest(as.factor(F1Wins) ~ ., data = na.omit(train_ufc_subset),
importance=TRUE)
test_ufc_subset$F1WinsProb <- NULL
test_ufc_subset$RFF1WinsProb <- predict(ufc_fresh_randomForest_model, test_ufc_subset, type = "prob")[,2]
test_ufc_subset$RFPredF1Wins <- as.numeric(test_ufc_subset$RFF1WinsProb > 0.5)
randomForestPerformance <- confusionMatrix(as.factor(test_ufc_subset$RFPredF1Wins), as.factor(test_ufc_subset$F1Wins), dnn = c("F1Wins", "PredF1Wins"))
randomForestPerformance
Confusion Matrix and Statistics
PredF1Wins
F1Wins 0 1
0 91 77
1 265 501
Accuracy : 0.6338
95% CI : (0.602, 0.6648)
No Information Rate : 0.6188
P-Value [Acc > NIR] : 0.1817
Kappa : 0.1362
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.25562
Specificity : 0.86678
Pos Pred Value : 0.54167
Neg Pred Value : 0.65405
Prevalence : 0.38116
Detection Rate : 0.09743
Detection Prevalence : 0.17987
Balanced Accuracy : 0.56120
'Positive' Class : 0
roc(F1Wins ~ RFF1WinsProb, test_ufc_subset, auc = TRUE, plot = TRUE)
Call:
roc.formula(formula = F1Wins ~ RFF1WinsProb, data = test_ufc_subset, auc = TRUE, plot = TRUE)
Data: RFF1WinsProb in 356 controls (F1Wins 0) < 578 cases (F1Wins 1).
Area under the curve: 0.6282
varImpPlot(ufc_fresh_randomForest_model, n.var = 10)
The practical application of this project is to generate predictions for UFC fights on a weekly basis in order to wager on the fights. There are four main steps in this workflow.
Update FightMetric_fighters_df
: After each fight event, existing fighter data changes and sometimes new fighters are introduced to UFC. So, we want to pull this new information into our dataset of fighters (FightMetric_fighters_df
). Specifically, we do this by running two blocks of code labeled for FightMetric_fighters_df that can be found in the UFC.R script beginning at line ~250 and ~1022, respectively.
Update fighter_ranking_df
: Of course, after each event fighter’s win and lose, and so the fighter rankings change. There are actually two compenents to our fighter_ranking_df
- we have historical rankings and current rankings. After each event we need to update the current rankings of each fighter. But the historical rankings will not change. These are recorded in the dataset on a quarterly basis. In other words, we have historical rankings for each weightclass for each quarter going back to Q1 of 2006. So updating of the historical rankings only occurs once a quarter when they are released. Theoretically this means our model could have a few cases of rankings that are slightly off, but as a practical matter quarterly rankings are probably pretty accurate as the vast majority of fighters do not fight more than once per quarter. We update the current rankings by running the block of code beginning at line ~1079 of UFCUpdater.R, and we update the historical rankings once a quarter by running the block of code at line ~1030 of UFCUpdater.R.
Update ufc_fight_details_df
: After each event there are new fights and detailed statistics to be added to our ufc_fight_details_df
. While this dataset was originally built using the UFC.R script, it can be a lengthy and somewhat hands-on process to scrape nearly 10,000 web pages once, but to do so every week would be overly burdonsome. So, there is another script, UFCUpdater.R, that scrapes only the recently completed fights and data, computes their derivitive data, and then merges them back into ufc_fight_details_df
. We simply run the UFCUpdator.R script to accomplish this.
Generate predictions: Once we have updated our dataset, we are ready to generate predictions for the set of upcoming fights. To accomplish this we run the UFCPredictor.R script. This script does a few things. First, it scrapes the upcoming fight matchups. Then it computes and gathers the relevant variables for each fighter from our dataset. Then it refits the model with the most recent dataset. Finally, it applies our model to the set of data for the upcoming fights to produce a probability that Fighter 1 will win.
The status of this project is both functional and practical, and it accomplishes what it sets out to do. Nonetheless, there is room for improvement. The following is a working list of items to explore for improving the performance of the model.
Feature Engineering
1. A feature that incorporates how a fight ends: For example, does a fighter win by KO/TKO, Submission, Decision, etc…
2. A feature that incorporates whether or not this is a fighter’s first fight in UFC.
3. A feature to incorporate a fighter’s current winning or losing streak.
4. Further testing of different variable combinations (removing variables that appear insignificant, for example)
Predictive Modeling
1. Explore implementing and Ensemble model of both Logistic Regression and Random Forest.
2. Explore clustering on sets of fights predicted incorrectly to see if we can develop rules to improve overall predictive accuracy with a multi-stage predictive approach… or a multi-faceted model (i.e. generate multiple fits for subsets of fights that meet certain criteria).
3. Explore Specificity, Precision, and other model performance measures as a way to weight or hone model predictions.
Usability
1. Automate dataset updates and prediction generation.
2. Build a user-friendly dashboard or interactive interface with Flexdashboard or Shiny, respectively.
Code Base
1. Implement testing suite to help avoid “breaking” an increasingly complex code base during future code maintenance and feature additions.
2. Plenty of room for refactoring, which will simplfy readability and collaboration, and may present some gains in execution speed.
# Creating table to explain the data.
data_table <- data.frame(colnames(ufc_fight_details_df),
rep(NA,length(ufc_fight_details_df)),
as.character(ufc_fight_details_df[3449,]),
rep("FightMetric",length(ufc_fight_details_df)),
stringsAsFactors = FALSE)
# Assign column names.
colnames(data_table) <- c("Variables", "Unabreviated", "Example", "Source")
# Generate unabreviated variable names.
data_table$Unabreviated <- data_table$Variables %>%
gsub("F1", "Fighter1 ", .) %>%
gsub("F2", "Fighter2 ", .) %>%
gsub("Str", "Strike ", .) %>%
gsub("Att", "Attempt ", .) %>%
gsub("Car", "Career ", .) %>%
gsub("Sig", "Significant ", .) %>%
gsub("Opp", "Opponent ", .) %>%
gsub("TD", "Take Down ", .) %>%
gsub("Abs", "Absorbed ", .) %>%
gsub("pM", "per Minute ", .) %>%
gsub("Acc", "Accuracy ", .) %>%
gsub("Def", "Defense ", .) %>%
gsub("Avg", "Average ", .) %>%
gsub("Dist", "Distance ", .) %>%
gsub("KD", "Knock Down ", .) %>%
gsub("Pass", "Guard Pass ", .) %>%
gsub("NC", "No Contest ", .) %>%
gsub("R1", "Round 1 ", .) %>%
gsub("R2", "Round 2 ", .) %>%
gsub("R3", "Round 3 ", .) %>%
gsub("R4", "Round 4 ", .) %>%
gsub("R5", "Round 5 ", .) %>%
gsub("Tot", "Total ", .) %>%
gsub("Rev", "Reversal", .) %>%
gsub("SA", "Strikes Absorbed ", .) %>%
gsub("Sub", "Submission ", .) %>%
gsub("Ref", "Referee ", .) %>%
gsub("SL", "Strikes Landed ", .)
# Insert data sources
self_derived <- grep(".*Car.*", data_table$Variables)
fightmatrix <- grep("F1Rank|F2Rank", data_table$Variables)
data_table$Source[self_derived] <- "Derived"
data_table$Source[fightmatrix] <- "FightMatrix"
kable(data_table, caption = "Composition of the primary datset, [ufc_fight_details_df]", align = 'l')
Variables | Unabreviated | Example | Source |
---|---|---|---|
Date | Date | 2009-03-07 | FightMetric |
Detail | Detail | Kick to Head At Distance | FightMetric |
F1Age | Fighter1 Age | 32 | FightMetric |
F1BodyStr | Fighter1 BodyStrike | 0 | FightMetric |
F1BodyStrAtt | Fighter1 BodyStrike Attempt | 0 | FightMetric |
F1CarFightTime | Fighter1 Career FightTime | 3790 | Derived |
F1CarOppSigStrAtt | Fighter1 Career Opponent Significant Strike Attempt | 540 | Derived |
F1CarOppTDAtt | Fighter1 Career Opponent Take Down Attempt | 5 | Derived |
F1CarSigStr | Fighter1 Career Significant Strike | 216 | Derived |
F1CarSigStrAbs | Fighter1 Career Significant Strike Absorbed | 215 | Derived |
F1CarSigStrAtt | Fighter1 Career Significant Strike Attempt | 505 | Derived |
F1CarStrAbspM | Fighter1 Career Strike Absorbed per Minute | 3.4 | Derived |
F1CarStrAcc | Fighter1 Career Strike Accuracy | 0.43 | Derived |
F1CarStrDef | Fighter1 Career Strike Defense | 0.4 | Derived |
F1CarStrLpM | Fighter1 Career Strike Lper Minute | 3.42 | Derived |
F1CarTD | Fighter1 Career Take Down | 17 | Derived |
F1CarTDAbs | Fighter1 Career Take Down Absorbed | 0 | Derived |
F1CarTDAcc | Fighter1 Career Take Down Accuracy | 0.52 | Derived |
F1CarTDAtt | Fighter1 Career Take Down Attempt | 33 | Derived |
F1CarTDAvg | Fighter1 Career Take Down Average | 2.43 | Derived |
F1CarTDDef | Fighter1 Career Take Down Defense | 0 | Derived |
F1ClinchStr | Fighter1 ClinchStrike | 2 | FightMetric |
F1ClinchStrAtt | Fighter1 ClinchStrike Attempt | 4 | FightMetric |
F1DistStr | Fighter1 Distance Strike | 20 | FightMetric |
F1DistStrAtt | Fighter1 Distance Strike Attempt | 42 | FightMetric |
F1DOB | Fighter1 DOB | 1976-10-05 | FightMetric |
F1GroundStr | Fighter1 GroundStrike | 1 | FightMetric |
F1GroundStrAtt | Fighter1 GroundStrike Attempt | 1 | FightMetric |
F1HeadStr | Fighter1 HeadStrike | 18 | FightMetric |
F1HeadStrAtt | Fighter1 HeadStrike Attempt | 42 | FightMetric |
F1Height | Fighter1 Height | 73 | FightMetric |
F1KD | Fighter1 Knock Down | 1 | FightMetric |
F1LegStr | Fighter1 LegStrike | 5 | FightMetric |
F1LegStrAtt | Fighter1 LegStrike Attempt | 5 | FightMetric |
F1Pass | Fighter1 Guard Pass | 0 | FightMetric |
F1ProLosses | Fighter1 ProLosses | 5 | FightMetric |
F1ProNC | Fighter1 ProNo Contest | 0 | FightMetric |
F1ProWins | Fighter1 ProWins | 12 | FightMetric |
F1R1BodyStr | Fighter1 Round 1 BodyStrike | 0 | FightMetric |
F1R1BodyStrAtt | Fighter1 Round 1 BodyStrike Attempt | 0 | FightMetric |
F1R1ClinchStr | Fighter1 Round 1 ClinchStrike | 2 | FightMetric |
F1R1ClinchStrAtt | Fighter1 Round 1 ClinchStrike Attempt | 4 | FightMetric |
F1R1DistStr | Fighter1 Round 1 Distance Strike | 20 | FightMetric |
F1R1DistStrAtt | Fighter1 Round 1 Distance Strike Attempt | 42 | FightMetric |
F1R1GroundStr | Fighter1 Round 1 GroundStrike | 1 | FightMetric |
F1R1GroundStrAtt | Fighter1 Round 1 GroundStrike Attempt | 1 | FightMetric |
F1R1HeadStr | Fighter1 Round 1 HeadStrike | 18 | FightMetric |
F1R1HeadStrAtt | Fighter1 Round 1 HeadStrike Attempt | 42 | FightMetric |
F1R1KD | Fighter1 Round 1 Knock Down | 1 | FightMetric |
F1R1LegStr | Fighter1 Round 1 LegStrike | 5 | FightMetric |
F1R1LegStrAtt | Fighter1 Round 1 LegStrike Attempt | 5 | FightMetric |
F1R1Pass | Fighter1 Round 1 Guard Pass | 0 | FightMetric |
F1R1Rev | Fighter1 Round 1 Reversal | 0 | FightMetric |
F1R1SigStr | Fighter1 Round 1 Significant Strike | 23 | FightMetric |
F1R1SigStrAtt | Fighter1 Round 1 Significant Strike Attempt | 47 | FightMetric |
F1R1SigStrPercent | Fighter1 Round 1 Significant Strike Percent | 0.48 | FightMetric |
F1R1SubAtt | Fighter1 Round 1 Submission Attempt | 0 | FightMetric |
F1R1TD | Fighter1 Round 1 Take Down | 0 | FightMetric |
F1R1TDAtt | Fighter1 Round 1 Take Down Attempt | 0 | FightMetric |
F1R1TDPercent | Fighter1 Round 1 Take Down Percent | 0 | FightMetric |
F1R1TotStr | Fighter1 Round 1 Total Strike | 40 | FightMetric |
F1R1TotStrAtt | Fighter1 Round 1 Total Strike Attempt | 64 | FightMetric |
F1R2BodyStr | Fighter1 Round 2 BodyStrike | NA | FightMetric |
F1R2BodyStrAtt | Fighter1 Round 2 BodyStrike Attempt | NA | FightMetric |
F1R2ClinchStr | Fighter1 Round 2 ClinchStrike | NA | FightMetric |
F1R2ClinchStrAtt | Fighter1 Round 2 ClinchStrike Attempt | NA | FightMetric |
F1R2DistStr | Fighter1 Round 2 Distance Strike | NA | FightMetric |
F1R2DistStrAtt | Fighter1 Round 2 Distance Strike Attempt | NA | FightMetric |
F1R2GroundStr | Fighter1 Round 2 GroundStrike | NA | FightMetric |
F1R2GroundStrAtt | Fighter1 Round 2 GroundStrike Attempt | NA | FightMetric |
F1R2HeadStr | Fighter1 Round 2 HeadStrike | NA | FightMetric |
F1R2HeadStrAtt | Fighter1 Round 2 HeadStrike Attempt | NA | FightMetric |
F1R2KD | Fighter1 Round 2 Knock Down | NA | FightMetric |
F1R2LegStr | Fighter1 Round 2 LegStrike | NA | FightMetric |
F1R2LegStrAtt | Fighter1 Round 2 LegStrike Attempt | NA | FightMetric |
F1R2Pass | Fighter1 Round 2 Guard Pass | NA | FightMetric |
F1R2Rev | Fighter1 Round 2 Reversal | NA | FightMetric |
F1R2SigStr | Fighter1 Round 2 Significant Strike | NA | FightMetric |
F1R2SigStrAtt | Fighter1 Round 2 Significant Strike Attempt | NA | FightMetric |
F1R2SigStrPercent | Fighter1 Round 2 Significant Strike Percent | NA | FightMetric |
F1R2SubAtt | Fighter1 Round 2 Submission Attempt | NA | FightMetric |
F1R2TD | Fighter1 Round 2 Take Down | NA | FightMetric |
F1R2TDAtt | Fighter1 Round 2 Take Down Attempt | NA | FightMetric |
F1R2TDPercent | Fighter1 Round 2 Take Down Percent | NA | FightMetric |
F1R2TotStr | Fighter1 Round 2 Total Strike | NA | FightMetric |
F1R2TotStrAtt | Fighter1 Round 2 Total Strike Attempt | NA | FightMetric |
F1R3BodyStr | Fighter1 Round 3 BodyStrike | NA | FightMetric |
F1R3BodyStrAtt | Fighter1 Round 3 BodyStrike Attempt | NA | FightMetric |
F1R3ClinchStr | Fighter1 Round 3 ClinchStrike | NA | FightMetric |
F1R3ClinchStrAtt | Fighter1 Round 3 ClinchStrike Attempt | NA | FightMetric |
F1R3DistStr | Fighter1 Round 3 Distance Strike | NA | FightMetric |
F1R3DistStrAtt | Fighter1 Round 3 Distance Strike Attempt | NA | FightMetric |
F1R3GroundStr | Fighter1 Round 3 GroundStrike | NA | FightMetric |
F1R3GroundStrAtt | Fighter1 Round 3 GroundStrike Attempt | NA | FightMetric |
F1R3HeadStr | Fighter1 Round 3 HeadStrike | NA | FightMetric |
F1R3HeadStrAtt | Fighter1 Round 3 HeadStrike Attempt | NA | FightMetric |
F1R3KD | Fighter1 Round 3 Knock Down | NA | FightMetric |
F1R3LegStr | Fighter1 Round 3 LegStrike | NA | FightMetric |
F1R3LegStrAtt | Fighter1 Round 3 LegStrike Attempt | NA | FightMetric |
F1R3Pass | Fighter1 Round 3 Guard Pass | NA | FightMetric |
F1R3Rev | Fighter1 Round 3 Reversal | NA | FightMetric |
F1R3SigStr | Fighter1 Round 3 Significant Strike | NA | FightMetric |
F1R3SigStrAtt | Fighter1 Round 3 Significant Strike Attempt | NA | FightMetric |
F1R3SigStrPercent | Fighter1 Round 3 Significant Strike Percent | NA | FightMetric |
F1R3SubAtt | Fighter1 Round 3 Submission Attempt | NA | FightMetric |
F1R3TD | Fighter1 Round 3 Take Down | NA | FightMetric |
F1R3TDAtt | Fighter1 Round 3 Take Down Attempt | NA | FightMetric |
F1R3TDPercent | Fighter1 Round 3 Take Down Percent | NA | FightMetric |
F1R3TotStr | Fighter1 Round 3 Total Strike | NA | FightMetric |
F1R3TotStrAtt | Fighter1 Round 3 Total Strike Attempt | NA | FightMetric |
F1R4BodyStr | Fighter1 Round 4 BodyStrike | NA | FightMetric |
F1R4BodyStrAtt | Fighter1 Round 4 BodyStrike Attempt | NA | FightMetric |
F1R4ClinchStr | Fighter1 Round 4 ClinchStrike | NA | FightMetric |
F1R4ClinchStrAtt | Fighter1 Round 4 ClinchStrike Attempt | NA | FightMetric |
F1R4DistStr | Fighter1 Round 4 Distance Strike | NA | FightMetric |
F1R4DistStrAtt | Fighter1 Round 4 Distance Strike Attempt | NA | FightMetric |
F1R4GroundStr | Fighter1 Round 4 GroundStrike | NA | FightMetric |
F1R4GroundStrAtt | Fighter1 Round 4 GroundStrike Attempt | NA | FightMetric |
F1R4HeadStr | Fighter1 Round 4 HeadStrike | NA | FightMetric |
F1R4HeadStrAtt | Fighter1 Round 4 HeadStrike Attempt | NA | FightMetric |
F1R4KD | Fighter1 Round 4 Knock Down | NA | FightMetric |
F1R4LegStr | Fighter1 Round 4 LegStrike | NA | FightMetric |
F1R4LegStrAtt | Fighter1 Round 4 LegStrike Attempt | NA | FightMetric |
F1R4Pass | Fighter1 Round 4 Guard Pass | NA | FightMetric |
F1R4Rev | Fighter1 Round 4 Reversal | NA | FightMetric |
F1R4SigStr | Fighter1 Round 4 Significant Strike | NA | FightMetric |
F1R4SigStrAtt | Fighter1 Round 4 Significant Strike Attempt | NA | FightMetric |
F1R4SigStrPercent | Fighter1 Round 4 Significant Strike Percent | NA | FightMetric |
F1R4SubAtt | Fighter1 Round 4 Submission Attempt | NA | FightMetric |
F1R4TD | Fighter1 Round 4 Take Down | NA | FightMetric |
F1R4TDAtt | Fighter1 Round 4 Take Down Attempt | NA | FightMetric |
F1R4TDPercent | Fighter1 Round 4 Take Down Percent | NA | FightMetric |
F1R4TotStr | Fighter1 Round 4 Total Strike | NA | FightMetric |
F1R4TotStrAtt | Fighter1 Round 4 Total Strike Attempt | NA | FightMetric |
F1R5BodyStr | Fighter1 Round 5 BodyStrike | NA | FightMetric |
F1R5BodyStrAtt | Fighter1 Round 5 BodyStrike Attempt | NA | FightMetric |
F1R5ClinchStr | Fighter1 Round 5 ClinchStrike | NA | FightMetric |
F1R5ClinchStrAtt | Fighter1 Round 5 ClinchStrike Attempt | NA | FightMetric |
F1R5DistStr | Fighter1 Round 5 Distance Strike | NA | FightMetric |
F1R5DistStrAtt | Fighter1 Round 5 Distance Strike Attempt | NA | FightMetric |
F1R5GroundStr | Fighter1 Round 5 GroundStrike | NA | FightMetric |
F1R5GroundStrAtt | Fighter1 Round 5 GroundStrike Attempt | NA | FightMetric |
F1R5HeadStr | Fighter1 Round 5 HeadStrike | NA | FightMetric |
F1R5HeadStrAtt | Fighter1 Round 5 HeadStrike Attempt | NA | FightMetric |
F1R5KD | Fighter1 Round 5 Knock Down | NA | FightMetric |
F1R5LegStr | Fighter1 Round 5 LegStrike | NA | FightMetric |
F1R5LegStrAtt | Fighter1 Round 5 LegStrike Attempt | NA | FightMetric |
F1R5Pass | Fighter1 Round 5 Guard Pass | NA | FightMetric |
F1R5Rev | Fighter1 Round 5 Reversal | NA | FightMetric |
F1R5SigStr | Fighter1 Round 5 Significant Strike | NA | FightMetric |
F1R5SigStrAtt | Fighter1 Round 5 Significant Strike Attempt | NA | FightMetric |
F1R5SigStrPercent | Fighter1 Round 5 Significant Strike Percent | NA | FightMetric |
F1R5SubAtt | Fighter1 Round 5 Submission Attempt | NA | FightMetric |
F1R5TD | Fighter1 Round 5 Take Down | NA | FightMetric |
F1R5TDAtt | Fighter1 Round 5 Take Down Attempt | NA | FightMetric |
F1R5TDPercent | Fighter1 Round 5 Take Down Percent | NA | FightMetric |
F1R5TotStr | Fighter1 Round 5 Total Strike | NA | FightMetric |
F1R5TotStrAtt | Fighter1 Round 5 Total Strike Attempt | NA | FightMetric |
F1Rank | Fighter1 Rank | 20 | FightMatrix |
F1Reach | Fighter1 Reach | 76 | FightMetric |
F1Rev | Fighter1 Reversal | 0 | FightMetric |
F1SApM | Fighter1 Strikes Absorbed per Minute | 3.8 | FightMetric |
F1SigStr | Fighter1 Significant Strike | 23 | FightMetric |
F1SigStrAtt | Fighter1 Significant Strike Attempt | 47 | FightMetric |
F1SigStrPercent | Fighter1 Significant Strike Percent | 0.48 | FightMetric |
F1SLpM | Fighter1 Strikes Landed per Minute | 3.69 | FightMetric |
F1Stance | Fighter1 Stance | Orthodox | FightMetric |
F1StrAcc | Fighter1 Strike Accuracy | 0.42 | FightMetric |
F1StrDef | Fighter1 Strike Defense | 0.61 | FightMetric |
F1SubAtt | Fighter1 Submission Attempt | 0 | FightMetric |
F1SubAttAvg | Fighter1 Submission Attempt Average | 0 | FightMetric |
F1TD | Fighter1 Take Down | 0 | FightMetric |
F1TDAcc | Fighter1 Take Down Accuracy | 0.34 | FightMetric |
F1TDAtt | Fighter1 Take Down Attempt | 0 | FightMetric |
F1TDAvg | Fighter1 Take Down Average | 3 | FightMetric |
F1TDDef | Fighter1 Take Down Defense | 0.91 | FightMetric |
F1TDPercent | Fighter1 Take Down Percent | 0 | FightMetric |
F1TotStr | Fighter1 Total Strike | 40 | FightMetric |
F1TotStrAtt | Fighter1 Total Strike Attempt | 64 | FightMetric |
F1UFCLosses | Fighter1 UFCLosses | 2 | FightMetric |
F1UFCNC | Fighter1 UFCNo Contest | 0 | FightMetric |
F1UFCWins | Fighter1 UFCWins | 5 | FightMetric |
F1Weight | Fighter1 Weight | 205 | FightMetric |
F1Wins | Fighter1 Wins | 1 | FightMetric |
F2Age | Fighter2 Age | 31 | FightMetric |
F2BodyStr | Fighter2 BodyStrike | 3 | FightMetric |
F2BodyStrAtt | Fighter2 BodyStrike Attempt | 5 | FightMetric |
F2CarFightTime | Fighter2 Career FightTime | 401 | Derived |
F2CarOppSigStrAtt | Fighter2 Career Opponent Significant Strike Attempt | 20 | Derived |
F2CarOppTDAtt | Fighter2 Career Opponent Take Down Attempt | 1 | Derived |
F2CarSigStr | Fighter2 Career Significant Strike | 34 | Derived |
F2CarSigStrAbs | Fighter2 Career Significant Strike Absorbed | 3 | Derived |
F2CarSigStrAtt | Fighter2 Career Significant Strike Attempt | 55 | Derived |
F2CarStrAbspM | Fighter2 Career Strike Absorbed per Minute | 0.45 | Derived |
F2CarStrAcc | Fighter2 Career Strike Accuracy | 0.62 | Derived |
F2CarStrDef | Fighter2 Career Strike Defense | 0.15 | Derived |
F2CarStrLpM | Fighter2 Career Strike Lper Minute | 5.09 | Derived |
F2CarTD | Fighter2 Career Take Down | 1 | Derived |
F2CarTDAbs | Fighter2 Career Take Down Absorbed | 0 | Derived |
F2CarTDAcc | Fighter2 Career Take Down Accuracy | 0.2 | Derived |
F2CarTDAtt | Fighter2 Career Take Down Attempt | 5 | Derived |
F2CarTDAvg | Fighter2 Career Take Down Average | 0.5 | Derived |
F2CarTDDef | Fighter2 Career Take Down Defense | 0 | Derived |
F2ClinchStr | Fighter2 ClinchStrike | 4 | FightMetric |
F2ClinchStrAtt | Fighter2 ClinchStrike Attempt | 6 | FightMetric |
F2DistStr | Fighter2 Distance Strike | 10 | FightMetric |
F2DistStrAtt | Fighter2 Distance Strike Attempt | 31 | FightMetric |
F2DOB | Fighter2 DOB | 1978-02-09 | FightMetric |
F2GroundStr | Fighter2 GroundStrike | 0 | FightMetric |
F2GroundStrAtt | Fighter2 GroundStrike Attempt | 0 | FightMetric |
F2HeadStr | Fighter2 HeadStrike | 11 | FightMetric |
F2HeadStrAtt | Fighter2 HeadStrike Attempt | 32 | FightMetric |
F2Height | Fighter2 Height | 72 | FightMetric |
F2KD | Fighter2 Knock Down | 0 | FightMetric |
F2LegStr | Fighter2 LegStrike | 0 | FightMetric |
F2LegStrAtt | Fighter2 LegStrike Attempt | 0 | FightMetric |
F2Pass | Fighter2 Guard Pass | 0 | FightMetric |
F2ProLosses | Fighter2 ProLosses | 6 | FightMetric |
F2ProNC | Fighter2 ProNo Contest | 0 | FightMetric |
F2ProWins | Fighter2 ProWins | 14 | FightMetric |
F2R1BodyStr | Fighter2 Round 1 BodyStrike | 3 | FightMetric |
F2R1BodyStrAtt | Fighter2 Round 1 BodyStrike Attempt | 5 | FightMetric |
F2R1ClinchStr | Fighter2 Round 1 ClinchStrike | 4 | FightMetric |
F2R1ClinchStrAtt | Fighter2 Round 1 ClinchStrike Attempt | 6 | FightMetric |
F2R1DistStr | Fighter2 Round 1 Distance Strike | 10 | FightMetric |
F2R1DistStrAtt | Fighter2 Round 1 Distance Strike Attempt | 31 | FightMetric |
F2R1GroundStr | Fighter2 Round 1 GroundStrike | 0 | FightMetric |
F2R1GroundStrAtt | Fighter2 Round 1 GroundStrike Attempt | 0 | FightMetric |
F2R1HeadStr | Fighter2 Round 1 HeadStrike | 11 | FightMetric |
F2R1HeadStrAtt | Fighter2 Round 1 HeadStrike Attempt | 32 | FightMetric |
F2R1KD | Fighter2 Round 1 Knock Down | 0 | FightMetric |
F2R1LegStr | Fighter2 Round 1 LegStrike | 0 | FightMetric |
F2R1LegStrAtt | Fighter2 Round 1 LegStrike Attempt | 0 | FightMetric |
F2R1Pass | Fighter2 Round 1 Guard Pass | 0 | FightMetric |
F2R1Rev | Fighter2 Round 1 Reversal | 0 | FightMetric |
F2R1SigStr | Fighter2 Round 1 Significant Strike | 14 | FightMetric |
F2R1SigStrAtt | Fighter2 Round 1 Significant Strike Attempt | 37 | FightMetric |
F2R1SigStrPercent | Fighter2 Round 1 Significant Strike Percent | 0.37 | FightMetric |
F2R1SubAtt | Fighter2 Round 1 Submission Attempt | 0 | FightMetric |
F2R1TD | Fighter2 Round 1 Take Down | 0 | FightMetric |
F2R1TDAtt | Fighter2 Round 1 Take Down Attempt | 4 | FightMetric |
F2R1TDPercent | Fighter2 Round 1 Take Down Percent | 0 | FightMetric |
F2R1TotStr | Fighter2 Round 1 Total Strike | 14 | FightMetric |
F2R1TotStrAtt | Fighter2 Round 1 Total Strike Attempt | 37 | FightMetric |
F2R2BodyStr | Fighter2 Round 2 BodyStrike | NA | FightMetric |
F2R2BodyStrAtt | Fighter2 Round 2 BodyStrike Attempt | NA | FightMetric |
F2R2ClinchStr | Fighter2 Round 2 ClinchStrike | NA | FightMetric |
F2R2ClinchStrAtt | Fighter2 Round 2 ClinchStrike Attempt | NA | FightMetric |
F2R2DistStr | Fighter2 Round 2 Distance Strike | NA | FightMetric |
F2R2DistStrAtt | Fighter2 Round 2 Distance Strike Attempt | NA | FightMetric |
F2R2GroundStr | Fighter2 Round 2 GroundStrike | NA | FightMetric |
F2R2GroundStrAtt | Fighter2 Round 2 GroundStrike Attempt | NA | FightMetric |
F2R2HeadStr | Fighter2 Round 2 HeadStrike | NA | FightMetric |
F2R2HeadStrAtt | Fighter2 Round 2 HeadStrike Attempt | NA | FightMetric |
F2R2KD | Fighter2 Round 2 Knock Down | NA | FightMetric |
F2R2LegStr | Fighter2 Round 2 LegStrike | NA | FightMetric |
F2R2LegStrAtt | Fighter2 Round 2 LegStrike Attempt | NA | FightMetric |
F2R2Pass | Fighter2 Round 2 Guard Pass | NA | FightMetric |
F2R2Rev | Fighter2 Round 2 Reversal | NA | FightMetric |
F2R2SigStr | Fighter2 Round 2 Significant Strike | NA | FightMetric |
F2R2SigStrAtt | Fighter2 Round 2 Significant Strike Attempt | NA | FightMetric |
F2R2SigStrPercent | Fighter2 Round 2 Significant Strike Percent | NA | FightMetric |
F2R2SubAtt | Fighter2 Round 2 Submission Attempt | NA | FightMetric |
F2R2TD | Fighter2 Round 2 Take Down | NA | FightMetric |
F2R2TDAtt | Fighter2 Round 2 Take Down Attempt | NA | FightMetric |
F2R2TDPercent | Fighter2 Round 2 Take Down Percent | NA | FightMetric |
F2R2TotStr | Fighter2 Round 2 Total Strike | NA | FightMetric |
F2R2TotStrAtt | Fighter2 Round 2 Total Strike Attempt | NA | FightMetric |
F2R3BodyStr | Fighter2 Round 3 BodyStrike | NA | FightMetric |
F2R3BodyStrAtt | Fighter2 Round 3 BodyStrike Attempt | NA | FightMetric |
F2R3ClinchStr | Fighter2 Round 3 ClinchStrike | NA | FightMetric |
F2R3ClinchStrAtt | Fighter2 Round 3 ClinchStrike Attempt | NA | FightMetric |
F2R3DistStr | Fighter2 Round 3 Distance Strike | NA | FightMetric |
F2R3DistStrAtt | Fighter2 Round 3 Distance Strike Attempt | NA | FightMetric |
F2R3GroundStr | Fighter2 Round 3 GroundStrike | NA | FightMetric |
F2R3GroundStrAtt | Fighter2 Round 3 GroundStrike Attempt | NA | FightMetric |
F2R3HeadStr | Fighter2 Round 3 HeadStrike | NA | FightMetric |
F2R3HeadStrAtt | Fighter2 Round 3 HeadStrike Attempt | NA | FightMetric |
F2R3KD | Fighter2 Round 3 Knock Down | NA | FightMetric |
F2R3LegStr | Fighter2 Round 3 LegStrike | NA | FightMetric |
F2R3LegStrAtt | Fighter2 Round 3 LegStrike Attempt | NA | FightMetric |
F2R3Pass | Fighter2 Round 3 Guard Pass | NA | FightMetric |
F2R3Rev | Fighter2 Round 3 Reversal | NA | FightMetric |
F2R3SigStr | Fighter2 Round 3 Significant Strike | NA | FightMetric |
F2R3SigStrAtt | Fighter2 Round 3 Significant Strike Attempt | NA | FightMetric |
F2R3SigStrPercent | Fighter2 Round 3 Significant Strike Percent | NA | FightMetric |
F2R3SubAtt | Fighter2 Round 3 Submission Attempt | NA | FightMetric |
F2R3TD | Fighter2 Round 3 Take Down | NA | FightMetric |
F2R3TDAtt | Fighter2 Round 3 Take Down Attempt | NA | FightMetric |
F2R3TDPercent | Fighter2 Round 3 Take Down Percent | NA | FightMetric |
F2R3TotStr | Fighter2 Round 3 Total Strike | NA | FightMetric |
F2R3TotStrAtt | Fighter2 Round 3 Total Strike Attempt | NA | FightMetric |
F2R4BodyStr | Fighter2 Round 4 BodyStrike | NA | FightMetric |
F2R4BodyStrAtt | Fighter2 Round 4 BodyStrike Attempt | NA | FightMetric |
F2R4ClinchStr | Fighter2 Round 4 ClinchStrike | NA | FightMetric |
F2R4ClinchStrAtt | Fighter2 Round 4 ClinchStrike Attempt | NA | FightMetric |
F2R4DistStr | Fighter2 Round 4 Distance Strike | NA | FightMetric |
F2R4DistStrAtt | Fighter2 Round 4 Distance Strike Attempt | NA | FightMetric |
F2R4GroundStr | Fighter2 Round 4 GroundStrike | NA | FightMetric |
F2R4GroundStrAtt | Fighter2 Round 4 GroundStrike Attempt | NA | FightMetric |
F2R4HeadStr | Fighter2 Round 4 HeadStrike | NA | FightMetric |
F2R4HeadStrAtt | Fighter2 Round 4 HeadStrike Attempt | NA | FightMetric |
F2R4KD | Fighter2 Round 4 Knock Down | NA | FightMetric |
F2R4LegStr | Fighter2 Round 4 LegStrike | NA | FightMetric |
F2R4LegStrAtt | Fighter2 Round 4 LegStrike Attempt | NA | FightMetric |
F2R4Pass | Fighter2 Round 4 Guard Pass | NA | FightMetric |
F2R4Rev | Fighter2 Round 4 Reversal | NA | FightMetric |
F2R4SigStr | Fighter2 Round 4 Significant Strike | NA | FightMetric |
F2R4SigStrAtt | Fighter2 Round 4 Significant Strike Attempt | NA | FightMetric |
F2R4SigStrPercent | Fighter2 Round 4 Significant Strike Percent | NA | FightMetric |
F2R4SubAtt | Fighter2 Round 4 Submission Attempt | NA | FightMetric |
F2R4TD | Fighter2 Round 4 Take Down | NA | FightMetric |
F2R4TDAtt | Fighter2 Round 4 Take Down Attempt | NA | FightMetric |
F2R4TDPercent | Fighter2 Round 4 Take Down Percent | NA | FightMetric |
F2R4TotStr | Fighter2 Round 4 Total Strike | NA | FightMetric |
F2R4TotStrAtt | Fighter2 Round 4 Total Strike Attempt | NA | FightMetric |
F2R5BodyStr | Fighter2 Round 5 BodyStrike | NA | FightMetric |
F2R5BodyStrAtt | Fighter2 Round 5 BodyStrike Attempt | NA | FightMetric |
F2R5ClinchStr | Fighter2 Round 5 ClinchStrike | NA | FightMetric |
F2R5ClinchStrAtt | Fighter2 Round 5 ClinchStrike Attempt | NA | FightMetric |
F2R5DistStr | Fighter2 Round 5 Distance Strike | NA | FightMetric |
F2R5DistStrAtt | Fighter2 Round 5 Distance Strike Attempt | NA | FightMetric |
F2R5GroundStr | Fighter2 Round 5 GroundStrike | NA | FightMetric |
F2R5GroundStrAtt | Fighter2 Round 5 GroundStrike Attempt | NA | FightMetric |
F2R5HeadStr | Fighter2 Round 5 HeadStrike | NA | FightMetric |
F2R5HeadStrAtt | Fighter2 Round 5 HeadStrike Attempt | NA | FightMetric |
F2R5KD | Fighter2 Round 5 Knock Down | NA | FightMetric |
F2R5LegStr | Fighter2 Round 5 LegStrike | NA | FightMetric |
F2R5LegStrAtt | Fighter2 Round 5 LegStrike Attempt | NA | FightMetric |
F2R5Pass | Fighter2 Round 5 Guard Pass | NA | FightMetric |
F2R5Rev | Fighter2 Round 5 Reversal | NA | FightMetric |
F2R5SigStr | Fighter2 Round 5 Significant Strike | NA | FightMetric |
F2R5SigStrAtt | Fighter2 Round 5 Significant Strike Attempt | NA | FightMetric |
F2R5SigStrPercent | Fighter2 Round 5 Significant Strike Percent | NA | FightMetric |
F2R5SubAtt | Fighter2 Round 5 Submission Attempt | NA | FightMetric |
F2R5TD | Fighter2 Round 5 Take Down | NA | FightMetric |
F2R5TDAtt | Fighter2 Round 5 Take Down Attempt | NA | FightMetric |
F2R5TDPercent | Fighter2 Round 5 Take Down Percent | NA | FightMetric |
F2R5TotStr | Fighter2 Round 5 Total Strike | NA | FightMetric |
F2R5TotStrAtt | Fighter2 Round 5 Total Strike Attempt | NA | FightMetric |
F2Rank | Fighter2 Rank | 51 | FightMatrix |
F2Reach | Fighter2 Reach | 72 | FightMetric |
F2Rev | Fighter2 Reversal | 0 | FightMetric |
F2SApM | Fighter2 Strikes Absorbed per Minute | 2.26 | FightMetric |
F2SigStr | Fighter2 Significant Strike | 14 | FightMetric |
F2SigStrAtt | Fighter2 Significant Strike Attempt | 37 | FightMetric |
F2SigStrPercent | Fighter2 Significant Strike Percent | 0.37 | FightMetric |
F2SLpM | Fighter2 Strikes Landed per Minute | 3.12 | FightMetric |
F2Stance | Fighter2 Stance | Orthodox | FightMetric |
F2StrAcc | Fighter2 Strike Accuracy | 0.51 | FightMetric |
F2StrDef | Fighter2 Strike Defense | 0.57 | FightMetric |
F2SubAtt | Fighter2 Submission Attempt | 0 | FightMetric |
F2SubAttAvg | Fighter2 Submission Attempt Average | 0.6 | FightMetric |
F2TD | Fighter2 Take Down | 0 | FightMetric |
F2TDAcc | Fighter2 Take Down Accuracy | 0.29 | FightMetric |
F2TDAtt | Fighter2 Take Down Attempt | 4 | FightMetric |
F2TDAvg | Fighter2 Take Down Average | 3.17 | FightMetric |
F2TDDef | Fighter2 Take Down Defense | 0.57 | FightMetric |
F2TDPercent | Fighter2 Take Down Percent | 0 | FightMetric |
F2TotStr | Fighter2 Total Strike | 14 | FightMetric |
F2TotStrAtt | Fighter2 Total Strike Attempt | 37 | FightMetric |
F2UFCLosses | Fighter2 UFCLosses | 0 | FightMetric |
F2UFCNC | Fighter2 UFCNo Contest | 0 | FightMetric |
F2UFCWins | Fighter2 UFCWins | 2 | FightMetric |
F2Weight | Fighter2 Weight | 185 | FightMetric |
Fighter1 | Fighter1 | Matt Hamill | FightMetric |
Fighter2 | Fighter2 | Mark Munoz | FightMetric |
FightNumber | FightNumber | 3449 | FightMetric |
Gender | Gender | Men | FightMetric |
Judge1 | Judge1 | NA | FightMetric |
Judge1Score | Judge1Score | NA | FightMetric |
Judge2 | Judge2 | NA | FightMetric |
Judge2Score | Judge2Score | NA | FightMetric |
Judge3 | Judge3 | NA | FightMetric |
Judge3Score | Judge3Score | NA | FightMetric |
Method | Method | KO/TKO | FightMetric |
Ref | Referee | Dan Miragliotta | FightMetric |
Result | Result | Matt Hamill | FightMetric |
Round | Round | 1 | FightMetric |
Time | Time | 3:53 | FightMetric |
TotFightTime | Total FightTime | 233 | FightMetric |
WeightClass | WeightClass | Light Heavyweight | FightMetric |
# Creating table to explain the data.
data_table_fighters <- data.frame(colnames(fightMetric_fighters_df),
rep(NA,length(fightMetric_fighters_df)),
as.character(fightMetric_fighters_df[fightMetric_fighters_df$Name=="Gegard Mousasi",]),
rep("FightMetric",length(fightMetric_fighters_df)),
stringsAsFactors = FALSE)
# Assign column names.
colnames(data_table_fighters) <- c("Variables", "Unabreviated", "Example", "Source")
# Generate unabreviated variable names.
data_table_fighters$Unabreviated <- data_table_fighters$Variables %>%
gsub("Str", "Strike ", .) %>%
gsub("SL", "Strikes Landed ", .) %>%
gsub("pM", "per Minute ", .) %>%
gsub("Acc", "Accuracy ", .) %>%
gsub("Def", "Defense ", .) %>%
gsub("Avg", "Average ", .) %>%
gsub("SA", "Strikes Absorbed", .) %>%
gsub("Sub", "Submission ", .) %>%
gsub("Att", "Attempt ", .) %>%
gsub("TD", "Take Down ", .)
kable(data_table_fighters, caption = "Composition of the primary datset, [ufc_fight_details_df]", align = 'l')
Variables | Unabreviated | Example | Source |
---|---|---|---|
Name | Name | Gegard Mousasi | FightMetric |
ProWins | ProWins | 39 | FightMetric |
ProLosses | ProLosses | 6 | FightMetric |
ProNC | ProNC | 2 | FightMetric |
Height | Height | 74 | FightMetric |
Weight | Weight | 185 | FightMetric |
Reach | Reach | 76 | FightMetric |
Stance | Stance | Orthodox | FightMetric |
DOB | DOB | 1985-08-01 | FightMetric |
SLpM | Strikes Landed per Minute | 3.56 | FightMetric |
StrAcc | Strike Accuracy | 0.5 | FightMetric |
SApM | Strikes Absorbedper Minute | 1.18 | FightMetric |
StrDef | Strike Defense | 0.69 | FightMetric |
TDAvg | Take Down Average | 1.6 | FightMetric |
TDAcc | Take Down Accuracy | 0.65 | FightMetric |
TDDef | Take Down Defense | 0.61 | FightMetric |
SubAttAvg | Submission Attempt Average | 1.2 | FightMetric |
Age | Age | 31 | FightMetric |