Friday, October 5, 2018

GOAT Talk - The Joys of Scraping

Well, after finally deciding on the subject of my Capstone, I had to start diving into data collection.



Yes, my feelings fully encapsulated the feelings of Kip, Jonah, and the Channel 4 News Team.  No judgment on this page.

First, I had to pick the players that I thought best described the Greatest of All Time as far as the NBA goes.  I didn't pick any International players simply because this is America!!!!!


I also wanted to include players from at least the 1970's and on.  Simply because most defensive stats like blocks and steals were not captured or thought of during this time.  Also, the 3 Point shot was not implemented until the 1979-80 season.  That eliminated such greats as Bill Russell (11-time NBA Champion) and Wilt "The Stilt" Chamberlain (2-time NBA Champion, but more known for the only 100-point game in NBA history as well as the 2nd greatest movie appearance of all time....Kareem Abdul-Jabbar having the #1 of course....and #3)



From that point, I had it narrowed down to at least 11.  Hard to pick honestly.

  • Kareem Abdul-Jabbar
  • Magic Johnson
  • Larry Bird
  • Isiah Thomas (The Detroit Pistons Point GOD!!!!!!)
  • Michael Jordan
  • Hakeem Olajuwon
  • Shaquille O'Neal
  • Tim Duncan
  • Kobe Bryant (The Black Mamba.....my personal GOAT)
  • Lebron James
  • Kevin Durant
Lebron and KD were difficult choices only because they are currently playing in the league now.  As accomplished as they currently are, it only leads me to believe that if this project was done 4-5 years from now it would be much more robust.

Next, I had to decide what data I wanted exactly.  Basketball Reference (https://www.basketball-reference.com/) is pretty much a cornucopia of knowledge, but how much did I want to take?
I figured the Per Game data was where I wanted to start:  Age, Games Played, Minutes Per G, Field Goals Per G, FG Attempts Per G, FG %, 3PT Per G, 3PT Attempts Per G, 3PT %, 2PT Per G, 2PT Attempts Per G, 2PT %, Effective FG% (same as FG%, but adjusting for the fact that a 3PT is worth more and weighing more for those shots), Free Throws Per G, FT %, Offensive Rebounds Per G, Defensive Rebound Per G, Total Rebounds Per G, Assists Per G, Steals Per G, Blocks Per G, Turnovers Per G, Fouls Per G, and Points Per G.

With the basic stats, I wanted something that would be more in depth as far as there skills.  Perhaps, something more.........Advanced????

So, under the Advanced tab, I wanted the following stats:  Age (to match up on the other table), PER (Player Efficiency Rating - measure of per-minute production standardized such that the league average is 15), True Shooting % (a measure of shooting efficiency that includes 2-pointer, 3-pointers and Free Throws), Total Rebound % (percentage of available rebounds a player grabbed while he was on the floor), Assist %, Steal %, Block %, Turnover %, Usage %, Win Shares (estimate of the number wins contributed by a player), Offensive and Defensive Box +/- (box score estimate of the offensive points per 100 possessions and defensive points per 100 possessions a player contributed above a league-average player, translated to an average team) Box +/-, VORP (Value Over Replacement Player - A box score estimate of the points per 100 TEAM possessions that a player contributed above a replacement-level (-2.0) player, translated to an average team and prorated to an 82-game season; can be multiplied by 2.70 to convert to Wins Over Replacement Player)

I also wanted to find and notate their accomplishments.  So time to find all awards given....



All-Star Games.  MVPs.  MVP Voting Shares (fun fact - the league MVP was voted on by the players until the 1979-80 season.  Since the 1980–81 season, the award is decided by a panel of sportswriters and broadcasters throughout the United States and Canada.) . All-League Team (All-Rookie, All-NBA, All-Defensive)



Now that I have the list of players, and the data I think would be important, it's time to actually start the data scrape.  Here is where my headache began.


No comments:

Post a Comment