First thing first, Kareem's data. Good thing was that there was plenty of it. He played for 20 years with solid contributions for most of those years. For data science purposes, the NBA statisticians did him a disservice very early on.
Now, that leaves me with 10 players, but that doesn't satisfy the scientist in me. So, we need more data.
I decided to pick 4 new players to at least add to the variety. All of these players have one at least 1 MVP, have been to numerous playoffs, and won an NBA Championship (except for Iversion with the awesome MVP year he had in 2000-01, but I digress)
I did the same as before by scraping their Per Game statistics,
and then some of the Advanced Statistics,
then adding Seasons, All-Star appearances, shares of the MVP voting, MVP Placing, and MVP trophies won.
After getting the new players added, it was time to check for any null values. **Notice the disclaimer. The work was already performed, but I saved the file to .csv, and then once loaded back into Jupyter Notebook and would get errors when running the command.
Once I viewed that there were no more null values in any of my players, I felt at least comfortable with calling my data cleaned and ready for some EDA.









No comments:
Post a Comment