Archive for the ‘sql’ tag
With all of the discussions (hype?) regarding applied statistics, machine learning, and data science, I have been looking for a go-to source of work-unrelated data. I loved baseball as a kid. I love baseball now. I love baseball stats. Why not do a grown-up version of what I used to do when I spent hours staring at and memorizing baseball stats on the back of a few pieces of cardboard on which I spent my allowance?
To get started, I purchased a copy of Baseball Hacks. The author suggests the usage of MySQL, so I will oblige. First, I downloaded some baseball data in MySQL format on my web server (Ubuntu 10.04) and decompressed it; when I downloaded the data, it was timestamped as 28 March 2011, so double-check if there is an updated version.
mkdir baseball cd baseball wget http://www.baseball-databank.org/files/BDB-sql-2011-03-28.sql.zip unzip BDB-sql-2011-03-28.sql.zip
Next, in MySQL I created a user named “baseball”, a database entitled “bbdatabank” and granted all privileges on this database to the user “baseball.” To do this, first open MySQL as root (mysql -u root -p)
CREATE USER 'baseball'@'localhost' IDENTIFIED BY 'YourPassword'; CREATE databas bbdatabank; GRANT ALL PRIVILEGES ON `bbdatabank`.* TO 'baseball'@'localhost'; FLUSH PRIVILEGES; quit
Note the tick marks (`) around bbdatabank when privileges are granted. Also, notice the deliberate misspelling when I constructed the db. WordPress freaks out on me because mod_security steps in and says, “Umm, no.” For more info about this, go here and here (see the comments as well).
Finally, we read the data into the database we just created by:
mysql -u baseball -p -s bbdatabank < BDB-sql-2011-03-28.sql
That’s it! Most of this code has been adapted from the Baseball Hacks book, although I’ve tweaked a couple of things. As I progress through the book, I will continue to add interesting finds and code as posts. Eventually, I will move away from the book’s code as it focuses too much on the “Intro to Data Exploration” reader with constant mentions of MS Access/Excel. The author means well though as he urges the reader to use *nix/Mac OS X.