New statistics course takes a swing at baseball analytics

Stat 430: Baseball Analytics is a new course that familiarizes students with the practice of breaking down complex data sets into more digestible information.
Date
12/19/22
Photo by Mike Bowman on Unsplash

There is a fascinating, intricate science behind America’s favorite pastime. Statistics and data science have quickly become a vital asset to MLB organizations. If this interests you, you may enjoy the new Stat 430: Baseball Analytics with Professor Daniel Eck.

Even if you aren’t particularly a sports fan, this course offers a multitude of skills which could be applied to other careers in data analytics. You will be taught to scrape, merge, and manipulate data sets with maximum efficiency. Eck shared that the primary goal of the course is to familiarize students with the practice of breaking down complex data sets into more digestible information. Another large part of the course is working with ongoing research, which is bound to evolve and change. Professor Eck asserts that all these skills can be applied to other industries than just sports analytics.

However, if you are picturing yourself in the field of sports analytics, this course will give you a strong foundation as you begin your career. This course features multiple guest speakers throughout the semester. The professionals featured recently include (but are not limited to): Cubs assistant general manager Ehsan Bokhari, the course textbook author Jim Albert, and physics professor Alan Nathan. Guest speakers represent all facets of the industry, from those who are analysts with certain teams to those who study baseball as a whole. The reliance on guest speakers in this course allows students the unique opportunity to see the industry from multiple angles. Students are encouraged to talk with these speakers and ask questions.

Prof. Daniel Eck
Prof. Daniel Eck - Instructor of Stat 430: Baseball Analytics

The MLB’s use of analytics has been growing exponentially in recent years. The importance of using statistics in professional sports has grown so much that it’s become non-negotiable. In Professor Eck’s words, “I can’t describe how important it is. If you were to not do it, you’d be lost. All the way from player evaluation to player strategy -- you’d be at a massive disadvantage. [...] Statistics is everything in baseball.” 

Professor Eck hopes that as time goes on, his course will evolve to include more complex statistical models. One such model the department has been perfecting started as a research collaboration between Professor Eck, Professor David Dalpiaz, Julia Wapner, and Charlie Young (both of which are now graduated and employed by the MLB). This model, SEAM (Synthetic Estimated Average Matchups), illustrates where a ball might go based on certain batter-pitcher matchups. This is very beneficial information to hold, as an individual batter doesn’t face off against a particular pitcher very many times during a season. It is estimated that this tool can be used to gain an additional 40 outs over conventional spray charts a regular season - which is quite significant! Professor Eck says, “Over the course of a season, you could win one more baseball game just by doing this. [...] An additional win is worth about 2 million dollars on the open market.”

As the Department of Statistics here at Illinois works to spearhead this growing need for baseball analysts, Professor Eck highlights the parallels this course has with his own work. He explains that parts of the course are based on current research of his. The students enrolled in the course get a closer glimpse at how the research process develops. I spoke with the two aforementioned Illinois alumni, Charlie Young and Julia Wapner, to get a better idea of how quickly this industry is expanding.

Charlie Young
Charlie Young

I first spoke with Charlie Young, a software developer with the Houston Astros. During his time at Illinois, Charlie majored in computer science and astronomy with a minor in statistics. He explained that a large part of his work with the Astros involves data cleaning and ingestion, similar to what one would learn in Professor Eck’s new course. Thanks to the centralized data that Charlie and other developers put together, the entire organization surrounding the Astros can run more smoothly. This information helps coaches, trainers, and scouts understand the science behind their team’s roster.

When asked about the importance of baseball analytics, he said: “There are only so many players that get to go out on the field for Major Leagues, so it’s important that we use as much data as we can to identify the players we want on the field. Every [MLB organization] at this point uses analytics in pretty much every facet of the game.” Charlie explained that before the analytics industry became a part of the MLB, scouting was a very different process than what it is today. By observing video and data coverage, analysts work alongside scouts to source players with greater efficiency. Analysts have become synonymous to the scouting process, especially for players in the Minor Leagues. In Charlie’s words, “We know how fast they’re throwing, how much they’re lifting, how fast they’re running. We have that data coming in every day for all of our players, so it helps a lot with player development and eventually reaching the Major League.” I quickly understood that nowadays data analytics is the preferred method within the MLB for player acquisition. Charlie Young confirmed Professor Eck’s previous statement, “I think if a team isn’t using analytics to its full capacity they’re going to start falling behind. It’s so important, so valuable for all these reasons.”

Next, I spoke with Julia Wapner, an analyst for the Baltimore Orioles. She was excited to hear about the increasing attention Illinois is paying to the field of analytics. I asked her to share her experiences working alongside Professors Eck and Dalpiaz to develop the SEAM model. Wapner said, “It showed me some of the applications of analytics and statistics while I was still an undergrad, which is something that we didn’t get to see as much in our classes. It’s cool that they’re introducing new classes that are showing [those applications].” She shared that this research endeavor was one of the things which best prepared her for her career.

Julia went on to explain that baseball is currently the sport with the most dependence on analytics. A team of analysts backs every MLB organization. Julia describes her job as giving meaning to the data collected by other members of her team. Thanks to this synthesis, the team is able to decide which players to assign to what roles. Julia said, “I don’t think there’s an aspect of baseball that isn’t getting touched by analytics in some way.” I was curious if there was a noticeable correlation between the size of a team’s analytics department and their overall success. When I posed this question to Julia Wapner, she established that she had been wondering the same thing lately. Based on her experiences, she said, “I think something that’s really clear now is that analytics is translating into on-field success. The Astros just won the World Series and they have one of the most well-known analytics departments. And so it’s definitely pretty clear that it’s making a difference.”

After understanding the sheer impact analytics has on modern day baseball, I sought out one of Professor Eck’s current students to get their perspective on this industry. I spoke with Jack Banks, a senior who is currently enrolled in Eck’s Stat 430 and pursuing a career in baseball analytics. Going into this course, Jack expected it to be more project-based than other statistics courses. “I’ve learned so much about baseball data, how to find it, and how to use it.” Jack recognizes this hands-on approach as his favorite part of the class. He shared that this class, more than others, focuses on group work, which encourages collaboration between students. I was reminded of the same sort of collaboration that led to both Charlie Young and Julia Wapner gaining a foothold in the MLB. Jack Banks seemingly read my mind. He said, “It’s helped me to not only be more familiar with the industry, but it’s also helped me to build up my portfolio. The projects in this class are stuff I could show to these professional teams and say, ‘I’ve done this in a class before, now I can do this for you.’”

As the importance of analytics continues to dominate the MLB, students and faculty at Illinois are well prepared. Enrolling in Professor Eck’s Baseball Analytics course will give students a robust insight into this industry. Learning from personal research, renowned guest speakers, and one another, students who have completed this course are sure to be attractive to hiring sports analytics departments. Even if you aren’t planning to work in sports, the class is structured so that you are bound to leave with valuable information. You will learn to manipulate data to make it more accessible, collaborate with peers on research, and make connections with other prominent analysts and professors. If any of these skills sound appealing, it could be worth your time to register for Stat 430 with Professor Eck. If you’d like additional information, check out Professor Eck’s baseball research website: https://ecklab.github.io/

 

Elizabeth McNutt
2022-12-19

Elizabeth McNutt is the staff writer for the Department of Statistics. If you have news to share, please contact the Statistics news group at stat-office@illinois.edu

 

 

 

Related People

dje13