Driveline Baseball

Driveline Baseball

Data Engineering Intern - Driveline Baseball (Kent · Open to Remote)

Driveline Baseball Jobs
Jobs in Kent · Open to Remote
Technical Services: IT Database Management/Services
Location: Kent, WA or Remote
Seasonal, $2500/mo.
No additional benefits.

(Start and end date are flexible.)

Candidates will be expected to work ~40 hours per week.

If a candidate decides to relocate to Kent, WA, no financial relocation assistance is offered. However, we will put you in contact with a number of affordable short-term housing options upon accepting the role, if offered. Fully remote (or hybrid remote) is totally fine and accepted!

You will spend your time helping collect, organize, and integrate the various data sources under the hood at Driveline Baseball, working with our various stakeholders (Baseball Operations Analysts, the Data Science Team, our Software Engineers, etc) to provide optimal data solutions for various processes, models, and dashboards. These solutions will directly impact training floor operations and data collection systems, special research projects that expand the sphere of baseball knowledge, and aid in consulting with professional players and organizations.

This position will plug straight into production-level projects and platforms (including our API suite of software tools and endpoints, EDGE, and our enterprise software platform, TRAQ) and be expected not only to help shore up data pipelines and databases, but also be ready to approach and identify new data sources and automation opportunities as they come. We're a growing start-up constantly looking to push the boundaries -- taking initiative and thinking critically around data problems is a must for this position.

- Maintaining and improving existing data pipelines for various third party API/data streams and internal databases
- Overseeing current data collection, cleaning, and analysis for existing and new research projects, ranging from logistical methods planning for peer-reviewed studies to internal database cleaning and restructuring of ETL-loading scripts
- Design, build and maintain *new* paired baseball technology data warehouse solutions
- Improve the process and system for tracking, analyzing, and visualizing our important business information
- Identify and propose solutions for useful data streams not currently being tracked consistently
- Collaborate with software, data and baseball operation teams in code reviews, as well as planning and design specs
- Debug, troubleshoot and document as needed any relevant data processes

This isn't a front office "hidden" position where you crank out data and no one is counting on you - you will be interfacing daily with professional athletes and coaches.

- Quantitative skill and experience in major programming languages, like R and/or Python (specific experience with data analysis, data visualization, and ETL packages)
- Experience with building, maintaining and exploring databases (MySQL and/or NoSQL Experience preferred), as well as optimizing database queries and database design
- Experience with constructing and deploying machine learning models in production level code
- Experience interfacing, extracting and manipulating various data structures, including JSON and XML
- Experience with software patches, source control, branch versioning, etc on platforms like Github, JIRA or similar
- Willingness and ability to learn quickly in a technical environment
- Open source projects with thorough documentation on GitHub
- Excellent verbal communication skills
- Excellent "feel". That mix of empathy, common sense and likability that makes people trust you
- A mindset for growth and learning with an ability to stay organized / manage multiple projects at the same time
- Proactive, creative solutions to wide-ranging problems.
- Familiarity with Driveline's mission, previous research, and blog posts.

- Any form of college education. We care more about the skills you have in the present rather than where you learned them from.  

Full-time roles are not guaranteed at the end of the internship. Candidates will be evaluated on an individual basis for full-time roles or internship extensions.

Driveline is proud to be an equal opportunity employer. All qualified applicants will receive consideration without regard to race, creed, gender, marital status, sexual orientation, citizenship status, color, religion, national origin, age, disability, veteran status, or any other status protected under local, state, or federal laws. For employees and applicants for employment who have disabilities, Driveline provides reasonable accommodation.

Job Questions:

  1. Please rate your familiarity with the following programming languages from 1 to 5, where 1 is vaguely familiar and 5 is most comfortable. - R - Python - SQL - NoSQL - JavaScript - PHP

  2. Why do you want to work at Driveline Baseball?

  3. What area of baseball operations are you most familiar with and/or have the most experience in? (e.g. scouting, sabermetrics, player development, etc.)

  4. You mess around with a new database and the existing pipelines seem slow. Subjectively, what are the first three possible areas of improvement you would investigate from a database design POV? (Question is meant to be both open-ended and present limited initial information -- bullet points and brevity encouraged)

  5. Name some independent projects you've worked on and why you are proud of them. (Links to GitHub, Bitbucket, blogs, tweets, media, etc are welcome and encouraged.)

  6. What would you like to investigate as a first project if you were given access to Driveline's resources, which includes, but is not limited to: pitch tracking data, bat sensor data and biomechanical data?