Code Monkey home page Code Monkey logo

A Quick, Who am I?

I’m an experienced data scientist with almost a decade of employment sucessfully applying scientific methods, to different disciplines, between in industry and academia. I hold three advanced degrees, two in Geophysics (Ph.D., and BSc.), and one in Risk and Uncertainty Management (MRes.) awarded by the University of Liverpool (Liverpool, UK). I typically use Python as my language of choice, given its wide support and integration withing the machine learning community, but I also frequently use shell (Bash/ZSH/Powershell), SQL, HTML/CSS, Rust, and have some experience with JavaScript, Ruby, C++, C#, Perl, Fortran, and MATLAB.

More About me

My passion for data science stems from my curiosity about the natural phenomena that shape our planet and the data that can help us comprehend them better. That’s why I obtained a Ph.D. in geophysics, where I investigated seismic waves and their interactions with the Earth’s structure. I utilized Python and MATLAB to process and examine large datasets of seismic recordings (digital time series) from around the world. As a researcher, I authored three peer-reviewed publications in reputable journals.

After completing my degree, I joined a leading seismological observatory as a researcher (University of Utah Seismograph Stations), where I continued to work on seismic data analysis and earthquake magnitude modeling. I cooperated with other scientists from the USGS, AFRL, LLNL, and other institutions to enhance our understanding of earthquake hazards and risks.

I then decided to switch gears and explore other domains where data science can make an impact. I joined Bayer crop sciences as a data scientist, where I worked on models that forecast corn seed yield given historic growth and other agronomic factors. I used Python, SQL, and Domino to manipulate and explore large datasets of crop measurements, weather data, soil data, and satellite imagery. I constructed the predictive models using scikit-learn.

My most recent position was working as a Data Scientist at Coyote Logistics, which is a 3PL service provider where I work on sophisticated predictive models that estimate potential distributions of market cost for a given load of freight. I use Python, SQL (and noSQL), and Databricks to ingest and transform data from various sources such as carriers, shippers, brokers, and third-party APIs. I also use LightGBM, scikit-learn, and PyTorch to build and fit models that capture the uncertainty and variability of the freight market.

Data Science Projects

Here are some of the projects I've worked on or I'm currently working on:

  • Freight Market Cost Distribution: I was the team lead of the spot pricing team, which developed a cutting-edge machine learning system that used a custom Boosted Gradiant Random Forest algorithm, and LightGBM / H2O.ai Gradient Boosting models, integrated through a live API model service. The model API produced a predicted distribution, that showed the most probable range of costs to transport different types of freight equipment and cargo, to and from distribution centers across the USA, Canada, and Mexico. This model helped us generate millions of dollars of extra revenue, as it enabled faster and more intelligent buying and brokering negotiations at scale.

  • Corn Yield Prediction: Created new and improved existing random forest models to predict corn seed yield across the world, leveraging multi-year spanning historic yields, and agronomic features (e.g., weather conditions, soil conditions, field clusters defined using remote sensing data, etc.).

  • Earthquake Magnitude Estimation: Created physics-informed empirical models of amplitude decay over distance via non-parametric inversions, pre-conditioned using events with known earthquake magnitudes. This method was applied for both Local (Richter), and Moment Magnitudes estimations, with the latter focusing on small magnitude estimation, which is difficult to obtain via conventional physical modeling. This work resulted in two peer-reviewed publications which were published in the Bulletin of the Seismological Society of America, and Seismological Research Letters.

Personal Projects

Skills

Here are some of the skills I've learned or improved during my data science journey:

  • Programming Languages: Python (advanced), SQL (intermediate), Matlab (intermediate)
  • Data Analysis Tools: Pandas (advanced), NumPy (advanced), SciPy (advanced), PySpark (intermediate), Polars (intermediate)
  • Data Visualization Tools: Matplotlib (advanced), Seaborn (advanced), Plotly (advanced), Streamlit (intermediate)
  • Machine Learning Tools: scikit-learn (advanced), statsmodels (intermediate), PyTorch (basic)
  • Cloud Computing Platforms: Azure (advanced), AWS (beginner)
  • Data Engineering Tools: Databricks (Airflow (beginner), Docker (beginner), Kubernetes (beginner)
  • Version Control Systems: Git, GitHub, Azure DevOps

Contact

LinkedIn

James Holt's Projects

data-science-interview-resources icon data-science-interview-resources

A repository listing out the potential sources which will help you in preparing for a Data Science/Machine Learning interview. New resources added frequently.

eqstochsim icon eqstochsim

A library to explore the stochastic method following Boore (2003).

mlmags icon mlmags

A small data science project which explores the use of some basic machine learning tools to predict moment magnitude for small-to moderate-sized earthquake in Utah, USA.

mopy icon mopy

A python package to calculate seismic source parameters

pygmt icon pygmt

A Python interface for the Generic Mapping Tools

pyqtgraph icon pyqtgraph

Fast data visualization and GUI tools for scientific / engineering applications

rustlings icon rustlings

:crab: Small exercises to get you used to reading and writing Rust code!

sgjholt.github.io icon sgjholt.github.io

A personal 'portfolio' website created using the Jekyll framework, hosted as a GitHub Page.

site_analysis icon site_analysis

Site analysis project. Contains classes and functions for 1D site analysis.

siteresponsetool icon siteresponsetool

The Site Characterisation and Seismic Response Analysis Toolkit for Python

specmod icon specmod

SpecMod - A Toolbox for Processing and Modeling Seismic Spectra

ynp_local_magnitude_recalibration icon ynp_local_magnitude_recalibration

A companion repository of code that was used to recalibrate Local Magnitude (ML), then compare and constrast past and present ML assignments for earthquakes recorded in Yellowstone National Park, USA.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.