Applied Data Science · Syracuse University

Hi, I'm Zach Brand

Data Science student at Syracuse University, with a background in defense finance and a focus on machine learning, deep learning, and NLP. I bring the full data science lifecycle to every problem.

View Projects GitHub Get in Touch

About me

"The ultimate purpose of analytics is to communicate findings to the concerned who might use these insights to formulate policy or strategy."

-- Murtaza Haider, Getting Started with Data Science

I'm a Program Cost Analyst at HII (Huntington Ingalls Industries) in Syracuse, NY, managing financial performance across a ~$90M defense program portfolio. I'm completing my Master of Science in Applied Data Science at Syracuse University (expected May 2026), where my projects have spanned RF signal intelligence, financial deep learning, and natural language processing.

My background is in corporate finance -- pricing analysis at Carrier and Leidos -- and my goal is to bridge that expertise with data science to drive more impactful decisions in the defense industry. My projects didn't stick to one field: while in the program I explored many directions before deciding to stay in defense and apply data science methodologies where I work today.

I approach every project with transparency and ethical rigor: documenting limitations, acknowledging AI assistance in code reviews, and understanding the real-world stakes of data misuse. Data science is a practice of iteration, self-critique, and curiosity -- and I plan to keep growing with it.

Skills & Tools

Programming & ML

PythonRSQL Machine LearningDeep Learning NLPRegression Modeling

Libraries & Frameworks

TensorFlow / KerasScikit-learn Pandas / NumPyNLTK / spaCy CatBoostSciPy MatplotlibSMOTE

Data & Analytics Tools

Excel (Advanced)Power BI TableauGoogle Analytics OneStreamSAP SalesforceCostpoint

Projects

RF Signal Classification Machine Learning

Machine Learning · Inspired by work in defense / RF hardware

Investigated the challenges of classifying radio frequency (RF) signals from a complex, real-world dataset containing I/Q (in-phase and quadrature) signal data. The dataset required extensive preprocessing -- parsing multi-valued string cells into usable complex arrays using a custom parse_iq_cell() function with ast.literal_eval(), normalizing signals to unit power, and crafting features from both the time domain and frequency domain using Welch Power Spectral Density estimation. Signal classes were organized by ITU frequency band designations and refined into VHF sub-bands (FM Broadcast, Marine VHF, Airband Communications).

The model used a two-stage Random Forest pipeline: a binary "gate" classifier to first separate FM Broadcast signals from all others, followed by a multi-class classifier for finer sub-band classification. Stratified K-Fold cross-validation (5 folds), calibrated probability outputs, and class-weighted training addressed dataset imbalance. Feature importance scores provided interpretability into which signal attributes drove classification decisions.

Random ForestWelch PSD Stratified K-FoldI/Q Signal Processing ITU Band ClassificationFeature Engineering Scikit-learn

View on GitHub

LSTM Stock Price Prediction Deep Learning

Deep Learning · AAPL historical price forecasting

Applied Long Short-Term Memory (LSTM) neural networks to predict Apple Inc. (AAPL) stock closing prices using historical time-series data sourced from Kaggle. Data was preprocessed with MinMaxScaler normalization and structured into sequential 60-day lookback windows to capture temporal dependencies -- requiring careful datetime parsing and chronological sorting to preserve time-phased data patterns.

A two-layer LSTM architecture with Dropout regularization was implemented in TensorFlow/Keras, trained with early stopping and learning rate reduction callbacks to prevent overfitting. Evaluated on held-out test data using RMSE, MAE, R-squared, and MAPE. Visualizations included training/validation loss curves, actual-vs-predicted price overlays, error distribution histograms, and scatter plots -- translating a complex sequence model into charts accessible to any stakeholder.

TensorFlow / KerasLSTM MinMaxScalerDropout Early StoppingRMSE / MAE / R2 Time Series

View on GitHub

Movie Review Sentiment Analysis NLP

Natural Language Processing · Rotten Tomatoes / Kaggle corpus

Built and compared text classification models for five-level sentiment analysis using the Kaggle Movie Reviews dataset (a subset of the Rotten Tomatoes corpus). Preprocessing involved NLTK tokenization, stopword removal, and spaCy lemmatization consolidated into a single reusable pipeline. Feature engineering progressed through several configurations: a 150-word unigram bag-of-words baseline, a 1,000-word expansion, and a combined feature set adding bigrams, POS tag counts, and VADER sentiment scores.

Three experimental conditions were evaluated using 5-fold cross-validation with an NLTK Naive Bayes classifier, plus an advanced condition using Logistic Regression with the combined feature set. Evaluation metrics included precision, recall, and F-measure across all folds -- reinforcing the trade-offs between vocabulary size, feature richness, and classifier complexity.

NLTKspaCy Naive BayesLogistic Regression VADERBag-of-Words 5-Fold Cross-Validation

View on GitHub

Program Learning Outcomes

My projects and coursework demonstrate achievement across all six learning outcomes of the Syracuse Applied Data Science program.

Data Collection, Storage & Access

Parsed complex I/Q arrays from string-encoded Excel cells, implemented datetime-aware sequential loading for stock time-series, and leveraged Kaggle datasets -- learning firsthand that this step makes or breaks a project.

Actionable Insights

Delivered domain-relevant insights across defense spectrum monitoring, financial forecasting, and media sentiment analysis -- each grounded in the full data science lifecycle from raw data to interpretable output.

Visualization & Predictive Models

Built training/validation loss curves, actual-vs-predicted overlays, feature importance plots, confusion matrices, and error histograms -- translating complex model behavior into charts any stakeholder can understand at a glance.

Python & Programming

All three projects built entirely in Python -- pandas, numpy, scikit-learn, TensorFlow/Keras, NLTK, spaCy, matplotlib, scipy, CatBoost. Grew from zero Python knowledge at program start to functional end-to-end ML pipelines.

Communication

Documented model decisions, limitations, and assumptions throughout. Core belief: effective visualizations should allow a stakeholder to capture key insights without ever looking at the underlying code.

Ethics & Responsibility

Committed to transparency -- acknowledging class-count limitations, citing AI assistance in code reviews, and understanding the real stakes of data misuse in defense and healthcare contexts.

Resume

Jan 2024
Present

Program Cost Analyst

HII (Huntington Ingalls Industries) · Syracuse, NY

Manage financial performance for ~10 programs totaling ~$90M, identifying cost variances and improving forecast accuracy for executive reporting
Built automated Excel reporting to track weekly program spend vs. forecast across full portfolio
Prepare MSRs, PMRs, and QPRs for internal and external stakeholders (CDRLs)
Update forecasting tool (OneStream) regularly for accurate AOP vs. IF reporting
Led financial deep dive into underperforming program reviewing costs, pricing, and workflow
Monitor program KPIs while analyzing variances between actuals and forecast

Feb 2022
Jan 2024

Lead Pricing Analyst

Leidos · Tewksbury, MA (Remote)

Consolidated BOEs and cost information into pricing models with director-level presentation summaries
Presented financial KPIs to determine profitability: GM, ROS, fee, escalation rates, cost validations
Cashflow forecasting based on payment and delivery schedules
Worked with multiple stakeholders on process and cost improvements

Dec 2018
Feb 2022

Pricing Analyst

Carrier · Syracuse, NY (Remote)

Analyzed 50-60 pricing requests per day through SAP by job size, product quantity, margin, and time frame
Maintained pricing programs using Power BI sales data to align pricing to market trends
Analyzed regional and equipment-based sales data for targeted pricing strategy

Exp. May 2026

M.S. Applied Data Science

Syracuse University

Machine Learning, Deep Learning, NLP, Data Visualization, Python Scripting, Financial Analytics

Dec 2018

B.S. Economics

Le Moyne College

Skills

Programming & ML

Python · R · SQL · Machine Learning · Deep Learning · NLP · Regression Modeling

Data & Visualization

Excel (Pivot Tables, Power Query, Macros, XLOOKUP) · Power BI · Tableau · Google Analytics

Enterprise Systems

OneStream · SAP · Salesforce · Costpoint

Domain Expertise

Financial Analysis · Variance Analysis · Pricing Analysis · Defense Program Finance · Statistical Inference