Cookbook
Personal reference scripts for commonly used code
Cookbook
Personal reference scripts for commonly used code
- Machine Learning: A folder containing scripts for commonly used machine learning code
- Preprocessing.py: Preparing data for machine learning tasks, primarily using pandas and sklearn
- scikit-learn: Also includes LightGBM and XGBoost
- ModelTraining.py: Cross validation, hyperparameter tuning, feature selection, etc.
- Evaluation.py: Evaluation plots, collecting eval metrics, learning curves, feature importance, etc.
- LighTGBM.py: Early stopping and other code that’s convenient to copy/paste
- TensorFlow
- Keras.py: Commonly used code for Keras
- KerasMNIST.py: Training a convolutional net on the MNIST data with Keras
- TensorFlowMNIST.py: Training a convolutional net on the MNIST data with TensorFlow
- PyTorch
- PyTorch.py: Commonly used code for PyTorch
- PyTorchMNIST.py: Training a convolutional net on the MNIST data with PyTorch
- SparkML
- SparkML.py: Commonly used code for SparkML. Includes preprocessing, hyperparameter tuning, cross validation, and so on.
- Plotting: Code snippets for common plots
- Misc: For scripts that don’t fit within any other folders
- EDA.py: EDA reports, missing values, and outliers
- NLP.py: NLTK natural language processing tasks
- PySpark.py: Missing values, datatype conversions, encoding categorical columns, and prepping data for models
- DevOps: A folder containing scripts for operationalizing machine learning models
- Flask: Operationalizing a trained machine learning model as a RESTful API