📒
Machine & Deep Learning Compendium
  • The Machine & Deep Learning Compendium
    • Thanks Page
  • The Ops Compendium
  • Types Of Machine Learning
    • Overview
    • Model Families
    • Weakly Supervised
    • Semi Supervised
    • Active Learning
    • Online Learning
    • N-Shot Learning
    • Unlearning
  • Foundation Knowledge
    • Data Science
    • Data Science Tools
    • Management
    • Project & Program Management
    • Data Science Management
    • Calculus
    • Probability & Statistics
    • Probability
    • Hypothesis Testing
    • Feature Types
    • Multi Label Classification
    • Distribution
    • Distribution Transformation
    • Normalization & Scaling
    • Regularization
    • Information Theory
    • Game Theory
    • Multi CPU Processing
    • Benchmarking
  • Validation & Evaluation
    • Features
    • Evaluation Metrics
    • Datasets
    • Dataset Confidence
    • Hyper Parameter Optimization
    • Training Strategies
    • Calibration
    • Datasets Reliability & Correctness
    • Data & Model Tests
    • Fairness, Accountability, and Transparency
    • Interpretable & Explainable AI (XAI)
    • Federated Learning
  • Machine Learning
    • Algorithms 101
    • Meta Learning (AutoML)
    • Probabilistic, Regression
    • Data Mining
    • Process Mining
    • Label Algorithms
    • Clustering Algorithms
    • Anomaly Detection
    • Decision Trees
    • Active Learning Algorithms
    • Linear Separator Algorithms
    • Regression
    • Ensembles
    • Reinforcement Learning
    • Incremental Learning
    • Dimensionality Reduction Methods
    • Genetic Algorithms & Genetic Programming
    • Learning Classifier Systems
    • Recommender Systems
    • Timeseries
    • Fourier Transform
    • Digital Signal Processing (DSP)
    • Propensity Score Matching
    • Diffusion models
  • Classical Graph Models
    • Graph Theory
    • Social Network Analysis
  • Deep Learning
    • Deep Neural Nets Basics
    • Deep Neural Frameworks
    • Embedding
    • Deep Learning Models
    • Deep Network Optimization
    • Attention
    • Deep Neural Machine Vision
    • Deep Neural Tabular
    • Deep Neural Time Series
  • Audio
    • Basics
    • Terminology
    • Feature Engineering
    • Deep Neural Audio
    • Algorithms
  • Natural Language Processing
    • A Reality Check
    • NLP Tools
    • Foundation NLP
    • Name Matching
    • String Matching
    • TF-IDF
    • Language Detection Identification Generation (NLD, NLI, NLG)
    • Topics Modeling
    • Named Entity Recognition (NER)
    • SEARCH
    • Neural NLP
    • Tokenization
    • Decoding Algorithms For NLP
    • Multi Language
    • Augmentation
    • Knowledge Graphs
    • Annotation & Disagreement
    • Sentiment Analysis
    • Question Answering
    • Summarization
    • Chat Bots
    • Conversation
  • Generative AI
    • Methods
    • Gen AI Industry
    • Speech
    • Prompt
    • Fairness, Accountability, and Transparency In Prompts
    • Large Language Models (LLMs)
    • Vision
    • GPT
    • Mix N Match
    • Diffusion Models
    • GenAI Applications
    • Agents
    • RAG
    • Chat UI/UX
  • Experimental Design
    • Design Of Experiments
    • DOE Tools
    • A/B Testing
    • Multi Armed Bandits
    • Contextual Bandits
    • Factorial Design
  • Business Domains
    • Follow the regularized leader
    • Growth
    • Root Cause Effects (RCE/RCA)
    • Log Parsing / Templatization
    • Fraud Detection
    • Life Time Value (LTV)
    • Survival Analysis
    • Propaganda Detection
    • NYC TAXI
    • Drug Discovery
    • Intent Recognition
    • Churn Prediction
    • Electronic Network Frequency Analysis
    • Marketing
  • Product Management
    • Expanding Your Data Science Skills
    • Product Vision & Strategy
    • Product / Program Managers
    • Product Management Resources
    • Product Tools
    • User Experience Design (UX)
    • Business
    • Marketing
    • Ideation
  • MLOps (www.OpsCompendium.com)
  • DataOps (www.OpsCompendium.com)
  • Humor
Powered by GitBook
On this page
  • PYTORCH
  • FAST.AI
  • KERAS
  • KERAS MULTI GPU
  • KERAS EMBEDDING LAYER
  • Keras: Predict vs Evaluate
  • LOSS IN KERAS

Was this helpful?

  1. Deep Learning

Deep Neural Frameworks

PreviousDeep Neural Nets BasicsNextEmbedding

Last updated 12 months ago

Was this helpful?

PYTORCH

  1. Deep learning with pytorch -

  2. , - yann lecun

  3. Pytorch Official

    1. || || || || || || || ||

    2. (good) -

FAST.AI

KERAS

, has several videos on the topic, going through many network types, creating custom activation functions, going through examples.

+ Two extra videos from the same author, and

Didn’t read:

Compares label with the rounded predicted float, i.e. bigger than 0.5 = 1, smaller than = 0

For categorical we take the argmax for the label and the prediction and compare their location.

In both cases, we average the results.

KERAS MULTI GPU

  1. Note: probably doesn't reflect on adam, is there a reference?

KERAS FUNCTIONAL API

KERAS EMBEDDING LAYER

Keras: Predict vs Evaluate

Keras metrics

LOSS IN KERAS

The training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.

- Example script showing how to use stateful RNNs to model long sequences efficiently.

- this script demonstrate the use of a conv LSTM network, used to predict the next frame of an artificially generated move which contains moving squares.

and not teano (set the .bat file)

- with code example.

ns in NN Keras.

- classification regression and custom metrics

- accuracy, ROC, AUC, classification, regression r^2.

using MSE, comparing baseline vs wide vs deep networks.

? Formula and explanation

. Which are taken from , including entropy and f1

, in keras-tensorflow. When you do multi-GPU training, it is important to feed all the GPUs with data. It can happen that the very last batch of your epoch has less data than defined (because the size of your dataset can not be divided exactly by the size of your batch). This might cause some GPUs not to receive any data during the last step. Unfortunately some Keras Layers, most notably the Batch Normalization Layer, can’t cope with that leading to nan values appearing in the weights (the running mean and variance in the BN layer).

A flexible way to declare layers in parallel, i.e. parallel ways to deal with input, feature extraction, models and outputs as seen in the following images.

Neural Network Graph With Multiple Outputs

, and

.predict() generates output predictions based on the input you pass it (for example, the predicted characters in the )

.evaluate() computes the loss based on the input you pass it, along with any other metrics that you requested in the metrics param when you compiled your model (such as accuracy in the )

A Keras model has two modes: training and testing. Regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.

Keras cheatsheet
Seq2Seq RNN
Stateful LSTM
CONV LSTM
How to force keras to use tensorflow
Callbacks - how to create an AUC ROC score callback with keras
Batch size vs. Iteratio
Keras metrics
Keras Metrics 2
Introduction to regression models in Keras,
How does Keras calculate accuracy
Custom metrics (precision recall) in keras
here
When using SGD only batches between 32-512 are adequate, more can lead to lower performance, less will lead to slow training times.
Parallel gpu-code for keras. Its a one liner, but remember to scale batches by the amount of GPU used in order to see a (non linear) scaability in training time.
Pitfalls in GPU training, this is a very important post, be aware that you can corrupt your weights using the wrong combination of batches-to-input-size
5 things to be aware of for multi gpu using keras, crucial to look at before doing anything
Injecting glove to keras embedding layer and using it for classification + what is and how to use the embedding layer in keras.
Keras blog - using GLOVE for pretrained embedding layers.
Word embedding using keras, continuous BOW - CBOW, SKIPGRAM, word2vec - really good.
Fasttext - comparison of key feature against word2vec
Multiclass classification using word2vec/glove + code
word2vec/doc2vec/tfidf code in python for text classification
Lda & word2vec
Text classification with word2vec
Gensim word2vec
another one
Fasttext paper
here:
MNIST example
MNIST example
For classification methods - how does keras calculate accuracy, all functions.
Why is the training loss much higher than the testing loss?
The book
Pytorch DL course
git
Tutorials
Learning with examples
Learn the Basics
Quickstart
Tensors
Datasets & DataLoaders
Transforms
Build Model
Autograd
Optimization
Save & Load Model
60 minute blitz
youtube series
git
A make sense introduction into keras
examples
examples-2
What is and how to use?
Neural Network Graph With Shared Feature Extraction Layer
Neural Network Graph With Multiple Inputs