📒
Machine & Deep Learning Compendium
  • The Machine & Deep Learning Compendium
    • Thanks Page
  • The Ops Compendium
  • Types Of Machine Learning
    • Overview
    • Model Families
    • Weakly Supervised
    • Semi Supervised
    • Active Learning
    • Online Learning
    • N-Shot Learning
    • Unlearning
  • Foundation Knowledge
    • Data Science
    • Data Science Tools
    • Management
    • Project & Program Management
    • Data Science Management
    • Calculus
    • Probability & Statistics
    • Probability
    • Hypothesis Testing
    • Feature Types
    • Multi Label Classification
    • Distribution
    • Distribution Transformation
    • Normalization & Scaling
    • Regularization
    • Information Theory
    • Game Theory
    • Multi CPU Processing
    • Benchmarking
  • Validation & Evaluation
    • Features
    • Evaluation Metrics
    • Datasets
    • Dataset Confidence
    • Hyper Parameter Optimization
    • Training Strategies
    • Calibration
    • Datasets Reliability & Correctness
    • Data & Model Tests
    • Fairness, Accountability, and Transparency
    • Interpretable & Explainable AI (XAI)
    • Federated Learning
  • Machine Learning
    • Algorithms 101
    • Meta Learning (AutoML)
    • Probabilistic, Regression
    • Data Mining
    • Process Mining
    • Label Algorithms
    • Clustering Algorithms
    • Anomaly Detection
    • Decision Trees
    • Active Learning Algorithms
    • Linear Separator Algorithms
    • Regression
    • Ensembles
    • Reinforcement Learning
    • Incremental Learning
    • Dimensionality Reduction Methods
    • Genetic Algorithms & Genetic Programming
    • Learning Classifier Systems
    • Recommender Systems
    • Timeseries
    • Fourier Transform
    • Digital Signal Processing (DSP)
    • Propensity Score Matching
    • Diffusion models
  • Classical Graph Models
    • Graph Theory
    • Social Network Analysis
  • Deep Learning
    • Deep Neural Nets Basics
    • Deep Neural Frameworks
    • Embedding
    • Deep Learning Models
    • Deep Network Optimization
    • Attention
    • Deep Neural Machine Vision
    • Deep Neural Tabular
    • Deep Neural Time Series
  • Audio
    • Basics
    • Terminology
    • Feature Engineering
    • Deep Neural Audio
    • Algorithms
  • Natural Language Processing
    • A Reality Check
    • NLP Tools
    • Foundation NLP
    • Name Matching
    • String Matching
    • TF-IDF
    • Language Detection Identification Generation (NLD, NLI, NLG)
    • Topics Modeling
    • Named Entity Recognition (NER)
    • SEARCH
    • Neural NLP
    • Tokenization
    • Decoding Algorithms For NLP
    • Multi Language
    • Augmentation
    • Knowledge Graphs
    • Annotation & Disagreement
    • Sentiment Analysis
    • Question Answering
    • Summarization
    • Chat Bots
    • Conversation
  • Generative AI
    • Methods
    • Gen AI Industry
    • Speech
    • Prompt
    • Fairness, Accountability, and Transparency In Prompts
    • Large Language Models (LLMs)
    • Vision
    • GPT
    • Mix N Match
    • Diffusion Models
    • GenAI Applications
    • Agents
    • RAG
    • Chat UI/UX
  • Experimental Design
    • Design Of Experiments
    • DOE Tools
    • A/B Testing
    • Multi Armed Bandits
    • Contextual Bandits
    • Factorial Design
  • Business Domains
    • Follow the regularized leader
    • Growth
    • Root Cause Effects (RCE/RCA)
    • Log Parsing / Templatization
    • Fraud Detection
    • Life Time Value (LTV)
    • Survival Analysis
    • Propaganda Detection
    • NYC TAXI
    • Drug Discovery
    • Intent Recognition
    • Churn Prediction
    • Electronic Network Frequency Analysis
    • Marketing
  • Product Management
    • Expanding Your Data Science Skills
    • Product Vision & Strategy
    • Product / Program Managers
    • Product Management Resources
    • Product Tools
    • User Experience Design (UX)
    • Business
    • Marketing
    • Ideation
  • MLOps (www.OpsCompendium.com)
  • DataOps (www.OpsCompendium.com)
  • Humor
Powered by GitBook
On this page
  • Difference between
  • Introduction to statistics
  • Introduction to Probability
  • More on Statistics
  • Wiki
  • Recommended Courses
  • STATISTICAL SAMPLING AND RESAMPLING

Was this helpful?

  1. Foundation Knowledge

Probability & Statistics

PreviousCalculusNextProbability

Last updated 3 years ago

Was this helpful?

on probabilities - for data science, actually quite good in explaining a lot of the basic tools,prob, conditional, distributions, sampling, CI, hypothesis, etc.

  • (adam bali)

  • I.e, Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.

  • The problems considered by probability and statistics are inverse to each other.

  • In probability theory we consider some underlying process which has some randomness or uncertainty modeled by random variables, and we figure out what happens.

=> Underlying process + randomness and random variables -> what happens next?

  • In statistics we observe something that has happened, and try to figure out what underlying process would explain those observations.

=> observe what happened -> what is the underlying process?

  • Finally, probability theory is mainly concerned with the deductive part, statistics with the inductive part of modeling processes with uncertainty

Introduction to statistics

  1. content

  2. - most freq

  3. ,

  4. - a private case

  5. - std is in the same metric as the mean, is the root of variance., allows outliers to influence, will not result in samples cancelling each other without the square root in the formula.

Introduction to Probability

More on Statistics

Wiki

Recommended Courses

  • every individual we are interested in studying, and a sample, consisting of the individuals that are selected from the population.

  • in probability: would start with us knowing everything about the composition of a population, and then would ask, “What is the likelihood that a selection, or sample, from the population, has certain characteristics?”

  • In statistics: we have no knowledge about the types of socks in the drawer. we infer properties about the population on the basis of a random sample.

  • Finding out the probability of an event

  • Of two consecutive events (multiplication)

  • Of several events (sum)

  • Etc..

STATISTICAL SAMPLING AND RESAMPLING

It works by making the total of the square of the errors as small as possible (that is why it is called "least squares"

(part 2), (part1) & in statistics.

- derivatives using the chain rule, on

, distribution types, conditional, joint, chain, etc.

academy

to probability, conditional, joint, etc.

(another angle)

what are the known facts? Inherent in both probability and statistics is a ,

Some to get you into probability:

(cross validation etc)

Coursera course
A great resource for proba/bayes/b-networks/etc
Difference between
Table of
Median
Mode
Weighted mean
Geometric mean
Harmonic mean
Percentiles
Mean deviation
Correlation
Standard deviation
formula
Standard normal distribution
Skewness of distribution
Confidence intervals (using std)
Accuracy vs precision (accurate vs hitting closely or density)
Probability
Probability complement
Chi-square test, p_value, independent, dependent, significance
Variation vs variance
Std vs variance
Types of events
Independent events
Conditional proba
Proba tree diagrams
Mutually exclusive events
Combination and permutations
Bayes
Least squares regresssion
Random variables
Continuous random variables
Random vars mean, std, variance
25 concepts
29 more concepts
part 3
Marginal probability
Joint probability
Conditional probability
Chain rule
khan
Another great course on probability
Kahn
A really good intro
What are confidence intervals?
The main difference between probability and statistics has to do with knowledge
population
calculations
What is? Method for sampling/resampling, and sampling errors explained.