Data Science Notes

Machine Learning

$ \mathbf{Independent variables}$ (Features or inputs or $x$): These don’t depend on anything and also are inputs to the model $f$. Here $x$ is a vector.
$ \mathbf{Dependent variable}$ (Target or $y$): The value of this variable depends on independent variables
\begin{equation} f(x) \approx y, \end{equation} where $f$ is a function that we want to find out using $x$ and $y$.

$ \mathbf{Supervised Learning:}$ Using known values of $x$ and the corresponding $y$ values, we want to construct $f$. We, then use $f$ and try to predict $y$. There are two categories.

  1. Regression - for continuous target values
  2. Classification - for discrete target values
    $ \mathbf{Unsupervised Learning:}$ We want to desire structure from data that we have no idea about how our target should be. Examples: clustering (stocks of various companies), non-clustering (Cocktail Party Algorithm)