Coursea linear

2025-12-27 14:54:05 -06:00 · 2024-11-10 14:19:52 +03:00 · 2024-11-10 14:19:52 +03:00 · 2b783795fd
commit 2b783795fd
parent 0dfede6574
1 changed files with 25 additions and 99 deletions
--- a/content/AI&DATA/MathsForML/Coursera
+++ b/content/AI&DATA/MathsForML/Coursera
@ -1,108 +1,34 @@

 ---
-
-## Linear Regression Explanation: Weights Vector and Feature Matrix
-
-### 1. How the Weights Vector Multiplies with the Feature Matrix
-
-In linear regression, we model the relationship between input features and the target output as a linear combination of the input features. The formula for linear regression is:
-
-$$
-\mathbf{y} = \mathbf{X} \mathbf{w} + \mathbf{b}
-$$
-
-where:
- **\(\mathbf{y}\)** is the vector of predicted outputs.
- **\(\mathbf{X}\)** is the feature matrix, where each row represents a data point and each column represents a feature. If there are \(n\) data points and \(p\) features, then \(\mathbf{X}\) has a shape of \(n \times p\).
- **\(\mathbf{w}\)** is the weights vector (also called the coefficient vector), which has a size of \(p \times 1\). Each weight corresponds to the importance or contribution of a specific feature.
- **\(\mathbf{b}\)** is the bias term, which shifts the output up or down.
-
-#### Explanation of \(\mathbf{X} \mathbf{w}\)
-
-1. **Matrix-vector multiplication**: In \(\mathbf{X} \mathbf{w}\), each row of \(\mathbf{X}\) (a single data point) is multiplied by the weights vector \(\mathbf{w}\) to produce a prediction for that data point.
-2. **Output**: The result \(\mathbf{X} \mathbf{w}\) yields an \(n \times 1\) vector of predictions for all \(n\) data points. The bias term \(\mathbf{b}\) is added to each element in this vector to produce the final prediction vector.
-
---
-
-### 2. The Weights Vector \(\mathbf{w}\)
-
-The weights vector, typically denoted as **\(\mathbf{w}\)**, is a column vector that holds the coefficients assigned to each feature in linear regression. It defines the influence or contribution of each feature to the output prediction.
-
-For a model with \(p\) features, the weights vector **\(\mathbf{w}\)** is of size \(p \times 1\) and can be represented as:
-
-$$
-\mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_p \end{bmatrix}
-$$
-
-where each element \(w_i\) in the weights vector corresponds to the coefficient for feature \(x_i\):
- \(w_1\) is the weight for the first feature.
- \(w_2\) is the weight for the second feature.
- ...
- \(w_p\) is the weight for the \(p\)-th feature.
-
-For example, if the model has three features, then the weights vector would be:
-
-$$
-\mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ w_3 \end{bmatrix}
-$$
-
-In practice, these weights are determined during training by minimizing the error between the predictions and the actual target values, adjusting the weights accordingly.
-
---
-
-### 3. The Feature Matrix \(\mathbf{X}\)
-
-The feature matrix, denoted as **\(\mathbf{X}\)**, contains all input data points for a linear regression model. In this matrix:
- Each row represents a single data point (observation).
- Each column represents a feature (variable) of that data point.
-
-If there are \(n\) data points and \(p\) features, the feature matrix \(\mathbf{X}\) is of size \(n \times p\) and has the following structure:
-
-$$
-\mathbf{X} = \begin{bmatrix} 
-x_{11} & x_{12} & \cdots & x_{1p} \\
-x_{21} & x_{22} & \cdots & x_{2p} \\
-\vdots & \vdots & \ddots & \vdots \\
-x_{n1} & x_{n2} & \cdots & x_{np} \\
-\end{bmatrix}
-$$
-
-where:
- \(x_{ij}\) represents the value of the \(j\)-th feature for the \(i\)-th data point.
-
-#### Example of a Feature Matrix
-
-For three data points and two features, the feature matrix would look like this:
-
-$$
-\mathbf{X} = \begin{bmatrix} 
-x_{11} & x_{12} \\ 
-x_{21} & x_{22} \\ 
-x_{31} & x_{32} \\
-\end{bmatrix}
-$$
-
-In this example:
- The first row \([x_{11}, x_{12}]\) represents the features for the first data point.
- The second row \([x_{21}, x_{22}]\) represents the features for the second data point.
- The third row \([x_{31}, x_{32}]\) represents the features for the third data point.
-
-During the linear regression calculation, each row of **\(\mathbf{X}\)** is multiplied by the weights vector **\(\mathbf{w}\)** to generate a prediction for that specific data point.
+# Linear Algebra


-In linear algebra, a **singularity** typically refers to a **singular matrix**—a matrix that does not have an inverse. Here’s what that means:
+## Singularity
+https://community.deeplearning.ai/t/singular-vs-non-singular-naming/274873

-1. **Singular Matrix**:
-   - A matrix \( \mathbf{A} \) is singular if its **determinant** is zero: \( \det(\mathbf{A}) = 0 \).
-   - A singular matrix cannot be inverted, which implies that there is no matrix \( \mathbf{A}^{-1} \) such that \( \mathbf{A} \mathbf{A}^{-1} = \mathbf{I} \) (where \( \mathbf{I} \) is the identity matrix).
+Suppose the linear system we have is 

-2. **Implications of Singularity**:
-   - **Linearly Dependent Rows or Columns**: A matrix is singular if its rows or columns are linearly dependent, meaning that at least one row or column can be written as a linear combination of the others.
-   - **No Unique Solutions**: When using a singular matrix in a system of linear equations \( \mathbf{A}\mathbf{x} = \mathbf{b} \), the system either has **no solutions** or **infinitely many solutions**, but never a unique solution.
-   - **Non-Invertible Transformations**: In transformations represented by matrices, a singular matrix corresponds to a transformation that squashes some or all of the space, meaning the transformation is not fully reversible.
+𝐴𝑥=𝑏

-3. **Geometric Interpretation**:
-   - A singular matrix, when representing a transformation in vector space, will map some vectors onto lower-dimensional space (e.g., a 2D plane in 3D space), causing a loss of dimensionality. This often manifests as "flattening" or "collapsing" parts of the space onto each other.
+where 𝐴∈𝐑𝑛×𝑛 and 𝑥,𝑏∈𝐑𝑛.

-In summary, in linear algebra, singularity indicates a loss of invertibility due to linearly dependent vectors, resulting in a matrix that cannot map uniquely back to its original space.
+You need to be a bit more precise to be correct to relate the number (or existence) of solutions to the singularity of 𝐴.

+The following statements are correct:
+
+1. A linear system has a unique solution if and only if the matrix is non-singular.
+2. A linear system has either no solution or infinite number of solutions if and only if the matrix is singular.
+3. A linear system has a solution if and only if 𝑏 is in the range of 𝐴.
+
+Now by definition,
+
+1. The matrix is non-singular if and only if the determinant is nonzero.
+
+However, like your professor mentioned, you do not need to evaluate the determinant to see whether a matrix is singular or not (though most such methods evaluates the determinant as by-product).
+
+For example, you can use [Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) to tell whether a matrix is singular. This has the following advantages.
+
+1. The time complexity of Gaussian elimination is 𝑂(𝑛3) (whereas brute-force evaluation of determinant by the original definition takes 𝑂(𝑛!)).
+2. Gaussian elimination evaluates the determinant as by-product (i.e., with no additional cost).
+
+Hope this helps you!