Add notes and change file structure

2025-12-27 14:54:05 -06:00 · 2024-10-10 10:39:11 +08:00 · 2024-10-10 10:39:11 +08:00 · 6f6d21292c
commit 6f6d21292c
parent 0049b139a3
24 changed files with 155 additions and 11 deletions
--- a/content/.trash/algorithm/Untitled.md
+++ b/content/.trash/algorithm/Untitled.md
--- a/content/.trash/tmp.md
+++ b/content/.trash/tmp.md
@ -0,0 +1,12 @@
+---
+title: tmp_note
+tags:
+  - tmp_note
+date:
+---
+
+1. 角度和距离，到底是哪个
+2. 水平和竖直，是什么
+3. 个人的误差，不同人的差异
+4. 特征分组，分成不同的矫正曲线
+5. 左右眼一致
--- a/content/computer_sci/MOC.md
+++ b/content/computer_sci/MOC.md
@ -17,4 +17,6 @@ date: 2024-05-21

 * [Multi-Processing - MOC](computer_sci/multiProcessing/MOC.md)

-* [Computational Geometry - MOC](computer_sci/computational_geometry/MOC.md)
+* [Computational Geometry - MOC](computer_sci/computational_geometry/MOC.md)
+
+* [Interview MOC](computer_sci/interview/interview_MOC.md)
--- a/content/computer_sci/deep_learning_and_machine_learning/LLM/basic/precision/LLM_precision.md
+++ b/content/computer_sci/deep_learning_and_machine_learning/LLM/basic/precision/LLM_precision.md
@ -0,0 +1,21 @@
+---
+title: LLM Precision About
+tags:
+  - LLM
+date: 2024-09-26
+---
+# Default Precision
+
+In conventional scientific computing, we typically use 64-bit floats for a higher precision. While training deep neural networks on a GPU, we typically use a lower-than-maximum precision, namely, 32-bit floating point operation. PyTorch uses 32-bit floats by default.
+
+Reasons for deep learning use 32-bit precision:
+
+* 64-bit precision unnecessary and computationally expensive
+* GPU not optimized for 64-bit precision
+
+**32-bit floating point operations have become the standard for training deep neural networks on GPUs.**
+
+
+# Reference
+
+[1] Raschka, Sebastian. “Accelerating Large Language Models with Mixed-Precision Techniques.” _Sebastian Raschka, PhD_, 11 May 2023, https://sebastianraschka.com/blog/2023/llm-mixed-precision-copy.html.
--- a/content/computer_sci/deep_learning_and_machine_learning/Trick/quantile_loss.md
+++ b/content/computer_sci/deep_learning_and_machine_learning/Trick/quantile_loss.md
@ -13,7 +13,7 @@ Quantile loss用于衡量预测分布和目标分布之间的差异，特别适

 # What is quantile

-[quantile_concept](math/Statistics/basic_concepot/quantile_concept.md)
+[quantile_concept](math/statistic/basic_concepot/quantile_concept.md)

 # What is a prediction interval

--- a/content/computer_sci/interview/interview_MOC.md
+++ b/content/computer_sci/interview/interview_MOC.md
@ -0,0 +1,8 @@
+---
+title: CS interview MOC
+tags:
+  - MOC
+  - cs
+date: 2024-09-29
+---
+* [machine learning interview](computer_sci/interview/machine_learning_interview.md)
--- a/content/computer_sci/interview/machine_learning_interview.md
+++ b/content/computer_sci/interview/machine_learning_interview.md
@ -0,0 +1,12 @@
+---
+title: Machine Learning Interview
+tags:
+  - machine-learning
+  - cs
+  - interview
+date: 2024-09-29
+---
+# Transformer
+
+## Attention计算公式
+
--- a/content/math/MOC.md
+++ b/content/math/MOC.md
@ -10,11 +10,16 @@ date: 2023-12-03

 ## Basic Concept

-* [quantile_concept](math/Statistics/basic_concepot/quantile_concept.md)
+* [quantile_concept](math/statistic/basic_concepot/quantile_concept.md)

 ## Significance Test

-* [Basic about significance test](math/Statistics/significance_test/whats_the_significance_test.md)
+* [Basic about significance test](math/statistic/significance_test/whats_the_significance_test.md)
+
+## Anomaly Detection
+
+* [Z-Score](math/statistic/anomaly_detection/z_score.md)
+* [IQR](math/statistic/anomaly_detection/IQR.md)

 # Discrete mathematics

--- a/content/math/statistic/anomaly_detection/IQR.md
+++ b/content/math/statistic/anomaly_detection/IQR.md
@ -0,0 +1,54 @@
+---
+title: Interquartile Range
+tags:
+  - math
+  - statistics
+  - anomaly
+date: 2024-10-08
+---
+# What is IQR
+
+**Interquartile Range**, IQR, 即四分位距。
+基于IQR进行anomaly detection常用于检测非正太分布数据中的异常值，它通过数据的四分位数（Q1和Q3）来识别和去除异常值，较[Z-score](math/statistic/anomaly_detection/z_score.md)方法更适合处理有偏或非正态分布的数据。
+
+- **第一四分位数（Q1）**：下四分位数，表示数据中最小25%的点所在位置。
+- **第三四分位数（Q3）**：上四分位数，表示数据中最大25%的点所在位置。
+- **四分位距（IQR）**：是Q3与Q1之间的差值，计算公式为：  
+$$
+IQR = Q3 - Q1
+$$
+
+# Algorithm Detail
+
+1. **排序数据**：
+    
+    - 将数据从小到大排序。
+2. **计算四分位数**：
+    
+    - **Q1**：找到排序后数据中第25%的位置。
+    - **Q3**：找到排序后数据中第75%的位置。
+3. **计算四分位距**：
+    
+    - IQR = Q3 - Q1，表示数据中间部分的扩展范围。
+4. **设定上下限**：
+    
+    - 定义**下限**和**上限**，用于判断异常值。
+    - **下限** = Q1 - 1.5 × IQR
+    - **上限** = Q3 + 1.5 × IQR
+    - 1.5倍IQR是一个常用的经验值，可以调整为其他倍数（如2倍或3倍），取决于具体应用场景。
+5. **检测异常值**：
+    
+    - 任何小于下限或大于上限的数据点被认为是异常值。
+
+
+# Pros and Cons
+
+### 优点：
+
+- **不依赖数据分布**：IQR算法不需要假设数据为正态分布，适合处理有偏分布或非对称分布的数据。
+- **对极端值不敏感**：与Z-score不同，IQR不受极端值的影响，因为它依赖于中位数和四分位数，而非均值和标准差。
+
+### 缺点：
+
+- **对大规模数据集处理效率较低**：在大型数据集中计算四分位数和IQR可能会比较耗时。
+- **对数据边界的敏感性**：虽然IQR能有效识别极端的异常值，但对于靠近上下界的边缘数据，可能会过度标记为异常。
--- a/content/math/statistic/anomaly_detection/z_score.md
+++ b/content/math/statistic/anomaly_detection/z_score.md
@ -0,0 +1,30 @@
+---
+title: Z-score
+tags:
+  - math
+  - statistics
+date: 2024-10-08
+---
+# What is Z-score
+
+$$
+z = \frac{X-\mu}{\sigma}
+$$
+* $X$: 单个数据点
+* $\mu$: 总体均值
+* $\sigma$: 总体标准差
+
+通过该公式，Z-score表示一个数据点与平均值之间的标准差距离。具体来说：
+
+- 当Z-score为0时，表示该数据点等于均值。
+- 当Z-score在±1之间时，表示数据点在一个标准差范围内。
+- 当Z-score超过±3时，通常被视为异常值
+
+
+# Pros and Cons
+
+Z-score的概念很直接，部署快捷。
+
+Z-score为什么要叫做Z-score，是因为**Z的符号来源于正态分布**。在统计学中，标准正态分布是一种具有均值为0、标准差为1的特殊正态分布，通常用字母 **Z** 表示。
+
+也是因为此，Z-score用于的数据分布常常处于正太分布，对数据正太分布有依赖性，因此对极端值敏感，使得均值和标准差容易受到极端值影响，导致误判
--- a/content/math/Statistics/basic_concepot/distribution/attachments/2bbb645362366906ace3296d35612625_720.jpg
+++ b/content/math/Statistics/basic_concepot/distribution/attachments/2bbb645362366906ace3296d35612625_720.jpg
--- a/content/math/Statistics/basic_concepot/distribution/attachments/df15541df80b6065fb8296d80ffceac5_720.jpg
+++ b/content/math/Statistics/basic_concepot/distribution/attachments/df15541df80b6065fb8296d80ffceac5_720.jpg
--- a/content/math/Statistics/basic_concepot/distribution/attachments/prove.jpg
+++ b/content/math/Statistics/basic_concepot/distribution/attachments/prove.jpg
--- a/content/math/Statistics/basic_concepot/distribution/beta_binomial.md
+++ b/content/math/Statistics/basic_concepot/distribution/beta_binomial.md
--- a/content/math/Statistics/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md
+++ b/content/math/Statistics/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md
@ -63,7 +63,7 @@ $$

 # Deduction

-![](math/Statistics/basic_concepot/distribution/attachments/2bbb645362366906ace3296d35612625_720.jpg)
+![](math/statistic/basic_concepot/distribution/attachments/2bbb645362366906ace3296d35612625_720.jpg)

 # Reference

--- a/content/math/Statistics/basic_concepot/distribution/gamma_distribution.md
+++ b/content/math/Statistics/basic_concepot/distribution/gamma_distribution.md
@ -35,7 +35,7 @@ $$
 $$
 证明如下：

-![](math/Statistics/basic_concepot/distribution/attachments/prove.jpg)
+![](math/statistic/basic_concepot/distribution/attachments/prove.jpg)

 同时，在integer节点，Gamma function也和阶乘对应起来，即：

@ -45,7 +45,7 @@ $$

 证明如下：

-![](math/Statistics/basic_concepot/distribution/attachments/df15541df80b6065fb8296d80ffceac5_720.jpg)
+![](math/statistic/basic_concepot/distribution/attachments/df15541df80b6065fb8296d80ffceac5_720.jpg)



@ -53,7 +53,7 @@ $$

 Exponential Distribution指的是，probability of the waiting time between events in a Poisson Process

-Here's the exponential distribution explain: [Exponential Distribution](math/Statistics/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md)
+Here's the exponential distribution explain: [Exponential Distribution](math/statistic/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md)


 # Introduction
--- a/content/math/Statistics/basic_concepot/distribution/students_t_distribution.md
+++ b/content/math/Statistics/basic_concepot/distribution/students_t_distribution.md
--- a/content/math/Statistics/basic_concepot/quantile_concept.md
+++ b/content/math/Statistics/basic_concepot/quantile_concept.md
--- a/content/math/Statistics/game_theory/counterfactual_regret_minimization.md
+++ b/content/math/Statistics/game_theory/counterfactual_regret_minimization.md
--- a/content/math/Statistics/significance_test/attachments/Pasted
+++ b/content/math/Statistics/significance_test/attachments/Pasted
--- a/content/math/Statistics/significance_test/whats_the_significance_test.md
+++ b/content/math/Statistics/significance_test/whats_the_significance_test.md
@ -80,7 +80,7 @@ T值的大小并不直接影响相关性的可重复性。然而，如果我们

 ### P-value

-![](math/Statistics/significance_test/attachments/Pasted%20image%2020240415174359.png)
+![](math/statistic/significance_test/attachments/Pasted%20image%2020240415174359.png)


 P值（P-value），全称为概率值（Probability value），是统计假设检验中的一个重要概念。**它用于帮助我们决定是否拒绝零假设**（null hypothesis）。**P值衡量的是，在零假设为真的情况下，观察到的统计量（如T值、Z值等）或更极端情况出现的概率**。
--- a/content/math/Statistics/stochastic_process/attachments/6fd1795d98c9031bc791909a8d098e25.jpg
+++ b/content/math/Statistics/stochastic_process/attachments/6fd1795d98c9031bc791909a8d098e25.jpg
--- a/content/math/Statistics/stochastic_process/markov_chain.md
+++ b/content/math/Statistics/stochastic_process/markov_chain.md
@ -35,7 +35,7 @@ $$

 An RPG with a 33% blitz rate. But if the first two times you don't blitz, the third time you're bound to blitz. So what is the actual hit rate?

-![](math/Statistics/stochastic_process/attachments/6fd1795d98c9031bc791909a8d098e25.jpg)
+![](math/statistic/stochastic_process/attachments/6fd1795d98c9031bc791909a8d098e25.jpg)

 Simulation Code:

--- a/content/signal/signal_processing/algorithm/advanced_statistic/autocorrelation/autocorrelation.md
+++ b/content/signal/signal_processing/algorithm/advanced_statistic/autocorrelation/autocorrelation.md
@ -70,7 +70,7 @@ For correlation, we usually use **p-value** to **quantify the confidence** of th
 ![](signal/signal_processing/algorithm/advanced_statistic/autocorrelation/attachments/Pasted%20image%2020240415171855.png)


-About P-value, you have better know what's [significance test](math/Statistics/significance_test/whats_the_significance_test.md)
+About P-value, you have better know what's [significance test](math/statistic/significance_test/whats_the_significance_test.md)


 ## Random Signal