Add notes and change file structure
0
content/.trash/algorithm/Untitled.md
Normal file
12
content/.trash/tmp.md
Normal file
@ -0,0 +1,12 @@
|
||||
---
|
||||
title: tmp_note
|
||||
tags:
|
||||
- tmp_note
|
||||
date:
|
||||
---
|
||||
|
||||
1. 角度和距离,到底是哪个
|
||||
2. 水平和竖直,是什么
|
||||
3. 个人的误差,不同人的差异
|
||||
4. 特征分组,分成不同的矫正曲线
|
||||
5. 左右眼一致
|
||||
@ -18,3 +18,5 @@ date: 2024-05-21
|
||||
* [Multi-Processing - MOC](computer_sci/multiProcessing/MOC.md)
|
||||
|
||||
* [Computational Geometry - MOC](computer_sci/computational_geometry/MOC.md)
|
||||
|
||||
* [Interview MOC](computer_sci/interview/interview_MOC.md)
|
||||
@ -0,0 +1,21 @@
|
||||
---
|
||||
title: LLM Precision About
|
||||
tags:
|
||||
- LLM
|
||||
date: 2024-09-26
|
||||
---
|
||||
# Default Precision
|
||||
|
||||
In conventional scientific computing, we typically use 64-bit floats for a higher precision. While training deep neural networks on a GPU, we typically use a lower-than-maximum precision, namely, 32-bit floating point operation. PyTorch uses 32-bit floats by default.
|
||||
|
||||
Reasons for deep learning use 32-bit precision:
|
||||
|
||||
* 64-bit precision unnecessary and computationally expensive
|
||||
* GPU not optimized for 64-bit precision
|
||||
|
||||
**32-bit floating point operations have become the standard for training deep neural networks on GPUs.**
|
||||
|
||||
|
||||
# Reference
|
||||
|
||||
[1] Raschka, Sebastian. “Accelerating Large Language Models with Mixed-Precision Techniques.” _Sebastian Raschka, PhD_, 11 May 2023, https://sebastianraschka.com/blog/2023/llm-mixed-precision-copy.html.
|
||||
@ -13,7 +13,7 @@ Quantile loss用于衡量预测分布和目标分布之间的差异,特别适
|
||||
|
||||
# What is quantile
|
||||
|
||||
[quantile_concept](math/Statistics/basic_concepot/quantile_concept.md)
|
||||
[quantile_concept](math/statistic/basic_concepot/quantile_concept.md)
|
||||
|
||||
# What is a prediction interval
|
||||
|
||||
|
||||
8
content/computer_sci/interview/interview_MOC.md
Normal file
@ -0,0 +1,8 @@
|
||||
---
|
||||
title: CS interview MOC
|
||||
tags:
|
||||
- MOC
|
||||
- cs
|
||||
date: 2024-09-29
|
||||
---
|
||||
* [machine learning interview](computer_sci/interview/machine_learning_interview.md)
|
||||
12
content/computer_sci/interview/machine_learning_interview.md
Normal file
@ -0,0 +1,12 @@
|
||||
---
|
||||
title: Machine Learning Interview
|
||||
tags:
|
||||
- machine-learning
|
||||
- cs
|
||||
- interview
|
||||
date: 2024-09-29
|
||||
---
|
||||
# Transformer
|
||||
|
||||
## Attention计算公式
|
||||
|
||||
@ -10,11 +10,16 @@ date: 2023-12-03
|
||||
|
||||
## Basic Concept
|
||||
|
||||
* [quantile_concept](math/Statistics/basic_concepot/quantile_concept.md)
|
||||
* [quantile_concept](math/statistic/basic_concepot/quantile_concept.md)
|
||||
|
||||
## Significance Test
|
||||
|
||||
* [Basic about significance test](math/Statistics/significance_test/whats_the_significance_test.md)
|
||||
* [Basic about significance test](math/statistic/significance_test/whats_the_significance_test.md)
|
||||
|
||||
## Anomaly Detection
|
||||
|
||||
* [Z-Score](math/statistic/anomaly_detection/z_score.md)
|
||||
* [IQR](math/statistic/anomaly_detection/IQR.md)
|
||||
|
||||
# Discrete mathematics
|
||||
|
||||
|
||||
54
content/math/statistic/anomaly_detection/IQR.md
Normal file
@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Interquartile Range
|
||||
tags:
|
||||
- math
|
||||
- statistics
|
||||
- anomaly
|
||||
date: 2024-10-08
|
||||
---
|
||||
# What is IQR
|
||||
|
||||
**Interquartile Range**, IQR, 即四分位距。
|
||||
基于IQR进行anomaly detection常用于检测非正太分布数据中的异常值,它通过数据的四分位数(Q1和Q3)来识别和去除异常值,较[Z-score](math/statistic/anomaly_detection/z_score.md)方法更适合处理有偏或非正态分布的数据。
|
||||
|
||||
- **第一四分位数(Q1)**:下四分位数,表示数据中最小25%的点所在位置。
|
||||
- **第三四分位数(Q3)**:上四分位数,表示数据中最大25%的点所在位置。
|
||||
- **四分位距(IQR)**:是Q3与Q1之间的差值,计算公式为:
|
||||
$$
|
||||
IQR = Q3 - Q1
|
||||
$$
|
||||
|
||||
# Algorithm Detail
|
||||
|
||||
1. **排序数据**:
|
||||
|
||||
- 将数据从小到大排序。
|
||||
2. **计算四分位数**:
|
||||
|
||||
- **Q1**:找到排序后数据中第25%的位置。
|
||||
- **Q3**:找到排序后数据中第75%的位置。
|
||||
3. **计算四分位距**:
|
||||
|
||||
- IQR = Q3 - Q1,表示数据中间部分的扩展范围。
|
||||
4. **设定上下限**:
|
||||
|
||||
- 定义**下限**和**上限**,用于判断异常值。
|
||||
- **下限** = Q1 - 1.5 × IQR
|
||||
- **上限** = Q3 + 1.5 × IQR
|
||||
- 1.5倍IQR是一个常用的经验值,可以调整为其他倍数(如2倍或3倍),取决于具体应用场景。
|
||||
5. **检测异常值**:
|
||||
|
||||
- 任何小于下限或大于上限的数据点被认为是异常值。
|
||||
|
||||
|
||||
# Pros and Cons
|
||||
|
||||
### 优点:
|
||||
|
||||
- **不依赖数据分布**:IQR算法不需要假设数据为正态分布,适合处理有偏分布或非对称分布的数据。
|
||||
- **对极端值不敏感**:与Z-score不同,IQR不受极端值的影响,因为它依赖于中位数和四分位数,而非均值和标准差。
|
||||
|
||||
### 缺点:
|
||||
|
||||
- **对大规模数据集处理效率较低**:在大型数据集中计算四分位数和IQR可能会比较耗时。
|
||||
- **对数据边界的敏感性**:虽然IQR能有效识别极端的异常值,但对于靠近上下界的边缘数据,可能会过度标记为异常。
|
||||
30
content/math/statistic/anomaly_detection/z_score.md
Normal file
@ -0,0 +1,30 @@
|
||||
---
|
||||
title: Z-score
|
||||
tags:
|
||||
- math
|
||||
- statistics
|
||||
date: 2024-10-08
|
||||
---
|
||||
# What is Z-score
|
||||
|
||||
$$
|
||||
z = \frac{X-\mu}{\sigma}
|
||||
$$
|
||||
* $X$: 单个数据点
|
||||
* $\mu$: 总体均值
|
||||
* $\sigma$: 总体标准差
|
||||
|
||||
通过该公式,Z-score表示一个数据点与平均值之间的标准差距离。具体来说:
|
||||
|
||||
- 当Z-score为0时,表示该数据点等于均值。
|
||||
- 当Z-score在±1之间时,表示数据点在一个标准差范围内。
|
||||
- 当Z-score超过±3时,通常被视为异常值
|
||||
|
||||
|
||||
# Pros and Cons
|
||||
|
||||
Z-score的概念很直接,部署快捷。
|
||||
|
||||
Z-score为什么要叫做Z-score,是因为**Z的符号来源于正态分布**。在统计学中,标准正态分布是一种具有均值为0、标准差为1的特殊正态分布,通常用字母 **Z** 表示。
|
||||
|
||||
也是因为此,Z-score用于的数据分布常常处于正太分布,对数据正太分布有依赖性,因此对极端值敏感,使得均值和标准差容易受到极端值影响,导致误判
|
||||
|
Before Width: | Height: | Size: 107 KiB After Width: | Height: | Size: 107 KiB |
|
Before Width: | Height: | Size: 89 KiB After Width: | Height: | Size: 89 KiB |
|
Before Width: | Height: | Size: 792 KiB After Width: | Height: | Size: 792 KiB |
@ -63,7 +63,7 @@ $$
|
||||
|
||||
# Deduction
|
||||
|
||||

|
||||

|
||||
|
||||
# Reference
|
||||
|
||||
@ -35,7 +35,7 @@ $$
|
||||
$$
|
||||
证明如下:
|
||||
|
||||

|
||||

|
||||
|
||||
同时,在integer节点,Gamma function也和阶乘对应起来,即:
|
||||
|
||||
@ -45,7 +45,7 @@ $$
|
||||
|
||||
证明如下:
|
||||
|
||||

|
||||

|
||||
|
||||
|
||||
|
||||
@ -53,7 +53,7 @@ $$
|
||||
|
||||
Exponential Distribution指的是,probability of the waiting time between events in a Poisson Process
|
||||
|
||||
Here's the exponential distribution explain: [Exponential Distribution](math/Statistics/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md)
|
||||
Here's the exponential distribution explain: [Exponential Distribution](math/statistic/basic_concepot/distribution/exponential_distribution_and_poisson_distribution.md)
|
||||
|
||||
|
||||
# Introduction
|
||||
|
Before Width: | Height: | Size: 860 KiB After Width: | Height: | Size: 860 KiB |
@ -80,7 +80,7 @@ T值的大小并不直接影响相关性的可重复性。然而,如果我们
|
||||
|
||||
### P-value
|
||||
|
||||

|
||||

|
||||
|
||||
|
||||
P值(P-value),全称为概率值(Probability value),是统计假设检验中的一个重要概念。**它用于帮助我们决定是否拒绝零假设**(null hypothesis)。**P值衡量的是,在零假设为真的情况下,观察到的统计量(如T值、Z值等)或更极端情况出现的概率**。
|
||||
|
Before Width: | Height: | Size: 493 KiB After Width: | Height: | Size: 493 KiB |
@ -35,7 +35,7 @@ $$
|
||||
|
||||
An RPG with a 33% blitz rate. But if the first two times you don't blitz, the third time you're bound to blitz. So what is the actual hit rate?
|
||||
|
||||

|
||||

|
||||
|
||||
Simulation Code:
|
||||
|
||||
@ -70,7 +70,7 @@ For correlation, we usually use **p-value** to **quantify the confidence** of th
|
||||

|
||||
|
||||
|
||||
About P-value, you have better know what's [significance test](math/Statistics/significance_test/whats_the_significance_test.md)
|
||||
About P-value, you have better know what's [significance test](math/statistic/significance_test/whats_the_significance_test.md)
|
||||
|
||||
|
||||
## Random Signal
|
||||
|
||||