Finally reconstruct content to repo

This commit is contained in:
PinkR1ver 2024-02-28 16:54:19 +08:00
parent ab3940c95f
commit 01b98bc210
756 changed files with 11128 additions and 1 deletions

View File

@ -0,0 +1,8 @@
---
title: Homework 1, student's Problem
tags:
- work-about
---
# 0780

View File

@ -0,0 +1,7 @@
---
title: A log for test
tags:
- log
date: 2024-02-28
---
A log for test

View File

@ -0,0 +1,8 @@
---
title: Problem
tags:
- research-about
---
* UWB signal ejection
* S parameter -> Frequency spectrum
* Know why VNA don't have time domain (circuit view)

0
content/.trash/UWB.md Normal file
View File

View File

View File

View File

View File

View File

View File

View File

View File

View File

View File

View File

View File

@ -0,0 +1 @@
{}

View File

View File

@ -0,0 +1,8 @@
---
title: Intermediate Frequency Bandwidth
tags:
- equipment
- VNA
- research-about
---

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

View File

@ -0,0 +1,6 @@
---
title: Equipment Research MOC
tags:
- MOC
---

View File

@ -0,0 +1,78 @@
---
title: UWB signal characterization experiment by VNA demo
tags:
- experiment
---
# Experiment Graph Overview
![](research_career/attachments/Untitled-1.png)
In this experiment, we use VNA Port 1 to eject signal and port2 to receive the signal reflecting by the reflection medium, such as burned tissue.
And we can use VNA to get scattering parameter to do analysis in frequency spectrum, including amplitude information and phase information.
# Experiment Explanations
## What is VNA
![](research_career/attachments/Pasted%20image%2020231016082202.png)
A Vector Network Analyzer (VNA) is a sophisticated electronic instrument used in the field of radio frequency (RF) and microwave engineering. Its primary function is to measure and characterize the electrical behavior of high-frequency components, such as antennas, cables, and passive RF devices like filters and amplifiers. VNAs are essential tools in the design, testing, and maintenance of RF and microwave systems.
## Ejecting Signal
The VNA generates an "ejecting signal," also known as the "incident signal" or "test signal", which is usually be a continuous wave (CW) or narrowband signal. In our VNA equipment, the signal is CW signal, which is at discrete frequencies sweeping in the specific frequency range.
Here I want to name the signal, which is at discrete frequencies sweeping in the specific frequency range, **frequency sweeping signal**
**So we can not acquire UWB signal directly in VNA.**
Here are two possible solutions:
1. Modem our sweep signal to UWB signal.
* I have already find the way to modem chirp signal to UWB signal, here:
[Chirp BOK BPSK.pdf](https://pinktalk.online/research_career/attachments/CN101267424A.pdf)
* Not sure if we can modem our frequency sweeping signal to UWB signal.
2. Direct using our frequency sweeping signal.
* Though we don't directly eject UWB signal, our ejecting signal also contains discrete frequencies, which are composition of UWB signal. In this way, we can analysis different frequencies component in UWB separately.
* Speculatively, we guess the high frequency part's phase information can provide the range detection function. The low frequency part's amplitude information will have a relation with the reflecting medium.
In this experiment demo, we use solution 2.
## Data we get
In this experiment, we can only get scattering parameter, S11 and S12.
![](research_career/attachments/Pasted%20image%2020231016091540.png)
Using this parameter, considering incident as a constant we can get frequency spectrum information, though the signal is not UWB signal
# Experiment Step
## 1. Set up Experiment equipment
* Prepare the experimental apparatus
* VNA, Keysight E5063A 100kHz - 6.5GHz
* UWB antennas
* N, male - SMA male
* VNA calibration
* Load the UWB antennas to VNA port1 and port2
## 2. Set reference data
* In antenna's near field and far filed, such as 20cm and 40cm, we need to set a specified medium and get S11 and S12 trace data.
* This two experiments will be our reference. Later we can get other S11 and S12 to compare with this data get range and material.
## 3. Data collection
Collect S11 and S12 trace in different reflection mediums and distances between antennas and mediums.
## 4. Data analysis
We want to build relationship between our data with distance and reflection mediums to show that UWB can have the ability to detect burning tissue level.

View File

@ -0,0 +1,8 @@
---
title: Log 2023.07.06 - 路过人间,谁有意见
tags:
- log
- music
---
![](文学/log/2023/7/attachments/7JEC(63A65[8JFI[G6O`IIK_tmb.jpg)

View File

@ -0,0 +1,7 @@
---
title: Log 2023.09.11 - Get some interesting blog here
tags:
- log
- front-end
---
* [_Building a Frontend Framework; Reactivity and Composability With Zero Dependencies_. https://18alan.space/posts/how-hard-is-it-to-build-a-frontend-framework.html. Accessed 11 Sept. 2023.](https://18alan.space/posts/how-hard-is-it-to-build-a-frontend-framework.html)

View File

@ -0,0 +1,8 @@
---
title: Log 2023.09.18 - A Normal Learning Day
tags:
- log
---
* learn 红色高棉 by wiki
* try to use [TXYZ](https://txyz.ai/) to read paper

View File

@ -0,0 +1,26 @@
---
title: Spectrum Analyzer
tags:
- equipment
- research-about
---
# What is spectrum analyzer?
Measure input signal power spectrum
# Spectrum Analyzer for UWB
## Papers
### Measurements of UWB through-the-wall propagation using spectrum analyzer and the Hilbert transform
[*pdf* - Measurements of UWB through-the-wall propagation using spectrum analyzer and the Hilbert transform](https://pinktalk.online/equipment_research/attachments/mop.23107.pdf)
![Architect](signal_processing/equipment/attachments/Pasted%20image%2020230918104114.png)
# Reference
* [_Understanding Basic Spectrum Analyzer Operation_. _www.youtube.com_, https://www.youtube.com/watch?v=P5gxNGckjLc. Accessed 13 Sept. 2023.](https://pinktalk.online/%E6%96%87%E5%AD%A6/%E5%8F%A5%E5%AD%90/Feeling/)
* https://www.bilibili.com/video/BV1kG4y1q72V/?spm_id_from=333.337.search-card.all.click&vd_source=c47136abc78922800b17d6ce79d6e19f

25
content/_index.md Normal file

File diff suppressed because one or more lines are too long

Binary file not shown.

62
content/atlas.md Normal file
View File

@ -0,0 +1,62 @@
---
title: Atlas - Map of Maps
tags:
- MOC
---
🚧 There are notebooks about his research career:
* [Deep Learning & Machine Learning](computer_sci/deep_learning_and_machine_learning/deep_learning_MOC.md)
* [[synthetic_aperture_radar_imaging/SAR_MOC| Synthetic Aperture Radar(SAR) Imaging]]
💻 Also, his research needs some basic science to support
* [Data Structure and Algorithm MOC](computer_sci/data_structure_and_algorithm/MOC.md)
* [Hardware](computer_sci/hardware/hardware_MOC.md)
* [Physics](physics/physics_MOC.md)
* [Signal Processing](signal_processing/signal_processing_MOC.md)
* [Data Science](data_sci/data_sci_MOC.md)
* [About coding language design detail](computer_sci/coding_knowledge/coding_lang_MOC.md)
* [Math](math/MOC.md)
* [Computational Geometry](computer_sci/computational_geometry/MOC.md)
* [Code Framework Learn](computer_sci/code_frame_learn/MOC.md)
🦺 I also need some tool to help me:
* [Git](toolkit/git/git_MOC.md)
💻 Code Practice:
* [💽Programing Problem Solution Record](https://github.com/PinkR1ver/JudeW-Problemset)
🛶 Also, he learn some knowledge about his hobbies:
* [📷 Photography](photography/photography_MOC.md)
* [📮文学](文学/文学_MOC.md)
* [🥐Food](food/MOC.md)
* [🎬Watching List](https://pinkr1ver.notion.site/5e136466f3664ff1aaaa75b85446e5b4?v=a41efbce52a84f7aa89d8f649f4620f6&pvs=4)
⭐ Here to find my recent study:
* [Recent notes (this function cannot be used on web)](recent.md)
* [Papers Recently Read](research_career/papers_read.md)
🎏 I also have some plans in my mind to do
* [Life List🚀](plan/life.md)
☁️ I also have some daily thoughts:
* [Logs](log/log_MOC.md)

View File

@ -0,0 +1,10 @@
---
title: Code Framework Learn
tags:
- web
- code_tool
---
# Web Framework
* [Flask](computer_sci/code_frame_learn/flask/MOC.md)

View File

@ -0,0 +1,3 @@
---
title: Flask - MOC
---

View File

@ -0,0 +1,18 @@
---
title: About coding language design detail
tags:
- basic
- coding-language
- MOC
---
# Python
[Why python doesn't need pointer?](computer_sci/coding_knowledge/python/python_doesnt_need_pointer.md)
# C
# MATLAB
# JavaScript

View File

@ -0,0 +1,8 @@
---
title: Matplotlib Backend Review
tags:
- python
- code
- matplotlib
---

View File

@ -0,0 +1,116 @@
---
title: Why python doesn't need pointer?
tags:
- python
- coding-language
- basic
---
Python doesn't require the explicit use of pointers like C because of its **underlying memory management** and **object model**.
# Design Concept
## Underlying memory management
In Python, variables are *references to objects rather than memory addresses* like pointers in C. When you assign a value to a variable in Python, you are actually creating a reference to an object in memory. This reference allows you to access and manipulate the object, but you don't need to manage memory explicitly.
Python uses automatic memory management through a mechanism called **garbage collection**. It keeps track of objects in memory and automatically deallocates memory for objects that are no longer referenced or used. **This automatic memory management frees developers from the responsibility of explicitly allocating and deallocating memory using pointers.**
## Object model
For python, everything is object, objects contains at least three pieces of data:
* Reference count
* Type
* Value
Reference count is an interesting concept in python, it's designed for memory management.
> [!help]
> Reference count refers to the number of references an object. Each object in Python contains a reference count, which is a count of how many references or variables are currently pointing to that object.
>
> The reference count mechanism is part of Python's memory management system. When an object is created or assigned to a variable, the reference count of that object is incremented. When a reference to an object is deleted or reassigned, the reference count is decremented. When the reference count of an object reaches zero, it means that there are no more references to that object, and the memory occupied by the object can be deallocated.
>
> The reference count mechanism help python build its **garbage collection** memory management.
Python's object model also provides additional benefits. For example, *objects in Python can have different types, and variables can be dynamically reassigned to different objects of different types without any explicit type declarations or memory management.*
# Example
Generating by [ChatGPT](https://chat.openai.com/):
**Example about swap**
---
Code:
```python
def swap_variables(a, b):
return b, a
# Example usage
x = 10
y = 20
x, y = swap_variables(x, y)
print("x =", x)
print("y =", y)
```
Certainly! Let's go through the example step by step to understand how memory management and the reference system work in Python when swapping two variables.
Step 1: Variable assignment
```python
x = 10
y = 20
```
In this step, two variables `x` and `y` are created and assigned the values `10` and `20`, respectively. At this point, **the reference count for the objects `10` and `20` is 1** each because both `x` and `y` reference these objects.
Step 2: Calling the `swap_variables` function
```python
x, y = swap_variables(x, y)
```
Here, the `swap_variables` function is called with the current values of `x` and `y` as arguments. The function returns a tuple containing the swapped values, which are then assigned to `x` and `y`. At this stage, the memory and reference count changes as follows:
- The `swap_variables` function receives the values of `x` and `y` (references to objects `10` and `20`) as arguments. *The function does not create any new objects or change the reference count*.
- The `swap_variables` function returns a tuple with the swapped values `(20, 10)`. *This creates a new tuple object in memory and increments the reference count of objects `20` and `10` to 2 each* (one reference from the tuple, one reference from `x` and `y`).
Step 3: Swapping the values
```python
x, y = 20, 10
```
The parallel assignment swaps the values of `x` and `y`. At this point, `x` refers to the object `20`, and `y` refers to the object `10`. The reference count of objects `10` and `20` remains at 2 each (one reference from the tuple, one reference from `x` and `y`).
Step 4: Printing the swapped values
```python
print("x =", x)
print("y =", y)
```
This step simply prints the values of `x` and `y`. The output will be:
```
x = 20
y = 10
```
Step 5: Memory deallocation
At the end of the program, Python's garbage collector runs periodically to identify and deallocate objects that are no longer referenced. In this example, the tuple created in the `swap_variables` function (`(20, 10)`) becomes eligible for garbage collection because it is no longer referenced. When the garbage collector runs, it will decrement the reference count of objects `10` and `20` to 1 each, and since the reference count reaches zero, the memory occupied by the tuple and the objects will be deallocated.
In summary, the Python memory management and reference system handles the creation, reference counting, and deallocation of objects automatically. The variables `x` and `y` act as references to the objects `10` and `20`, and swapping their values using the parallel assignment does not involve explicit memory management or pointer manipulation.
---
## Trick
> [!quote]
>You can use `getrefcount` to get reference count of the object. *The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to* `getrefcount()`.
# Conclusion
Overall, Python's design philosophy aims to prioritize simplicity and readability while abstracting away low-level memory management concerns, making it easier and more convenient to work with compared to languages like C that require explicit pointer manipulation.
# Reference
* ChatGPT - Prompt: Why python don't need pointer

View File

@ -0,0 +1,11 @@
---
title: Computational Geometry MOC
tags:
- math
- MOC
- geometry
---
# 3D Geometry Algorithm
* [Delaunay Triangulation](computer_sci/computational_geometry/delaunay_triangulation.md)

View File

@ -0,0 +1,14 @@
---
title: Delaunay Triangulation
tags:
- math
- geometry
---
# What is Delaunay Triangulation?
# Reference
* [_Delaunay Triangulation (1/5) | Computational Geometry - Lecture 08_. _www.youtube.com_, https://www.youtube.com/watch?v=6UsdvbiJx54. Accessed 4 Sept. 2023.](https://www.youtube.com/watch?v=6UsdvbiJx54)
* [_Delaunay Triangulation_. _www.youtube.com_, https://www.youtube.com/watch?v=GctAunEuHt4. Accessed 4 Sept. 2023.](https://www.youtube.com/watch?v=GctAunEuHt4)

View File

@ -0,0 +1,24 @@
---
title: Data Structure and Algorithm MOC
tags:
- MOC
- algorithm
- data-structure
---
# Tree-like Structure
* [Fenwick Tree](computer_sci/data_structure_and_algorithm/tree/fenwick_tree.md)
* [Segment Tree](computer_sci/data_structure_and_algorithm/tree/segment_tree.md)
# Graph
## Algorithm
* [BFS](computer_sci/data_structure_and_algorithm/graph/BFS.md)
* [Topological Sorting](computer_sci/data_structure_and_algorithm/graph/topological_sorting.md)
* [Minimum Spanning Tree](computer_sci/data_structure_and_algorithm/graph/MST.md)
## Type of graph
* [Spanning Tree](computer_sci/data_structure_and_algorithm/graph/spanning_tree.md)

View File

@ -0,0 +1,19 @@
---
title: Breadth First Search in Python
tags:
- data-structure
- basic
- algorithm
---
# Basic Concept
# Code Implementation
# Reference
* [_Breadth First Search Algorithm Explained (With Example and Code)_. _www.youtube.com_, https://www.youtube.com/watch?v=YtD2KGRdn3s. Accessed 19 July 2023.](https://www.youtube.com/watch?v=YtD2KGRdn3s&t=2s)

View File

@ -0,0 +1,7 @@
---
title: Minimum Spanning Tree
tags:
- data-structure
- graph
---
Not now...

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 516 KiB

View File

@ -0,0 +1,56 @@
---
title: Spanning Tree
tags:
- graph
- data-structure
---
# What is Spanning Tree?
树上再加一条边使之存在环,就称为**基环树**
Example:
![](computer_sci/data_structure_and_algorithm/graph/attachments/Pasted%20image%2020230915111826.png)
# Why do we need Spanning Tree
* **Network design**: Spanning trees are used to create efficient and redundant networks, such as in Ethernet networks or telecommunications.
* **Routing protocols**: Spanning trees are employed in protocols like Spanning Tree Protocol (STP) and Rapid Spanning Tree Protocol (RSTP) for loop prevention and redundancy in network switches.
* [Minimum Spanning Tree (MST)](computer_sci/data_structure_and_algorithm/graph/MST.md): Spanning trees can be used to find the minimum-weighted spanning tree in a weighted graph. This is particularly useful in **optimizing costs in transportation networks or electrical power distribution grids**.
* **Broadcast algorithms**: Spanning trees are used in broadcasting messages or data packets efficiently within a network, ensuring that each node receives the message exactly once.
> [!summary]
> Spanning trees provide **a simplified view of the graph**, which **eliminates unnecessary edges** while **preserving connectivity**. This simplification helps in various graph-related algorithms, network design, and optimization problems.
# More about Spanning Tree
## Inward Spanning Tree, 内向基环树
一个基环树的拓展概念没有在英文资料里查到很多相关资料但是中文资料和chatgpt明白这个词表达的概念一致
![](computer_sci/data_structure_and_algorithm/graph/attachments/Pasted%20image%2020230915114049.png)
内向基环树类似于基环树的结构,在有向图中,每个点有且只有一条出边,即**every node out-degree = 1***这也是内向的定义*。(”这个图会给人内向的感觉“)
内向基环树的特点是可以通过BFS去检索所有indegree = 0的点直到环的位置这样可以去检索基环树里的最长链。
具体的代码可以见:
[*Leet Code* - 2127 Maximum Employees to Be invited to a Meeting](https://github.com/PinkR1ver/JudeW-Problemset/blob/master/Leetcode/2127.%20Maximum%20Employees%20to%20Be%20Invited%20to%20a%20Meeting/main_bfs.py)
## Outward Spanning Tree外向基环树
in-degree = 1就会造成外向的感觉如图
![](computer_sci/data_structure_and_algorithm/graph/attachments/Pasted%20image%2020230919101645.png)
具体应用有待补充
# Reference
* [_$Note$-内向基环树 - AcWing_. https://www.acwing.com/blog/content/23513/. Accessed 15 Sept. 2023.](https://www.acwing.com/blog/content/23513/)
* [_浅谈基环树环套树 - Seaway-Fu - 博客园_. https://www.cnblogs.com/fusiwei/p/13815549.html. Accessed 19 Sept. 2023.](https://www.cnblogs.com/fusiwei/p/13815549.html)

View File

@ -0,0 +1,83 @@
---
title: Topological Sorting
tags:
- data-structure
- graph
---
# What is Topological Sorting
**Topological Sorting**(拓扑排序) is designed for Directed Acyclic Graph(**DAG, 有向无环图**). Topological Sorting is a linear ordering of vertices such that for every directed edge u v, vertex u comes before v in the ordering.
## Example
![](computer_sci/data_structure_and_algorithm/graph/attachments/Pasted%20image%2020230914104155.png)
Topological sorting can be more than one result. For this graph, "5 4 2 3 1 0" is one of the result. **The first vertex in topological sorting is always a vertex with an *in-degree of 0*.**
# Algorithm to do Topological Sorting
## DFS
深度优先搜索以任意顺序循环遍历图中的每个节点,若搜索进行中碰到之前已经遇到的节点,或碰到叶节点,则中止算法。
```fake
L ← Empty list that will contain the sorted nodes
while exists nodes without a permanent mark do
select an unmarked node n
visit(n)
function visit(node n)
if n has a permanent mark then
return
if n has a temporary mark then
return error (graph not DAG)
mark n with a temporary mark
for each node m with a edge from n to m do
visit(m)
remove temporary mark from n
mark n with a permanent mark
add n to head of L
```
## Kahn's Algorithm
 First, find a list of "start nodes" which have no incoming edges and insert them into a set S; *at least one such node must exist in a non-empty acyclic graph*
```fake
L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edge
while S is not empty do
remove a node n from S
add n to L
for each node m with an edge e from n to m do
remove edge e from graph
if m has no other incoming edge then
insert m into S
if graph has edges then
return error (graph no DAG)
else
return L
```
![](computer_sci/data_structure_and_algorithm/graph/attachments/algo.gif)
# Topological Sorting Application
* Task Priorities
# Reference
* [“Topological Sorting.” _GeeksforGeeks_, 12 May 2013, https://www.geeksforgeeks.org/topological-sorting/.](https://www.geeksforgeeks.org/topological-sorting/)
* [“拓撲排序.” 维基百科,自由的百科全书, 22 May 2022. _Wikipedia_, https://zh.wikipedia.org/w/index.php?title=%E6%8B%93%E6%92%B2%E6%8E%92%E5%BA%8F&oldid=71758255.](https://zh.wikipedia.org/wiki/%E6%8B%93%E6%92%B2%E6%8E%92%E5%BA%8F)
* [“算法 - 拓扑排序.” _Earth Guardian_, 22 Aug. 2018, http://redspider110.github.io/2018/08/22/0092-algorithms-topological-sorting/index.html.](https://redspider110.github.io/2018/08/22/0092-algorithms-topological-sorting/)

View File

@ -0,0 +1,95 @@
---
title: KnuthMorrisPratt algorithm
tags:
- algorithm
- string
- string-search
---
# Abstract
* Class —— String Search
* Data Structure —— String
* Worst-case performance —— $\Theta(m)$ preprocessing + $\Theta(n)$ matching
* Worst-case space complexity —— $\Theta(m)$
# Details
## What's KMP do
KMP是做**字符串匹配**最常用的算法之一。
> [!abstract]
> 什么是字符串匹配?
>
> 举例来说,有一个字符串"BBC ABCDAB ABCDABCDABDE",我想知道,里面是否包含另一个字符串"ABCDABD"
Knuth-Morris-Pratt算法是以三个发明者命名起头的那个K就是著名科学家Donald Knuth。
## Core
> [!abstract]
> KMP的算法的核心是利用已知匹配的结果构建**部分匹配表** Partial Match Table来进行算法加速
"部分匹配"的实质是有时候字符串prefix和suffix会有重复。比如"ABCDAB"之中有两个"AB",那么它的"部分匹配值"就是2"AB"的长度)。**搜索词移动的时候,第一个"AB"向后移动4位字符串长度-部分匹配值),就可以来到第二个"AB"的位置。**
> [!tip]
> 以“ABCDABD”为例
>
> "A"的前缀和后缀都为空集共有元素的长度为0
>
> "AB"的前缀为[A],后缀为[B]共有元素的长度为0
>
> "ABC"的前缀为[A, AB],后缀为[BC, C]共有元素的长度0
>
> "ABCD"的前缀为[A, AB, ABC],后缀为[BCD, CD, D]共有元素的长度为0
>
> "ABCDA"的前缀为[A, AB, ABC, ABCD],后缀为[BCDA, CDA, DA, A],共有元素为"A"长度为1
>
> "ABCDAB"的前缀为[A, AB, ABC, ABCD, ABCDA],后缀为[BCDAB, CDAB, DAB, AB, B],共有元素为"AB"长度为2
>
> "ABCDABD"的前缀为[A, AB, ABC, ABCD, ABCDA, ABCDAB],后缀为[BCDABD, CDABD, DABD, ABD, BD, D]共有元素的长度为0。
>
KMP算法在发现不匹配后移动的位数由**已匹配的字符数**和**对应的部分匹配值**决定
$$
 移动位数 = 已匹配的字符数 - 对应的部分匹配值
$$
# Code
## Partial Match Table
```python
def partialMatchTable(self, pattern: str) -> list[int]:
table = [0] * len(pattern)
i = 1
j = 0
while i < len(pattern):
if pattern[i] == pattern[j]:
table[i] = j + 1
i += 1
j += 1
elif j > 0:
j = table[j - 1]
else:
i += 1
return table
```
# Reference
* [阮一峰. “字符串匹配的KMP算法.” _字符串匹配的KMP算法_, 23 Jan. 2024, https://www.ruanyifeng.com/blog/2013/05/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm.html. 👈 ⭐⭐⭐!](https://www.ruanyifeng.com/blog/2013/05/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm.html)
* [_The Knuth-Morris-Pratt Algorithm in My Own Words - jBoxer_. http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/. Accessed 23 Jan. 2024.](http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/)

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

View File

@ -0,0 +1,66 @@
---
title:
tags:
- data-structure
- basic
- algorithm
---
![](computer_sci/data_structure_and_algorithm/tree/attachments/Pasted%20image%2020230710160348.png)
**树状数组Fenwick Tree**,也被称为**二叉索引树Binary Indexed TreeBIT**其初衷是解决数据压缩里的累积频率Cumulative Frequency的计算问题现多用于*高效计算数列的[前缀和](tmp_script/prefix_sum.md) 区间和*。它可以以$O(\log{n})$的时间得到任意前缀和,并同时支持在$O(\log{n})$时间内支持动态单点值的修改,空间复杂度为$O(n)$
我们希望BIT可以完成的操作是
1. 更改存储在索引I处的值。(这称为**点更新**操作)
2. 查找长度为k的前缀之和。(这称为**前缀**和**查询**)
# Origin
按照Peter M.Fenwick的说法正如所有的整数都可以表示成2的幂和我们也可以把一串序列表示成一系列子序列的的和。采用这个想法我们可*将一个前缀和划分成多个子序列的和*而划分的方法与数的2的幂和具有极其相似的方式。*一方面子序列的个数是其二进制表示中1的个数另一方面子序列代表的f[i]的个数也是2的幂。*
# Step by Step
## `lowbit(x:int) -> int`
该函数返回参数转为二进制后,最后一个1的位置所代表的数值例如
```
lowbit(34) -> 2
lowbit(12) -> 4
lowbit(8) -> 8
```
在coding时可以使用位运算`(~i + 1) & i`来计算最后一位1的值它的原理在于使得最小位数上的1在`~i + 1`和`i`上都为1而其它位置上则不会同时为1因此使用`and`运算可以得到最后一位1的值
同时具有trick的点在于实际coding的时候`lowbit`函数写为:
```python
def lowbit(x):
    return x & (-x)
```
并不需要做+1操作这是因为这是因为当我们对整数 `i` 取负数 `-i` 时,其二进制表示中只有最右边的 1 保持不变,而其余位都会取反。然后我们再将 `i``-i` 进行按位与操作 `&`,结果会保留 `i` 中最右边的 1而将其他位都变为 0。这个技巧的原理基于补码表示法在补码表示法中*正整数的补码和其本身相同,负整数的补码是将其绝对值的二进制表示取反后加 1*。所以很多语言在coding的时候`-i`所做操作就是`~i+1`
`lowbit()`
## Build Array `BIT` (**Binary Indexed Tree**)
二叉索引树一般由数组实现
在Fenwick Tree结构中需要一个数组`BIT`来维护数组$A$的前缀和,有:
$$
{BIT}_i = \sum_{j=i-lowbit(i)+1}^{i} A_j
$$
code实现
```python
```
# Reference
* [二叉索引树 | 三点水. https://lotabout.me/2018/binary-indexed-tree/. Accessed 11 July 2023.](https://lotabout.me/2018/binary-indexed-tree/)

View File

@ -0,0 +1,163 @@
---
title: Segment Tree
tags:
- data-structure
- tree
---
# Overview
Segment Tree**线段树**)是一种用于解决区间查询问题的数据结构。它可以**有效地处理包含大量区间操作的问题**,如*查询*区间最大值、最小值、*求和*、*更新*等。
Segment Tree将给定的区间划分为若干个较小的子区间并使用树进行表示。**每个节点表示一个子区间,树的根节点表示整个区间**。每个节点记录了对应子区间的一些统计信息,如该区间的最大值、最小值、总和等。
构建Segment Tree的过程中首先将问题规模不断缩小将大的区间划分为两个较小的子区间并依次递归构建每个子区间的节点。当区间缩小到长度为1时即叶子节点将问题的原始数据作为叶子节点的值。
Segment Tree的构建完成后可以高效地进行查询和更新操作。查询操作通过递归遍历树的节点在给定的区间范围内查找所需的统计信息。更新操作通过递归更新树的节点更新目标区间内的值并更新父节点的统计信息。
**由于Segment Tree的每个节点代表的区间是互不重叠的因此在进行统计信息的查询和更新时可以利用区间的性质进行剪枝操作从而提高效率**。
# Detail
## Basic
![](computer_sci/data_structure_and_algorithm/tree/attachments/Pasted%20image%2020230907145346.png)
*Segment Tree* is a basically binary tree, we can represent segment tree in a simple linear array. We can learn segment tree by knowing some key points. We consider an array $A$ of size $N$ and a corresponding Segment Tree $T$.
1. The root of $T$ will represent the whole array $A[0:N-1]$
2.  In each step, the segment is divided into half and the two children represent those two halves. $A[0:N-1]$ will be divided into $A[0, (N-1)/2]$ & $A[(N-1)/2 + 1, N-1]$
3. **Height** of the segment tree will be $log_2{N}$. The **internal nodes** is $N-1$ and **leaves** are $N$. So a **total number of nodes** are $2 \times N - 1$.
## Operations
Once the Segment Tree is built, its structure cannot be changed. We can update the values of nodes but we cannot change its structure. Segment tree provides two operations:
1. **Update**: To update the element of the array $A$ and reflect the corresponding change in the Segment tree.
2. **Query**: In this operation we can **query on an interval or segment and return the answer to the problem** (say minimum/maximum/summation in the particular segment).
## Time Complexity and Code Implementation Demo
### Build
![](computer_sci/data_structure_and_algorithm/tree/attachments/Pasted%20image%2020230907170533.png)
```c
void build(int node, int start, int end)
{
if(start == end)
{
// Leaf node will have a single element
tree[node] = A[start];
}
else
{
int mid = (start + end) / 2;
// Recurse on the left child
build(2*node, start, mid);
// Recurse on the right child
build(2*node+1, mid+1, end);
// Internal node will have the sum of both of its children
tree[node] = tree[2*node] + tree[2*node+1];
}
}
```
```python
def segment_tree_build(nums):
n = len(nums)
tree = np.zeros(2 * n)
tree[n:2 * n] = nums
for i in range(n-1, 0, -1):
tree[i] = tree[2 * i] + tree[2 * i + 1]
return tree
```
**Every nodes means a sum of an interval**. Build Complexity is $O(N)$
### Update
```c
void update(int node, int start, int end, int idx, int val)
{
if(start == end)
{
// Leaf node
A[idx] += val;
tree[node] += val;
}
else
{
int mid = (start + end) / 2;
if(start <= idx and idx <= mid)
{
// If idx is in the left child, recurse on the left child
update(2*node, start, mid, idx, val);
}
else
{
// if idx is in the right child, recurse on the right child
update(2*node+1, mid+1, end, idx, val);
}
// Internal node will have the sum of both of its children
tree[node] = tree[2*node] + tree[2*node+1];
}
}
```
To update an element, **look at the interval in which the element is present and recurse accordingly on the left or the right child**.
Complexity of update will be $O(logN)$
### Query
```c
int query(int node, int start, int end, int l, int r)
{
if(r < start or end < l)
{
// range represented by a node is completely outside the given range
return 0;
}
if(l <= start and end <= r)
{
// range represented by a node is completely inside the given range
return tree[node];
}
// range represented by a node is partially inside and partially outside the given range
int mid = (start + end) / 2;
int p1 = query(2*node, start, mid, l, r);
int p2 = query(2*node+1, mid+1, end, l, r);
return (p1 + p2);
}
```
将查询区间切割成多个区间在不同节点查找并合并
## LazyTag Trick
Lazy Tag的设计目的是为了[l, r]区间所有数增加k的情况做多次update时间复杂度浪费过多利用lazy tag降低时间复杂度。
lazy tag的设计原理是被打上lazy tag的seg node是已经更新完了的seg node而lazy tag之下的seg node是没有更新的。只有要访问lazy tag之下的seg node的时候才去做更新来节省更新。
### Lazy Tag Propagation
lazy propagation is a optimize technique in segment tree to **minimize** tons of operations.
lazy propagation is hard to explain, so watch this tutorial vedio is a best way to learn and review.
pls watch vedio in reference 3: [_Lazy Propagation Segment Tree_. _www.youtube.com_, https://www.youtube.com/watch?v=xuoQdt5pHj0. Accessed 12 Sept. 2023.](https://www.youtube.com/watch?v=xuoQdt5pHj0)
# Reference
* [“Segment Trees Tutorials & Notes | Data Structures.” _HackerEarth_, https://www.hackerearth.com/practice/data-structures/advanced-data-structures/segment-trees/tutorial/. Accessed 7 Sept. 2023.](https://www.hackerearth.com/practice/data-structures/advanced-data-structures/segment-trees/tutorial/)
* [“力扣LeetCode官网 - 全球极客挚爱的技术成长平台.” _力扣 LeetCode_, https://leetcode.cn/problems/handling-sum-queries-after-update/solutions/2356392/geng-xin-shu-zu-hou-chu-li-qiu-he-cha-xu-kv6u/. Accessed 11 Sept. 2023.](https://leetcode.cn/problems/handling-sum-queries-after-update/solutions/2356392/geng-xin-shu-zu-hou-chu-li-qiu-he-cha-xu-kv6u/)
* [_Lazy Propagation Segment Tree_. _www.youtube.com_, https://www.youtube.com/watch?v=xuoQdt5pHj0. Accessed 12 Sept. 2023.](https://www.youtube.com/watch?v=xuoQdt5pHj0)

View File

@ -0,0 +1,7 @@
---
title: Two Pointers
tags:
- algorithm
- pointer
---

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

View File

@ -0,0 +1,8 @@
---
title: Model Evaluation - MOC
tags:
- deep-learning
- evaluation
---
* [Model Evaluation in Time Series Forecasting](computer_sci/deep_learning_and_machine_learning/Evaluation/time_series_forecasting.md)

View File

@ -0,0 +1,121 @@
---
title: Model Evaluation in Time Series Forecasting
tags:
- deep-learning
- evaluation
- time-series-dealing
---
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230526162839.png)
# Some famous time series scoring technics
1. **MAE, RMSE and AIC**
2. **Mean Forecast Accuracy**
3. **Warning: The time series model EVALUATION TRAP!**
4. **RdR Score Benchmark**
## MAE, RMSE, AIC
MAE means **Mean Absolute Error (MAE)** and RMSE means **Root Mean Squared Error (RMSE)**.
这是两个衡量 continuous variables的accuracy的著名指标MAE在以前的文章中被时常使用16年的观察已经发现RMSE或者其他version的R-squared逐渐被使用起来
*我们需要了解何时使用哪种指标会更好*
### MAE
$$
\text{MAE} = \frac{1}{n}\sum_{j=1}^n |y_j - \hat{y}_j|
$$
MAE的特点在于所有individual difference有着equal weight
如果将绝对值去掉MAE会变成**Mean Bias Error (MBE)**使用MBE时要注意正反bias相互抵消
### RMSE
$$
\text{RMSE} = \sqrt{\frac{1}{n} \sum_{j=1}^n (y_j - \hat{y}_j)^2}
$$
均方根误差RMSE是一种二次评分规则它还测量误差的平均幅度。它是预测值和实际观测值之间差异的平方的平均值的平方根。
### AIC
$$
\text{AIC} = 2k - 2\ln{(\hat{L})}
$$
$k$是模型参数的估计,$\hat{L}$是模型似然函数(likelihood function)的最大化值
**Akaike information criterion**赤池信息准则AIC是一个有助于比较模型的指标因为它同时考虑了模型对数据的拟合程度和模型的复杂性。
AIC衡量信息的损失并**对模型的复杂性进行惩罚**。它是*参数数量惩罚后的负对数似然函数*。AIC的主要思想是模型参数越少越好。**AIC允许您测试模型在不过拟合数据集的情况下拟合数据的程度**
### Comparison
#### Similarities between MAE and RMSE
均方误差MAE和均方根误差RMSE都以感兴趣变量的单位来表示平均模型预测误差。这两个指标都可以在0到∞的范围内变化并且对误差的方向不敏感。它们是负向评分指标也就是说数值越低越好。
#### Differences between MAE and RMSE
*由于误差在求平均之前被平方RMSE对大误差给予相对较高的权重*。这意味着在特别不希望出现大误差的情况下RMSE应该更有用而在MAE的平均值中这些大误差将被稀释
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230526161422.png)
AIC the lower is better但没有perfect score只能用来相同dataset下不同model的性能
## Mean Forecast Accuracy
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230526162035.png)
计算每个点的Forecast Accuracy然后求平均得到 Mean Forecast Accuracy
Mean Forecast Accuracy的重大缺陷在大的偏离值造成巨大的负面影响比如$1 - \frac{|\hat{y}_j - y_j|}{y_j} = 1 - \frac{250-25}{25} = -800\%$
解决方案是将Forecast Accuracy的最小值限制为0%同时可以使用Median代替Mean。
一般来说,**当你的误差分布偏斜时,你应该使用 Median 而不是 Mean**。 在某些情况下Mean Forecast Accuray也可能毫无意义。 如果你还记得你的统计数据; 变异系数 (**coefficient of variation**, CV) 表示标准偏差与平均值的比率($\text{CV} = (\text{Standard Deviation}/\text{Mean} * 100)$)。 大 CV 值意味着大变异性,这也意味着围绕均值的离差程度更大。 **例如,我们可以将 CV 高于 0.7 的任何事物视为高度可变且不可真正预测的。 另外,还可以说明你的预测模型预测能力很不稳定!**
## RdR Score Benchmark (这是一个具有实验性的指标blogger指出这个指标并没有在research paper出现过)
RdR metric stands for:
* *R*: **Naïve Random Walk**
* *d*: **Dynamic Time Warping**
* *R*: **Root Mean Squared Error**
### DTW to deal with shape similarity
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230526163614.png)
RMSE、MAE这些指标都没有考虑到一个重要的标准**THE SHAPE SIMILARITY**
RdR Score Benchmark使用 [**Dynamic Time Warping(DTW动态时间调整)** ](computer_sci/deep_learning_and_machine_learning/Trick/DTW.md)作为shape similarity的指标
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230526164106.png)
欧氏距离在时间序列之间可能是一个不好的选择,因为时间轴上存在扭曲的情况。
* DTW通过“同步”/“对齐”时间轴上的不同信号,找到两个时间序列之间的最佳(最小距离)扭曲路径
### RdR score means
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230529130501.png)
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230529130509.png)
*RdR score*通过RMSE和DTW distance来计算用于比较你的model和Radnom Walk(*Random Walk的RdR score = 0*)相比的优越性
### RdR calculation details
可以通过绘制 RMSE vs. DTW来计算RdR score绘制的图如下所示
![](computer_sci/deep_learning_and_machine_learning/Evaluation/attachments/Pasted%20image%2020230529130856.png)
计算矩阵面积来计算RdR score文章里并没有完整介绍计算在[github code](https://github.com/CoteDave/blog/tree/master/RdR%20score)里有,并不确定)
# Reference
* M.Sc, Dave Cote. “RdR Score Metric for Evaluating Time Series Forecasting Models.” _Medium_, 8 Feb. 2022, https://medium.com/@dave.cote.msc/rdr-score-metric-for-evaluating-time-series-forecasting-models-1c23f92f80e7.
* JJ. “MAE and RMSE — Which Metric Is Better?” _Human in a Machine World_, 23 Mar. 2016, https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d.
* _Accelerating Dynamic Time Warping Subsequence Search with GPU_. https://www.slideshare.net/DavideNardone/accelerating-dynamic-time-warping-subsequence-search-with-gpu. Accessed 29 May 2023.

View File

@ -0,0 +1,77 @@
---
title: DeepAR - Time Series Forcasting
tags:
- deep-learning
- model
- time-series-dealing
---
DeepAR, an autoregressive recurrent network developed by Amazon, is the first model that could natively work on multiple time-series. It's a milestone in time-series community.
# What is DeepAR
> [!quote]
> DeepAR is the first successful model to combine Deep Learning with traditional Probabilistic Forecasting.
* **Multiple time-series support**
* **Extra covariates**: *DeepAR* allows extra features, covariates. It is very important for me when I learn *DeepAR*, because in my task, I have corresponding feature for each time series.
* **Probabilistic output**:  Instead of making a single prediction, the model leverages [**quantile loss**](computer_sci/deep_learning_and_machine_learning/Trick/quantile_loss.md) to output prediction intervals.
* **“Cold” forecasting:** By learning from thousands of time-series that potentially share a few similarities, _DeepAR_ can provide forecasts for time-series that have little or no history at all.
# Block used in DeepAR
* [LSTM](computer_sci/deep_learning_and_machine_learning/deep_learning/LSTM.md)
# *DeepAR* Architecture
DeepAR模型并不直接使用LSTMs去计算prediction而是去估计Gaussian likelihood function的参数即$\theta=(\mu,\sigma)$估计Gaussian likelihood function的mean和standard deviation。
## Training Step-by-Step
![](computer_sci/deep_learning_and_machine_learning/Famous_Model/attachments/Pasted%20image%2020230523134255.png)
假设目前我们在time-series $i$ 的 t 时刻,
1. LSTM cell会输入covariates $x_{i,t}$,即$x_i$在t时刻的值还有上一时刻的target variable$z_{i,t-1}$LSTM还需要输入上一时刻的隐藏状态$h_{i,t-1}$
2. LSTM紧接着就会输出当前的hidden state $h_{i,t}$,会输入到下一步中
3. Gaussian likelihood function里的parameter$\mu$和$\sigma$会从$h_{i,t}$中不直接计算出,计算细节在后面
> [!quote]
> 换言之,这个模型是为了得到最好的$\mu$和$\sigma$去构建gaussian distribution让预测更接近$z_{i,t}$;同时,因为*DeepAR*每次都是train and predicts a single data point所以这个模型也被称为autoregressive模型
## Inference Step-by-Step
![](computer_sci/deep_learning_and_machine_learning/Famous_Model/attachments/Pasted%20image%2020230523141219.png)
在使用model进行预测的时候某一改变的就是使用预测值$\hat{z}$ 代替真实值$z$,同时$\hat{z}_{i,t}$是在我们模型学习到的Gaussian distribution里sample得到的而这个Gaussian distribution里的参数$\mu$和$\sigma$并不是model直接学习到的*DeepAR*如何做到这一点的呢?
# Gaussian Likelihood
$$
\ell_G(z|\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp{(-\frac{(z-\mu)^2)}{2\sigma^2}}
$$
Estimate gaussian distribution的任务一般会被转化成maximize gaussian log-likelihood function的任务即**MLEformulas**(maximum log-likelihood estimators)
**Gaussian log-likelihood function**:
$$
\mathcal{L} = \sum_{i=1}^{N}\sum_{t=t_o}^{T} \log{\ell(z_{i,t}|\theta(h_{i,t}))}
$$
# Parameter estimation in *DeepAR*
在统计学中预估Gaussian Distribution一般使用MLEformulas但是在*DeepAR*中并不这么去做而是使用两个dense layer去做预估如下图
![](computer_sci/deep_learning_and_machine_learning/Famous_Model/attachments/Pasted%20image%2020230523151201.png)
使用dense layer的方式去预估Gaussian distribution的原因在于可以使用backpropagation
# Reference
* [https://towardsdatascience.com/deepar-mastering-time-series-forecasting-with-deep-learning-bc717771ce85](https://towardsdatascience.com/deepar-mastering-time-series-forecasting-with-deep-learning-bc717771ce85)

View File

@ -0,0 +1,11 @@
---
title: Famous Model MOC
tags:
- deep-learning
- MOC
---
# Time-series
* [DeepAR](computer_sci/deep_learning_and_machine_learning/Famous_Model/DeepAR.md)

View File

@ -0,0 +1,8 @@
---
title: Temporal Fusion Transformer
tags:
- deep-learning
- model
- time-series-dealing
---

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

View File

@ -0,0 +1,25 @@
---
title: Large Language Model(LLM) - MOC
tags:
- deep-learning
- LLM
- NLP
---
# Training
* [Training Tech Outline](computer_sci/deep_learning_and_machine_learning/LLM/train/steps.md)
* [⭐⭐⭐Train LLM from scratch](computer_sci/deep_learning_and_machine_learning/LLM/train/train_LLM.md)
* [⭐⭐⭐Detailed explanation of RLHF technology](computer_sci/deep_learning_and_machine_learning/LLM/train/RLHF.md)
* [How to do use fine tune tech to create your chatbot](computer_sci/deep_learning_and_machine_learning/LLM/train/finr_tune/how_to_fine_tune.md)
* [Learn finetune by Stanford Alpaca](computer_sci/deep_learning_and_machine_learning/LLM/train/finr_tune/learn_finetune_byStanfordAlpaca.md)
# Metrics
How to evaluate a LLM performance?
* [Tasks to evaluate BERT - Maybe can be deployed in other LM](computer_sci/deep_learning_and_machine_learning/LLM/metircs/some_task.md)
# Basic
* [LLM Hyperparameter](computer_sci/deep_learning_and_machine_learning/LLM/basic/llm_hyperparameter.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 444 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

View File

@ -0,0 +1,56 @@
---
title: LLM hyperparameter
tags:
- hyperparameter
- LLM
- deep-learning
- basic
---
# LLM Temperature
Temperature definition come from the physical meaning of temperature. The more higher temperature, the atoms moving more faster, meaning more randomness.
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/physic_temp.gif)
LLM temperature is a hyperparameter that regulates **the randomness, or creativity.**
* Higher the LLM temperature, more diverse and creative, increasing likelihood of straying from context.
* Lower the LLM temperature, more focused and deterministic, sticking closely to the most likely prediction
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/Pasted%20image%2020230627160125.png)
## More detail
The LLM model is to give a probability of next word, like this:
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/Pasted%20image%2020230627162848.png)
"A cat is chasing a …", there are lots of words can be filled in that blank. Different words have different probabilities, in the model, we output the next word ratings.
Sure, we can always pick the highest rating word, but that would result in very standard predictable boring sentences, and the model wouldn't be equivalent to human language, because we don't always use the most common word either.
So, we want to design a mechanism that **allows all words with a decent rating to occur with a reasonable probability**, that's why we need temperature in LLM model.
Like real physic world, we can do samples to describe the distribution, *we use SoftMax to describe the distribution of the probability of the next word*. The temperature is the element $T$ in the formula:
$$
p_i = \frac{\exp{(\frac{R_i}{T})}}{\sum_i \exp{(\frac{R_i}{T})}}
$$
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/Pasted%20image%2020230627163514.png)
More lower the $T$, the higher rating word's probability will goes to 100%, and more higher the $T$, the probability will be more smoother for very words.
*The gif below is important and intuitive.*
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/rating_probabililty.gif)
So, set different $T$, the next word's probability will be changed, we will output next word depending on the probability.
![](computer_sci/deep_learning_and_machine_learning/LLM/basic/attachments/Pasted%20image%2020230627165311.png)
# Reference
* [LLM Temperature, dedpchecks](https://deepchecks.com/glossary/llm-parameters/#:~:text=One%20intriguing%20parameter%20within%20LLMs,of%20straying%20from%20the%20context.)
* [⭐⭐⭐https://www.youtube.com/watch?v=YjVuJjmgclU](https://www.youtube.com/watch?v=YjVuJjmgclU)

Binary file not shown.

After

Width:  |  Height:  |  Size: 272 KiB

View File

@ -0,0 +1,44 @@
---
title: LangChain Explained
tags:
- LLM
- basic
- langchain
---
# What is LangChain
LangChain is an open source framework that allows AI developers to combine LLMs like GPT-4 *with external sources of computation and data*.
# Why LangChain
LangChain can make LLM answer question depending on your own documents. It can help you doing lots of amazing apps.
You can use LangChain to make GPT to do analysis on your own company data, booking flight depending on schedule. summarizing abstract on bunches of PDFs, .….
# LangChain value propositions
## Components
* LLM Wrappers
* Prompt Templates
* Indexes for relevant information retrieval
## Chains
Assemble components to solve a specific task - finding info in a book...
## Agents
Agents allow LLMs to interact with it's environment. - For instance, make API request with a specific action
# LangChain Framework
![](computer_sci/deep_learning_and_machine_learning/LLM/langchain/attachments/Pasted%20image%2020230627154149.png)
# Reference
* [https://www.youtube.com/watch?v=aywZrzNaKjs](https://www.youtube.com/watch?v=aywZrzNaKjs)
*

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

View File

@ -0,0 +1,36 @@
---
title: Tasks to evaluate BERT - Maybe can be deployed in other LM
tags:
- LLM
- metircs
- deep-learning
- benchmark
---
# Overview
![](computer_sci/deep_learning_and_machine_learning/LLM/metircs/attachments/Pasted%20image%2020230629140929.png)
# MNLI-m (Multi-Genre Natural Language Inference - Matched):
MNLI-m is a benchmark dataset and task for natural language inference (NLI). The goal of NLI is to determine the logical relationship between two given sentences: whether the relationship is "entailment," "contradiction," or "neutral." MNLI-m focuses on matched data, which means the sentences are drawn from the same genres as the sentences in the training set. It is part of the GLUE (General Language Understanding Evaluation) benchmark, which evaluates the performance of models on various natural language understanding tasks.
# QNLI (Question Natural Language Inference):
QNLI is another NLI task included in the GLUE benchmark. In this task, the model is given a sentence that is a premise and a sentence that is a question related to the premise. The goal is to determine whether the answer to the question can be inferred from the given premise. The dataset for QNLI is derived from the Stanford Question Answering Dataset (SQuAD).
# MRPC (Microsoft Research Paraphrase Corpus):
MRPC is a dataset used for paraphrase identification or semantic equivalence detection. It consists of sentence pairs from various sources that are labeled as either paraphrases or not. The task is to classify whether a given sentence pair expresses the same meaning (paraphrase) or not. MRPC is also part of the GLUE benchmark and helps evaluate models' ability to understand sentence similarity and equivalence.
# SST-2 (Stanford Sentiment Treebank - Binary Sentiment Classification):
SST-2 is a binary sentiment classification task based on the Stanford Sentiment Treebank dataset. The dataset contains sentences from movie reviews labeled as either positive or negative sentiment. The task is to classify a given sentence as expressing a positive or negative sentiment. SST-2 is often used to evaluate the ability of models to understand and classify sentiment in natural language.
# SQuAD (Stanford Question Answering Dataset):
SQuAD is a widely known dataset and task for machine reading comprehension. It consists of questions posed by humans on a set of Wikipedia articles, where the answers to the questions are spans of text from the corresponding articles. The goal is to build models that can accurately answer the questions based on the provided context. SQuAD has been instrumental in advancing the field of question answering and evaluating models' reading comprehension capabilities.
Overall, these tasks and datasets serve as benchmarks for evaluating natural language understanding and processing models. They cover a range of language understanding tasks, including natural language inference, paraphrase identification, sentiment analysis, and machine reading comprehension.

View File

@ -0,0 +1,65 @@
---
title: Reinforcement Learning from Human Feedback
tags:
- LLM
- deep-learning
- RLHF
- LLM-training-method
---
# Review: Reinforcement Learning Basics
![](computer_sci/deep_learning_and_machine_learning/LLM/train/attachments/Pasted%20image%2020230628145009.png)
Reinforcement learning is a mathematical framework.
Demystify the reinforcement learning model, it's a open-ended model using reward function to optimize agent to solve complex task in target environment.
<!---
# Origins of RLHF
## Pre Deep RL
![](Deep_Learning_And_Machine_Learning/LLM/train/attachments/Pasted%20image%2020230628160836.png)
Before, Deep RL don't use neural network to represent policy. What this system did was a machine learning system that created a policy by having humans label the actions that an agent took as being kind of correct or incorrect. This was just a simple decision rule where humans labeled every actions as good or bad. This was essentially a reward model and a policy put together.
## For Deep RL
![](Deep_Learning_And_Machine_Learning/LLM/train/attachments/Pasted%20image%2020230628161627.png)
--->
# Step by Step
For RLHF training method, here are three core steps:
1. Pretraining a language model
2. Gathering data(问答数据) and training a reward model
3. Fine-tuning the LM with reinforcement learning
## Step 1. Pretraining Language Models
Read this to learn how to train a LM:
[Pretraining language models](computer_sci/deep_learning_and_machine_learning/LLM/train/train_LLM.md)
OpenAI used a smaller version of GPT-3 for its first popular RLHF model - InstructGPT.
Nowadays, RLHF is new area, there's no answer to which model is the best for starting point of RLHF and using expensive augmented data to fine-tune is not necessarily.
## Step 2. Reward model training
In reward model, we integrate human preferences into the system.
![](computer_sci/deep_learning_and_machine_learning/LLM/train/attachments/Pasted%20image%2020230629145231.png)
# Reference
* [Reinforcement Learning from Human Feedback: From Zero to chatGPT, YouTube, HuggingFace](https://www.youtube.com/watch?v=2MBJOuVq380)
* [Hugging Face blog, ChatGPT 背后的“功臣”——RLHF 技术详解](https://huggingface.co/blog/zh/rlhf)

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

View File

@ -0,0 +1,8 @@
---
title: How to make custom dataset?
tags:
- dataset
- LLM
- deep-learning
---

View File

@ -0,0 +1,7 @@
---
title: How to do use fine tune tech to create your chatbot
tags:
- deep-learning
- LLM
---

View File

@ -0,0 +1,19 @@
---
title: Learn finetune by Stanford Alpaca
tags:
- deep-learning
- LLM
- fine-tune
- LLaMA
---
![](computer_sci/deep_learning_and_machine_learning/LLM/train/finr_tune/attachments/Pasted%20image%2020230627145954.png)
# Reference
* [https://www.youtube.com/watch?v=pcszoCYw3vc](https://www.youtube.com/watch?v=pcszoCYw3vc)
* [https://crfm.stanford.edu/2023/03/13/alpaca.html](https://crfm.stanford.edu/2023/03/13/alpaca.html)

Some files were not shown because too many files have changed in this diff Show More