ACMClassCourses · LauYeeYu · Oct 29, 2022 · Nov 19, 2022 · Dec 17, 2022
diff --git a/W10D2/menu.md b/W10D2/menu.md
@@ -0,0 +1 @@
+- [Liu YiYu's note](note_liu_yiyu.md)
diff --git a/W10D2/note_liu_yiyu.md b/W10D2/note_liu_yiyu.md
@@ -0,0 +1,57 @@
+# Note for W7D1
+
+*Authored by Liu Yi-Yu*
+
+Async v.s. Sync
+- Interruption (I/O)
+- Internet
+  - ATM/IDSM (Euro)
+  - TCP/IP (USA)
+
+## Interruption
+
+Outer devices interrupt CPU
+
+Using Daisy chain to access:
+```mermaid
+graph LR
+Core((Core))
+A[Device]
+B[Device]
+C[device]
+Core --> A
+A --> Core
+A --> B
+B --> A
+B --> C
+C --> B
+```
+
+## Internet
+Using supercube to reduce the distance
+
+### ATM/ISDN (Euro)
+
+ATM: Async Transfer Mode
+
+- virtual circuit (the circuit using switch)
+
+- Have full access to the virtual circuit if the connection has been established
+
+- slice the data into pieces to avoid data loss
+
+### TCP/IP (USA)
+
+Using the asynchronous network to implement a synchronous network
+
+### Internet Service
+- SNMP (know the state of every node)
+- OSPF (larger scale)
+- DNS (the largest scale)
+
+Most thing in internet is implemented in user space instead of kernel space
+
+### CDMA (1989)
+- Frequent Hopping Spread Spectrum
+- Direct Sequence Spread Spectrum
+- Orthogonal coding
diff --git a/W14D1/menu.md b/W14D1/menu.md
@@ -0,0 +1 @@
+- [Liu YiYu's note](note_liu_yiyu.md)
diff --git a/W14D1/note_liu_yiyu.md b/W14D1/note_liu_yiyu.md
@@ -0,0 +1,93 @@
+# Review
+
+```mermaid
+graph LR
+arch((Arch))
+perf[Performance]
+func[Functions]
+exam[Examination]
+vn(Von Neumann)
+principles(principles)
+rl(Reducing Latency)
+p1([Small: =fast])
+p2([Simple: RISC])
+p3([Tradeoff/Compromise])
+p4([Amdahl's Law])
+
+arch --> func
+arch --> perf
+arch -.- exam
+func --- vn
+perf ---- principles
+perf --- Parallelism
+perf --- Locality
+perf --- rl
+principles --- p1
+principles --- p2
+principles --- p3
+principles --- p4
+```
+
+## Functions
+Von Neumann's architecture
+
+## Performance
+CPI (clock per instruction)
+### Reduce Latency
+- higher frequency
+- CLA
+
+### Principles
+- Small (=fast)
+- Simple (RISC)
+- Tradeoff/Compromise
+- Amdahl's Law: make the most common fast. ($\displaystyle s_p=\frac{1}{(1-n)+n/s}$)
+
+## Pipeline
+
+### Basic Principle
+- Balance
+- Speed up
+
+### Hazard
+$\approx$ stalls
+
+#### Structural
+Duplicating
+
+Example:
+
+memory conflict
+
+- ID's 4th stage
+- Ii's 1st stage
+
+Solution: I-Cache / D-Cache
+
+(or Harvard Structure)
+
+#### Data
+- True-dep (RAW)
+  - small dist: forwarding
+  - large dist: out of order (hardware) / move code (software)
+- Pseudo-dep
+
+#### Control
+jump, branch
+- Early branch prediction
+- calculation delay (e.g. BTB, but return cause an issue)
+- delay slot
+- Kill
+
+## Locality Cache
+$\displaystyle AMAT_{\mathrm{cache}} = T_{\mathrm{hit}} + \eta_{\mathrm{miss}}\times T_{\mathrm{penalty}}$
+
+- $T_{\mathrm{hit}}$: (for cache) small, (direct mapping is fast, and fully associated is slow!)
+- $\eta_{\mathrm{miss}}$: higher associativity, smaller miss rate
+- $T_{\mathrm{penalty}}$: (for memory / L2 cache) wider bus / multi-bank
+
+### Cache
+
+1. use index to find which line (block)
+2. check whether the tag matches
+3. find which part of the block is the data
diff --git a/W7D1/menu.md b/W7D1/menu.md
@@ -1 +1,2 @@
 - [钟逸超](Note_W7D1_Yichao_Zhong.md)
+- [刘祎禹](note_liu_yiyu.md)
diff --git a/W7D1/note_liu_yiyu.md b/W7D1/note_liu_yiyu.md
@@ -0,0 +1,119 @@
+# Note for W7D1
+
+*Authored by Liu Yi-Yu*
+
+Target: reduce $AMAT = HitTime + MissRate \times MissPenalty$
+
+## 5. Reducing Misses by Hardware Prefetching Data
+
+```mermaid
+graph LR
+Core((Core))
+Cache[Cache]
+Mem[Memory]
+WB[Write Buffer / Stream Buffer]
+Core --> |va| Cache
+Cache --> Core
+Cache --> |pa| Mem
+Cache --- WB
+WB --- Mem
+```
+
+## 6. Reducing Misses by Software Prefetching Data
+
+### Load to a register
+
+### Touch a memory address (cache)
+
+### Make accessing order to be consistent with the order in memory
+
+## Different Kinds of Memory
+
+- SRAM
+  - R-S
+  - D-latch
+- DRAM
+  - electric capacity
+- EDU / FP
+- SDRAM
+
+## 7.Reducing Misses by Compiler Optimization
+
+### Instruction
+- Reorder procedures in memory so as to reduce conflict misses
+- Profiling to look at conflicts(using tools they developed)
+
+### Data
+- Merging Arrays
+  - Reducing conflicts between val & key:
+```c++
+/* Before: 2 sequential arrays */
+int val[SIZE];
+int key[SIZE];
+/* After: 1 array of stuctures */
+struct merge {
+  int val;
+  int key;
+};
+struct merge merged_array[SIZE];
+```
+
+- Loop Interchange
+  - Sequential accesses instead of striding through memory every 100 words
+  - improved spatial locality
+```c++
+/* Before */
+for (k = 0; k < 100; k = k + 1)
+  for (j = 0; j < 100; j = j + 1)
+    for (i = 0; i < 5000; i = i + 1)
+      x[i][j] = 2 * x[i][j];
+/* After */
+for (k = 0; k < 100; k = k + 1)
+  for (i = 0; i < 5000; i = i + 1)
+    for (j = 0; j < 100; j = j + 1)
+      x[i][j] = 2 * x[i][j];
+```
+
+- Loop Fusion
+
+```c++
+/* Before */
+for (i = 0; i < N; i = i + 1)
+  for (j = 0; j < N; j = j + 1)
+    a[i][j] = 1 / b[i][j] * c[i][j];
+for (i = 0; i < N; i = i + 1)
+  for (j = 0; j < N; j = j + 1)
+    d[i][j] = a[i][j] + c[i][j];
+/* After */
+for (i = 0; i < N; i = i+1) {
+  for (j = 0; j < N; j = j+1) {
+    a[i][j] = 1 / b[i][j] * c[i][j];
+    d[i][j] = a[i][j] + c[i][j];
+  }
+}
+```
+
+- Blocking: 
+
+```c++
+/* Before */
+for (i = 0; i < N; i = i+1) {
+  for (j = 0; j < N; j = j+1) {
+    r = 0;
+    for (k = 0; k < N; k = k+1) {
+      r = r + y[i][k]*z[k][j];
+      }
+    x[i][j] = r;
+};
+/* After */
+for (jj = 0; jj < N; jj = jj+B)
+  for (kk = 0; kk < N; kk = kk+B)
+    for (i = 0; i < N; i = i+1)
+      for (j = jj; j < min(jj+B-1,N); j = j+1) {
+        r = 0;
+        for (k = kk; k < min(kk+B-1,N); k = k+1) {
+          r = r + y[i][k]*z[k][j];
+        }
+        x[i][j] = x[i][j] + r;
+      }
+```
Original file line number	Diff line number	Diff line change
		@@ -1 +1,2 @@
		- [钟逸超](Note_W7D1_Yichao_Zhong.md)
		- [刘祎禹](note_liu_yiyu.md)