file-path:: ../assets/ostep_1680491762166_0.pdf - virtualization, concurrency, and persistence ls-type:: annotation hl-page:: 1 hl-color:: yellow id:: 642a452b-2dc9-4566-b8a8-95f0caa7b8e3 - didactic 道德说教的;教诲的; hl-page:: 2 ls-type:: annotation id:: 642a7c33-ca04-447a-a357-88236d0b9360 hl-color:: green - Cramming 填塞;填鸭式用功;考试前临时硬记 hl-page:: 3 ls-type:: annotation id:: 642a80b7-43a1-40ba-ad01-815cb171572f hl-color:: green - fond 喜爱(尤指认识已久的人);喜爱(尤指长期喜爱的事物) hl-page:: 4 ls-type:: annotation id:: 642a80ef-5f31-422a-8227-3842faa4eb8d hl-color:: green - the operating system as a virtual machine, standard library and resource manager hl-page:: 25 ls-type:: annotation id:: 642bb88b-f481-4af6-afb6-7375d37654ce hl-color:: yellow - illusion 错觉,幻象 hl-page:: 27 ls-type:: annotation id:: 642bb978-1efe-418e-8894-8493bfd4f6bb hl-color:: green - virtualizing the CPU: run many programs at the same time hl-page:: 27 ls-type:: annotation id:: 642bb9c2-efb8-4960-a7bd-36afe1cc0b1d hl-color:: yellow - virtualizing memory: each process accesses its own private virtual address space hl-page:: 29 ls-type:: annotation id:: 642bbaae-582b-4b1c-bc60-e709931085e7 hl-color:: yellow - intricate 错综复杂的 ls-type:: annotation hl-page:: 33 hl-color:: green id:: 642bbca9-67e2-45d3-a131-968df46c0cef - heyday 全盛时期 ls-type:: annotation hl-page:: 39 hl-color:: green id:: 642bc0e8-ee33-46a6-bf83-407cc1e64737 - incredulous 不肯相信的;不能相信的;表示怀疑的 hl-page:: 45 ls-type:: annotation id:: 642bc1b3-459e-4e76-b36c-406daba96726 hl-color:: green - The definition of a process, informally, is quite simple: it is a running program ls-type:: annotation hl-page:: 47 hl-color:: yellow id:: 642bc39c-f56a-488f-a751-1532e413d474 - To implement virtualization of the CPU, low-level machinery and some high-level intelligence. hl-page:: 47 ls-type:: annotation id:: 642bc417-a3af-4132-b4b9-9fec9749fc9b hl-color:: yellow - mechanisms are low-level methods or protocols that implement a needed piece of functionality hl-page:: 47 ls-type:: annotation id:: 642bc41d-8b68-46d9-908e-ab14b53a859b hl-color:: yellow - Policies are algorithms for making some kind of decision within the OS ls-type:: annotation hl-page:: 48 hl-color:: yellow id:: 642bc579-a1cc-4b67-a8c5-5a33f274c634 - inventory 库存;财产清单; hl-page:: 48 ls-type:: annotation id:: 642bc8a9-1a7d-4236-86ce-65b0498e9e01 hl-color:: green - three states of a process: Running Ready Blocked hl-page:: 51 ls-type:: annotation id:: 642bd104-6b57-44f2-9f10-e5107a926079 hl-color:: yellow - nitty-gritty 本质;事实真相;实质问题 hl-page:: 55 ls-type:: annotation id:: 642c279e-0187-4175-abfb-5fcf4e534ae8 hl-color:: green - KEY PROCESS TERMS ls-type:: annotation hl-page:: 56 hl-color:: yellow id:: 642c27e3-d7d6-41d8-9832-73674615a246 - Interlude: Process API hl-page:: 60 ls-type:: annotation id:: 642cc7d2-cd33-4dd6-879f-72f1070ed96d hl-color:: yellow Syscall-fork, wait, exec - Get it right. Neither abstraction nor simplicity is a substitute for getting it right ls-type:: annotation hl-page:: 65 hl-color:: yellow id:: 642cc975-cf26-431c-8c38-617da01d89ee - the separation of fork() and exec() is essential in shell, because it lets the shell run code after the call to fork() but before the call to exec(); this code can alter the environment of the about-to-be-run program hl-page:: 65 ls-type:: annotation id:: 642ccc92-baa5-4b54-8525-9a664698c669 hl-color:: yellow - imperative 必要的,命令的;必要的事 hl-page:: 67 ls-type:: annotation id:: 642ccf62-0b5a-41db-899e-3e99a69c2eac hl-color:: green - control processes through signal subsystem hl-page:: 67 ls-type:: annotation id:: 642cd11a-eea5-48f8-b4d1-b225f37ccdb4 hl-color:: yellow - Limited Direct Execution: Direct execution is to simply run the program directly on the CPU. hl-page:: 74 ls-type:: annotation id:: 642cd1eb-c058-484a-ac25-782e37082bc6 hl-color:: yellow - aspiring 有追求的,有抱负的 hl-page:: 75 ls-type:: annotation id:: 642cd2f2-c565-4309-951a-1f809da9beff hl-color:: green - user mode and kernel mode hl-page:: 76 ls-type:: annotation id:: 642cd3e3-3fa6-43a6-a37a-cae62c634654 hl-color:: yellow - In user mode, code is restricted in what it can do, otherwise the processor will raise an exception - User code perform system call to do privileged operation. - To execute a system call, use `trap` and `return-from-trap` instruction: jump to/from kernel and change the privilege level. - Limited Direct Execution Protocol-interesting figure illustrating how system call is done hl-page:: 78 ls-type:: annotation id:: 642cd6be-083b-4ecd-b220-aafef97a8b65 hl-color:: yellow - wary 谨慎的;考虑周到的 hl-page:: 79 ls-type:: annotation id:: 642cd6e9-21fc-49c7-b0bb-0c9ba3b7a524 hl-color:: green - Switching Between Processes: how OS regain control of the CPU hl-page:: 80 ls-type:: annotation id:: 642cd90e-ab6b-4d4d-8312-cd8c69efdac8 hl-color:: yellow - Cooperative Approach: wait for a system call. 1. The processes are expected to give up the CPU periodically through system call `yield`(which does nothing except to transfer control to the OS). 2. The process does something that causes a trap - Non-Cooperative Approach: timer interrupt. - Context switch: save a few register values for the currently-executing process. Two types of register saves/restores: TI, hardware save state to kernel stack; context switch, OS save kernel registers and restore everything to return from trap - malfeasance 渎职;不正当行为 hl-page:: 81 ls-type:: annotation id:: 642cdc2e-329c-423e-a65a-0a53fb6eaa76 hl-color:: green - scoff 嘲笑;愚弄;笑柄 hl-page:: 83 ls-type:: annotation id:: 642ce319-8087-4252-b51d-42f749f7c283 hl-color:: green - enact 制定(法律);通过(法案) ls-type:: annotation hl-page:: 83 hl-color:: green id:: 642ce357-9914-417b-8036-35ae44ac7283 - whet 引起,刺激(食欲、欲望、兴趣等) hl-page:: 84 ls-type:: annotation id:: 642cef47-0c2f-482f-8b61-9a715e5438e5 hl-color:: green - analogous 相似的,可比拟的 ls-type:: annotation hl-page:: 86 hl-color:: green id:: 642cf0c1-59a2-410e-8f45-517f66ef47f9 - workload assumptions: Each job, runs for the same amount of time, arrives at the same time, runs to completion, uses only CPU, and run-time is known hl-page:: 90 ls-type:: annotation id:: 642cf292-4464-4c8f-8639-3a194484d4c0 hl-color:: yellow - Quite unrealistic, but modern preemptive scheduling somewhat mimics part of these assumptions - scheduling metric: turnaround time, aka `completion - arrival` hl-page:: 91 ls-type:: annotation id:: 642cf48d-b312-4af1-a2ff-d55cf9f32e48 hl-color:: yellow - conundrum 谜,猜不透的难题,难答的问题 ls-type:: annotation hl-page:: 91 hl-color:: green id:: 642cf4c4-d246-48f2-a6ef-f14c77684ad9 - First In, First Out (FIFO/FCFS) algorithm hl-page:: 91 ls-type:: annotation id:: 642cf4f9-4afd-4240-ac76-5522285fa1eb hl-color:: yellow - Bad average turnaround when long job runs first(The jobs run for **different amount of time**) - Convoy effect: a number of relatively-short potential consumers of a resource get queued behind a heavyweight resource consumer - Shortest Job First(SJF) hl-page:: 93 ls-type:: annotation id:: 642cf705-4a47-4daa-a542-43c4ae6f239e hl-color:: yellow - assuming that jobs all arriving at the same time, it could be proven that SJF is indeed an optimal scheduling algorithm - Downgrade to the same problem of Convey Effect when jobs **don't arrive at the same time**. For example, short jobs arrive shortly after the long job. - Shortest Time-to-Completion First (STCF) ls-type:: annotation hl-page:: 94 hl-color:: yellow id:: 642cfce4-67f5-4315-bf81-445922b8ae54 - Basically, it is SJF(by our definition is a non-preemptive scheduler) with **preemption**. When a new job enters the system, STCF schedules to the job with the least time left among all present jobs(including the new guy). - Metric: Response Time. Defined as `resp = firstrun - arrival`. Compare to ((642cf48d-b312-4af1-a2ff-d55cf9f32e48)) hl-page:: 95 ls-type:: annotation id:: 642e41ac-3b8f-4fe3-a9ef-e2adeeadfe9d hl-color:: yellow - For this metric, STCF is not that good. - Round-Robin (RR) ls-type:: annotation hl-page:: 96 hl-color:: yellow id:: 642e435d-7116-4d2c-9ec3-889558ba2dca - Not run jobs to completion But run a job for a time slice and then Switch to the next job in the run queue. Repeat until the jobs are done - Good for Response Time, bad for Turnaround Time. - Length of time slice is critical, in theory the shorter the better performance under response time metric. However, cost of context switching will dominate overall performance. The cost of context switching comes not only from save/restore registers, but also from caches or something alike. - amortization 分期偿还;折旧;(均摊) hl-page:: 97 ls-type:: annotation id:: 642e4473-4162-4320-91af-fba22e79be25 hl-color:: green - pessimal 最差的;最坏的 ls-type:: annotation hl-page:: 97 hl-color:: green id:: 642e4a5d-37c1-4484-b2ac-913e40d8a2dc - Incorporating I/O: Overlap. Basic idea is to treat each CPU burst(rather than the whole job) as a job, so that the job is divided into parts. This enables the scheduler to choose another job to run when the job is doing IO hl-page:: 98 ls-type:: annotation id:: 642e4ed2-7674-4a5f-bb65-67541b97db95 hl-color:: yellow - Multi-level Feedback Queue (MLFQ) ls-type:: annotation hl-page:: 103 hl-color:: yellow id:: 642e5117-90c3-41db-9f15-45f3ba9edf91 - Workload: a mix of interactive jobs that are short-running, and some longer-running “CPU-bound” jobs - Basic rules: each job assigned to a priority level and MLFQ decides which job to run by priority. In this scheme, a job with higher priority runs first, and jobs with the same priority RR. ((642ecc9e-b28b-4951-aaf6-1191e867b34f)) - Change priority: set priority based on its observed behavior, for example, keep high priority for interactive jobs which frequently relinquish CPU. collapsed:: true - one of the major goals of the algorithm: It doesn’t know whether a job will be short or job, it first assumes it might be short, thus giving high priority. If it actually is short, it will run quickly and complete; if it is not, it will slowly move down the queues, and thus soon prove itself to be a long-running more batch-like process. hl-page:: 107 ls-type:: annotation id:: 642ece05-2fa8-4a24-88e2-f2550cfdd2ed hl-color:: yellow - Approximates SJF - Basic rules for MLFQ: hl-page:: 104 ls-type:: annotation id:: 642ecc9e-b28b-4951-aaf6-1191e867b34f hl-color:: yellow collapsed:: true - Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t). - Rule 2: If Priority(A) = Priority(B), A & B run in RR. - Problematic priority adjustment algorithm hl-page:: 105 ls-type:: annotation id:: 642ecd09-c81b-4127-8bfa-e1fbb78ba583 hl-color:: yellow collapsed:: true - Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue). - Rule 4a: If a job uses up an entire time slice while running, its priority is reduced. id:: 642ecd25-c824-4dcd-9a6a-43a717dd5b1e Rule 4b: If a job gives up the CPU before the time slice is up, it stays at the same priority level. - Problem 1: starvation. id:: 642ecd3f-c076-42f1-ba24-7f363eba9e14 If there are too many interactive jobs *occupying the CPU in combination*, then the long jobs will never get to run - Problem 2: game the scheduler. For example, a CPU-bound job intentionally issue a trivial IO request just before its time slice is over, so that it will not be moved to lower queue although it should be. - Problem 3: program behavior change. id:: 642ed383-c27c-401c-b77f-66e7ec60ba5e - The Priority Boost ls-type:: annotation hl-page:: 109 hl-color:: yellow id:: 642ed47c-9ba8-4451-b6b3-6ca6ee1dbdda collapsed:: true - Rule 5: After some time period `S`, move all the jobs in the system to the topmost queue. - This solves the problem of ((642ecd3f-c076-42f1-ba24-7f363eba9e14)) and ((642ed383-c27c-401c-b77f-66e7ec60ba5e)). Since the priorities will get recalculated periodically, the scheduler re-learns the jobs' traits which may have changed. - However, how to choose such `S` is a problem. - voo-doo constants ls-type:: annotation hl-page:: 109 hl-color:: yellow id:: 642ed799-d933-441a-a043-06e47877c0d9 - Better Accounting ls-type:: annotation hl-page:: 110 hl-color:: yellow id:: 642ed5f6-2a74-4a24-9f69-a472cf644fc9 collapsed:: true - Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced. - This substitutes ((642ecd25-c824-4dcd-9a6a-43a717dd5b1e)) - parameterized scheduler hl-page:: 110 ls-type:: annotation id:: 642ed6a2-c04b-4f03-a86f-1d70933c0d42 hl-color:: yellow - relinquish 交出,让给;放弃 hl-page:: 105 ls-type:: annotation id:: 642ec9be-dd64-4a66-ab7a-3f0ee376e055 hl-color:: green - culprit 犯人,罪犯;被控犯罪的人 ls-type:: annotation hl-page:: 110 hl-color:: green id:: 642ed5fe-3314-4020-a4fc-b1b75ea987b9 - Proportional-share(fair-share) scheduler hl-page:: 115 ls-type:: annotation id:: 642eda22-bd34-42e4-b7e4-1107636d1fbc hl-color:: yellow - Instead of optimizing for turnaround or response time, the scheduler tries to guarantee that each job obtain a certain percentage of CPU time. - tickets: represent the share of a resource that a process should receive hl-page:: 115 ls-type:: annotation id:: 642edaa2-78d2-4fde-8b14-584b7d39fa24 hl-color:: yellow - ticket currency: kind of user interface. Users allocate tickets freely to their own processes, and the system converts user tickets to global tickets according to some kind of exchange rate, in order to achieve fairness between users. hl-page:: 117 ls-type:: annotation id:: 642edc09-1329-4eca-95ad-7f62b48875e2 hl-color:: yellow - ticket transfer: kind of cooperation between processes. A process temporarily hands off its tickets to another process. hl-page:: 117 ls-type:: annotation id:: 642edc0f-464d-4616-9313-92b640cecec5 hl-color:: yellow - ticket inflation: another kind of cooperation. A process can temporarily raise or lower the number of tickets it owns, to indicate that it needs CPU. hl-page:: 117 ls-type:: annotation id:: 642edc14-74d6-4758-a21f-d615d2ee51c9 hl-color:: yellow - Lottery scheduling hl-page:: 115 ls-type:: annotation id:: 642edb1d-7740-4459-bb42-0c6a84156475 hl-color:: yellow - Scheduler **randomly** pick a winning ticket(i.e. number the tickets 1-N, and do a range random), the job which holdes this ticket runs. The more tickets a job holds, the higher chance it is chosen to run. Thus the CPU is shared by proportion, probabilistically. - Lottery Fairness Study: When the job length is not very long, unfairness can be quite severe. Only as the jobs run for a significant number of time slices does the lottery scheduler approach the desired outcome. ls-type:: annotation hl-page:: 119 hl-color:: yellow id:: 642eded0-39d0-40aa-8b28-c273d39f90c2 - Stride scheduling: a **deterministic** fair-share scheduler. hl-page:: 120 ls-type:: annotation id:: 642edf4f-7c7f-477f-acae-0969da13731e hl-color:: yellow - Each job has a *stride*, which is inverse in proportion to the tickets it has (conceptually like reciprocal). Every time a process runs, increase its counter(called its *pass* value) by 1 stride. The scheduler picks the process with lowest pass value to run - Why still lottery scheduling? No global states! Thus much easier to implement. - Problem: How to determine how many tickets to assign to your processes with different purposes and traits? MLFQ does this automatically, but here nobody does this. - Completely Fair Scheduler (CFS) ls-type:: annotation hl-page:: 121 hl-color:: yellow id:: 642ee1b9-281d-4589-ab90-e776507dd04f - Goal: to fairly divide a CPU evenly among all competing processes. hl-page:: 122 ls-type:: annotation id:: 642ee242-d382-4685-86b4-b3169fcc4fcf hl-color:: yellow - virtual runtime: As each process runs, it accumulates `vruntime`. And the scheduler picks the lowest one to run. hl-page:: 122 ls-type:: annotation id:: 642ee25b-1f3a-4b7c-a721-fa60f5fa5d2f hl-color:: yellow - For blocked processes: need to alter `vruntime` of a job when it wakes up. Otherwise, its `vruntime` would be too small thus breaking fairness. CFS chooses the minimum `vruntime` in the running process table. hl-page:: 125 ls-type:: annotation id:: 642ee7e4-20d2-44d5-8407-288a8a2e1769 hl-color:: yellow - Parameters - `sched_latency`: when running, scheduler divides this value by the number of running processes `n`. The result is used as the time slice for each process. This simple approach is adaptive to dynamic change of running processes. hl-page:: 122 ls-type:: annotation id:: 642ee303-c289-4e55-a0c5-bc4f534fa882 hl-color:: yellow - `min_granularity`: minimum of time slice, to avoid reducing performance too much hl-page:: 122 ls-type:: annotation id:: 642ee3d6-827b-4d80-b6c3-9cb8253a16d6 hl-color:: yellow - Weighting (Niceness): every process is assigned to a `nice` value ranging from -20 to 19. The smaller nice value, the higher priority. A nice value is mapped to some `weight` through a carefully built table. hl-page:: 123 ls-type:: annotation id:: 642ee44f-5fca-4d7d-b688-ff4ac22be23a hl-color:: yellow - Given the weight, the time slice can be calculated, and the calculation for `vruntime` needs adaptation to guarantee the time slice. $$ time\_slice_k = \frac{weight_k}{\sum weight_i}\cdot sched\_latency \\ vruntime_i = vruntime_i + \frac{weight_0}{weight_i} \cdot runtime_i$$ - - hallmark 特征;特点: ls-type:: annotation hl-page:: 126 hl-color:: green id:: 642ee5dd-9183-4122-9dee-06ff7fb9be46 - panacea 万能药 hl-page:: 126 ls-type:: annotation id:: 642ee8c2-89fe-4e7b-bf6d-bb0e379f8fe2 hl-color:: green - remedy 补救方法 ls-type:: annotation hl-page:: 129 hl-color:: green id:: 642eeb3a-8803-4c73-84a7-cc48c903f10f - proliferation 涌现;增殖 hl-page:: 129 ls-type:: annotation id:: 642eeb44-cf62-4cf6-81cb-f1f6423cb66d hl-color:: green - Problems with multiple processors - cache coherence: basically, hardware handles this hl-page:: 132 ls-type:: annotation id:: 642eecc1-d07d-4e48-bcd5-b84db831b241 hl-color:: yellow - Synchronization: though locks ensure correctness, performance is harmed hl-page:: 132 ls-type:: annotation id:: 642f878b-44c2-4485-ae98-448032b588da hl-color:: yellow - Cache Affinity: cache may still keep some of the process's state, so this may be faster if the process runs on the same CPU next time, in that there is no need to load state from memory. hl-page:: 133 ls-type:: annotation id:: 642f87d0-2e9a-405c-8fa6-689dc492ef52 hl-color:: yellow - Single-Queue Multiprocessor Scheduling(SQMS) hl-page:: 134 ls-type:: annotation id:: 642f88e3-3f8b-472b-8947-09531409a23b hl-color:: yellow - Simply use the same policy as we do in the single processor condition, and pick maybe more than one best jobs to run. - Problem 1: lack of scalability. Since it is a single global queue, there will be a lot of contention on the same lock, thus greatly reducing the performance. - Problem 2: cache affinity. If the scheduler simply feed processes to CPU by order, the jobs will bounce around from CPU to CPU. Complex affinity mechanism is needed to try to make it more likely that process will continue to run on the same CPU. - Multi-Queue Multiprocessor Scheduling (MQMS). hl-page:: 135 ls-type:: annotation id:: 642f90ef-6bf0-42be-8056-188682da8901 hl-color:: yellow - Consists of multiple independent queues following some particular policy. Avoid problems of sharing and synchronization. - More scalable: when number of CPUs grows, add more queues. - Better cache affinity: jobs in the same queue stay in the same CPU - Problem: load imbalance. The jobs in the queue with fewer jobs get more CPU share than those in the queue with more jobs. Or even worse, some CPUs are IDLE. (一核有难,七核围观) hl-page:: 136 ls-type:: annotation id:: 642f934e-0f7b-43de-97a1-fc530b229098 hl-color:: yellow - Migration: the obvious solution to load imbalance, is to migrate some jobs from one CPU to another. Sometimes, we need to keep switching jobs, in such case that Q1 has 1 job and Q2 has 2 jobs. You may want to keep moving the third job from one CPU to another, to balance load. hl-page:: 137 ls-type:: annotation id:: 642f9564-38b9-4f22-a0af-f4c4f6c8fe76 hl-color:: yellow - Work stealing: source queue(low on jobs) occasionally peek at other queues to see whether it is a good idea to move some jobs to help balance load. - sinister 危险的, 不吉祥的 hl-page:: 136 ls-type:: annotation id:: 642f914d-502e-4f9d-8878-cf331e7f3fc3 hl-color:: green - insidious 隐伏的,潜在的,阴险的 hl-page:: 137 ls-type:: annotation id:: 642f9458-c07c-4ef3-a1ec-a14e76ea4b2b hl-color:: green - dissertation 专题论文, 学位论文 ls-type:: annotation hl-page:: 138 hl-color:: green id:: 642f96cb-066e-4ab9-975c-a746e3143062 - daunting 使人畏缩的;使人气馁的; ls-type:: annotation hl-page:: 138 hl-color:: green id:: 642f97ac-a4ab-4baf-b039-678c466ea588 - undertake 承担;从事;负责 ls-type:: annotation hl-page:: 138 hl-color:: green id:: 642f97b5-d203-4998-84f0-21d66f8424b7 - Linux Multiprocessor Schedulers: 3 different schedulers. CFS and O(1) are MQMS, while BFS is SQMS based on EEVDF. hl-page:: 138 ls-type:: annotation id:: 642f981d-af29-4cc1-a9da-b445cb964674 hl-color:: yellow - super-linear speedup: Sometimes a speedup of more than A when using A processors is observed in parallel computing. One possible reason for this is that, these CPUs offers larger cache size all together. If properly designed, memory access could even be eliminated, thus greatly improving performance. hl-page:: 141 ls-type:: annotation id:: 642faf44-5876-46b3-acb4-b786f390716f hl-color:: yellow - paranoid 多疑的;偏执狂;妄想症患者 hl-page:: 142 ls-type:: annotation id:: 642fb2f2-b9a4-48f4-b847-2b30d632db32 hl-color:: green - rage 暴怒;狂怒;迅速蔓延;快速扩散: hl-page:: 143 ls-type:: annotation id:: 642fb3d2-e51f-4b9b-8a79-dea9d8e6a7b0 hl-color:: green - inundate 泛滥;淹没;浸水; hl-page:: 144 ls-type:: annotation id:: 642fb4c0-5c56-44e9-aac6-396212698309 hl-color:: green - errant 周游的;不定的;错误的;偏离正路的 ls-type:: annotation hl-page:: 145 hl-color:: green id:: 642fb51d-f82c-40b4-82ad-878ee13a2264 - darned ls-type:: annotation hl-page:: 146 hl-color:: green id:: 642fb54e-23c4-4667-955d-ad09fbbf6268 - pesky 烦人的;让人讨厌的 ls-type:: annotation hl-page:: 148 hl-color:: green id:: 642fb9df-d4c2-4b75-9341-e9ea53d42dcc - Address space: process's view of memory, the abstraction that the OS provides to the running program. When OS does this, we say it is **virtualizing memory**. hl-page:: 148 ls-type:: annotation id:: 642fbad1-cda0-4ef4-97f7-38c3519042f4 hl-color:: yellow - Goal: transparency, efficiency and protection - tandem ls-type:: annotation hl-page:: 150 hl-color:: green id:: 642fbb6e-9a9a-4ff6-92ea-b36903da1b88 - stipulate ls-type:: annotation hl-page:: 150 hl-color:: green id:: 642fbc3d-d70b-4adb-bb0b-6b7038480589 - scribble ls-type:: annotation hl-page:: 159 hl-color:: green id:: 642fbfcb-7327-44b8-9b25-308728264a81 - Memory API: this interlude chapter talks about memory allocation interfaces like `malloc` and `free`. Quite trivial for proficient C programmers. hl-page:: 155 ls-type:: annotation id:: 642fc139-afc3-4ba4-ad01-4cc33d9e535c hl-color:: yellow - hardware-based address translation ls-type:: annotation hl-page:: 167 hl-color:: yellow id:: 642fd36e-c0a7-4ead-a3a8-f085d6576229 - Assumptions ls-type:: annotation hl-page:: 167 hl-color:: yellow id:: 642fc48b-e5bf-4043-b042-218445c2b714 - Address space mapped to contiguous physical memory id:: 6430040b-bfaa-4148-a49a-1a22350e0c33 - Address space can be totally held in physical memory(no too big) - Each address space is the same size. - Dynamic (Hardware-based) Relocation ls-type:: annotation hl-page:: 170 hl-color:: yellow id:: 642fc65f-1147-459c-8906-5ffca38b1e66 - Software-based(static) relocation: program loader rewrites the to-be-loaded program's addresses according to its target offset in physical memory. The most important problem with this approach is that, protection can hardly be enforced. hl-page:: 171 ls-type:: annotation id:: 642fc677-cbd4-47fd-bd90-3ed0cd56cc5b hl-color:: yellow - base-and-bounds: 2 registers for *each* CPU, used for determining the physical location of the address space. - Before running, program is compiled as if it is loaded at address 0x00000000. On startup, OS decide where to put the program and set `base` register. When running, CPU translates process's memory reference(virtual address -> physical address) and issue request to RAM using physical address. `physcal address = virtual address + base` - `bounds` register is there to help with protection, hardware checks whether the translated address exceeds the bound - Dynamic Relocation: Hardware Requirements ls-type:: annotation hl-page:: 174 hl-color:: yellow id:: 642fd1ed-22a9-4443-b02f-bc35fe42e50b - Dynamic Relocation: Operating System Responsibilities. In addition to the LDE introduced in CPU virtualization, a little more work needs to be done, like base/bounds register save/restore, memory allocation and deallocation. hl-page:: 175 ls-type:: annotation id:: 642fd1f4-d573-4dbc-9e48-02adacc69e5e hl-color:: yellow - The stuff is not difficult to figure out on your own, so, why bother keeping notes on it? - Problem: Internal fragmentation. Since the address space has fixed size in Dynamic Relocation, the used part of memory between stack and heap is wasted. hl-page:: 178 ls-type:: annotation id:: 642fd2f3-e05f-453c-8286-15e7807e8e97 hl-color:: yellow - Programs may want larger address space(though not fully used) - Memory Management Unit (MMU): the part of the processor that helps with address translation. hl-page:: 172 ls-type:: annotation id:: 642fced7-c809-4d1e-bddb-da6474e851b8 hl-color:: yellow - havoc ls-type:: annotation hl-page:: 173 hl-color:: green id:: 642fcf85-6cfb-4eb9-aef2-ddcef7027b70 - wreak ls-type:: annotation hl-page:: 173 hl-color:: green id:: 642fcf90-3133-4002-b986-4fd8157ab707 - ghastly ls-type:: annotation hl-page:: 174 hl-color:: green id:: 642fcfa9-c026-4885-9f50-029ca80ce148 - juncture ls-type:: annotation hl-page:: 174 hl-color:: green id:: 642fcff1-ae71-4756-9847-5ab85c41be06 - oblivious 未察觉;不注意;忘记 ls-type:: annotation hl-page:: 175 hl-color:: green id:: 642fd071-db9c-4f0e-886f-b6ba3e5e4f7d - Segmentation: Generalized Base/Bounds ls-type:: annotation hl-page:: 181 hl-color:: yellow id:: 642fd5c3-30f9-4770-8ec5-09555d21c4ab - Divide the address space into contiguous segments, and **the address space as a whole is no more contiguous in physical memory**. ((6430040b-bfaa-4148-a49a-1a22350e0c33)) - A base and bounds pair per logical segment of the address space. Place each one of those segments in different parts of physical memory, and thus avoid filling physical memory with unused virtual address space. Conforming to this, MMU should add some registers. - Selecting segment: which segment does a virtual address refer to? - Explicit method: use the top few bits of the virtual address as segment selector, and the rest as in-segment offset. - Problem 1: bit wasted, for example, we have only 3 segments, but we have to use 2-bits which provides 4. - Problem2: limits the use of the virtual address space. Because the top bits is taken away to represent segments, the maximum of a segment is reduced. - implicit approach, the hardware determines the segment by noticing how the address was formed. For example, PC -> code segment, SP -> stack segment hl-page:: 185 ls-type:: annotation id:: 642fda54-4040-4d9b-ab00-13523aeb54c4 hl-color:: yellow - Stack segment support: stack grows backwards. First, add a field to hardware to indicate a segment grows positive or not. When proceeding a negative growing segment, the physical address is calculated as `PA = VA[offset] - MAX_SEG_SIZE + base`, signed operation hl-page:: 186 ls-type:: annotation id:: 642fdca4-df4b-4515-8e96-ef4b05a7bb62 hl-color:: yellow - A real world example for this is the E(Expand-Down) bit in x86's segment descriptor. - The reason why design such a weird mechanism is explained here: [osdev-expand_down](https://wiki.osdev.org/Expand_Down#Expand_Down). In short, programs may require the a segment to grow its size when the initially allocated segment is too small. - Support for Sharing: protection bits. hl-page:: 187 ls-type:: annotation id:: 642fe0dc-7d4b-41af-a136-69164ee77ab4 hl-color:: yellow - Attach several protection bits to Segment Register. For example, by setting code segment to read-only, you can safely share the segment across processes, thus saving the memory to hold a copy of code when a program creates many processes. - Problem 1: variable-sized segments cause ((64302439-30a1-4724-ab5f-0d90ab8530e2)) id:: 6430040b-2a6e-4342-8db8-bc92893d0a3f - Problem 2: not flexible enough. What if we want a large enough but sparsely-allocated heap(the heap segment could be very large but wastefully used)? - external fragmentation hl-page:: 193 ls-type:: annotation id:: 64302439-30a1-4724-ab5f-0d90ab8530e2 hl-color:: yellow - the free space gets chopped into little pieces of different size(fragmented); subsequent requests may fail because there is no single contiguous space that can satisfy the request, even though the total amount of free space suffices. - Assumptions on free space management hl-page:: 194 ls-type:: annotation id:: 6430258a-4890-4fea-97c3-2d6bef20b251 hl-color:: yellow - Assume a pair of interface `void* malloc(size_t size)` and `void free(void* ptr)`, and a generic data structure `free list`(not necessarily be a list) - Ignore internal fragmentation - No reallocation, no compaction - Assume the allocator manages a contiguous and size-fixed region of bytes - coalesce 联合;合并 hl-page:: 195 ls-type:: annotation id:: 64302a62-c183-4dbf-94de-1861deb192f4 hl-color:: green - intact 完整的;原封不动的;未受损伤的 ls-type:: annotation hl-page:: 196 hl-color:: green id:: 64302b20-f76e-4516-868f-0ecab376328f - corollary 必然的结果(或结论) hl-page:: 196 ls-type:: annotation id:: 64302b32-b01b-4cb2-b027-5ad59bd7ec11 hl-color:: green - Low-level mechanisms(some discussion on practical stuff) - A free list contains a set of elements that describe the free space still remaining in the heap. For example, `{addr, len}` hl-page:: 195 ls-type:: annotation id:: 64302ac2-a341-467e-9469-baca1e8aa7f2 hl-color:: yellow - Splitting and Coalescing ls-type:: annotation hl-page:: 195 hl-color:: yellow id:: 64302da4-d1d5-41df-8ec8-883712d948d5 - split: if the request is smaller than a free block, then the allocator splits the block, take requested part and put the rest part back to free list - coalesce: when returning a free chunk, the allocator looks for its nearby chunks in the free list and merge them into larger free chunk - Tracking The Size Of Allocated Regions ls-type:: annotation hl-page:: 197 hl-color:: yellow id:: 64302eab-54b3-4e8c-a105-cc3e8747c0f3 - Store some extra info in a header, which generally contains a size field and some other checking or speedup fields. Commonly, the header is placed just before the handed-out chunk of memory. - When allocating, we look for a free block that fits `N+sizeof(header)` instead of merely `N`. In other words, the header is allocated along with the requested chunk of memory, though the pointer to return doesn't include header. See ((64303176-25c9-4d28-bd6e-b5b00bf74e3a)). The header's location could be calculated through pointer arithmetic `ptr - sizeof(header)`. - Figure 17.2: Specific Contents Of The Header hl-page:: 197 ls-type:: annotation id:: 64303176-25c9-4d28-bd6e-b5b00bf74e3a hl-color:: yellow - Embedding A Free List ls-type:: annotation hl-page:: 198 hl-color:: yellow id:: 643032f8-4103-4d7c-b471-fae4364ce7b5 - The free list's nodes resides in the chunk of memory that it is going to manage, because we cannot use stuff like `malloc` now. - The implementation described in the textbook, can be concluded as: Node structure `{size, *next}` Each node resides at the start of a contiguous free chunk of memory When allocated, the node moves to the start of the remaining free memory - Basic Strategies: quite boring policies hl-page:: 203 ls-type:: annotation id:: 643034f6-efa7-4f59-80a9-3561c52a5218 hl-color:: yellow - Best Fit: pick the smallest chunk of memory that is as big or bigger than the requested size hl-page:: 203 ls-type:: annotation id:: 6430350b-81b6-44db-a10d-fb0b392e957c hl-color:: yellow - Try to reduce wasted space, though performance is compromised and many small free fragments are generated - Worst Fit: pick the largest chunk of memory that is as big or bigger than the requested size hl-page:: 203 ls-type:: annotation id:: 6430350e-5d5f-4e5a-9342-7b57874ffda8 hl-color:: yellow - Try to leave big chunks free and void small fragments, though costly and may not work as expected(in contrast leading to excess fragmentation) - First Fit: select the first block that is big enough hl-page:: 204 ls-type:: annotation id:: 64303512-4b59-4c95-b86d-62a2e1349490 hl-color:: yellow - No exhaustive search, thus fast. - Problem: small pieces tend to accumulate at the beginning of the free list(first fit tends to take memory there), performance become worse. One alleviation approach is to use address-based ordering, which makes coalescing easier by keeping the list ordered by the address of free chunks. - Next Fit: modified version of first fit. Try not to always take memory from the beginning. hl-page:: 204 ls-type:: annotation id:: 64303515-c336-4934-987d-4ea4a29a7dfd hl-color:: yellow - Envision 想像,展望 ls-type:: annotation hl-page:: 204 hl-color:: green id:: 6430382d-9e37-412f-b983-e2a7f624954a - splinter 尖片, 碎片;刺 hl-page:: 204 ls-type:: annotation id:: 64303831-b777-42c4-a349-d6e95ac91b25 hl-color:: green - **Segregated Lists** hl-page:: 205 ls-type:: annotation id:: 64303b17-81c8-4d92-ab80-1e48c3598403 hl-color:: yellow - Basic Idea: If a particular application has one (or a few) popular-sized request, keep a separate list just to manage objects of that size; all other requests are forwarded to a more general memory allocator. hl-page:: 205 ls-type:: annotation id:: 64303b24-47e4-4016-8111-519c599e6cca hl-color:: yellow - Advantage: reduced fragmentation, faster service IF requests are of the RIGHT size Problem: how to figure out that? - Real world example: `slab`. - At kernel bootstrap, allocate a number of object caches for kernel objects that are likely to be requested frequently hl-page:: 205 ls-type:: annotation id:: 64303d7c-6ac2-462c-98c9-1b61ea58a396 hl-color:: yellow - When a given cache is running low on free space, it requests some slabs of memory from a more general memory allocator hl-page:: 205 ls-type:: annotation id:: 64303d8e-54a3-4b7f-a879-7c2533ad8b02 hl-color:: yellow - When the reference counts of the objects within a given slab all go to 0, the general allocator can reclaim them from the specialized allocator hl-page:: 205 ls-type:: annotation id:: 64303da3-9def-447a-9cc8-08942db17f38 hl-color:: yellow - Free objects on the segregated lists are set in a **pre-initialized state**, in that initialization and destruction of data structures is costly hl-page:: 206 ls-type:: annotation id:: 64303db7-987f-47ed-84de-c17e421159aa hl-color:: yellow - **Binary Buddy Allocator**: simplify coalescing hl-page:: 206 ls-type:: annotation id:: 64303dc5-8073-4659-9f39-ec6167683a7d hl-color:: yellow - Allocation: Recursively divide free space by 2 until a block which is MERELY big enough to accommodate the request (and a further split would be too small), is found. hl-page:: 206 ls-type:: annotation id:: 64303f3a-2b0f-4106-8ab1-48e4839fe971 hl-color:: yellow - Deallocation: When a block is returned, check whether its buddy is free. If so, coalesce with its buddy. Recursively coalesce up the tree, and stop when a buddy is in use. hl-page:: 206 ls-type:: annotation id:: 64304015-d9ff-408b-bf88-cf37a33e832d hl-color:: yellow - Good quality: the address of each buddy pair only differs by a single bit, no matter the base address. hl-page:: 207 ls-type:: annotation id:: 643040e1-7784-424c-9096-d3e22fddbf9e hl-color:: yellow - Page: fixed-sized memory unit in address space hl-page:: 211 ls-type:: annotation id:: 643044ff-fd8a-45fe-b564-f93683425ab3 hl-color:: yellow Page frame: physical memory as an array of fixed-sized slots