Skip to content

Commit 61630d0

Browse files
authored
Merge pull request #631 from typelevel/fibers-fast-fix-comments
Convert HTML comments to alt text
2 parents 69d2bb1 + a435e2f commit 61630d0

File tree

1 file changed

+6
-12
lines changed

1 file changed

+6
-12
lines changed

src/blog/fibers-fast-mkay.md

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,7 @@ The most direct and naive way to approach this is to allocate one thread per con
5050

5151
### Unbounded Threads
5252

53-
<!-- loads of threads diagram -->
54-
![](/img/media/fibers/many-threads.png)
53+
![loads of threads diagram](/img/media/fibers/many-threads.png)
5554

5655
Implementation-wise, this is very easy to reason about. Your code will all take on a highly imperative structure, with *A* followed by *B* followed by *C*, etc, and it will behave entirely reasonably at small scales! Unfortunately, the problem here is that threads are not particularly cheap. The reasons for this are relatively complex, but they manifest in two places: the OS kernel scheduler, and the JVM itself.
5756

@@ -63,8 +62,7 @@ This is a huge problem, and we run face-first into it in architectures like the
6362

6463
### Bounded Threads
6564

66-
<!-- thread pool diagram -->
67-
![](/img/media/fibers/few-threads.png)
65+
![thread pool diagram](/img/media/fibers/few-threads.png)
6866

6967
In this kind of architecture, incoming requests are handed off to a scheduler (usually a shared lock-free work queue) which then hands them off to a fixed set of worker threads for processing. This is the kind of thing you'll see in almost every JVM application written in the past decade or so.
7068

@@ -74,8 +72,7 @@ This is extremely wasteful, because we have a scarce resource (threads) which *c
7472

7573
### Improved Thread Utilization
7674

77-
<!-- async pool diagram -->
78-
![](/img/media/fibers/async.png)
75+
![async pool diagram](/img/media/fibers/async.png)
7976

8077
This is much more efficient! It's also incredibly confusing, and it gets exponentially worse the more complexity you have in your control flow. In practice most systems like this one have *multiple* downstreams that they need to talk to, often in parallel, which makes this whole thing get crazy in a hurry. It also doesn't get any easier when you add in the fact that just talking to a downstream (like a database) often involves some form of resource management which has to be correctly threaded across these asynchronous boundaries and carried between threads, not to mention problems like timeouts and fallback races and such. It's a mess.
8178

@@ -85,8 +82,7 @@ All in all, this is very bad, and it starts to hint at *why* it is that Cats Eff
8582

8683
### Many Fibers, Fewer Threads, One Scheduler
8784

88-
<!-- fiber diagram -->
89-
![](/img/media/fibers/fibers.png)
85+
![fiber diagram](/img/media/fibers/fibers.png)
9086

9187
This diagram looks a lot like the first one! In here, we're just allocating a new fiber for each request that comes in, much like how we *tried* to allocate a new thread per request. Each fiber is a self-contained, sequential unit which *semantically* runs from start to finish and we don't really need to think about what's going on under the surface. Once the response has been produced to the client, the fiber goes away and we never have to think about it again.
9288

@@ -104,8 +100,7 @@ That is, until Cats Effect 3.
104100

105101
### Many Fibers, Fewer Threads, Many Schedulers
106102

107-
<!-- work-stealing diagram -->
108-
![](/img/media/fibers/work-stealing.png)
103+
![work-stealing diagram](/img/media/fibers/work-stealing.png)
109104

110105
Cats Effect 3 has a *much* smarter and more efficient scheduler than any other asynchronous framework on the JVM. It was heavily inspired by the [Tokio](https://tokio.rs) Rust framework, which is fairly close to Cats Effect's problem space. As you might infer from the diagram, the scheduler is no longer a central clearing house for work, and instead is dispersed among the worker threads. This *immediately* results in some massive efficiency wins, but the real magic is still to come.
111106

@@ -115,8 +110,7 @@ In a conventional implementation of the disruptor pattern (which is what a fixed
115110

116111
Work-stealing, for contrast, allows the individual worker threads to manage their own *local* queue, and when that queue runs out, they simply take work from each other on a one-to-one basis. Thus, the only contention that exists is between the stealer and the "stealee", entirely avoiding the quadratic growth problem. In fact, contention becomes *less* frequent as the number of workers and the load on the pool increases. You can conceptualize this with the following extremely silly plot (higher numbers are *bad* because they represent overhead):
117112

118-
<!-- plot of work stealing overhead vs standard disruptor pattern -->
119-
![](/img/media/fibers/overhead.png)
113+
![plot of work stealing overhead vs standard disruptor pattern](/img/media/fibers/overhead.png)
120114

121115
Work-stealing is simply very, very, very good. But we can do even better.
122116

0 commit comments

Comments
 (0)