You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before Python 3.14, the C implementation of `asyncio` used a [`WeakSet`](https://docs.python.org/3/library/weakref.html#weakref.WeakSet) to store all the tasks created by the event loop. `WeakSet` was used so that the event loop
12
-
doesn't hold strong references to the tasks, allowing them to be garbage collected when they are no longer needed.
13
-
The current task of the event loop was stored in a dict mapping the event loop to the current task.
12
+
Before Python 3.14, the C implementation of `asyncio` used a
to store all the tasks created by the event loop. `WeakSet` was used
15
+
so that the event loop doesn't hold strong references to the tasks,
16
+
allowing them to be garbage collected when they are no longer needed.
17
+
The current task of the event loop was stored in a dict mapping the
18
+
event loop to the current task.
14
19
15
20
```c
16
21
/* Dictionary containing tasks that are currently active in
@@ -22,34 +27,87 @@ The current task of the event loop was stored in a dict mapping the event loop t
22
27
```
23
28
24
29
This implementation had a few drawbacks:
25
-
1.**Performance**: Using a `WeakSet` for storing tasks is inefficient, as it requires maintaining a full set of weak references to tasks along with corresponding weakref callback to cleanup the tasks when they are garbage collected.
26
-
This increases the work done by the garbage collector, and in applications with a large number of tasks, this becomes a bottleneck, with increased memory usage and lower performance. Looking up the current task was slow as it required a dictionary lookup on the `current_tasks` dict.
27
30
28
-
2.**Thread safety**: Before Python 3.14, concurrent iterations over `WeakSet` was not thread safe[^1]. This meant calling APIs like `asyncio.all_tasks()` could lead to inconsistent results or even `RuntimeError` if used in multiple threads[^2].
29
-
30
-
3.**Poor scaling in free-threading**: Using global `WeakSet` for storing all tasks across all threads lead to contention when adding and removing tasks from the set which is a frequent operation. As such it performed poorly in free-threading and did not scale well with the number of threads. Similarly, accessing the current task in multiple threads did not scale due to contention on the global `current_tasks` dictionary.
31
+
1.**Performance**: Using a `WeakSet` for storing tasks is
32
+
inefficient, as it requires maintaining a full set of weak references
33
+
to tasks along with corresponding weakref callback to cleanup the
34
+
tasks when they are garbage collected. This increases the work done
35
+
by the garbage collector, and in applications with a large number of
36
+
tasks, this becomes a bottleneck, with increased memory usage and
37
+
lower performance. Looking up the current task was slow as it required
38
+
a dictionary lookup on the `current_tasks` dict.
39
+
40
+
2.**Thread safety**: Before Python 3.14, concurrent iterations over
41
+
`WeakSet` was not thread safe[^1]. This meant calling APIs like
42
+
`asyncio.all_tasks()` could lead to inconsistent results or even
43
+
`RuntimeError` if used in multiple threads[^2].
44
+
45
+
3.**Poor scaling in free-threading**: Using global `WeakSet` for
46
+
storing all tasks across all threads lead to contention when adding
47
+
and removing tasks from the set which is a frequent operation. As such
48
+
it performed poorly in free-threading and did not scale well with the
49
+
number of threads. Similarly, accessing the current task in multiple
50
+
threads did not scale due to contention on the global `current_tasks`
51
+
dictionary.
31
52
32
53
## Python 3.14 implementation
33
54
34
-
To address these issues, Python 3.14 implements several changes to improve the performance and thread safety of tasks management.
35
-
36
-
-**Per-thread double linked list for tasks**: Python 3.14 introduces a per-thread circular double linked list implementation for storing tasks. This allows each thread to maintain its own list of tasks and allows for lock free addition and removal of tasks. This is designed to be efficient, and thread-safe and scales well with the number of threads in free-threading. This also allows external introspection tools such as `python -m asyncio pstree` to inspect tasks running in all threads and was implemented as part of [Audit asyncio thread safety](https://github.com/python/cpython/issues/128002).
37
-
38
-
-**Per-thread current task**: Python 3.14 stores the current task on the current thread state instead of a global dictionary. This allows for faster access to the current task without the need for a dictionary lookup. Each thread maintains its own current task, which is stored in the `PyThreadState` structure. This was implemented in https://github.com/python/cpython/issues/129898.
39
-
40
-
Storing the current task and list of all tasks per-thread instead of storing it per-loop was chosen primarily to support external introspection tools such as `python -m asyncio pstree` as looking up arbitrary attributes on the loop object
41
-
is not possible externally. Storing data per-thread also makes it easy to support third party event loop implementations such as `uvloop`, and is more efficient for the single threaded asyncio use-case as it avoids the overhead of attribute lookups on the loop object and several other calls on the performance critical path of adding and removing tasks from the per-loop task list.
42
-
55
+
To address these issues, Python 3.14 implements several changes to
56
+
improve the performance and thread safety of tasks management.
57
+
58
+
-**Per-thread double linked list for tasks**: Python 3.14 introduces
59
+
a per-thread circular double linked list implementation for
60
+
storing tasks. This allows each thread to maintain its own list of
61
+
tasks and allows for lock free addition and removal of tasks. This
62
+
is designed to be efficient, and thread-safe and scales well with
63
+
the number of threads in free-threading. This also allows external
64
+
introspection tools such as `python -m asyncio pstree` to inspect
65
+
tasks running in all threads and was implemented as part of [Audit
-**Per-thread current task**: Python 3.14 stores the current task on
70
+
the current thread state instead of a global dictionary. This
71
+
allows for faster access to the current task without the need for
72
+
a dictionary lookup. Each thread maintains its own current task,
73
+
which is stored in the `PyThreadState` structure. This was
74
+
implemented in https://github.com/python/cpython/issues/129898.
75
+
76
+
Storing the current task and list of all tasks per-thread instead of
77
+
storing it per-loop was chosen primarily to support external
78
+
introspection tools such as `python -m asyncio pstree` as looking up
79
+
arbitrary attributes on the loop object is not possible
80
+
externally. Storing data per-thread also makes it easy to support
81
+
third party event loop implementations such as `uvloop`, and is more
82
+
efficient for the single threaded asyncio use-case as it avoids the
83
+
overhead of attribute lookups on the loop object and several other
84
+
calls on the performance critical path of adding and removing tasks
85
+
from the per-loop task list.
43
86
44
87
## Per-thread double linked list for tasks
45
88
46
-
This implementation uses a circular doubly linked list to store tasks on the thread states. This is used for all tasks which are instances of `asyncio.Task` or subclasses of it, for third-party tasks a fallback `WeakSet` implementation is used. The linked list is implemented using an embedded `llist_node` structure within each `TaskObj`. By embedding the list node directly into the task object, the implementation avoids additional memory allocations for linked list nodes.
47
-
48
-
The `PyThreadState` structure gained a new field `asyncio_tasks_head`, which serves as the head of the circular linked list of tasks. This allows for lock free addition and removal of tasks from the list.
49
-
50
-
It is possible that when a thread state is deallocated, there are lingering tasks in its list; this can happen if another thread has references to the tasks of this thread. Therefore, the `PyInterpreterState` structure also gains a new `asyncio_tasks_head` field to store any lingering tasks. When a thread state is deallocated, any remaining lingering tasks are moved to the interpreter state tasks list, and the thread state tasks list is cleared.
51
-
The `asyncio_tasks_lock` is used protect the interpreter's tasks list from concurrent modifications.
52
-
89
+
This implementation uses a circular doubly linked list to store tasks
90
+
on the thread states. This is used for all tasks which are instances
91
+
of `asyncio.Task` or subclasses of it, for third-party tasks a
92
+
fallback `WeakSet` implementation is used. The linked list is
93
+
implemented using an embedded `llist_node` structure within each
94
+
`TaskObj`. By embedding the list node directly into the task object,
95
+
the implementation avoids additional memory allocations for linked
96
+
list nodes.
97
+
98
+
The `PyThreadState` structure gained a new field `asyncio_tasks_head`,
99
+
which serves as the head of the circular linked list of tasks. This
100
+
allows for lock free addition and removal of tasks from the list.
101
+
102
+
It is possible that when a thread state is deallocated, there are
103
+
lingering tasks in its list; this can happen if another thread has
104
+
references to the tasks of this thread. Therefore, the
105
+
`PyInterpreterState` structure also gains a new `asyncio_tasks_head`
106
+
field to store any lingering tasks. When a thread state is
107
+
deallocated, any remaining lingering tasks are moved to the
108
+
interpreter state tasks list, and the thread state tasks list is
109
+
cleared. The `asyncio_tasks_lock` is used protect the interpreter's
When a task is created, it is added to the current thread's list of tasks by the `register_task` function. When the task is done, it is removed from the list by the `unregister_task` function. In free-threading, the thread id of the thread which created the task is stored in `task_tid` field of the `TaskObj`. This is used to check if the task is being removed from the correct thread's task list. If the current thread is same as the thread which created it then no locking is required, otherwise in free-threading, the `stop-the-world` pause is used to pause all other threads and then safely remove the task from the tasks list.
130
+
When a task is created, it is added to the current thread's list of
131
+
tasks by the `register_task` function. When the task is done, it is
132
+
removed from the list by the `unregister_task` function. In
133
+
free-threading, the thread id of the thread which created the task is
134
+
stored in `task_tid` field of the `TaskObj`. This is used to check if
135
+
the task is being removed from the correct thread's task list. If the
136
+
current thread is same as the thread which created it then no locking
137
+
is required, otherwise in free-threading, the `stop-the-world` pause
138
+
is used to pause all other threads and then safely remove the task
139
+
from the tasks list.
73
140
74
141
```mermaid
75
142
@@ -98,12 +165,21 @@ flowchart TD
98
165
one --> two
99
166
```
100
167
101
-
`asyncio.all_tasks` now iterates over the per-thread task lists of all threads and the interpreter's task list to get all the tasks. In free-threading, this is done by pausing all the threads using the `stop-the-world` pause to ensure that no tasks are being added or removed while iterating over the lists. This allows for a consistent view of all task lists across all threads and is thread safe.
168
+
`asyncio.all_tasks` now iterates over the per-thread task lists of all
169
+
threads and the interpreter's task list to get all the tasks. In
170
+
free-threading, this is done by pausing all the threads using the
171
+
`stop-the-world` pause to ensure that no tasks are being added or
172
+
removed while iterating over the lists. This allows for a consistent
173
+
view of all task lists across all threads and is thread safe.
102
174
103
-
This design allows for lock free execution and scales well in free-threading with multiple event loops running in different threads.
175
+
This design allows for lock free execution and scales well in
176
+
free-threading with multiple event loops running in different threads.
104
177
105
178
## Per-thread current task
106
-
This implementation stores the current task in the `PyThreadState` structure, which allows for faster access to the current task without the need for a dictionary lookup.
179
+
180
+
This implementation stores the current task in the `PyThreadState`
181
+
structure, which allows for faster access to the current task without
When a task is entered or left, the current task is updated in the thread state using `enter_task` and `leave_task` functions. When `current_task(loop)` is called where `loop` is the current running event loop of the current thread, no locking is required as the current task is stored in the thread state and is returned directly (general case). Otherwise, if the `loop` is not current running event loop, the `stop-the-world` pause is used to pause all threads in free-threading and then by iterating over all the thread states and checking if the `loop` matches with `tstate->asyncio_current_loop`, the current task is found and returned. If no matching thread state is found, `None` is returned.
117
-
118
-
In free-threading, it avoids contention on a global dictionary as threads can access the current task of thier running loop without any locking.
192
+
When a task is entered or left, the current task is updated in the
193
+
thread state using `enter_task` and `leave_task` functions. When
194
+
`current_task(loop)` is called where `loop` is the current running
195
+
event loop of the current thread, no locking is required as the
196
+
current task is stored in the thread state and is returned directly
197
+
(general case). Otherwise, if the `loop` is not current running event
198
+
loop, the `stop-the-world` pause is used to pause all threads in
199
+
free-threading and then by iterating over all the thread states and
200
+
checking if the `loop` matches with `tstate->asyncio_current_loop`,
201
+
the current task is found and returned. If no matching thread state is
202
+
found, `None` is returned.
203
+
204
+
In free-threading, it avoids contention on a global dictionary as
205
+
threads can access the current task of thier running loop without any
0 commit comments