Revision | c02c112a2ca66da9bf0843d428e27eac5107b365 (tree) |
---|---|
Time | 2020-03-06 19:05:12 |
Author | Peter Maydell <peter.maydell@lina...> |
Commiter | Peter Maydell |
docs/system: Convert security.texi to rST format
security.texi is included from qemu-doc.texi but is not used
in the qemu.1 manpage. So we can do a straightforward conversion
of the contents, which go into the system manual.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-17-peter.maydell@linaro.org
Message-id: 20200226113034.6741-16-pbonzini@redhat.com
@@ -14,4 +14,5 @@ Contents: | ||
14 | 14 | .. toctree:: |
15 | 15 | :maxdepth: 2 |
16 | 16 | |
17 | + security | |
17 | 18 | vfio-ap |
@@ -0,0 +1,173 @@ | ||
1 | +Security | |
2 | +======== | |
3 | + | |
4 | +Overview | |
5 | +-------- | |
6 | + | |
7 | +This chapter explains the security requirements that QEMU is designed to meet | |
8 | +and principles for securely deploying QEMU. | |
9 | + | |
10 | +Security Requirements | |
11 | +--------------------- | |
12 | + | |
13 | +QEMU supports many different use cases, some of which have stricter security | |
14 | +requirements than others. The community has agreed on the overall security | |
15 | +requirements that users may depend on. These requirements define what is | |
16 | +considered supported from a security perspective. | |
17 | + | |
18 | +Virtualization Use Case | |
19 | +''''''''''''''''''''''' | |
20 | + | |
21 | +The virtualization use case covers cloud and virtual private server (VPS) | |
22 | +hosting, as well as traditional data center and desktop virtualization. These | |
23 | +use cases rely on hardware virtualization extensions to execute guest code | |
24 | +safely on the physical CPU at close-to-native speed. | |
25 | + | |
26 | +The following entities are untrusted, meaning that they may be buggy or | |
27 | +malicious: | |
28 | + | |
29 | +- Guest | |
30 | +- User-facing interfaces (e.g. VNC, SPICE, WebSocket) | |
31 | +- Network protocols (e.g. NBD, live migration) | |
32 | +- User-supplied files (e.g. disk images, kernels, device trees) | |
33 | +- Passthrough devices (e.g. PCI, USB) | |
34 | + | |
35 | +Bugs affecting these entities are evaluated on whether they can cause damage in | |
36 | +real-world use cases and treated as security bugs if this is the case. | |
37 | + | |
38 | +Non-virtualization Use Case | |
39 | +''''''''''''''''''''''''''' | |
40 | + | |
41 | +The non-virtualization use case covers emulation using the Tiny Code Generator | |
42 | +(TCG). In principle the TCG and device emulation code used in conjunction with | |
43 | +the non-virtualization use case should meet the same security requirements as | |
44 | +the virtualization use case. However, for historical reasons much of the | |
45 | +non-virtualization use case code was not written with these security | |
46 | +requirements in mind. | |
47 | + | |
48 | +Bugs affecting the non-virtualization use case are not considered security | |
49 | +bugs at this time. Users with non-virtualization use cases must not rely on | |
50 | +QEMU to provide guest isolation or any security guarantees. | |
51 | + | |
52 | +Architecture | |
53 | +------------ | |
54 | + | |
55 | +This section describes the design principles that ensure the security | |
56 | +requirements are met. | |
57 | + | |
58 | +Guest Isolation | |
59 | +''''''''''''''' | |
60 | + | |
61 | +Guest isolation is the confinement of guest code to the virtual machine. When | |
62 | +guest code gains control of execution on the host this is called escaping the | |
63 | +virtual machine. Isolation also includes resource limits such as throttling of | |
64 | +CPU, memory, disk, or network. Guests must be unable to exceed their resource | |
65 | +limits. | |
66 | + | |
67 | +QEMU presents an attack surface to the guest in the form of emulated devices. | |
68 | +The guest must not be able to gain control of QEMU. Bugs in emulated devices | |
69 | +could allow malicious guests to gain code execution in QEMU. At this point the | |
70 | +guest has escaped the virtual machine and is able to act in the context of the | |
71 | +QEMU process on the host. | |
72 | + | |
73 | +Guests often interact with other guests and share resources with them. A | |
74 | +malicious guest must not gain control of other guests or access their data. | |
75 | +Disk image files and network traffic must be protected from other guests unless | |
76 | +explicitly shared between them by the user. | |
77 | + | |
78 | +Principle of Least Privilege | |
79 | +'''''''''''''''''''''''''''' | |
80 | + | |
81 | +The principle of least privilege states that each component only has access to | |
82 | +the privileges necessary for its function. In the case of QEMU this means that | |
83 | +each process only has access to resources belonging to the guest. | |
84 | + | |
85 | +The QEMU process should not have access to any resources that are inaccessible | |
86 | +to the guest. This way the guest does not gain anything by escaping into the | |
87 | +QEMU process since it already has access to those same resources from within | |
88 | +the guest. | |
89 | + | |
90 | +Following the principle of least privilege immediately fulfills guest isolation | |
91 | +requirements. For example, guest A only has access to its own disk image file | |
92 | +``a.img`` and not guest B's disk image file ``b.img``. | |
93 | + | |
94 | +In reality certain resources are inaccessible to the guest but must be | |
95 | +available to QEMU to perform its function. For example, host system calls are | |
96 | +necessary for QEMU but are not exposed to guests. A guest that escapes into | |
97 | +the QEMU process can then begin invoking host system calls. | |
98 | + | |
99 | +New features must be designed to follow the principle of least privilege. | |
100 | +Should this not be possible for technical reasons, the security risk must be | |
101 | +clearly documented so users are aware of the trade-off of enabling the feature. | |
102 | + | |
103 | +Isolation mechanisms | |
104 | +'''''''''''''''''''' | |
105 | + | |
106 | +Several isolation mechanisms are available to realize this architecture of | |
107 | +guest isolation and the principle of least privilege. With the exception of | |
108 | +Linux seccomp, these mechanisms are all deployed by management tools that | |
109 | +launch QEMU, such as libvirt. They are also platform-specific so they are only | |
110 | +described briefly for Linux here. | |
111 | + | |
112 | +The fundamental isolation mechanism is that QEMU processes must run as | |
113 | +unprivileged users. Sometimes it seems more convenient to launch QEMU as | |
114 | +root to give it access to host devices (e.g. ``/dev/net/tun``) but this poses a | |
115 | +huge security risk. File descriptor passing can be used to give an otherwise | |
116 | +unprivileged QEMU process access to host devices without running QEMU as root. | |
117 | +It is also possible to launch QEMU as a non-root user and configure UNIX groups | |
118 | +for access to ``/dev/kvm``, ``/dev/net/tun``, and other device nodes. | |
119 | +Some Linux distros already ship with UNIX groups for these devices by default. | |
120 | + | |
121 | +- SELinux and AppArmor make it possible to confine processes beyond the | |
122 | + traditional UNIX process and file permissions model. They restrict the QEMU | |
123 | + process from accessing processes and files on the host system that are not | |
124 | + needed by QEMU. | |
125 | + | |
126 | +- Resource limits and cgroup controllers provide throughput and utilization | |
127 | + limits on key resources such as CPU time, memory, and I/O bandwidth. | |
128 | + | |
129 | +- Linux namespaces can be used to make process, file system, and other system | |
130 | + resources unavailable to QEMU. A namespaced QEMU process is restricted to only | |
131 | + those resources that were granted to it. | |
132 | + | |
133 | +- Linux seccomp is available via the QEMU ``--sandbox`` option. It disables | |
134 | + system calls that are not needed by QEMU, thereby reducing the host kernel | |
135 | + attack surface. | |
136 | + | |
137 | +Sensitive configurations | |
138 | +------------------------ | |
139 | + | |
140 | +There are aspects of QEMU that can have security implications which users & | |
141 | +management applications must be aware of. | |
142 | + | |
143 | +Monitor console (QMP and HMP) | |
144 | +''''''''''''''''''''''''''''' | |
145 | + | |
146 | +The monitor console (whether used with QMP or HMP) provides an interface | |
147 | +to dynamically control many aspects of QEMU's runtime operation. Many of the | |
148 | +commands exposed will instruct QEMU to access content on the host file system | |
149 | +and/or trigger spawning of external processes. | |
150 | + | |
151 | +For example, the ``migrate`` command allows for the spawning of arbitrary | |
152 | +processes for the purpose of tunnelling the migration data stream. The | |
153 | +``blockdev-add`` command instructs QEMU to open arbitrary files, exposing | |
154 | +their content to the guest as a virtual disk. | |
155 | + | |
156 | +Unless QEMU is otherwise confined using technologies such as SELinux, AppArmor, | |
157 | +or Linux namespaces, the monitor console should be considered to have privileges | |
158 | +equivalent to those of the user account QEMU is running under. | |
159 | + | |
160 | +It is further important to consider the security of the character device backend | |
161 | +over which the monitor console is exposed. It needs to have protection against | |
162 | +malicious third parties which might try to make unauthorized connections, or | |
163 | +perform man-in-the-middle attacks. Many of the character device backends do not | |
164 | +satisfy this requirement and so must not be used for the monitor console. | |
165 | + | |
166 | +The general recommendation is that the monitor console should be exposed over | |
167 | +a UNIX domain socket backend to the local host only. Use of the TCP based | |
168 | +character device backend is inappropriate unless configured to use both TLS | |
169 | +encryption and authorization control policy on client connections. | |
170 | + | |
171 | +In summary, the monitor console is considered a privileged control interface to | |
172 | +QEMU and as such should only be made accessible to a trusted management | |
173 | +application or user. |