path: root/Documentation
diff options
authorJiri Kosina <>2010-04-23 02:08:44 +0200
committerJiri Kosina <>2010-04-23 02:08:44 +0200
commit6c9468e9eb1252eaefd94ce7f06e1be9b0b641b1 (patch)
tree797676a336b050bfa1ef879377c07e541b9075d6 /Documentation
parent4cb3ca7cd7e2cae8d1daf5345ec99a1e8502cf3f (diff)
parentc81eddb0e3728661d1585fbc564449c94165cc36 (diff)
Merge branch 'master' into for-next
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/DMA-API-HOWTO.txt (renamed from Documentation/PCI/PCI-DMA-mapping.txt)0
-rw-r--r--Documentation/fb/efifb.txt (renamed from Documentation/fb/imacfb.txt)14
32 files changed, 846 insertions, 122 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb
index a986e9bbba3d..bcebb9eaedce 100644
--- a/Documentation/ABI/testing/sysfs-bus-usb
+++ b/Documentation/ABI/testing/sysfs-bus-usb
@@ -160,7 +160,7 @@ Description:
match the driver to the device. For example:
# echo "046d c315" > /sys/bus/usb/drivers/foo/remove_id
-What: /sys/bus/usb/device/.../avoid_reset
+What: /sys/bus/usb/device/.../avoid_reset_quirk
Date: December 2009
Contact: Oliver Neukum <>
diff --git a/Documentation/PCI/PCI-DMA-mapping.txt b/Documentation/DMA-API-HOWTO.txt
index 52618ab069ad..52618ab069ad 100644
--- a/Documentation/PCI/PCI-DMA-mapping.txt
+++ b/Documentation/DMA-API-HOWTO.txt
diff --git a/Documentation/DocBook/tracepoint.tmpl b/Documentation/DocBook/tracepoint.tmpl
index 8bca1d5cec09..e8473eae2a20 100644
--- a/Documentation/DocBook/tracepoint.tmpl
+++ b/Documentation/DocBook/tracepoint.tmpl
@@ -16,6 +16,15 @@
+ <author>
+ <firstname>William</firstname>
+ <surname>Cohen</surname>
+ <affiliation>
+ <address>
+ <email></email>
+ </address>
+ </affiliation>
+ </author>
@@ -91,4 +100,8 @@
+ <chapter id="block">
+ <title>Block IO</title>
+ </chapter>
diff --git a/Documentation/RCU/NMI-RCU.txt b/Documentation/RCU/NMI-RCU.txt
index a6d32e65d222..a8536cb88091 100644
--- a/Documentation/RCU/NMI-RCU.txt
+++ b/Documentation/RCU/NMI-RCU.txt
@@ -34,7 +34,7 @@ NMI handler.
cpu = smp_processor_id();
- if (!rcu_dereference(nmi_callback)(regs, cpu))
+ if (!rcu_dereference_sched(nmi_callback)(regs, cpu))
@@ -47,12 +47,13 @@ function pointer. If this handler returns zero, do_nmi() invokes the
default_do_nmi() function to handle a machine-specific NMI. Finally,
preemption is restored.
-Strictly speaking, rcu_dereference() is not needed, since this code runs
-only on i386, which does not need rcu_dereference() anyway. However,
-it is a good documentation aid, particularly for anyone attempting to
-do something similar on Alpha.
+In theory, rcu_dereference_sched() is not needed, since this code runs
+only on i386, which in theory does not need rcu_dereference_sched()
+anyway. However, in practice it is a good documentation aid, particularly
+for anyone attempting to do something similar on Alpha or on systems
+with aggressive optimizing compilers.
-Quick Quiz: Why might the rcu_dereference() be necessary on Alpha,
+Quick Quiz: Why might the rcu_dereference_sched() be necessary on Alpha,
given that the code referenced by the pointer is read-only?
@@ -99,17 +100,21 @@ invoke irq_enter() and irq_exit() on NMI entry and exit, respectively.
Answer to Quick Quiz
- Why might the rcu_dereference() be necessary on Alpha, given
+ Why might the rcu_dereference_sched() be necessary on Alpha, given
that the code referenced by the pointer is read-only?
Answer: The caller to set_nmi_callback() might well have
- initialized some data that is to be used by the
- new NMI handler. In this case, the rcu_dereference()
- would be needed, because otherwise a CPU that received
- an NMI just after the new handler was set might see
- the pointer to the new NMI handler, but the old
- pre-initialized version of the handler's data.
- More important, the rcu_dereference() makes it clear
- to someone reading the code that the pointer is being
- protected by RCU.
+ initialized some data that is to be used by the new NMI
+ handler. In this case, the rcu_dereference_sched() would
+ be needed, because otherwise a CPU that received an NMI
+ just after the new handler was set might see the pointer
+ to the new NMI handler, but the old pre-initialized
+ version of the handler's data.
+ This same sad story can happen on other CPUs when using
+ a compiler with aggressive pointer-value speculation
+ optimizations.
+ More important, the rcu_dereference_sched() makes it
+ clear to someone reading the code that the pointer is
+ being protected by RCU-sched.
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index cbc180f90194..790d1a812376 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -260,7 +260,8 @@ over a rather long period of time, but improvements are always welcome!
The reason that it is permissible to use RCU list-traversal
primitives when the update-side lock is held is that doing so
can be quite helpful in reducing code bloat when common code is
- shared between readers and updaters.
+ shared between readers and updaters. Additional primitives
+ are provided for this case, as discussed in lockdep.txt.
10. Conversely, if you are in an RCU read-side critical section,
and you don't hold the appropriate update-side lock, you -must-
@@ -344,8 +345,8 @@ over a rather long period of time, but improvements are always welcome!
requiring SRCU's read-side deadlock immunity or low read-side
realtime latency.
- Note that, rcu_assign_pointer() and rcu_dereference() relate to
- SRCU just as they do to other forms of RCU.
+ Note that, rcu_assign_pointer() relates to SRCU just as they do
+ to other forms of RCU.
15. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
index fe24b58627bd..d7a49b2f6994 100644
--- a/Documentation/RCU/lockdep.txt
+++ b/Documentation/RCU/lockdep.txt
@@ -32,9 +32,20 @@ checking of rcu_dereference() primitives:
srcu_dereference(p, sp):
Check for SRCU read-side critical section.
rcu_dereference_check(p, c):
- Use explicit check expression "c".
+ Use explicit check expression "c". This is useful in
+ code that is invoked by both readers and updaters.
Don't check. (Use sparingly, if at all.)
+ rcu_dereference_protected(p, c):
+ Use explicit check expression "c", and omit all barriers
+ and compiler constraints. This is useful when the data
+ structure cannot change, for example, in code that is
+ invoked only by updaters.
+ rcu_access_pointer(p):
+ Return the value of the pointer and omit all barriers,
+ but retain the compiler constraints that prevent duplicating
+ or coalescsing. This is useful when when testing the
+ value of the pointer itself, for example, against NULL.
The rcu_dereference_check() check expression can be any boolean
expression, but would normally include one of the rcu_read_lock_held()
@@ -59,7 +70,20 @@ In case (1), the pointer is picked up in an RCU-safe manner for vanilla
RCU read-side critical sections, in case (2) the ->file_lock prevents
any change from taking place, and finally, in case (3) the current task
is the only task accessing the file_struct, again preventing any change
-from taking place.
+from taking place. If the above statement was invoked only from updater
+code, it could instead be written as follows:
+ file = rcu_dereference_protected(fdt->fd[fd],
+ lockdep_is_held(&files->file_lock) ||
+ atomic_read(&files->count) == 1);
+This would verify cases #2 and #3 above, and furthermore lockdep would
+complain if this was used in an RCU read-side critical section unless one
+of these two cases held. Because rcu_dereference_protected() omits all
+barriers and compiler constraints, it generates better code than do the
+other flavors of rcu_dereference(). On the other hand, it is illegal
+to use rcu_dereference_protected() if either the RCU-protected pointer
+or the RCU-protected data that it points to can change concurrently.
There are currently only "universal" versions of the rcu_assign_pointer()
and RCU list-/tree-traversal primitives, which do not (yet) check for
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 1dc00ee97163..cfaac34c4557 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -840,6 +840,12 @@ SRCU: Initialization/cleanup
+All: lockdep-checked RCU-protected pointer access
+ rcu_dereference_check
+ rcu_dereference_protected
+ rcu_access_pointer
See the comment headers in the source code (or the docbook generated
from them) for more information.
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index 6fab97ea7e6b..508b5b2b0289 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -1162,8 +1162,8 @@ where a driver received a request ala this before:
As mentioned, there is no virtual mapping of a bio. For DMA, this is
not a problem as the driver probably never will need a virtual mapping.
-Instead it needs a bus mapping (pci_map_page for a single segment or
-use blk_rq_map_sg for scatter gather) to be able to ship it to the driver. For
+Instead it needs a bus mapping (dma_map_page for a single segment or
+use dma_map_sg for scatter gather) to be able to ship it to the driver. For
PIO drivers (or drivers that need to revert to PIO transfer once in a
while (IDE for example)), where the CPU is doing the actual data
transfer a virtual mapping is needed. If the driver supports highmem I/O,
diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
new file mode 100644
index 000000000000..8117e5bf6065
--- /dev/null
+++ b/Documentation/circular-buffers.txt
@@ -0,0 +1,234 @@
+ ================
+ ================
+By: David Howells <>
+ Paul E. McKenney <>
+Linux provides a number of features that can be used to implement circular
+buffering. There are two sets of such features:
+ (1) Convenience functions for determining information about power-of-2 sized
+ buffers.
+ (2) Memory barriers for when the producer and the consumer of objects in the
+ buffer don't want to share a lock.
+To use these facilities, as discussed below, there needs to be just one
+producer and just one consumer. It is possible to handle multiple producers by
+serialising them, and to handle multiple consumers by serialising them.
+ (*) What is a circular buffer?
+ (*) Measuring power-of-2 buffers.
+ (*) Using memory barriers with circular buffers.
+ - The producer.
+ - The consumer.
+First of all, what is a circular buffer? A circular buffer is a buffer of
+fixed, finite size into which there are two indices:
+ (1) A 'head' index - the point at which the producer inserts items into the
+ buffer.
+ (2) A 'tail' index - the point at which the consumer finds the next item in
+ the buffer.
+Typically when the tail pointer is equal to the head pointer, the buffer is
+empty; and the buffer is full when the head pointer is one less than the tail
+The head index is incremented when items are added, and the tail index when
+items are removed. The tail index should never jump the head index, and both
+indices should be wrapped to 0 when they reach the end of the buffer, thus
+allowing an infinite amount of data to flow through the buffer.
+Typically, items will all be of the same unit size, but this isn't strictly
+required to use the techniques below. The indices can be increased by more
+than 1 if multiple items or variable-sized items are to be included in the
+buffer, provided that neither index overtakes the other. The implementer must
+be careful, however, as a region more than one unit in size may wrap the end of
+the buffer and be broken into two segments.
+Calculation of the occupancy or the remaining capacity of an arbitrarily sized
+circular buffer would normally be a slow operation, requiring the use of a
+modulus (divide) instruction. However, if the buffer is of a power-of-2 size,
+then a much quicker bitwise-AND instruction can be used instead.
+Linux provides a set of macros for handling power-of-2 circular buffers. These
+can be made use of by:
+ #include <linux/circ_buf.h>
+The macros are:
+ (*) Measure the remaining capacity of a buffer:
+ CIRC_SPACE(head_index, tail_index, buffer_size);
+ This returns the amount of space left in the buffer[1] into which items
+ can be inserted.
+ (*) Measure the maximum consecutive immediate space in a buffer:
+ CIRC_SPACE_TO_END(head_index, tail_index, buffer_size);
+ This returns the amount of consecutive space left in the buffer[1] into
+ which items can be immediately inserted without having to wrap back to the
+ beginning of the buffer.
+ (*) Measure the occupancy of a buffer:
+ CIRC_CNT(head_index, tail_index, buffer_size);
+ This returns the number of items currently occupying a buffer[2].
+ (*) Measure the non-wrapping occupancy of a buffer:
+ CIRC_CNT_TO_END(head_index, tail_index, buffer_size);
+ This returns the number of consecutive items[2] that can be extracted from
+ the buffer without having to wrap back to the beginning of the buffer.
+Each of these macros will nominally return a value between 0 and buffer_size-1,
+ [1] CIRC_SPACE*() are intended to be used in the producer. To the producer
+ they will return a lower bound as the producer controls the head index,
+ but the consumer may still be depleting the buffer on another CPU and
+ moving the tail index.
+ To the consumer it will show an upper bound as the producer may be busy
+ depleting the space.
+ [2] CIRC_CNT*() are intended to be used in the consumer. To the consumer they
+ will return a lower bound as the consumer controls the tail index, but the
+ producer may still be filling the buffer on another CPU and moving the
+ head index.
+ To the producer it will show an upper bound as the consumer may be busy
+ emptying the buffer.
+ [3] To a third party, the order in which the writes to the indices by the
+ producer and consumer become visible cannot be guaranteed as they are
+ independent and may be made on different CPUs - so the result in such a
+ situation will merely be a guess, and may even be negative.
+By using memory barriers in conjunction with circular buffers, you can avoid
+the need to:
+ (1) use a single lock to govern access to both ends of the buffer, thus
+ allowing the buffer to be filled and emptied at the same time; and
+ (2) use atomic counter operations.
+There are two sides to this: the producer that fills the buffer, and the
+consumer that empties it. Only one thing should be filling a buffer at any one
+time, and only one thing should be emptying a buffer at any one time, but the
+two sides can operate simultaneously.
+The producer will look something like this:
+ spin_lock(&producer_lock);
+ unsigned long head = buffer->head;
+ unsigned long tail = ACCESS_ONCE(buffer->tail);
+ if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
+ /* insert one item into the buffer */
+ struct item *item = buffer[head];
+ produce_item(item);
+ smp_wmb(); /* commit the item before incrementing the head */
+ buffer->head = (head + 1) & (buffer->size - 1);
+ /* wake_up() will make sure that the head is committed before
+ * waking anyone up */
+ wake_up(consumer);
+ }
+ spin_unlock(&producer_lock);
+This will instruct the CPU that the contents of the new item must be written
+before the head index makes it available to the consumer and then instructs the
+CPU that the revised head index must be written before the consumer is woken.
+Note that wake_up() doesn't have to be the exact mechanism used, but whatever
+is used must guarantee a (write) memory barrier between the update of the head
+index and the change of state of the consumer, if a change of state occurs.
+The consumer will look something like this:
+ spin_lock(&consumer_lock);
+ unsigned long head = ACCESS_ONCE(buffer->head);
+ unsigned long tail = buffer->tail;
+ if (CIRC_CNT(head, tail, buffer->size) >= 1) {
+ /* read index before reading contents at that index */
+ smp_read_barrier_depends();
+ /* extract one item from the buffer */
+ struct item *item = buffer[tail];
+ consume_item(item);
+ smp_mb(); /* finish reading descriptor before incrementing tail */
+ buffer->tail = (tail + 1) & (buffer->size - 1);
+ }
+ spin_unlock(&consumer_lock);
+This will instruct the CPU to make sure the index is up to date before reading
+the new item, and then it shall make sure the CPU has finished reading the item
+before it writes the new tail pointer, which will erase the item.
+Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
+This prevents the compiler from discarding and reloading its cached value -
+which some compilers will do across smp_read_barrier_depends(). This isn't
+strictly needed if you can be sure that the opposition index will _only_ be
+used the once.
+See also Documentation/memory-barriers.txt for a description of Linux's memory
+barrier facilities.
diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c
index b07add3467f1..7764594778d4 100644
--- a/Documentation/connector/cn_test.c
+++ b/Documentation/connector/cn_test.c
@@ -25,6 +25,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/skbuff.h>
+#include <linux/slab.h>
#include <linux/timer.h>
#include <linux/connector.h>
diff --git a/Documentation/fb/imacfb.txt b/Documentation/fb/efifb.txt
index 316ec9bb7deb..a59916c29b33 100644
--- a/Documentation/fb/imacfb.txt
+++ b/Documentation/fb/efifb.txt
@@ -1,9 +1,9 @@
-What is imacfb?
+What is efifb?
This is a generic EFI platform driver for Intel based Apple computers.
-Imacfb is only for EFI booted Intel Macs.
+efifb is only for EFI booted Intel Macs.
Supported Hardware
@@ -16,16 +16,16 @@ MacMini
How to use it?
-Imacfb does not have any kind of autodetection of your machine.
+efifb does not have any kind of autodetection of your machine.
You have to add the following kernel parameters in your elilo.conf:
Macbook :
- video=imacfb:macbook
+ video=efifb:macbook
MacMini :
- video=imacfb:mini
+ video=efifb:mini
Macbook Pro 15", iMac 17" :
- video=imacfb:i17
+ video=efifb:i17
Macbook Pro 17", iMac 20" :
- video=imacfb:i20
+ video=efifb:i20
Edgar Hucek <>
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index a5cc0db63d7a..ed511af0f79a 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -582,3 +582,10 @@ Why: The paravirt mmu host support is slower than non-paravirt mmu, both
Who: Avi Kivity <>
+What: "acpi=ht" boot option
+When: 2.6.35
+Why: Useful in 2003, implementation is a hack.
+ Generally invoked by accident today.
+ Seen as doing more harm than good.
+Who: Len Brown <>
diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX
index 3bae418c6ad3..4303614b5add 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -16,6 +16,8 @@ befs.txt
- information about the BeOS filesystem for Linux.
- info for the SCO UnixWare Boot Filesystem (BFS).
+ - info for the Ceph Distributed File System
- description of the CIFS filesystem.
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 57e0b80a5274..c0236e753bc8 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -37,6 +37,15 @@ For Plan 9 From User Space applications (
mount -t 9p `namespace`/acme /mnt/9 -o trans=unix,uname=$USER
+For server running on QEMU host with virtio transport:
+ mount -t 9p -o trans=virtio <mount_tag> /mnt/9
+where mount_tag is the tag associated by the server to each of the exported
+mount points. Each 9P export is seen by the client as a virtio device with an
+associated "mount_tag" property. Available mount tags can be
+seen by reading /sys/bus/virtio/drivers/9pnet_virtio/virtio<n>/mount_tag files.
@@ -47,7 +56,7 @@ OPTIONS
fd - used passed file descriptors for connection
(see rfdno and wfdno)
virtio - connect to the next virtio channel available
- (from lguest or KVM with trans_virtio module)
+ (from QEMU with trans_virtio module)
rdma - connect to a specified RDMA channel
uname=name user name to attempt mount as on the remote server. The
@@ -85,7 +94,12 @@ OPTIONS
port=n port to connect to on the remote server
- noextend force legacy mode (no 9p2000.u semantics)
+ noextend force legacy mode (no 9p2000.u or 9p2000.L semantics)
+ version=name Select 9P protocol version. Valid options are:
+ 9p2000 - Legacy mode (same as noextend)
+ 9p2000.u - Use 9P2000.u protocol
+ 9p2000.L - Use 9P2000.L protocol
dfltuid attempt to mount as a particular uid
diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt
new file mode 100644
index 000000000000..0660c9f5deef
--- /dev/null
+++ b/Documentation/filesystems/ceph.txt
@@ -0,0 +1,140 @@
+Ceph Distributed File System
+Ceph is a distributed network file system designed to provide good
+performance, reliability, and scalability.
+Basic features include:
+ * POSIX semantics
+ * Seamless scaling from 1 to many thousands of nodes
+ * High availability and reliability. No single point of failure.
+ * N-way replication of data across storage nodes
+ * Fast recovery from node failures
+ * Automatic rebalancing of data on node addition/removal
+ * Easy deployment: most FS components are userspace daemons
+ * Flexible snapshots (on any directory)
+ * Recursive accounting (nested files, directories, bytes)
+In contrast to cluster filesystems like GFS, OCFS2, and GPFS that rely
+on symmetric access by all clients to shared block devices, Ceph
+separates data and metadata management into independent server
+clusters, similar to Lustre. Unlike Lustre, however, metadata and
+storage nodes run entirely as user space daemons. Storage nodes
+utilize btrfs to store data objects, leveraging its advanced features
+(checksumming, metadata replication, etc.). File data is striped
+across storage nodes in large chunks to distribute workload and
+facilitate high throughputs. When storage nodes fail, data is
+re-replicated in a distributed fashion by the storage nodes themselves
+(with some minimal coordination from a cluster monitor), making the
+system extremely efficient and scalable.
+Metadata servers effectively form a large, consistent, distributed
+in-memory cache above the file namespace that is extremely scalable,
+dynamically redistributes metadata in response to workload changes,
+and can tolerate arbitrary (well, non-Byzantine) node failures. The
+metadata server takes a somewhat unconventional approach to metadata
+storage to significantly improve performance for common workloads. In
+particular, inodes with only a single link are embedded in
+directories, allowing entire directories of dentries and inodes to be
+loaded into its cache with a single I/O operation. The contents of
+extremely large directories can be fragmented and managed by
+independent metadata servers, allowing scalable concurrent access.
+The system offers automatic data rebalancing/migration when scaling
+from a small cluster of just a few nodes to many hundreds, without
+requiring an administrator carve the data set into static volumes or
+go through the tedious process of migrating data between servers.
+When the file system approaches full, new nodes can be easily added
+and things will "just work."
+Ceph includes flexible snapshot mechanism that allows a user to create
+a snapshot on any subdirectory (and its nested contents) in the
+system. Snapshot creation and deletion are as simple as 'mkdir
+.snap/foo' and 'rmdir .snap/foo'.
+Ceph also provides some recursive accounting on directories for nested
+files and bytes. That is, a 'getfattr -d foo' on any directory in the
+system will reveal the total number of nested regular files and
+subdirectories, and a summation of all nested file sizes. This makes
+the identification of large disk space consumers relatively quick, as
+no 'du' or similar recursive scan of the file system is required.
+Mount Syntax
+The basic mount syntax is:
+ # mount -t ceph monip[:port][,monip2[:port]...]:/[subdir] mnt
+You only need to specify a single monitor, as the client will get the
+full list when it connects. (However, if the monitor you specify
+happens to be down, the mount won't succeed.) The port can be left
+off if the monitor is using the default. So if the monitor is at
+ # mount -t ceph /mnt/ceph
+is sufficient. If /sbin/mount.ceph is installed, a hostname can be
+used instead of an IP address.
+Mount Options
+ ip=A.B.C.D[:N]
+ Specify the IP and/or port the client should bind to locally.
+ There is normally not much reason to do this. If the IP is not
+ specified, the client's IP address is determined by looking at the
+ address it's connection to the monitor originates from.
+ wsize=X
+ Specify the maximum write size in bytes. By default there is no
+ maximum. Ceph will normally size writes based on the file stripe
+ size.
+ rsize=X
+ Specify the maximum readahead.
+ mount_timeout=X
+ Specify the timeout value for mount (in seconds), in the case
+ of a non-responsive Ceph file system. The default is 30
+ seconds.
+ rbytes
+ When stat() is called on a directory, set st_size to 'rbytes',
+ the summation of file sizes over all files nested beneath that
+ directory. This is the default.
+ norbytes
+ When stat() is called on a directory, set st_size to the
+ number of entries in that directory.
+ nocrc
+ Disable CRC32C calculation for data writes. If set, the storage node
+ must rely on TCP's error correction to detect data corruption
+ in the data payload.
+ noasyncreaddir
+ Disable client's use its local cache to satisfy readdir
+ requests. (This does not change correctness; the client uses
+ cached metadata only when a lease or capability ensures it is
+ valid.)
+More Information
+For more information on Ceph, see the home page at
+The Linux kernel client source tree is available at
+ git://
+ git://
+and the source for the full system is at
+ git://
diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt
index 3015da0c6b2a..fe09a2cb1858 100644
--- a/Documentation/filesystems/tmpfs.txt
+++ b/Documentation/filesystems/tmpfs.txt
@@ -82,11 +82,13 @@ tmpfs has a mount option to set the NUMA memory allocation policy for
all files in that instance (if CONFIG_NUMA is enabled) - which can be
adjusted on the fly via 'mount -o remount ...'
-mpol=default prefers to allocate memory from the local node
+mpol=default use the process allocation policy
+ (see set_mempolicy(2))
mpol=prefer:Node prefers to allocate memory from the given Node
mpol=bind:NodeList allocates memory only from nodes in NodeList
mpol=interleave prefers to allocate from each node in turn
mpol=interleave:NodeList allocates from each node of NodeList in turn
+mpol=local prefers to allocate memory from the local node
NodeList format is a comma-separated list of decimal numbers and ranges,
a range being two hyphen-separated decimal numbers, the smallest and
@@ -134,3 +136,5 @@ Author:
Christoph Rohland <>, 1.12.01
Hugh Dickins, 4 June 2007
+ KOSAKI Motohiro, 16 Mar 2010
diff --git a/Documentation/input/multi-touch-protocol.txt b/Documentation/input/multi-touch-protocol.txt
index 8490480ce432..c0fc1c75fd88 100644
--- a/Documentation/input/multi-touch-protocol.txt
+++ b/Documentation/input/multi-touch-protocol.txt
@@ -68,6 +68,22 @@ like:
+Here is the sequence after lifting one of the fingers:
+And here is the sequence after lifting the remaining finger:
+If the driver reports one of BTN_TOUCH or ABS_PRESSURE in addition to the
+ABS_MT events, the last SYN_MT_REPORT event may be omitted. Otherwise, the
+last SYN_REPORT will be dropped by the input core, resulting in no
+zero-finger event reaching userland.
Event Semantics
@@ -217,11 +233,6 @@ where examples can be found.
difference between the contact position and the approaching tool position
could be used to derive tilt.
[2] The list can of course be extended.
-[3] The multi-touch X driver is currently in the prototyping stage. At the
-time of writing (April 2009), the MT protocol is not yet merged, and the
-prototype implements finger matching, basic mouse support and two-finger
-scrolling. The project aims at improving the quality of current multi-touch
-functionality available in the Synaptics X driver, and in addition
-implement more advanced gestures.
+[3] Multitouch X driver project:
[4] See the section on event computation.
[5] See the section on finger tracking.
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 35c9b51d20ea..dd5806f4fcc4 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -291,6 +291,7 @@ Code Seq#(hex) Include File Comments
0x92 00-0F drivers/usb/mon/mon_bin.c
0x93 60-7F linux/auto_fs.h
0x94 all fs/btrfs/ioctl.h
+0x97 00-7F fs/ceph/ioctl.h Ceph file system
0x99 00-0F 537-Addinboard driver
0xA0 all linux/sdp/sdp.h Industrial Device Project
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 3bc48b0bd3a9..e2202e93b148 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -200,10 +200,6 @@ and is between 256 and 4096 characters. It is defined in the file
See above.
- acpi_early_pdc_eval [HW,ACPI] Evaluate processor _PDC methods
- early. Needed on some platforms to properly
- initialize the EC.
acpi_irq_balance [HW,ACPI]
ACPI will balance active IRQs
default in APIC mode
@@ -324,11 +320,6 @@ and is between 256 and 4096 characters. It is defined in the file
amd_iommu= [HW,X86-84]
Pass parameters to the AMD IOMMU driver in the system.
Possible values are:
- isolate - enable device isolation (each device, as far
- as possible, will get its own protection
- domain) [default]
- share - put every device behind one IOMMU into the
- same protection domain
fullflush - enable flushing of IO/TLB entries when
they are unmapped. Otherwise they are
flushed before they will be reused, which
diff --git a/Documentation/kobject.txt b/Documentation/kobject.txt
index bdb13817e1e9..3ab2472509cb 100644
--- a/Documentation/kobject.txt
+++ b/Documentation/kobject.txt
@@ -59,37 +59,56 @@ nice to have in other objects. The C language does not allow for the
direct expression of inheritance, so other techniques - such as structure
embedding - must be used.
-So, for example, the UIO code has a structure that defines the memory
-region associated with a uio device:
+(As an aside, for those familiar with the kernel linked list implementation,
+this is analogous as to how "list_head" structs are rarely useful on
+their own, but are invariably found embedded in the larger objects of
-struct uio_mem {
+So, for example, the UIO code in drivers/uio/uio.c has a structure that
+defines the memory region associated with a uio device:
+ struct uio_map {
struct kobject kobj;
- unsigned long addr;
- unsigned long size;
- int memtype;
- void __iomem *internal_addr;
+ struct uio_mem *mem;
+ };
-If you have a struct uio_mem structure, finding its embedded kobject is
+If you have a struct uio_map structure, finding its embedded kobject is
just a matter of using the kobj member. Code that works with kobjects will
often have the opposite problem, however: given a struct kobject pointer,
what is the pointer to the containing structure? You must avoid tricks
(such as assuming that the kobject is at the beginning of the structure)
and, instead, use the container_of() macro, found in <linux/kernel.h>:
- container_of(pointer, type, member)
+ container_of(pointer, type, member)
+ * "pointer" is the pointer to the embedded kobject,
+ * "type" is the type of the containing structure, and
+ * "member" is the name of the structure field to which "pointer" points.
+The return value from container_of() is a pointer to the corresponding
+container type. So, for example, a pointer "kp" to a struct kobject
+embedded *within* a struct uio_map could be converted to a pointer to the
+*containing* uio_map structure with:
+ struct uio_map *u_map = container_of(kp, struct uio_map, kobj);
+For convenience, programmers often define a simple macro for "back-casting"
+kobject pointers to the containing type. Exactly this happens in the
+earlier drivers/uio/uio.c, as you can see here:
+ struct uio_map {
+ struct kobject kobj;
+ struct uio_mem *mem;
+ };
-where pointer is the pointer to the embedded kobject, type is the type of
-the containing structure, and member is the name of the structure field to
-which pointer points. The return value from container_of() is a pointer to
-the given type. So, for example, a pointer "kp" to a struct kobject
-embedded within a struct uio_mem could be converted to a pointer to the
-containing uio_mem structure with:
+ #define to_map(map) container_of(map, struct uio_map, kobj)
- struct uio_mem *u_mem = container_of(kp, struct uio_mem, kobj);
+where the macro argument "map" is a pointer to the struct kobject in
+question. That macro is subsequently invoked with:
-Programmers often define a simple macro for "back-casting" kobject pointers
-to the containing type.
+ struct uio_map *map = to_map(kobj);
Initialization of kobjects
@@ -387,4 +406,5 @@ called, and the objects in the former circle release each other.
Example code to copy from
For a more complete example of using ksets and kobjects properly, see the
-sample/kobject/kset-example.c code.
+example programs samples/kobject/{kobject-example.c,kset-example.c},
+which will be built as loadable modules if you select CONFIG_SAMPLE_KOBJECT.
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 7f5809eddee6..631ad2f1b229 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -3,6 +3,7 @@
By: David Howells <>
+ Paul E. McKenney <>
@@ -60,6 +61,10 @@ Contents:
- And then there's the Alpha.
+ (*) Example uses.
+ - Circular buffers.
(*) References.
@@ -2226,6 +2231,21 @@ The Alpha defines the Linux kernel's memory barrier model.
See the subsection on "Cache Coherency" above.
+Memory barriers can be used to implement circular buffering without the need
+of a lock to serialise the producer with the consumer. See:
+ Documentation/circular-buffers.txt
+for details.
diff --git a/Documentation/networking/Makefile b/Documentation/networking/Makefile
index 6d8af1ac56c4..5aba7a33aeeb 100644
--- a/Documentation/networking/Makefile
+++ b/Documentation/networking/Makefile
@@ -6,3 +6,5 @@ hostprogs-y := ifenslave
# Tell kbuild to always build the programs
always := $(hostprogs-y)
+obj-m := timestamping/
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
new file mode 100644
index 000000000000..7ee770b5ef5f
--- /dev/null
+++ b/Documentation/networking/stmmac.txt
@@ -0,0 +1,143 @@
+ STMicroelectronics 10/100/1000 Synopsys Ethernet driver
+Copyright (C) 2007-2010 STMicroelectronics Ltd
+Author: Giuseppe Cavallaro <>
+This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
+(Synopsys IP blocks); it has been fully tested on STLinux platforms.
+Currently this network device driver is for all STM embedded MAC/GMAC
+(7xxx SoCs).
+DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100
+Universal version 4.0 have been used for developing the first code
+Please, for more information also visit:
+1) Kernel Configuration
+The kernel configuration option is STMMAC_ETH:
+ Device Drivers ---> Network device support ---> Ethernet (1000 Mbit) --->
+ STMicroelectronics 10/100/1000 Ethernet driver (STMMAC_ETH)
+2) Driver parameters list:
+ debug: message level (0: no output, 16: all);
+ phyaddr: to manually provide the physical address to the PHY device;
+ dma_rxsize: DMA rx ring size;
+ dma_txsize: DMA tx ring size;
+ buf_sz: DMA buffer size;
+ tc: control the HW FIFO threshold;
+ tx_coe: Enable/Disable Tx Checksum Offload engine;
+ watchdog: transmit timeout (in milliseconds);
+ flow_ctrl: Flow control ability [on/off];
+ pause: Flow Control Pause Time;
+ tmrate: timer period (only if timer optimisation is configured).
+3) Command line options
+Driver parameters can be also passed in command line by using:
+ stmmaceth=dma_rxsize:128,dma_txsize:512
+4) Driver information and notes
+4.1) Transmit process
+The xmit method is invoked when the kernel needs to transmit a packet; it sets
+the descriptors in the ring and informs the DMA engine that there is a packet
+ready to be transmitted.
+Once the controller has finished transmitting the packet, an interrupt is
+triggered; So the driver will be able to release the socket buffers.
+By default, the driver sets the NETIF_F_SG bit in the features field of the
+net_device structure enabling the scatter/gather feature.
+4.2) Receive process
+When one or more packets are received, an interrupt happens. The interrupts
+are not queued so the driver has to scan all the descriptors in the ring during
+the receive process.
+This is based on NAPI so the interrupt handler signals only if there is work to be
+done, and it exits.
+Then the poll method will be scheduled at some future point.
+The incoming packets are stored, by the DMA, in a list of pre-allocated socket
+buffers in order to avoid the memcpy (Zero-copy).
+4.3) Timer-Driver Interrupt
+Instead of having the device that asynchronously notifies the frame receptions, the
+driver configures a timer to generate an interrupt at regular intervals.
+Based on the granularity of the timer, the frames that are received by the device
+will experience different levels of latency. Some NICs have dedicated timer
+device to perform this task. STMMAC can use either the RTC device or the TMU
+channel 2 on STLinux platforms.
+The timers frequency can be passed to the driver as parameter; when change it,
+take care of both hardware capability and network stability/performance impact.
+Several performance tests on STM platforms showed this optimisation allows to spare
+the CPU while having the maximum throughput.
+4.4) WOL
+Wake up on Lan feature through Magic Frame is only supported for the GMAC
+4.5) DMA descriptors
+Driver handles both normal and enhanced descriptors. The latter has been only
+tested on DWC Ether MAC 10/100/1000 Universal version 3.41a.
+4.6) Ethtool support
+Ethtool is supported. Driver statistics and internal errors can be taken using:
+ethtool -S ethX command. It is possible to dump registers etc.
+4.7) Jumbo and Segmentation Offloading
+Jumbo frames are supported and tested for the GMAC.
+The GSO has been also added but it's performed in software.
+LRO is not supported.
+4.8) Physical
+The driver is compatible with PAL to work with PHY and GPHY devices.
+4.9) Platform information
+Several information came from the platform; please refer to the
+driver's Header file in include/linux directory.
+struct plat_stmmacenet_data {
+ int bus_id;
+ int pbl;
+ int has_gmac;
+ void (*fix_mac_speed)(void *priv, unsigned int speed);
+ void (*bus_setup)(unsigned long ioaddr);
+ struct stm_pad_config *pad_config;
+ void *bsp_priv;
+- pbl (Programmable Burst Length) is maximum number of
+ beats to be transferred in one DMA transaction.
+ GMAC also enables the 4xPBL by default.
+- fix_mac_speed and bus_setup are used to configure internal target
+ registers (on STM platforms);
+- has_gmac: GMAC core is on board (get it at run-time in the next step);
+- bus_id: bus identifier.
+struct plat_stmmacphy_data {
+ int bus_id;
+ int phy_addr;
+ unsigned int phy_mask;
+ int interface;
+ int (*phy_reset)(void *priv);
+ void *priv;
+- bus_id: bus identifier;
+- phy_addr: physical address used for the attached phy device;
+ set it to -1 to get it at run-time;
+- interface: physical MII interface mode;
+- phy_reset: hook to reset HW function.
+- Continue to make the driver more generic and suitable for other Synopsys
+ Ethernet controllers used on other architectures (i.e. ARM).
+- 10G controllers are not supported.
+- MAC uses Normal descriptors and GMAC uses enhanced ones.
+ This is a limit that should be reviewed. MAC could want to
+ use the enhanced structure.
+- Checksumming: Rx/Tx csum is done in HW in case of GMAC only.
+- Review the timer optimisation code to use an embedded device that seems to be
+ available in new chip generations.
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 0e58b4539176..e8c8f4f06c67 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -41,11 +41,12 @@ SOF_TIMESTAMPING_SOFTWARE: return system time stamp generated in
SOF_TIMESTAMPING_TX/RX determine how time stamps are generated.
SOF_TIMESTAMPING_RAW/SYS determine how they are reported in the
following control message:
- struct scm_timestamping {
- struct timespec systime;
- struct timespec hwtimetrans;
- struct timespec hwtimeraw;
- };
+struct scm_timestamping {
+ struct timespec systime;
+ struct timespec hwtimetrans;
+ struct timespec hwtimeraw;
recvmsg() can be used to get this control message for regular incoming
packets. For send time stamps the outgoing packet is looped back to
@@ -87,12 +88,13 @@ by the network device and will be empty without that support.
Hardware time stamping must also be initialized for each device driver
-that is expected to do hardware time stamping. The parameter is:
+that is expected to do hardware time stamping. The parameter is defined in
+/include/linux/net_tstamp.h as:
struct hwtstamp_config {
- int flags; /* no flags defined right now, must be zero */
- int tx_type; /* HWTSTAMP_TX_* */
- int rx_filter; /* HWTSTAMP_FILTER_* */
+ int flags; /* no flags defined right now, must be zero */
+ int tx_type; /* HWTSTAMP_TX_* */
+ int rx_filter; /* HWTSTAMP_FILTER_* */
Desired behavior is passed into the kernel and to a specific device by
@@ -139,42 +141,56 @@ enum {
/* time stamp any incoming packet */
- /* return value: time stamp all packets requested plus some others */
+ /* return value: time stamp all packets requested plus some others */
/* PTP v1, UDP, any kind of event packet */
- ...
+ /* for the complete list of values, please check
+ * the include file /include/linux/net_tstamp.h
+ */
A driver which supports hardware time stamping must support the
-SIOCSHWTSTAMP ioctl. Time stamps for received packets must be stored
-in the skb with skb_hwtstamp_set().
+SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with
+the actual values as described in the section on SIOCSHWTSTAMP.
+Time stamps for received packets must be stored in the skb. To get a pointer
+to the shared time stamp structure of the skb call skb_hwtstamps(). Then
+set the time stamps in the structure:
+struct skb_shared_hwtstamps {
+ /* hardware time stamp transformed into duration
+ * since arbitrary point in time
+ */
+ ktime_t hwtstamp;
+ ktime_t syststamp; /* hwtstamp transformed to system time base */
Time stamps for outgoing packets are to be generated as follows:
-- In hard_start_xmit(), check if skb_hwtstamp_check_tx_hardware()
- returns non-zero. If yes, then the driver is expected
- to do hardware time stamping.
+- In hard_start_xmit(), check if skb_tx(skb)->hardware is set no-zero.
+ If yes, then the driver is expected to do hardware time stamping.
- If this is possible for the skb and requested, then declare
- that the driver is doing the time stamping by calling
- skb_hwtstamp_tx_in_progress(). A driver not supporting
- hardware time stamping doesn't do that. A driver must never
- touch sk_buff::tstamp! It is used to store how time stamping
- for an outgoing packets is to be done.
+ that the driver is doing the time stamping by setting the field
+ skb_tx(skb)->in_progress non-zero. You might want to keep a pointer
+ to the associated skb for the next step and not free the skb. A driver
+ not supporting hardware time stamping doesn't do that. A driver must
+ never touch sk_buff::tstamp! It is used to store software generated
+ time stamps by the network subsystem.
- As soon as the driver has sent the packet and/or obtained a
hardware time stamp for it, it passes the time stamp back by
calling skb_hwtstamp_tx() with the original skb, the raw
- hardware time stamp and a handle to the device (necessary
- to convert the hardware time stamp to system time). If obtaining
- the hardware time stamp somehow fails, then the driver should
- not fall back to software time stamping. The rationale is that
- this would occur at a later time in the processing pipeline
- than other software time stamping and therefore could lead
- to unexpected deltas between time stamps.
-- If the driver did not call skb_hwtstamp_tx_in_progress(), then
+ hardware time stamp. skb_hwtstamp_tx() clones the original skb and
+ adds the timestamps, therefore the original skb has to be freed now.
+ If obtaining the hardware time stamp somehow fails, then the driver
+ should not fall back to software time stamping. The rationale is that
+ this would occur at a later time in the processing pipeline than other
+ software time stamping and therefore could lead to unexpected deltas
+ between time stamps.
+- If the driver did not call set skb_tx(skb)->in_progress, then
dev_hard_start_xmit() checks whether software time stamping
is wanted as fallback and potentially generates the time stamp.
diff --git a/Documentation/networking/timestamping/Makefile b/Documentation/networking/timestamping/Makefile
index 2a1489fdc036..e79973443e9f 100644
--- a/Documentation/networking/timestamping/Makefile
+++ b/Documentation/networking/timestamping/Makefile
@@ -1,6 +1,13 @@
-CPPFLAGS = -I../../../include
+# kbuild trick to avoid linker error. Can be omitted if a module is built.
+obj- := dummy.o
-timestamping: timestamping.c
+# List of programs to build
+hostprogs-y := timestamping
+# Tell kbuild to always build the programs
+always := $(hostprogs-y)
+HOSTCFLAGS_timestamping.o += -I$(objtree)/usr/include
rm -f timestamping
diff --git a/Documentation/networking/timestamping/timestamping.c b/Documentation/networking/timestamping/timestamping.c
index bab619a48214..8ba82bfe6a33 100644
--- a/Documentation/networking/timestamping/timestamping.c
+++ b/Documentation/networking/timestamping/timestamping.c
@@ -41,9 +41,9 @@
#include <arpa/inet.h>
#include <net/if.h>
-#include "asm/types.h"
-#include "linux/net_tstamp.h"
-#include "linux/errqueue.h"
+#include <asm/types.h>
+#include <linux/net_tstamp.h>
+#include <linux/errqueue.h>
@@ -164,7 +164,7 @@ static void printpacket(struct msghdr *msg, int res,
gettimeofday(&now, 0);
- printf("%ld.%06ld: received %s data, %d bytes from %s, %d bytes control messages\n",
+ printf("%ld.%06ld: received %s data, %d bytes from %s, %zu bytes control messages\n",
(long)now.tv_sec, (long)now.tv_usec,
(recvmsg_flags & MSG_ERRQUEUE) ? "error" : "regular",
@@ -173,7 +173,7 @@ static void printpacket(struct msghdr *msg, int res,
for (cmsg = CMSG_FIRSTHDR(msg);
cmsg = CMSG_NXTHDR(msg, cmsg)) {
- printf(" cmsg len %d: ", cmsg->cmsg_len);
+ printf(" cmsg len %zu: ", cmsg->cmsg_len);
switch (cmsg->cmsg_level) {
printf("SOL_SOCKET ");
diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
index 6e37be1eeb2d..4f8930263dd9 100644
--- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
+++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
@@ -21,6 +21,15 @@ Required properties:
- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the
+Optional properties:
+- fsl,firmware-phandle:
+ Usage: required only if there is no fsl,qe-firmware child node
+ Value type: <phandle>
+ Definition: Points to a firmware node (see "QE Firmware Node" below)
+ that contains the firmware that should be uploaded for this QE.
+ The compatible property for the firmware node should say,
+ "fsl,qe-firmware".
Recommended properties
- brg-frequency : the internal clock source frequency for baud-rate
generators in Hz.
@@ -59,3 +68,48 @@ Example:
reg = <0 c000>;
+* QE Firmware Node
+This node defines a firmware binary that is embedded in the device tree, for
+the purpose of passing the firmware from bootloader to the kernel, or from
+the hypervisor to the guest.
+The firmware node itself contains the firmware binary contents, a compatible
+property, and any firmware-specific properties. The node should be placed
+inside a QE node that needs it. Doing so eliminates the need for a
+fsl,firmware-phandle property. Other QE nodes that need the same firmware
+should define an fsl,firmware-phandle property that points to the firmware node
+in the first QE node.
+The fsl,firmware property can be specified in the DTS (possibly using incbin)
+or can be inserted by the boot loader at boot time.
+Required properties:
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: A standard property. Specify a string that indicates what
+ kind of firmware it is. For QE, this should be "fsl,qe-firmware".
+ - fsl,firmware
+ Usage: required
+ Value type: <prop-encoded-array>, encoded as an array of bytes
+ Definition: A standard property. This property contains the firmware
+ binary "blob".
+ qe1@e0080000 {
+ compatible = "fsl,qe";
+ qe_firmware:qe-firmware {
+ compatible = "fsl,qe-firmware";
+ fsl,firmware = [0x70 0xcd 0x00 0x00 0x01 0x46 0x45 ...];
+ };
+ ...
+ };
+ qe2@e0090000 {
+ compatible = "fsl,qe";
+ fsl,firmware-phandle = <&qe_firmware>;
+ ...
+ };
diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt
index f4dd3bf99d12..98d14cb8a85d 100644
--- a/Documentation/sound/alsa/HD-Audio.txt
+++ b/Documentation/sound/alsa/HD-Audio.txt
@@ -119,10 +119,18 @@ the codec slots 0 and 1 no matter what the hardware reports.
Interrupt Handling
-In rare but some cases, the interrupt isn't properly handled as
-default. You would notice this by the DMA transfer error reported by
-ALSA PCM core, for example. Using MSI might help in such a case.
-Pass `enable_msi=1` option for enabling MSI.
+HD-audio driver uses MSI as default (if available) since 2.6.33
+kernel as MSI works better on some machines, and in general, it's
+better for performance. However, Nvidia controllers showed bad
+regressions with MSI (especially in a combination with AMD chipset),
+thus we disabled MSI for them.
+There seem also still other devices that don't work with MSI. If you
+see a regression wrt the sound quality (stuttering, etc) or a lock-up
+in the recent kernel, try to pass `enable_msi=0` option to disable
+MSI. If it works, you can add the known bad device to the blacklist
+defined in hda_intel.c. In such a case, please report and give the
+patch back to the upstream developer.
diff --git a/Documentation/volatile-considered-harmful.txt b/Documentation/volatile-considered-harmful.txt
index 991c26a6ef64..db0cb228d64a 100644
--- a/Documentation/volatile-considered-harmful.txt
+++ b/Documentation/volatile-considered-harmful.txt
@@ -63,9 +63,9 @@ way to perform a busy wait is:
The cpu_relax() call can lower CPU power consumption or yield to a
-hyperthreaded twin processor; it also happens to serve as a memory barrier,
-so, once again, volatile is unnecessary. Of course, busy-waiting is
-generally an anti-social act to begin with.
+hyperthreaded twin processor; it also happens to serve as a compiler
+barrier, so, once again, volatile is unnecessary. Of course, busy-
+waiting is generally an anti-social act to begin with.
There are still a few rare situations where volatile makes sense in the
diff --git a/Documentation/watchdog/src/watchdog-simple.c b/Documentation/watchdog/src/watchdog-simple.c
index 4cf72f3fa8e9..ba45803a2216 100644
--- a/Documentation/watchdog/src/watchdog-simple.c
+++ b/Documentation/watchdog/src/watchdog-simple.c
@@ -17,9 +17,6 @@ int main(void)
ret = -1;
- ret = fsync(fd);
- if (ret)
- break;
diff --git a/Documentation/watchdog/src/watchdog-test.c b/Documentation/watchdog/src/watchdog-test.c
index a750532ffcf8..63fdc34ceb98 100644
--- a/Documentation/watchdog/src/watchdog-test.c
+++ b/Documentation/watchdog/src/watchdog-test.c
@@ -31,6 +31,8 @@ static void keep_alive(void)
int main(int argc, char *argv[])
+ int flags;
fd = open("/dev/watchdog", O_WRONLY);
if (fd == -1) {
@@ -41,12 +43,14 @@ int main(int argc, char *argv[])
if (argc > 1) {
if (!strncasecmp(argv[1], "-d", 2)) {
+ ioctl(fd, WDIOC_SETOPTIONS, &flags);
fprintf(stderr, "Watchdog card disabled.\n");
} else if (!strncasecmp(argv[1], "-e", 2)) {
+ ioctl(fd, WDIOC_SETOPTIONS, &flags);
fprintf(stderr, "Watchdog card enabled.\n");
diff --git a/Documentation/watchdog/watchdog-api.txt b/Documentation/watchdog/watchdog-api.txt
index 4cc4ba9d7150..eb7132ed8bbc 100644
--- a/Documentation/watchdog/watchdog-api.txt
+++ b/Documentation/watchdog/watchdog-api.txt
@@ -222,11 +222,10 @@ returned value is the temperature in degrees fahrenheit.
ioctl(fd, WDIOC_GETTEMP, &temperature);
Finally the SETOPTIONS ioctl can be used to control some aspects of
-the cards operation; right now the pcwd driver is the only one
-supporting this ioctl.
+the cards operation.
int options = 0;
- ioctl(fd, WDIOC_SETOPTIONS, options);
+ ioctl(fd, WDIOC_SETOPTIONS, &options);
The following options are available: