[m-dev.] Lock Elision in eglibc 2.19

Paul Bone paul at bone.id.au
Sat Jun 21 11:14:13 AEST 2014


Hi Andi

I've been tracking down a bug that came up when I upgraded from eglibc 2.18
to eglibc 2.19 (On Debian jessie, on x86_64).

The Mercury programming language http://www.mercurylang.org uses the
Boehm-Demers-Weiser Garbage Collector http://hboehm.info/gc/.  Both use
pthreads on Linux.  This is https://www.mercurylang.org/bugs/view.php?id=334
in Mercury's BTS.

I noticed the following error:
mercury_compile: ../nptl/pthread_mutex_lock.c:80: __pthread_mutex_cond_lock: Ass
ertion `mutex->__data.__owner == 0' failed.

This is thrown (indirectly) from a call to pthread_cond_wait in
pthread_support.c line 2036 in Boehm GC 7.4.2  I have the same problem with
Boehm Gc 7.2.  There doesn't appear to be anything suspicious about the use
of the mutex or condition variable involved here.

A different bug affecting libtirpc and mount_nfs also started occuring when
I upgraded to eglibc 2.19.  When investigating this I found that eglibc 2.19
introduced lock elision using TSX extensions and found your article here:
http://lwn.net/Articles/534758/ I use an i7-4770 processor which supports
TSX.  (I chose this one because I wanted to experiment with some lock free
code myself.)

I've looked at the NTPL code and the Boehm code and I don't see anything
obvious - not that the NTPL assembler is easy to read.  Given that the
assertion refers to the __owner field, and that on elision paths don't
update this field I wonder if they're related, that is that not updating the
__owner field has other issues.

This mutex and condition variable refer to the Boehm collector's marking
phase, which will read and update a lot of memory.  Is the mutex code
falling back from lock elision to normal locks for this mutex and then
triggering the assertion because the owner field hasn't been updated?

As a work-around I'd like to explicitly disable elision for this mutex.
I've searched the glibc/eglibc sources and documentation and haven't found a
way to disable elision.  But some things I read (mailing list messages etc)
say that it should be possible either per mutex or completely (with an
environment variable).  Could you tell me how?  Thanks.


I have a second question that is less important, but I'd like to understand
nevertheless.  Your LWN article suggests that the entire critical section
(from pthread_mutex_lock to pthread_mutex_unlock) is a transactional memory
transaction.  Have I understood correctly?  If so, why not just start and
finish the transactional memory transaction within the pthread_mutex_lock
code?  That is, after acquiring the lock, finish the TM transaction so that
the processor doesn't need to handle all the memory use until the
pthread_mutex_unlock call specially.

Thanks.


-- 
Paul Bone



More information about the developers mailing list