< BACKMake Note | BookmarkCONTINUE >
156135250194107072078175030179198180025031194137176049106218111004229222111161084016162077

thread Module

Let's take a look at what the thread module has to offer. In addition to being able to spawn threads, the threadmodule also provides a basic synchronization data structure called a lock object (a.k.a. primitive lock, simple lock, mutual exclusion lock, mutex, binary semaphore). As we mentioned earlier, such synchronization primitives go hand-in-hand with thread management.

Listed in Table 17.1 are a list of the more commonly-used thread functions and LockType lock object methods:

Table 17.1. thread Module and Lock Objects
Function/Method Description
thread Module Functions
start_new_thread(function, args, kwargs=None) spawns a new thread and execute function with the given args and optional kwargs
allocate_lock() allocates LockType lock object
exit() instructs a thread to exit
LockType Lock Object Methods
acquire(wait=None) attempts to acquire lock object
locked() returns 1 if lock acquired, 0 otherwise
release() releases lock

The key function of the thread module is start_new_thread(). Its syntax is exactly that of the apply() built-in function, taking a function along with arguments and optional keyword arguments. The difference is that instead of the main thread executing the function, a new thread is spawned to invoke the function.

Let's take our onethr.py example and integrate threading into it. By slightly changing the call to the loop*() functions, we now present mtsleep1.py in Example 17.2.

Example 17.2. Using the thread Module (mtsleepl.py)

The same loops from onethr.py are executed, but this time using the simple multithreaded mechanism provided by the thread module. The two loops are executed concurrently (with the shorter one finishing first, obviously), and the total elapsed time is only as long as the slowest thread rather than the total time for each separately.

 <$nopage>
001 1  #!/usr/bin/env python
002 2
003 3  import thread
004 4  from time import sleep, time, ctime
005 5
006 6  def loop0():
007 7      print 'start loop 0 at:', ctime(time())
008 8      sleep(4)
009 9      print 'loop 0 done at:', ctime(time())
010 10
011 11 def loop1():
012 12     print 'start loop 1 at:', ctime(time())
013 13     sleep(2)
014 14     print 'loop 1 done at:', ctime(time())
015 15
016 16 def main():
017 17     print 'starting threads…'
018 18     thread.start_new_thread(loop0, ())
019 19     thread.start_new_thread(loop1, ())
020 20     sleep(6)
021 21     print 'all DONE at:', ctime(time))
022 22
023 23 if __name__ == '__main__':
024 24     main()
025  <$nopage>

start_new_thread() requires the first two arguments, so that's the reason for passing in an empty tuple even if the executing function requires no arguments.

Upon execution of this program, our output changes drastically. Rather than taking a full 6 or 7 seconds, our script now runs in 4, the length of time of our longest loop, plus any overhead.

					
% mtsleep1.py
starting threads…
start loop 0 at: Sun Aug 13 05:04:50 2000
start loop 1 at: Sun Aug 13 05:04:50 2000
loop 1 done at: Sun Aug 13 05:04:52 2000
loop 0 done at: Sun Aug 13 05:04:54 2000
all DONE at: Sun Aug 13 05:04:56 2000

				

The pieces of code that sleep for 4 and 2 seconds now occur concurrently, contributing to the lower overall runtime.

The only other major change to our application is the addition of the "sleep(6)" call. Why is this necessary? The reason is that if we did not stop the main thread from continuing, it would proceed to the next statement, displaying "all done" and exit, killing both threads running loop0() and loop1().

We did not have any code which told the main thread to wait for the child threads to complete before continuing. This is what we mean by threads requiring some sort of synchronization. In our case, we used another sleep() call as our synchronization mechanism. We used a value of 6 seconds because we know that both threads (which take 4 and 2 seconds, as you know) should have completed by the time the main thread has counted to 6.

You are probably thinking that there should be a better way of managing threads than creating that extra delay of 6 seconds in the main thread. Because of this delay, the overall runtime is no better than in our single-threaded version. Using sleep() for thread synchronization as we did is not reliable. What if our loops had independent and varying execution times? We may be exiting the main thread too early or too late. This is where locks come in.

Making yet another update to our code to include locks as well as getting rid of separate loop functions, we get mtsleep2.py, presented in Example 17.3. Running it, we see that the output is similar to mtsleep1.py. The only difference is that we did not have to wait the extra time for mtsleep1.py to conclude. By using locks, we were able to exit as soon as both threads had completed execution.

					
% mtsleep2.py
starting threads…
start loop 0 at: Sun Aug 13 16:34:41 2000
start loop 1 at: Sun Aug 13 16:34:41 2000
loop 1 done at: Sun Aug 13 16:34:43 2000
loop 0 done at: Sun Aug 13 16:34:45 2000
all DONE at: Sun Aug 13 16:34:45 2000

				
Example 17.3. Using thread and Locks (mtsleep2.py)

Rather than using a call to sleep() to hold up the main thread as in mtsleep1.py, the use of locks makes more sense.

 <$nopage>
001 1  #!/usr/bin/env python
002 2
003 3  import thread
004 4  from time import sleep, time, ctime
005 5
006 6  loops = [ 4, 2 ]
007 7
008 8  def loop(nloop, nsec, lock):
009 9      print 'start loop', nloop, 'at:', ctime(time())
010 10     sleep(nsec)
011 11     print 'loop', nloop, 'done at:', ctime(time())
012 12     lock.release()
013 13
014 14 def main():
015 15     print 'starting threads…'
016 16     locks = []
017 17     nloops = range(len(loops))
018 18
019 19     for i in nloops:
020 20         lock = thread.allocate_lock()
021 21         lock.acquire()
022 22         locks.append(lock)
023 23
024 24     for i in nloops:
025 25         thread.start_new_thread(loop, \
026 26             (i, loops[i], locks[i]))
027 27
028 28     for i in nloops:
029 29         while locks[i].locked(): pass <$nopage>
030 30
031 31     print 'all DONE at:', ctime(time())
032 32
033 33 if __name__ == '__main__':
034 34    main()
035  <$nopage>

So how did we accomplish our task with locks? Let's take a look at the source code:

Line-by-line explanation

Lines 1–6

After the Unix start-up line, we import the thread module and a few familiar attributes of the time module. Rather than hardcoding separate functions to count to 4 and 2 seconds, we will use a single loop() function and place these constants in a list, loops.

Lines 8–12

The loop() function will proxy for the now-removed loop*() functions from our earlier examples. We had to make some cosmetic changes to loop() so that it can now perform its duties using locks. The obvious changes are that we need to be told which loop number we are as well as how long to sleep for. The last piece of new information is the lock itself. Each thread will be allocated an acquired lock. When the sleep() time has concluded, we will release the corresponding lock, indicating to the main thread that this thread has completed.

Lines 14–34

The bulk of the work is done here in main() using three separate for loops. We first create a list of locks, which we obtain using the thread.allocate_lock() function and acquire each lock with the acquire() method. Acquiring a lock has the effect of "locking the lock." Once it's locked, we add the lock to the lock list, locks. The next loop actually spawns the threads, invoking the loop() function per thread, and for each thread, provides it with the loop number, the time to sleep for, and the acquired lock for that thread. So why didn't we start the threads in the lock acquisition loop? There are several reasons: (1) we wanted to synchronize the threads, so that "all the horses started out the gate" around the same time, and (2) locks take a little bit of time to be acquired. If your thread executes "too fast," it is possible that it completes before the lock has a chance to be acquired.

It is up to each thread to unlock its lock object when it has completed execution. The final loop just sits-and-spins (pausing the main thread) until both locks have been released before continuing execution. Since we are checking each lock sequentially, we may be at the mercy of all the slower loops if they are more towards the beginning of the set of loops. In such cases, the majority of the wait time may be for the first loop(s). When that lock is released, remaining locks may have already been unlocked (meaning that corresponding threads have completed execution). The result is that the main thread will fly through those lock checks without pause. Finally, you should be well aware that the final pair of lines will execute main() only if we are invoking this script directly.

As hinted in the earlier Core Note, we presented the thread module only to introduce the reader to threaded programming. Your MT application should use higher-level modules such as the threading module, which we will now discuss.


Last updated on 9/14/2001
Core Python Programming, © 2002 Prentice Hall PTR

< BACKMake Note | BookmarkCONTINUE >

© 2002, O'Reilly & Associates, Inc.