Multithreaded programs. Architectures of multithreaded applications. Parallel processing in VB6

E this article is not for seasoned Python tamers, for whom unraveling this tangle of snakes is child's play, but rather a superficial overview of multithreading possibilities for new python addicts.

Unfortunately, there is not so much material in Russian on the topic of multithreading in Python, and pythoners who have not heard anything, for example, about GIL, began to come across to me with enviable regularity. In this article I will try to describe the most basic features of a multi-threaded python, tell you what GIL is and how to live with it (or without it), and much more.


Python is a charming programming language. It perfectly combines many programming paradigms. Most of the tasks that a programmer may encounter are solved here easily, elegantly and concisely. But for all these tasks, a single-threaded solution is often enough, and single-threaded programs are usually predictable and easy to debug. What can not be said about multi-threaded and multi-process programs.

Multithreaded Applications


Python has a module threading , and it has everything you need for multi-threaded programming: it also has different kind locks, and a semaphore, and an event mechanism. One word - all that is needed for the vast majority of multi-threaded programs. Moreover, using all these tools is quite simple. Consider an example program that starts 2 threads. One thread writes ten "0", the other - ten "1", and strictly in order.

import threading

def writer

for i in xrange(10 ):

print x

Event_for_set.set()

# init events

e1 = threading.Event()

e2 = threading.Event()

# init threads

0 , e1, e2))

1 , e2, e1))

# start threads

t1.start()

t2.start()

t1.join()

t2.join()


No magic, no voodoo code. The code is clear and consistent. Moreover, as you can see, we created a thread from a function. For small tasks, this is very convenient. This code is also quite flexible. Suppose we have a 3rd process that writes “2”, then the code will look like this:

import threading

def writer (x, event_for_wait, event_for_set):

for i in xrange(10 ):

Event_for_wait.wait() # wait for event

Event_for_wait.clear() # clean event for future

print x

Event_for_set.set() # set event for neighbor thread

# init events

e1 = threading.Event()

e2 = threading.Event()

e3 = threading.Event()

# init threads

t1 = threading.Thread(target=writer, args=( 0 , e1, e2))

t2 = threading.Thread(target=writer, args=( 1 , e2, e3))

t3 = threading.Thread(target=writer, args=( 2 , e3, e1))

# start threads

t1.start()

t2.start()

t3.start()

e1.set() # initiate the first event

# join threads to the main thread

t1.join()

t2.join()

t3.join()


We have added a new event, a new thread, and slightly changed the parameters with which
threads start (of course, you can write a more general solution using, for example, MapReduce, but this is already beyond the scope of this article).
As you can see, there is still no magic. Everything is simple and clear. Let's go further.

Global Interpreter Lock


There are two most common reasons to use threads: first, to increase the efficiency of using the multi-core architecture of modern processors, and hence the performance of the program;
secondly, if we need to divide the logic of the program into parallel fully or partially asynchronous sections (for example, to be able to ping several servers at the same time).

In the first case, we are faced with such a limitation of Python (or rather its main implementation, CPython), as the Global Interpreter Lock (or GIL for short). The concept of the GIL is that only one thread can be executing on a processor at a time. This is done so that there is no struggle between threads for individual variables. The executable thread has access to the entire environment. This feature of the implementation of threads in Python greatly simplifies the work with threads and provides a certain thread safety (thread safety).

But there is a subtle point here: it may seem that a multi-threaded application will run exactly the same amount of time as a single-threaded one doing the same thing, or for the sum of the execution time of each thread on the CPU. But here we are waiting for one unpleasant effect. Consider the program:

with open("test1.txt" , "w" ) as fout:

for i in xrange(1000000 ):

print >> fout, 1


This program simply writes a million "1" lines to a file and does it in ~0.35 seconds on my computer.

Consider another program:

from threading import Thread

def writer(filename, n):

with open(filename, "w" ) as fout:

for i in xrange(n):

print >> fout, 1

t1 = Thread(target=writer, args=("test2.txt" , 500000 ,))

t2 = Thread(target=writer, args=("test3.txt" , 500000 ,))

t1.start()

t2.start()

t1.join()

t2.join()


This program creates 2 threads. In each stream, she writes half a million “1” lines to a separate file. In fact, the amount of work is the same as that of the previous program. But over time, an interesting effect is obtained here. The program can run from 0.7 seconds to as much as 7 seconds. Why is this happening?

This is due to the fact that when a thread does not need the CPU resource, it releases the GIL, and at this moment it can try to get it, and another thread, and also the main thread. Wherein operating system, knowing that there are many cores, can exacerbate everything by trying to distribute threads between cores.

UPD: on this moment in Python 3.2, there is an improved implementation of the GIL, which partially solves this problem, in particular, due to the fact that each thread, after losing control, waits for a short period of time before it can again capture the GIL (there is a good presentation on this topic in English)

“So you can’t write efficient multi-threaded programs in Python?” you ask. No, of course, there is a way out, and even several.

Multiprocess Applications


In order to somewhat solve the problem described in the previous paragraph, Python has a module subprocess . We can write a program that we want to execute in a parallel thread (actually already a process). And run it in one or more threads in another program. This way would really speed up our program, because the threads created in the GIL launcher do not pick up, but only wait for the running process to terminate. However, there are many problems with this method. The main problem is that it becomes difficult to transfer data between processes. It would be necessary to somehow serialize objects, establish communication through PIPE or other tools, but all this inevitably incurs overhead and the code becomes difficult to understand.

Here we can use a different approach. Python has a multiprocessing module . Functionally, this module resembles threading . For example, processes can be created in the same way from ordinary functions. Methods for working with processes are almost the same as for threads from the threading module. But to synchronize processes and exchange data, it is customary to use other tools. It's about about queues (Queue) and channels (Pipe). However, there are also analogues of locks, events and semaphores that were in threading.

In addition, the multiprocessing module has a mechanism for working with shared memory. To do this, the module has variable (Value) and array (Array) classes that can be “generalized” (share) between processes. For the convenience of working with shared variables, you can use the manager classes (Manager). They are more flexible and easy to handle, but slower. Not to mention the nice ability to share types from the ctypes module using the multiprocessing.sharedctypes module.

The multiprocessing module also has a mechanism for creating process pools. This mechanism is very useful for implementing the Master-Worker pattern, or for implementing a parallel Map (which is, in a sense, a special case of the Master-Worker).

Of the main problems of working with the multiprocessing module, it is worth noting the relative platform dependence of this module. Since work with processes is organized differently in different operating systems, some restrictions are imposed on the code. For example, in Windows there is no fork mechanism, so the process separation point must be wrapped in:

if __name__ == "__main__" :


However, this design is already a good form.

What else...


There are other libraries and approaches for writing parallel applications in Python. For example, you can use Hadoop+Python or various implementations of MPI in Python (pyMPI, mpi4py). You can even use wrappers for existing C++ or Fortran libraries. Here it was possible to mention such frameworks / libraries as Pyro, Twisted, Tornado and many others. But this is all beyond the scope of this article.

If you liked my style, then in the next article I will try to tell you how to write simple interpreters in PLY and what they can be used for.

Multithreaded programming is not fundamentally different from writing event-driven graphical user interfaces, and even from creating simple sequential applications. All the important rules about encapsulation, separation of concerns, loose coupling, and so on apply here. But many developers find it difficult to write multithreaded programs precisely because they ignore these rules. Instead, they are trying to put into practice the much less important knowledge about threads and synchronization primitives gleaned from texts on multithreaded programming for beginners.

So what are these rules

Another programmer, faced with a problem, thinks: "Ah, right, we need to use regular expressions". And now he has two problems - Jamie Zawinski.

Another programmer, faced with a problem, thinks: "Ah, right, I'll use streams here." And now he has ten problems - Bill Schindler.

Too many programmers who undertake to write multi-threaded code get into trouble, like the hero of Goethe's ballad " The Sorcerer's Apprentice". The programmer will learn how to create a bunch of threads that work in principle, but sooner or later they get out of hand, and the programmer does not know what to do.

But unlike the half-educated wizard, the unfortunate programmer cannot hope for the arrival of a powerful sorcerer who will wave his magic wand and restore order. Instead, the programmer goes to the most ugly tricks, trying to cope with constantly arising problems. The result is always the same: an overly complicated, limited, fragile, and unreliable application. It has a persistent threat of deadlock and other dangers inherent in bad multi-threaded code. I'm not talking about inexplicable crashes, poor performance, incomplete or incorrect results.

You may be wondering: why is this happening? There is a common misconception: "Multi-threaded programming is very difficult." But it's not. If a multi-threaded program is unreliable, then it usually crashes for the same reasons as poor-quality single-threaded programs. It's just that the programmer does not follow the fundamental, long-known and proven methods of development. Multithreaded programs just seem more complicated, because the more parallel threads go wrong, the more mess they make - and much faster than a single thread would.

The misconception about the "difficulty of multithreaded programming" has become widespread due to those developers who professionally developed on writing single-threaded code, first encountered multithreading and did not cope with it. But instead of reconsidering their prejudices and habitual methods of work, they stubbornly fix what does not want to work in any way. Justifying themselves for unreliable software and missed deadlines, these people repeat the same thing: "multi-threaded programming is very difficult."

Please note that above I am talking about typical programs that use multithreading. Indeed, there are complex multi-threaded scenarios - as well as complex single-threaded ones. But they are rare. As a rule, in practice, nothing supernatural is required from a programmer. We move the data, transform it, perform some calculations from time to time, and finally store the information in a database or display it on the screen.

There is nothing difficult in improving the average single-threaded program and turning it into a multi-threaded one. At least it shouldn't be. Difficulties arise for two reasons:

  • programmers do not know how to apply simple, well-known proven development methods;
  • most of the information presented in books on multithreaded programming is technically correct, but is completely inapplicable to solving applied problems.

The most important programming concepts are universal. They apply equally to single-threaded and multi-threaded programs. Programmers drowning in a whirlpool of threads simply didn't learn important lessons when they were learning single-threaded code. I can say this because such developers make the same fundamental mistakes in multi-threaded and single-threaded programs.

Perhaps the most important lesson to be learned in the sixty-year history of programming is put like this: global mutable state- evil. Real evil. Programs that depend on global mutable state are relatively difficult to reason about and generally unreliable because there are too many ways to change state. There has been a ton of research that supports this general principle, and there are countless design patterns whose main goal is to implement one way or another of hiding data. To make your programs more predictable, try to eliminate mutable state as much as possible.

In a single-threaded sequential program, the likelihood of data corruption is directly proportional to the number of components that can change that data.

As a rule, it is not possible to completely get rid of the global state, but the developer has very effective tools in the arsenal that allow you to strictly control which program components can change the state. In addition, we learned how to create restrictive API layers around primitive data structures. Therefore, we have good control over how these data structures change.

The problems of global mutable state gradually became apparent in the late 80s and early 90s, with the rise of event-driven programming. Programs no longer started "from the beginning" and followed the only predictable path of execution "to the end." Modern programs have the initial state, after leaving which events occur in them - in an unpredictable order, with variable time intervals. The code remains single-threaded, but already becomes asynchronous. The probability of data corruption increases precisely because the order of occurrence of events is very important. There are situations like this all the time: if event B occurs after event A, then everything works fine. But if event A occurs after event B, and event C has time to wedge between them, then the data can be distorted beyond recognition.

If parallel threads are involved, the problem is further exacerbated, since multiple methods can operate on global state at the same time. It becomes impossible to judge exactly how the global state changes. It is already not only that events can occur in an unpredictable order, but also that the state of several threads of execution can be updated simultaneously. With asynchronous programming, at the very least, you can guarantee that a certain event can not occur until another event has finished processing. That is, it is possible to say with certainty what the global state will be at the end of the processing of a particular event. In multithreaded code, it is usually impossible to tell which events will occur in parallel, so it is impossible to describe the global state at any given time with certainty.

A multi-threaded program with extensive global mutable state is one of the most telling examples of the Heisenberg Uncertainty Principle that I know of. It is impossible to check the state of a program without changing its behavior.

When I start another filippic about global mutable state (the essence is outlined in the previous few paragraphs), programmers roll their eyes and assure me that they have known all this for a long time. But if you know this - why don't you tell it from your code? Programs are stuffed with global mutable state, and programmers wonder why code doesn't work.

Not surprisingly, the most important work in multi-threaded programming happens during the design phase. It is required to clearly define what the program should do, develop independent modules to perform all functions, describe in detail what data is required by which module, and determine the ways in which information is exchanged between modules ( Yes, do not forget to prepare beautiful T-shirts for all project participants. First thing.- approx. ed. in original). This process is not fundamentally different from designing a single-threaded program. The key to success, as with single-threaded code, is to limit interactions between modules. If we can get rid of shared mutable state, then data sharing problems simply won't arise.

One might argue that sometimes there is no time for such fine-grained program design that avoids global state. I believe that this can and should be spent time. Nothing hurts multithreaded programs more than trying to deal with global mutable state. The more details you have to manage, the more likely your program will go into a tailspin and crash.

in realistic application programs there must be some shared state that can change. And this is where most programmers have problems. The programmer sees that shared state is required here, turns to the multithreaded arsenal and takes the simplest tool from there: a universal lock (critical section, mutex, or whatever they call it). They seem to think that mutual exclusion will solve all data sharing problems.

The number of problems that can arise with such a single lock is simply staggering. Race conditions, gating problems with excessive blocking, and fairness issues are just a few examples. If you have multiple locks, especially if they are nested, you will also need to take care of deadlocks, dynamic deadlocks, lock queues, and other concurrency-related threats. In addition, there are characteristic problems of single blocking.
When I write or review code, I follow a nearly unbreakable ironclad rule: if you made a blocking, then, apparently, you made a mistake somewhere.

This statement can be commented in two ways:

  1. If you need a lock, then you probably have a global mutable state that needs to be protected from concurrent updates. Having a global mutable state is a flaw that was made at the design stage of the application. Review and redesign.
  2. Blocking is not easy to use correctly, and blocking-related bugs can be incredibly difficult to localize. It is very likely that you will use the lock incorrectly. If I see a lock, and the program is behaving unusually, then the first thing I do is check the code that depends on the lock. And I usually find problems with it.

Both of these interpretations are correct.

It's easy to write multithreaded code. But it's very, very tricky to use synchronization primitives correctly. You may not be qualified to correct use even one block. After all, locks and other synchronization primitives are constructions erected at the level of the entire system. People who understand parallel programming much better than you use these primitives to build concurrent data structures and high-level synchronization constructs. And we, ordinary programmers, simply take such constructions and use them in our code. Programmer, application writer, should use low-level synchronization primitives no more than it makes direct calls to device drivers. That is, almost never.

Trying to use locks to solve data sharing problems is like putting out a fire with liquid oxygen. Like a fire, these problems are easier to prevent than fix. If you get rid of shared state, then you don't have to abuse synchronization primitives either.

Most of what you know about multithreading is irrelevant

In the multithreading tutorials for beginners, you will learn what threads are. Then the author will consider various ways, which can be used to set up parallel operation of these threads - for example, he will talk about access control to shared data using locks and semaphores, he will dwell on what things can happen when working with events. Consider condition variables, memory barriers, critical sections, mutexes, volatile fields, and atomic operations in detail. We will look at examples of how to use these low-level constructs to perform all sorts of system operations. Having read this material up to half, the programmer decides that he already knows enough about all these primitives and about their application. After all, if I know how this thing works at the system level, then I can apply it at the application level in the same way. Yes?

Imagine that you told a teenager how to build an internal combustion engine yourself. Then, without any driving instruction, you put him behind the wheel of a car and say, "Drive"! The teenager understands how the car works, but has no idea how to get it from point A to point B.

Understanding how threads work at the system level usually does not help in any way to use them at the application level. I'm not suggesting that programmers don't need to learn all these low-level details. Just don't expect to be able to apply this knowledge right off the bat when designing or developing a business application.

Introductory threading literature (and related academic courses) should not teach such low-level constructs. We need to focus on solving the most common classes of problems and show developers how such problems are solved using high-level features. In principle, most business applications are exclusively simple programs. They read data from one or more input devices, perform some complex processing on that data (for example, requesting some more data in the process), and then display the results.

Often such programs fit perfectly into the provider-consumer model, which requires the use of only three streams:

  • the input stream reads the data and places it in the input queue;
  • the worker thread reads entries from the input queue, processes them, and puts the results into the output queue;
  • The output thread reads entries from the output queue and saves them.

These three threads work independently, communication between them takes place at the queue level.

Although these queues can technically be considered shared state zones, in practice they are just communication channels that have their own internal synchronization. Queues support work with many producers and consumers at once, you can add and remove elements to them in parallel.

Since the input, processing, and output steps are isolated from each other, it is easy to change their implementation without affecting the rest of the program. As long as the type of data in the queue does not change, you can refactor individual program components as you wish. In addition, since an arbitrary number of producers and consumers participate in the queue, it will not be difficult to add other producers / consumers. We can have dozens of input threads writing information to the same queue, or dozens of worker threads taking information from the input queue and digesting the data. Within a single computer, such a model scales well.

But the most important thing is that modern programming languages ​​and libraries make it very easy to create applications in the producer-consumer model. In .NET, you will find Parallel Collections and the TPL Dataflow library. Java has an Executor service, as well as BlockingQueue and other classes from the java.util.concurrent namespace. C++ has the Boost library for working with threads and the Thread Building Blocks library from Intel. AT visual studio 2013 from Microsoft appeared asynchronous agents. Similar libraries are also available in Python, JavaScript, Ruby, PHP, and, to my knowledge, many other languages. You can create a producer-consumer application with any of these packages without ever resorting to locks, semaphores, condition variables, or any other synchronization primitives.

These libraries freely use a wide variety of synchronization primitives. This is fine. All the listed libraries are written by people who understand multithreading incomparably better than the average programmer. Working with such a library is almost the same as using a runtime language library. This can be compared to programming in a high-level language rather than assembly language.

The provider-consumer model is just one of many examples. The above libraries contain classes that can be used to implement many common multithreaded design patterns without getting into low-level details. You can create large-scale multi-threaded applications, with little or no concern about how exactly the threads are coordinated and synchronized.

Work with Libraries

So, the creation of multi-threaded programs is no fundamentally different from writing single-threaded synchronous programs. The important principles of encapsulation and data hiding are universal, and their importance only increases when many concurrent threads are involved. If you neglect these important aspects, then even the most exhaustive knowledge of low-level handling of streams will not save you.

Modern developers have to solve a lot of problems at the level of application programming, it happens that there is simply no time to think about what is happening at the system level. The more intricate applications become, the more complex details have to be hidden between API layers. We've been doing this for over a decade now. It can be argued that the quality of hiding the complexity of the system from the programmer is the main reason why the programmer manages to write modern applications. For that matter - don't we hide the complexity of the system by implementing a user interface message loop, building low-level communication protocols, etc.?

A similar situation arises with multithreading. Most of the multi-threaded scenarios that the average business application programmer might encounter are already well known and well implemented in libraries. Library functions do a great job of hiding the staggering complexity of concurrency. These libraries need to be learned to use in the same way that you use UI element libraries, communication protocols, and numerous other tools that just work. Leave low-level multithreading to specialists - authors of libraries used to create application programs.

An example of building a simple multi-threaded application.

Born about the reason for the large number of questions about building multi-threaded applications in Delphi.

Target this example- demonstrate how to properly build a multi-threaded application, with the removal of long-term work in a separate thread. And how in such an application to ensure the interaction of the main thread with the worker to transfer data from the form (visual components) to the thread and back.

The example does not claim to be complete, it only demonstrates the most simple ways thread interactions. Allowing the user to "quickly dazzle" (who knows how much I dislike this) a correctly running multi-threaded application.
Everything is detailed (in my opinion) commented in it, but if you have questions, ask.
But once again I warn you: Streams are not easy. If you have no idea how it all works, then there is a huge danger that often everything will work fine for you, and sometimes the program will behave more than strange. The behavior of an incorrectly written multi-threaded program depends very much on a large number of factors that are sometimes impossible to reproduce during debugging.

So an example. For convenience, I placed both the code and attached the archive with the code of the module and form

unitExThreadForm;

uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls;

// constants used when passing data from a stream to a form using
// send window messages
const
WM_USER_SendMessageMetod = WM_USER+10;
WM_USER_PostMessageMetod = WM_USER+11;

type
// description of the thread class, a descendant of tThread
tMyThread = class(tThread)
private
SyncDataN:Integer;
SyncDataS:String;
procedure SyncMethod1;
protected
procedure Execute; override;
public
Param1:String;
Param2:Integer;
Param3:Boolean;
Stopped:Boolean;
LastRandom:Integer;
IterationNo:Integer;
ResultList:tStringList;

Constructor Create(aParam1:String);
destructor Destroy; override;
end;

// description of the class of the stream-using form
TForm1 = class(TForm)
Label1: T Label;
Memo1:TMemo;
btnStart: TButton;
btnStop: TButton;
Edit1: TEdit;
Edit2: TEdit;
CheckBox1: TCheckBox;
Label2: T Label;
Label3: T Label;
Label4: T Label;
procedure btnStartClick(Sender: TObject);
procedure btnStopClick(Sender: TObject);
private
(Private declarations)
MyThread:tMyThread;
procedure EventMyThreadOnTerminate(Sender:tObject);
procedure EventOnSendMessageMetod (var Msg: TMessage);message WM_USER_SendMessageMetod;
procedure EventOnPostMessageMetod(var Msg: TMessage); message WM_USER_PostMessageMethod;

Public
(Public declarations)
end;

var
Form1: TForm1;

{
Stopped - Demonstrates passing data from a form to a thread.
It does not require additional synchronization, since it is simple
single-word type, and is written by only one thread.
}

procedure TForm1.btnStartClick(Sender: TObject);
begin
randomize(); // ensuring randomness in the sequence by Random() - has nothing to do with the thread

// Create an instance of the stream object, passing it an input parameter
{
ATTENTION!
The thread constructor is written in such a way that the thread is created
suspended because it allows:
1. Control the moment of its launch. This is almost always more convenient, because
allows you to set up a stream even before launch, pass it input
parameters, etc.
2. Because a link to the created object will be saved in the form field, then
after the self-destruction of the thread (see below) which, when the thread is running
may occur at any time, this link will become invalid.
}
MyThread:= tMyThread.Create(Form1.Edit1.Text);

// However, since the thread was created suspended, any errors
// during its initialization (before launch), we must destroy it ourselves
// why use try / except block
try

// Assignment of the end handler of the thread in which we will receive
// results of the thread, and "overwrite" the link to it
MyThread.OnTerminate:= EventMyThreadOnTerminate;

// Since we will collect the results in OnTerminate, i.e. before self-destruction
// thread, then we will remove the worries of destroying it
MyThread.FreeOnTerminate:= True;

// An example of passing input parameters through the fields of the stream object, at the point
// create an instance when it's not running yet.
// Personally, I prefer to do this through the parameters of the overridden
// constructor (tMyThread.Create)
MyThread.Param2:= StrToInt(Form1.Edit2.Text);

MyThread.Stopped:= False; // kind of also a parameter, but changing during
// thread running time
except
// since the thread is not yet running and will not be able to self-destruct, we will destroy it "manually"
FreeAndNil(MyThread);
// and then let the exception be handled in the usual way
raise;
end;

// Since the thread object has been successfully created and configured, it's time to run it
MyThread.Resume;

ShowMessage("Thread started");
end;

procedure TForm1.btnStopClick(Sender: TObject);
begin
// If the thread instance still exists, then ask it to stop
// Moreover, it is "ask". We can also "force" in principle, but it will be
// exclusively an emergency option, requiring a clear understanding of all this
// streaming kitchen. Therefore, it is not considered here.
if Assigned(MyThread) then
MyThread.Stopped:= True
else
ShowMessage("Thread not running!");
end;

procedure TForm1.EventOnSendMessageMetod(var Msg: TMessage);
begin
// synchronous message processing method
// in WParam the address of the tMyThread object, in LParam the current value of the thread's LastRandom
with tMyThread(Msg.WParam) do begin
Form1.Label3.Caption:= Format("%d %d %d",);
end;
end;

procedure TForm1.EventOnPostMessageMetod(var Msg: TMessage);
begin
// asynchronous message processing method
// in WParam the current value of IterationNo, in LParam the current value of the thread's LastRandom
Form1.Label4.Caption:= Format("%d %d",);
end;

procedure TForm1.EventMyThreadOnTerminate(Sender:tObject);
begin
// IMPORTANT!
// OnTerminate event handling method is always called in the context of the main
// thread - this is guaranteed by the tThread implementation. Therefore, it is possible to freely
// use any properties and methods of any objects

// Just in case, make sure the object instance still exists
if not Assigned(MyThread) then Exit; // if it doesn't exist, then there is nothing to do

// get the results of the thread's work of the thread object instance
Form1.Memo1.Lines.Add(Format("Thread terminated with result %d",));
Form1.Memo1.Lines.AddStrings((Sender as tMyThread).ResultList);

// Destroy the reference to the stream object instance.
// Because the thread is self-destructing (FreeOnTerminate:= True)
// then after the completion of the OnTerminate handler, the instance of the thread object will be
// destroyed (Free), and all references to it will become invalid.
// In order not to accidentally run into such a link, let MyThread
// I'll note again - let's not destroy the object, but only erase the link. An object
// destroy itself!
MyThread:= Nil;
end;

constructor tMyThread.Create(aParam1:String);
begin
// Create an instance of the SUSPENDED thread (see comment when instantiating)
inheritedCreate(True);

// Create internal objects (if necessary)
ResultList:= tStringList.Create;

// Getting initial data.

// Copy the input data passed through the parameter
Param1:= aParam1;

// An example of receiving input data from VCL components in the constructor of a stream object
// This is allowed in this case, since the constructor is called in the context
// main thread. Therefore, VCL components can be accessed here.
// But, I don't like this, because I think it's bad when the thread knows something
// about some form there. But what can you do to demonstrate.
Param3:= Form1.CheckBox1.Checked;
end;

destructor tMyThread.Destroy;
begin
// destruction of internal objects
FreeAndNil(ResultList);
// destroy the underlying tThread
inherited;
end;

procedure tMyThread.Execute;
var
t:cardinal;
s:string;
begin
IterationNo:= 0; // result counter (loop number)

// In my example, the thread body is a loop that ends
// or at the external "request" to end passed through the variable Stopped parameter,
// or just doing 5 cycles
// It's more pleasant for me to write this through an "eternal" loop.

While True do begin

Inc(IterationNo); // next cycle number

LastRandom:= Random(1000); // random number - to demonstrate passing parameters from the thread to the form

T:= Random(5)+1; // time for which we will fall asleep if we are not completed

// Dumb work (depending on the input parameter)
if not Param3 then
Inc(Param2)
else
Dec(Param2);

// Generate an intermediate result
s:= Format("%s %5d %s %d %d",
);

// Add an intermediate result to the list of results
ResultList.Add(s);

//// Examples of passing the intermediate result to the form

//// Passing through a synchronized method - the classic way
//// Flaws:
//// - synchronized method - this is usually a thread class method (to access
//// to the fields of the stream object), but to access the form fields, it must
//// "know" about it and its fields (objects), which is usually not very good with
//// point of view of the organization of the program.
//// - the current thread will be suspended until execution completes
//// synchronized method.

//// Advantages:
//// - standard and universal
//// - in a synchronized method, you can use
//// all fields of the stream object.
// first, if necessary, save the transmitted data to
// special fields of the object object.
SyncDataN:=IterationNo;
SyncDataS:="Sync"+s;
// and then ensure a synchronized method call
Synchronize(SyncMethod1);

//// Transmission via synchronous message sending (SendMessage)
//// in this case, the data can be passed both through the message parameters (LastRandom),
//// and through the fields of the object, passing the instance address in the message parameter
//// Thread object - Integer(Self).
//// Flaws:
//// - the thread must know the handle of the form window
//// - as with Synchronize, the current thread will be suspended until
//// finish processing the message by the main thread
//// - requires significant CPU time for each call
//// (to switch threads) so a very frequent call is undesirable
//// Advantages:
//// - as with Synchronize, when processing a message, you can use
//// all fields of the stream object (unless, of course, its address was passed)


//// start the thread.
SendMessage(Form1.Handle,WM_USER_SendMessageMetod,Integer(Self),LastRandom);

//// Transmission via asynchronous message sending (PostMessage)
//// Since in this case, by the time the message is received by the main thread,
//// the sending thread may have already terminated, passing the instance address
//// thread object is invalid!
//// Flaws:
//// - the thread must know the handle of the form window;
//// - due to asynchrony, data transfer is possible only through parameters
//// messages, which significantly complicates the transfer of data having the size
//// more than two machine words. It is convenient to use for passing Integer, etc.
//// Advantages:
//// - unlike the previous methods, the current thread will NOT
//// suspended, and will immediately continue its execution
//// - unlike a synchronized call, a message handler
//// is a form method that must have knowledge of the thread object,
//// or not know anything about the stream at all if the data is only transferred
//// via message parameters. That is, the thread may not know anything about the form
//// in general - only its Handle, which can be passed as a parameter before
//// start the thread.
PostMessage(Form1.Handle,WM_USER_PostMessageMetod,IterationNo,LastRandom);

//// Check for possible completion

// Check for completion by parameter
if Stopped then Break;

// Check completion by occasion
if IterationNo >= 10 then Break;

sleep(t*1000); // Sleep for t seconds
end;
end;

procedure tMyThread.SyncMethod1;
begin
// this method is called via the Synchronize method.
// That is, despite the fact that it is a method of the tMyThread thread,
// it runs in the context of the main application thread.
// Therefore, he can do everything, well, or almost everything :)
// But remember, you shouldn't "mess around" here for a long time

// Passed parameters, we can extract from the special field where we put them
// saved before the call.
Form1.Label1.Caption:= SyncDataS;

// or from other fields of the stream object, for example, reflecting its current state
Form1.Label2.Caption:= Format("%d %d",);
end;

In general, the example was preceded by the following my reasoning on the topic ....

Firstly:
THE MOST IMPORTANT RULE multi-threaded programming in Delphi:
In the context of a non-main thread, it is impossible to access the properties and methods of forms, and indeed of all components that "grow" from tWinControl.

This means (slightly simplified) that neither in the Execute method inherited from TThread, nor in other methods/procedures/functions called from Execute, it is forbidden directly access any properties and methods of visual components.

How to do it right.
There are no single recipes here. More precisely, there are so many and different options that, depending on the specific case, you need to choose. Therefore, they refer to the article. After reading and understanding it, the programmer will be able to understand how best to do in this or that case.

In a nutshell:

Most often, an application becomes multi-threaded either when it is necessary to do some kind of long-term work, or when it is possible to simultaneously do several things that do not heavily load the processor.

In the first case, the implementation of work inside the main thread leads to "slowdown" of the user interface - while the work is being done, the message processing cycle is not executed. As a result, the program does not respond to user actions, and the form is not drawn, for example, after it has been moved by the user.

In the second case, when work involves an active exchange with outside world, then during forced "downtime". In anticipation of receiving / sending data, you can do something else in parallel, for example, send / receive data again.

There are other cases, but less common. However, this is not important. Now it's not about that.

Now how is it written. Naturally, a certain most frequent case, somewhat generalized, is considered. So.

The work carried out in a separate thread, in the general case, has four entities (I don’t even know how to call it more precisely):
1. Initial data
2. The actual work itself (it may depend on the source data)
3. Intermediate data (for example, information about the current state of the work being done)
4. Output (result)

Most often, visual components are used to read and display most of the data. But, as mentioned above, you cannot directly access visual components from a thread. How to be?
The Delphi developers suggest using the Synchronize method of the TThread class. Here I will not describe how to apply it - for this there is the above article. Let me just say that its use, even the correct one, is not always justified. There are two problems:

First, the body of a method called through Synchronize is always executed in the context of the main thread, and therefore, while it is executing, the window message processing loop is again not executed. Therefore, it must be fast, otherwise, we will get all the same problems as with a single-threaded implementation. Ideally, a method called via Synchronize should generally only be used to access properties and methods of visual objects.

Secondly, executing a method through Synchronize is an "expensive" pleasure, caused by the need for two switches between threads.

Moreover, both problems are interrelated and cause a contradiction: on the one hand, to solve the first one, it is necessary to “shred” the methods called through Synchronize, and on the other hand, then they have to be called more often, wasting precious processor resources.

Therefore, as always, it is necessary to approach wisely, and for different cases, use different ways flow interaction with the outside world:

Initial data
All data that is transferred to the stream, and does not change during its operation, must be transferred even before it starts, i.e. when creating a thread. To use them in the body of a thread, you need to make a local copy of them (usually in the fields of a TThread descendant).
If there is initial data that can change during the operation of the thread, then access to such data must be carried out either through synchronized methods (methods called through Synchronize), or through the fields of the thread object (descendant of TThread). The latter requires some caution.

Intermediate and output data
Here, again, there are several ways (in order of my preference):
- A method for sending messages asynchronously to the main window of the application.
It is usually used to send messages about the progress of a process to the main application window, with the transfer of a small amount of data (for example, percentage of completion)
- Method of synchronously sending messages to the main window of the application.
It is usually used for the same purposes as asynchronous sending, but allows you to transfer a larger amount of data without creating a separate copy.
- Synchronized methods, where possible, combining the transfer of as much data as possible into one method.
It can also be used to get data from a form.
- Through the fields of the stream object, providing mutually exclusive access.
More details can be found in the article.

Eh. Briefly failed again.

Chapter 10.

Multithreaded Applications

Multitasking in modern operating systems is taken for granted [ Prior to the advent of Apple OS X, there were no modern multitasking operating systems on Macintosh computers. Properly designing an operating system with full multitasking is very difficult, so OS X had to be based on the Unix system.]. The user expects that with the simultaneous launch text editor and the mail client, these programs will not conflict, and when receiving Email the editor will not stop working. When running multiple programs at the same time, the operating system quickly switches between programs, giving them a processor in turn (unless, of course, the computer has multiple processors installed). As a result, it creates illusion running multiple programs at the same time, because even the best typist (and the fastest Internet connection) can't keep up with a modern processor.

Multithreading, in a sense, can be seen as the next level of multitasking: instead of switching between different programs the operating system switches between different parts of the same program. For example, multithreaded mail client allows you to receive new e-mail messages while reading or composing new messages. Nowadays, multithreading is also taken for granted by many users.

VB has never had normal support for multithreading. True, in VB5 one of its varieties appeared - collaborative streaming model(apartment threading). As you'll see shortly, the concurrent model provides the programmer with some of the benefits of multithreading, but it doesn't take full advantage of it. Sooner or later, you have to switch from a training machine to a real one, and VB .NET became the first version of VB to support a free multi-threaded model.

However, multithreading is not a feature that is easily implemented in programming languages ​​and easily mastered by programmers. Why?

Because multi-threaded applications can have very tricky bugs that come and go unpredictably (and those bugs are the hardest to debug).

Fair warning: multithreading is one of the hardest areas of programming. The slightest inattention leads to the appearance of elusive errors, the correction of which takes astronomical sums. For this reason, this chapter contains many bad examples - we deliberately wrote them in such a way as to demonstrate characteristic errors. This is the safest approach to learning multithreaded programming: you should be able to see potential problems when everything seems to be working fine at first glance, and know how to solve them. If you want to use the techniques of multi-threaded programming, this is indispensable.

This chapter will lay a solid foundation for further independent work, but we will not be able to describe multi-threaded programming in all its subtleties - only the printed documentation on the classes of the Threading namespace takes more than 100 pages. If you want to master multi-threaded programming at a higher level, refer to specialized books.

But no matter how dangerous multi-threaded programming is, it is indispensable for the professional solution of some problems. If your programs do not use multithreading where appropriate, users will be very disappointed and will prefer another product. For example, only in the fourth version of the popular e-mail program Eudora appeared multi-threaded capabilities, without which it is impossible to imagine any modern program to work with e-mail. By the time Eudora introduced multithreading support, many users (including one of the authors of this book) had switched to other products.

Finally, single-threaded programs simply do not exist in .NET. All.NET programs are multithreaded because the garbage collector runs as a low priority background process. As shown below, for serious graphical programming in .NET, proper communication of program threads helps prevent the GUI from locking up when a program performs lengthy operations.

Introduction to multithreading

Each program works in a certain context, describing the distribution of code and data in memory. When the context is saved, the state of the program thread is actually saved, which allows you to restore it in the future and continue the execution of the program.

Saving the context is associated with a certain cost of time and memory. The operating system remembers the state of a program thread and transfers control to another thread. When the program wants to continue executing the suspended thread, the saved context has to be restored, which takes even more time. Therefore, multithreading should only be used when the benefits outweigh the costs. Some typical examples are listed below.

  • The functionality of the program is clearly and naturally divided into several heterogeneous operations, as in the example of receiving e-mail and preparing new messages.
  • The program performs long and complex calculations, and you don't want the graphical interface to be blocked during the calculations.
  • The program runs on a multiprocessor computer with an operating system that supports the use of multiple processors (as long as the number of active threads does not exceed the number of processors, parallel execution costs almost no thread switching costs).

Before moving on to the mechanics of how multithreaded programs work, it is necessary to point out one circumstance that often causes confusion among beginners in the field of multithreaded programming.

The program thread will execute the procedure, not the object.

It's hard to say what is meant by the phrase "an object is being executed", but one of the authors often teaches seminars on multithreaded programming and this question is asked more often than others. One might think that a program thread starts by calling the New method of a class, after which the thread processes all messages passed to the corresponding object. Such representations absolutely are wrong. One object can contain several threads executing different (and sometimes even the same) methods, while object messages are transmitted and received by several different threads (by the way, this is one of the reasons why multithreaded programming is difficult: in order to debug a program, you need to know which thread in a given moment performs this or that procedure!).

Because threads are created on the basis of object methods, the object itself is usually created before the thread. After successfully creating an object, the program creates a thread, passing it the address of the object's method, and only after that instructs the thread to start executing. The procedure for which the thread was created, like all procedures, can create new objects, perform operations on existing objects, and call other procedures and functions that are in its scope.

Program threads can also execute shared class methods. In this slu-Also keep in mind another important circumstance: the thread terminates with the exit from the procedure for which it was created. Until the exit from the procedure, the normal completion of the program thread is impossible.

Threads can terminate not only naturally, but also abnormally. This is usually not recommended. See Terminating and Interrupting Threads for more information.

The core .NET features related to the use of threads are located in the Threading namespace. Therefore, most multi-threaded programs should start with the following line:

Imports System.Threading

Importing a namespace simplifies program input and allows you to use IntelliSense technology.

The direct connection of threads with procedures suggests that in this picture an important place is occupied by delegates(see chapter 6). In particular, the Threading namespace includes the ThreadStart delegate, commonly used when starting program threads. The syntax for using this delegate looks like this:

Public Delegate Sub ThreadStart()

Code called with the ThreadStart delegate must not have parameters or a return value, so threads cannot be created for functions (which return a value) and procedures with parameters. To pass information from a stream, one also has to look for alternative means, since the methods being executed do not return values ​​and cannot use pass-by-reference. For example, if the ThreadMethod procedure is in the WilluseThread class, then the ThreadMethod can communicate information by changing the properties of instances of the WillUseThread class.

Application Domains

.NET program threads run in so-called application domains, defined in the documentation as "an isolated environment in which an application runs." You can think of an application domain as a lightweight version of Win32 processes; a single Win32 process can contain multiple application domains. The main difference between application domains and processes is that a Win32 process has its own address space (the documentation also compares application domains to logical processes running inside a physical process). In .NET, all memory management is handled by the runtime, so multiple application domains can run in the same Win32 process. One of the benefits of this scheme is to improve the scaling capabilities of applications. The tools for working with application domains are in the AppDomain class. We recommend that you read the documentation for this class. With it, you can get information about the environment in which your program is running. In particular, the AppDomain class is used when doing reflection on .NET system classes. The following program lists the loaded assemblies.

Imports System.Reflection

Module

SubMain()

Dim theDomain As AppDomain

theDomain = AppDomain.CurrentDomain

Dim Assemblies()As

Assemblies = theDomain.GetAssemblies

Dim anAssemblyxAs

For Each anAssembly In Assemblies

Console.WriteLinetanAssembly.Full Name) Next

Console.ReadLine()

end sub

End Module

Creating threads

Let's start with an elementary example. Let's say you want to run a procedure on a separate thread that decrements a counter in an endless loop. The procedure is defined as part of the class:

Public Class WillUseThreads

Public Sub SubtractFromCounter()

Dim count As Integer

Do While True count -= 1

Console.WriteLlne("Am in another thread and counter ="

&count)

loop

end sub

end class

Since the condition of the Do loop is always true, you might think that nothing will stop the SubtractFromCounter procedure from executing. However, this is not always the case in a multi-threaded application.

The following snippet shows the Sub Main procedure that starts the thread and the Imports command:

Option Strict On Imports System.Threading Module Modulel

SubMain()

1 Dim myTest As New WillUseThreads()

2 Dim bThreadStart As New ThreadStart(AddressOf _

myTest.SubtractFromCounter)

3 Dim bThread As New Thread(bThreadStart)

4" bThread.Start()

Dim i As Integer

5 Do While True

Console.WriteLine("In main thread and count is " & i) i += 1

loop

end sub

End Module

Let's take a look at the most important points one by one. First of all, the procedure Sub Man n always works in main stream(main thread). There are always at least two threads running in .NET programs: the main thread and the garbage collection thread. Line 1 creates a new instance of the test class. On line 2, we create a ThreadStart delegate and pass the address of the SubtractFromCounter procedure of the test class instance created on line 1 (this procedure is called without parameters). Goodgiving the Threading namespace import a long name can be omitted. The new thread object is created on line 3. Notice the passing of the ThreadStart delegate when calling the constructor of the Thread class. Some programmers prefer to combine these two lines into one logical line:

Dim bThread As New Thread(New ThreadStarttAddressOf _

myTest.SubtractFromCounter))

Finally, line 4 "starts" the thread by calling the Start method of the Thread instance created for the ThreadStart delegate. By calling this method, we indicate to the operating system that the Subtract procedure should run on a separate thread.

The word “starts” in the previous paragraph is enclosed in quotation marks, because in this case one of the many oddities of multithreaded programming occurs: calling Start does not actually start the thread! It just says that the operating system should schedule the specified thread to run, but running it directly is out of the program's control. You will not be able to start threads on your own, because the operating system always manages the execution of threads. In a later section, you'll learn how to use priority to make the operating system run your thread faster.

On fig. Figure 10.1 shows an example of what can happen after a program is started and then interrupted with the Ctrl+Break key. In our case, the new thread started only after the counter in the main thread increased to 341!

Rice. 10.1. Simple multi-threaded programmatically running time

If the program runs for a longer period of time, the result will look something like the one shown in Fig. 10.2. We see that youthe execution of the running thread is suspended and control is again transferred to the main thread. In this case, there is a manifestation preemptive multithreading through time slicing. The meaning of this frightening term is explained below.

Rice. 10.2. Switching between threads in a simple multithreaded program

When interrupting threads and transferring control to other threads, the operating system uses the principle of preemptive multithreading through time slicing. Time slicing also solves one of the common problems that used to occur in multithreaded programs - one thread takes up all the CPU time and does not yield control to other threads (usually this happens in intensive loops like the one above). To prevent exclusive CPU takeover, your threads should transfer control to other threads from time to time. If the program turns out to be “non-conscious”, there is another, slightly less desirable solution: the operating system always preempts a running thread, regardless of its priority level, so that access to the processor is given to every thread in the system.

Since the slicing schemes of all versions of Windows running .NET allocate a minimum quantum of time to each thread, CPU ownership problems are not as severe in .NET programming. On the other hand, if the .NET environment is ever adapted for other systems, the situation may change.

If we include the following line in our program before calling Start, then even threads with the lowest priority will get some CPU time:

bThread.Priority = ThreadPriority.Highest

Rice. 10.3. The highest priority thread usually starts faster

Rice. 10.4. The processor is given to lower priority threads as well.

The command assigns the maximum priority to the new thread and decreases the priority of the main thread. From fig. Figure 10.3 shows that the new thread starts running faster than before, but as Figure 10-3 shows. 10.4, the main thread also receives controlleniya (albeit very briefly and only after continuous work flow with subtraction). When you run the program on your computers, you will get results similar to those shown in Fig. 10.3 and 10.4, but due to differences between our systems, there will be no exact match.

The enumerated type ThreadPrlority includes values ​​for five priority levels:

ThreadPriority.Highest

ThreadPriority.AboveNormal

ThreadPrlority.Normal

ThreadPriority.BelowNormal

ThreadPriority.Lowest

Join method

Sometimes a program thread needs to be suspended until another thread completes. Let's say you want to suspend thread 1 until thread 2 has finished its computation. For this from stream 1 the Join method for thread 2 is called. In other words, the command

thread2.Join()

suspends the current thread and waits for thread 2 to complete. Thread 1 transitions to locked state.

If you join thread 1 with thread 2 using the Join method, the operating system will automatically start thread 1 after thread 2 ends. Note that the startup process is non-deterministic: there is no way to tell exactly how long after thread 2 ends, thread 1 will start working. There is another version of Join that returns a boolean value:

thread2.Join(Integer)

This method either waits for thread 2 to finish, or unblocks thread 1 after the specified time interval has elapsed, causing the operating system scheduler to re-allocate processor time to the thread. The method returns True if thread 2 terminates before the specified timeout interval expires, and False otherwise.

Don't forget the basic rule: whether thread 2 terminated or timed out, you can't control when thread 1 is activated.

Thread names, CurrentThread and ThreadState

The Thread.CurrentThread property returns a reference to the currently executing thread object.

Although VB .NET has a great Threads window for debugging multi-threaded applications, which is described below, we have often been rescued by the command

MsgBox(Thread.CurrentThread.Name)

Often it turned out that the code is executed in a completely different thread, in which it was supposed to be executed.

Recall that the term "non-deterministic scheduling of program threads" means a very simple thing: the programmer has practically no means at his disposal to influence the work of the scheduler. For this reason, programs often use the ThreadState property, which returns information about the current state of the thread.

Threads window

The Threads window of Visual Studio .NET is an invaluable aid in debugging multi-threaded programs. It is activated by the submenu command Debug > Windows in interrupt mode. Let's say you named the thread bThread with the following command:

bThread.Name = "Subtracting thread"

An approximate view of the thread window after interrupting the program with the Ctrl+Break key combination (or in another way) is shown in Fig. 10.5.

Rice. 10.5. Threads window

The arrow in the first column marks the active thread returned by the Thread.CurrentThread property. The ID column contains the numeric thread IDs. The next column lists the thread names (if any have been assigned). The Location column indicates the procedure to be executed (for example, the WriteLine procedure of the Console class in Figure 10-5). The remaining columns contain information about priority and suspended threads (see the next section).

The threads window (not the operating system!) allows you to control your program's threads using context menus. For example, you can stop the current thread by clicking on the appropriate line right click mouse and select the Freeze command (later the work of the stopped thread can be resumed). Stopping threads is often used when debugging so that a misbehaving thread does not interfere with the application. In addition, the threads window allows you to activate another (not stopped) thread; To do this, right-click on the desired line and select the Switch To Thread command from the context menu (or simply double-click on the thread line). As will be shown later, this is very useful in diagnosing potential deadlocks.

Suspending a thread

Temporarily unused threads can be put in a passive state using the Sleep method. A passive thread is also considered blocked. Of course, with the transfer of the thread to the passive state, the share of the remaining threads will get more processor resources. The standard syntax for the Sleeper method is as follows: Thread.Sleep(interval_in_milliseconds)

Calling Sleep causes the active thread to become inactive for at least the specified number of milliseconds (however, it is not guaranteed to wake up immediately after the specified interval has elapsed). Note that when the method is called, no reference to a specific thread is passed - the Sleep method is called only for the active thread.

Another version of Sleep causes the current thread to yield the remainder of the allocated CPU time:

Thread.Sleep(0)

The following option puts the current thread in a passive state for an unlimited time (activation occurs only when Interrupt is called):

Thread.Sleer(Timeout.Infinite)

Because passive threads (even with unlimited timeouts) can be interrupted by the Interrupt method, causing a ThreadInterruptExceptionException to be thrown, the Slayer call is always wrapped in a Try-Catch block, as in the following snippet:

try

Thread.Sleep(200)

" Passive thread state has been aborted

Catch e As Exception

"Other exceptions

End Try

Every .NET program runs on a program thread, so the Sleep method is also used to suspend programs (if the program does not import the Threadipg namespace, you must use the fully qualified name Threading.Thread.Sleep).

Termination or interruption of program threads

The thread is automatically terminated when exiting the method specified when the ThreadStart delegate was created, but sometimes it is desirable to terminate the method (hence the thread) when certain factors occur. In such cases, streams usually check condition variable depending on the state ofa decision is made about an emergency exit from the flow. As a rule, a Do-While loop is included in the procedure for this:

Sub ThreadedMethod()

"The program must provide means for surveying

" condition variable.

" For example, a condition variable can be written as a property

Do While conditionVariable = False And MoreWorkToDo

" Main code

Loop End Sub

It takes some time to poll the condition variable. Constantly polling in a loop condition should only be used if you expect the thread to terminate prematurely.

If the test of the condition variable must occur at a specific location, use the If-Then command in combination with Exit Sub inside an infinite loop.

Access to a condition variable needs to be synchronized so that other threads don't interfere with its normal use. This important topic is covered in the Troubleshooting: Synchronization section.

Unfortunately, the code of passive (or otherwise blocked) threads is not executed, so polling a conditional variable is not an option for them. In this case, you should call the Interrupt method on an object variable containing a reference to the desired thread.

The Interrupt method can only be called on threads that are in the Wait, Sleep, or Join state. If you call Interrupt on a thread that is in one of these states, then after a while the thread will start running again, and the runtime will raise a ThreadInterruptedException exception on the thread. This occurs even if the thread has been put to sleep indefinitely by calling Thread.Sleepdimeout. infinite). We say "after some time" because thread scheduling is non-deterministic in nature. The ThreadInterruptedException on exception is caught by the Catch section containing the exit code from the wait state. However, the Catch section is not required to terminate the thread by calling Interrupt - the thread handles the exception as it sees fit.

In .NET, the Interrupt method can be called even on non-blocking threads. In this case, the thread is interrupted at the next block.

Pausing and Killing Threads

The Threading namespace contains other methods that interrupt the normal functioning of threads:

  • Suspend;
  • abortion.

It's hard to say why .NET included support for these methods - calling Suspend and Abort will most likely cause the program to become unstable. None of the methods allows you to properly deinitialize the thread. In addition, when Suspend or Abort is called, it is impossible to predict in what state the thread will leave objects after being suspended or aborted.

Calling Abort results in a ThreadAbortException being thrown. To help you understand why this strange exception should not be handled in programs, here is an excerpt from the .NET SDK documentation:

“...When a thread is destroyed by calling Abort, the runtime throws a ThreadAbortException. This is a special kind of exception that cannot be caught by the program. When this exception is thrown, the runtime executes all Finally blocks before killing the thread. Because Finally blocks can do anything, call Join to make sure the thread is killed."

Moral: Abort and Suspend are not recommended (and if you still cannot do without Suspend, resume the suspended thread with the Resume method). The only way to safely terminate a thread is by polling a synchronized condition variable or by calling the Interrupt method discussed above.

Background threads (daemons)

Some threads running in the background automatically terminate when other program components stop. In particular, the garbage collector runs on one of the background threads. Typically, background threads are created to receive data, but this is only done if there is code running on other threads that can process the received data. Syntax: thread name.IsBackGround = True

If only background threads are left in the application, the application is automatically terminated.

A more serious example: extracting data from HTML code

We recommend using threads only when the functionality of the program is clearly divided into several operations. A good example is the HTML extraction program from Chapter 9. Our class does two things: fetching data from the Amazon site and processing it. Here is a perfect example of a situation in which multi-threaded programming is really appropriate. We create classes for several different books and then parse the data in different threads. Creating a new thread for each book improves the efficiency of the program because while one thread is receiving data (which may require waiting on the Amazon server), another thread will be busy processing the data already received.

The multi-threaded version of this program works more efficiently than the single-threaded version only on a computer with several processors or if the reception of additional data can be effectively combined with their analysis.

As mentioned above, only procedures that have no parameters can run in threads, so you will have to make small changes to the program. The following is the main procedure, rewritten with the exception of the parameters:

Public Sub FindRank()

m_Rank = ScrapeAmazon()

Console.WriteLine("the rank of " & m_Name & "Is " & GetRank)

end sub

Since we will not be able to use the combined field for storing and retrieving information (writing multi-threaded programs with GUI discussed in the last section of this chapter), the program stores the data of four books in an array whose definition begins like this:

Dim theBook(3.1) As String theBook(0.0) = "1893115992"

theBook(0.l) = "Programming VB .NET" " Etc.

Four threads are created in the same cycle that creates the AmazonRanker objects:

For i= 0 then 3

try

theRanker = New AmazonRanker(theBook(i.0). theBookd.1))

aThreadStart = New ThreadStar(AddressOf theRanker.FindRan()

aThread = New Thread(aThreadStart)

aThread.Name = theBook(i.l)

aThread.Start() Catch e As Exception

Console.WriteLine(e.Message)

End Try

Next

Below is the full text of the program:

Option Strict On Imports System.IO Imports System.Net

Imports System.Threading

Module

SubMain()

Dim theBook(3.1) As String

theBook(0.0) = "1893115992"

theBook(0.l) = "Programming VB .NET"

theBook(l.0) = "1893115291"

theBook(l.l) = "Database Programming VB .NET"

theBook(2,0) = "1893115623"

theBook(2.1) = "Programmer "s Introduction to C#."

theBook(3.0) = "1893115593"

theBook(3.1) = "Gland the .Net Platform "

Dim i As Integer

Dim theRanker As =AmazonRanker

Dim aThreadStart As Threading.ThreadStart

Dim aThread As Threading.Thread

For i = 0 to 3

try

theRanker = New AmazonRankerttheBook(i.0). theBook(i.1))

aThreadStart = New ThreadStart(AddressOf theRanker.FindRank)

aThread = New Thread(aThreadStart)

aThread.Name= theBook(i.l)

aThread.Start()

Catch e As Exception

Console.WriteLlnete.Message)

End Try Next

Console.ReadLine()

end sub

End Module

Public Class AmazonRanker

Private m_URL As String

Private m_Rank As Integer

Private m_Name As String

Public Sub New(ByVal ISBN As String. ByVal theName As String)

m_URL = "http://www.amazon.com/exec/obidos/ASIN/" & ISBN

m_Name = theName End Sub

Public Sub FindRank() m_Rank = ScrapeAmazon()

Console.Writeline("the rank of " & m_Name & "is "

& GetRank) End Sub

Public Readonly Property GetRank() As String Get

If m_Rank<>0 Then

Return CStr(m_Rank) Else

" Problems

End if

End get

End Property

Public Readonly Property GetName() As String Get

Return m_Name

End get

End Property

Private Function ScrapeAmazon() As Integer Try

Dim theURL As New Uri(m_URL)

Dim theRequest As WebRequest

theRequest =WebRequest.Create(theURL)

Dim theResponse As WebResponse

theResponse = theRequest.GetResponse

Dim aReader As New StreamReader(theResponse.GetResponseStream())

Dim theData As String

theData = aReader.ReadToEnd

Return Analyze(theData)

Catch E As Exception

Console.WriteLine(E.Message)

Console.WriteLine(E.StackTrace)

console. ReadLine()

End Try End Function

Private Function Analyze(ByVal theData As String) As Integer

Dim Location As.Integer Location = theData.IndexOf(" Amazon.com

Sales Rank:") _

+ "Amazon.com Sales Rank:".Length

Dim temp As String

Do Until theData.Substring(Location.l) = "<" temp = temp

&theData.Substring(Location.l) Location += 1 Loop

Return Clnt(temp)

end function

end class

Multithreaded operations are common in .NET and I/O namespaces, so the .NET Framework library provides dedicated asynchronous methods for them. For more information about using asynchronous methods when writing multithreaded programs, see the BeginGetResponse and EndGetResponse methods of the HTTPWebRequest class.

Main danger (general data)

So far, the only safe use case for threads has been considered - our threads didn't change the shared data. If you allow changes to shared data, potential bugs multiply exponentially, and it becomes much more difficult for a program to get rid of them. On the other hand, if you forbid modification of shared data by different threads, .NET multithreaded programming will be practically no different from the limited features of VB6.

We offer you a small program that demonstrates the problems that arise, without delving into unnecessary details. This program simulates a house with a thermostat in every room. If the temperature is 5 degrees Fahrenheit (about 2.77 degrees Celsius) or more below normal, we tell the heating system to increase the temperature by 5 degrees; otherwise, the temperature rises only by 1 degree. If the current temperature is greater than or equal to the set temperature, no change is made. Temperature control in each room is carried out by a separate stream with a 200-millisecond delay. The main work is done by the following snippet:

If mHouse.HouseTemp< mHouse.MAX_TEMP = 5 Then Try

Thread.Sleep(200)

Catch tie As ThreadInterruptedException

"Passive waiting has been interrupted

Catch e As Exception

" Other Exceptions End Try

mHouse.HouseTemp +- 5" Etc.

Below is the complete source code of the program. The result is shown in fig. 10.6: The temperature in the house reached 105 degrees Fahrenheit (40.5 degrees Celsius)!

1 Option Strict On

2 Imports System.Threading

3 Module

4 SubMain()

5 Dim myHouse As New House(l0)

6 console. ReadLine()

7 End Sub

8 End Module

9 Public Class House

10 Public Const MAX_TEMP As Integer = 75

11 Private mCurTemp As Integer = 55

12 Private mRooms() As Room

13 Public Sub New(ByVal numOfRooms As Integer)

14 ReDim mRooms(numOfRooms = 1)

15 Dim i As Integer

16 Dim aThreadStart As Threading.ThreadStart

17 Dim aThread As Thread

18 For i = 0 To numOfRooms -1

19 Try

20 mRooms(i)=NewRoom(Me, mCurTemp,CStr(i) &"throom")

21 aThreadStart - New ThreadStart(AddressOf _

mRooms(i).CheckTempInRoom)

22 aThread =New Thread(aThreadStart)

23 aThread.Start()

24 Catch E As Exception

25 Console.WriteLine(E.StackTrace)

26 End Try

27 Next

28 End Sub

29 Public Property HouseTemp() As Integer

thirty . Get

31 Return mCurTemp

32 End Get

33 Set(ByVal Value As Integer)

34 mCurTemp = Value 35 End Set

36 End Property

37 End Class

38 Public Class Room

39 Private mCurTemp As Integer

40 Private mName As String

41 Private mHouse As House

42 Public Sub New(ByVal theHouse As House,

ByVal temp As Integer, ByVal roomName As String)

43 mHouse = theHouse

44 mCurTemp = temp

45 mName = roomName

46 End Sub

47 Public Sub CheckTempInRoom()

48 ChangeTemperature()

49 End Sub

50 Private Sub ChangeTemperature()

51 Try

52 If mHouse.HouseTemp< mHouse.MAX_TEMP - 5 Then

53 Thread.Sleep(200)

54 mHouse.HouseTemp +- 5

55 Console.WriteLine("Am in " & Me.mName & _

56 ".Current temperature is "&mHouse.HouseTemp)

57 . Elself mHouse.HouseTemp< mHouse.MAX_TEMP Then

58 Thread.Sleep(200)

59 mHouse.HouseTemp += 1

60 Console.WriteLine("Am in " & Me.mName & _

61 ".Current temperature is " & mHouse.HouseTemp)

62 Else

63 Console.WriteLine("Am in " & Me.mName & _

64 ".Current temperature is " & mHouse.HouseTemp)

65 "Do nothing, temperature is normal

66 End If

67 Catch tae As ThreadInterruptedException

68 "Passive wait has been interrupted

69 Catch e As Exception

70" Other exceptions

71 End Try

72 End Sub

73 End Class

Rice. 10.6. Multithreading Issues

In the Sub Main procedure (lines 4-7), a "house" is created with ten "rooms". The House class sets the maximum temperature to 75 degrees Fahrenheit (about 24 degrees Celsius). Lines 13-28 define a fairly complex house constructor. Lines 18-27 are key to understanding the program. Line 20 creates another room object, passing a reference to the house object to the constructor so that the room object can refer to it if necessary. Lines 21-23 run ten threads to adjust the temperature in each room. The Room class is defined in lines 38-73. Reference to House coxpa objectSet to the mHouse variable in the Room class constructor (line 43). The code for checking and adjusting the temperature (lines 50-66) looks simple and natural, but as you will soon see, this impression is deceptive! Note that this code is wrapped in a Try-Catch block because the program uses the Sleep method.

It is unlikely that anyone will agree to live at a temperature of 105 degrees Fahrenheit (40.5 24 degrees Celsius). What happened? The problem is with the following line:

If mHouse.HouseTemp< mHouse.MAX_TEMP - 5 Then

What happens is that thread 1 first checks the temperature. It sees that the temperature is too low and raises it by 5 degrees. Unfortunately, before the temperature rises, thread 1 is interrupted and control is transferred to thread 2. Thread 2 checks the same variable that hasn't been changed yet thread 1. Thus, thread 2 is also preparing to raise the temperature by 5 degrees, but does not have time to do this and also goes into the waiting state. The process continues until thread 1 is activated and moves on to the next command - increase the temperature by 5 degrees. The increase is repeated when all 10 streams are activated, and the residents of the house will have a bad time.

Problem Solving: Synchronization

In the previous program, there is a situation where the result of the program depends on the order in which the threads are executed. To get rid of it, you need to make sure that commands like

If mHouse.HouseTemp< mHouse.MAX_TEMP - 5 Then...

are fully processed by the active thread before it is interrupted. This property is called atomic shame - a block of code must be executed by each thread without interruption, as an atomic unit. A group of instructions combined in an atomic block cannot be interrupted by the thread scheduler until it is completed. Every multithreaded programming language has its own ways of ensuring atomicity. In VB .NET, the easiest way is to use the SyncLock command, which is called with an object variable. Make minor changes to the ChangeTemperature procedure from the previous example and the program will work just fine:

Private Sub ChangeTemperature() SyncLock(mHouse)

try

If mHouse.HouseTemp< mHouse.MAXJTEMP -5 Then

Thread.Sleep(200)

mHouse.HouseTemp += 5

Console.WriteLine("Am in " & Me.mName & _

".Current temperature is " & mHouse.HouseTemp)

Elselves

mHouse.HouseTemp< mHouse. MAX_TEMP Then

Thread.Sleep(200) mHouse.HouseTemp += 1

Console.WriteLine("Am in " & Me.mName &_ ".Current temperature is " & mHouse.HomeTemp) Else

Console.WriteLineC"Am in " & Me.mName & _ ".Current temperature is " & mHouse.HouseTemp)

"Do nothing, the temperature is normal

End If Catch tie As ThreadInterruptedException

" Passive wait was interrupted by Catch e As Exception

"Other Exceptions

End Try

EndSyncLock

end sub

The SyncLock block code is executed atomically. Access to it from all other threads will be closed until the first thread releases the lock with the End SyncLock command. If a thread in a synchronized block enters a passive wait state, the lock is held until the thread is interrupted or resumed.

Proper use of the SyncLock command ensures that your program is thread safe. Unfortunately, the abuse of SyncLock negatively affects performance. Synchronization of code in a multi-threaded program reduces the speed of its work by several times. Synchronize only the most necessary code and remove the lock as soon as possible.

The base collection classes are not thread-safe, but the .NET Framework includes thread-safe versions of most collection classes. In these classes, the code for potentially dangerous methods is enclosed in SyncLock blocks. Thread-safe versions of collection classes should be used in multithreaded programs wherever data integrity is compromised.

It remains to be mentioned that condition variables are easily implemented using the SyncLock command. To do this, you only need to synchronize the write to a common read-write boolean property, as is done in the following snippet:

Public Class ConditionVariable

Private Shared locker As Object= New Object()

Private Shared mOK As Boolean Shared

Property TheConditionVariable()As Boolean

Get

Return OK

End get

Set(ByVal Value As Boolean) SyncLock (locker)

mOK= Value

EndSyncLock

end set

End Property

end class

SyncLock command and Monitor class

The use of the SyncLock command involves some subtleties not shown in the simple examples above. Thus, the choice of the synchronization object plays a very important role. Try running the previous program with SyncLock(Me) instead of SyncLock(mHouse). The temperature rises above the threshold again!

Remember that the SyncLock command synchronizes according to object, passed as a parameter, not by code snippet. The SyncLock parameter acts as a door for accessing the synchronized fragment from other threads. The SyncLock(Me) command actually opens several different doors, which is exactly what you were trying to avoid with synchronization. Morality:

To protect shared data in a multithreaded application, the SyncLock command must synchronize on a single object.

Since synchronization is object-specific, it is possible in some situations to inadvertently block other fragments. Let's say you have two synchronized methods first and second, both methods being synchronized on the bigLock object. When thread 1 enters the first method and grabs bigLock, no thread will be able to enter the second method, because access to it is already restricted to thread 1!

The functionality of the SyncLock command can be thought of as a subset of the functionality of the Monitor class. The Monitor class is highly customizable and can be used to solve non-trivial synchronization tasks. The SyncLock command is an approximate analogue of the Enter and Exit methods of the Monitor class:

try

Monitor.Enter(theObject) Finally

Monitor.Exit(theObject)

End Try

For some standard operations (incrementing/decreasing a variable, exchanging the contents of two variables), the .NET Framework provides the Interlocked class, whose methods perform these operations at the atomic level. Using the Interlocked class, these operations are much faster than using the SyncLock command.

Deadlock

During synchronization, the lock is set on objects, not threads, so when using different objects to block different code fragments in programs sometimes have very non-trivial errors. Unfortunately, in many cases, single-object synchronization is simply unacceptable because it will cause threads to block too often.

Consider the situation interlock(deadlock) in its simplest form. Imagine two programmers at a dinner table. Unfortunately, for two of them they only have one knife and one fork. Assuming that both a knife and a fork are needed for eating, two situations are possible:

  • One programmer manages to grab a knife and fork and starts eating. When he is satiated, he puts down the dinner set, and then another programmer can take them.
  • One programmer takes the knife and the other takes the fork. Neither will be able to start eating unless the other hands over their appliance.

In a multithreaded program, this situation is called mutual blocking. The two methods are synchronized on different objects. Thread A grabs object 1 and enters the program fragment protected by this object. Unfortunately, it needs access to code protected by another Sync Lock block with a different sync object in order to work. But before it can enter a fragment being synchronized by another object, thread B enters and grabs that object. Now thread A cannot enter the second fragment, thread B cannot enter the first fragment, and both threads are doomed to endless waiting. No thread can continue running because the object needed to do so will never be freed.

Diagnosing deadlocks is difficult because they can occur in relatively rare cases. It all depends on the order in which the scheduler allocates CPU time to them. It is entirely possible that, in most cases, synchronization objects will be acquired in a non-deadlock order.

Below is an implementation of the deadlock situation just described. After a brief discussion of the most fundamental points, we will show how to recognize a deadlock situation in the thread window:

1 Option Strict On

2 Imports System.Threading

3 Module

4 SubMain()

5 Dim Tom As New Programmer("Tom")

6 Dim Bob As New Programmer("Bob")

7 Dim aThreadStart As New ThreadStart(AddressOf Tom.Eat)

8 Dim aThread As New Thread(aThreadStart)

9 aThread.Name="Tom"

10 Dim bThreadStart As New ThreadStarttAddressOf Bob.Eat)

11 Dim bThread As New Thread(bThreadStart)

12 bThread.Name = "Bob"

13 aThread.Start()

14 bThread.Start()

15 End Sub

16 End Module

17 Public Class Fork

18 Private Shared mForkAvaiTable As Boolean = True

19 Private Shared mOwner As String = "Nobody"

20 Private Readonly Property OwnsUtensil() As String

21 Get

22 Return mOwner

23 End Get

24 End Property

25 Public Sub GrabForktByVal a As Programmer)

26 Console.Writel_ine(Thread.CurrentThread.Name &_

"trying to grab the fork.")

27 Console.WriteLine(Me.OwnsUtensil & "has the fork.") . .

28 Monitor.Enter(Me) "SyncLock(aFork)"

29 If mForkAvailable Then

30 a.HasFork = True

31 mOwner = a.MyName

32 mForkAvailable= False

33 Console.WriteLine(a.MyName&"just got the fork.waiting")

34 Try

Thread.Sleep(100) Catch e As Exception Console.WriteLine(e.StackTrace)

End Try

35 End If

36 Monitor.Exit(Me)

EndSyncLock

37 End Sub

38 End Class

39 Public Class Knife

40 Private Shared mKnifeAvailable As Boolean = True

41 Private Shared mOwner As String ="Nobody"

42 Private Readonly Property OwnsUtensi1() As String

43 Get

44 Return mowner

45 End Get

46 End Property

47 Public Sub GrabKnifetByVal a As Programmer)

48 Console.WriteLine(Thread.CurrentThread.Name & _

"trying to grab the knife.")

49 Console.WriteLine(Me.OwnsUtensil & "has the knife.")

50 Monitor.Enter(Me) "SyncLock(aKnife)"

51 If mKnifeAvailable Then

52 mKnifeAvailable = False

53 a.HasKnife = True

54 mOwner = a.MyName

55 Console.WriteLine(a.MyName&"just got the knife.waiting")

56 Try

Thread.Sleep(100)

Catch e As Exception

Console.WriteLine(e.StackTrace)

End Try

57 End If

58 Monitor.Exit(Me)

59 End Sub

60 end class

61 Public Class Programmer

62 Private mName As String

63 Private Shared mFork As Fork

64 Private Shared mKnife As Knife

65 Private mHasKnife As Boolean

66 Private mHasFork As Boolean

67 Shared Sub New()

68 mFork = New Fork()

69 mKnife = New Knife()

70 End Sub

71 Public Sub New(ByVal theName As String)

72 mName = theName

73 End Sub

74 Public Readonly Property MyName() As String

75 Get

76 Return mName

77 End Get

78 End Property

79 Public Property HasKnife() As Boolean

80 Get

81 Return mHasKnife

82 End Get

83 Set(ByVal Value As Boolean)

84 mHasKnife = Value

85 End Set

86 End Property

87 Public Property HasFork() As Boolean

88 Get

89 Return mHasFork

90 End Get

91 Set(ByVal Value As Boolean)

92 mHasFork = Value

93 End Set

94 End Property

95 Public Sub Eat()

96 Do Until Me.HasKnife And Me.HasFork

97 Console.Writeline(Thread.CurrentThread.Name&"is in the thread.")

98 If Rnd()< 0.5 Then

99 mFork.GrabFork(Me)

100 else

101 mKnife.GrabKnife(Me)

102 End If

103 Loop

104 MsgBox(Me.MyName & "can eat!")

105 mKnife = New Knife()

106 mFork= New Fork()

107 End Sub

108 End Class

The main procedure Main (lines 4-16) creates two instances of the Programmer class and then starts two threads to execute the critical Eat method of the Programmer class (lines 95-108), described below. The Main procedure sets the names of the threads and bootstrap them; probably, everything that happens is clear and without comment.

The code of the Fork class looks more interesting (lines 17-38) (a similar Knife class is defined in lines 39-60). Lines 18 and 19 set the values ​​of the common fields, by which you can find out whether the fork is currently available, and if not, who uses it. ReadOnly-property OwnUtensi1 (lines 20-24) is intended for the simplest transfer of information. Central to the Fork class is GrabFork's "grab the fork" method, defined on lines 25-27.

  1. Lines 26 and 27 simply write debug information to the console. In the main code of the method (lines 28-36), access to the fork is synchronized on the object path.belt Me. Because our program uses only one fork, timing on Me ensures that two threads can't grab it at the same time. The Sleep"p command (in the block starting on line 34) simulates the delay between grabbing the fork/knife and the start of eating. Note that the Sleep command does not release the lock on objects and only speeds up the deadlock!
    However, the code of the Programmer class (lines 61-108) is of most interest. Lines 67-70 define a generic constructor to ensure that there is only one fork and knife in the program. The property code (lines 74-94) is simple and does not require comments. The most important thing happens in the Eat method, which is executed by two separate threads. The process continues in a loop until some thread captures the fork along with the knife. In lines 98-102, the object grabs a fork/knife randomly using the Rnd call, which is what causes the deadlock. The following happens:
    The thread executing the Eat method of the One object is activated and enters the loop. He grabs the knife and goes into a waiting state.
  2. The thread executing Bob's Eat method wakes up and enters the loop. It cannot grab the knife, but it does grab the fork and goes into a waiting state.
  3. The thread executing the Eat method of the One object is activated and enters the loop. It tries to grab the fork, but the fork is already grabbed by Bob; the thread enters the waiting state.
  4. The thread executing Bob's Eat method wakes up and enters the loop. He tries to grab the knife, but the knife is already grabbed by Thoth; the thread enters the waiting state.

All this goes on ad infinitum - we have a typical deadlock situation (try to run the program and you will see that no one manages to eat like that).
You can also find out about the occurrence of a deadlock in the threads window. Run the program and interrupt it with the Ctrl+Break keys. Include the Me variable in the viewport and open the threads window. The result looks something like the one shown in Fig. 10.7. From the figure, it can be seen that Bob's thread has grabbed a knife, but he does not have a fork. Right-click in the threads window on the Toth line and select the Switch to Thread command from the context menu. The viewport shows that the stream Thoth has a fork but no knife. Of course, this is not one hundred percent proof, but such behavior at least makes one suspect that something was wrong.
If the single-object synchronization option (as in the house-rise program) is not possible, to prevent deadlocks, you can enumerate the synchronization objects and always capture them in a constant order. To continue the analogy with programmers having dinner: if the thread always takes the knife first and then the fork, there will be no deadlock problems. The first thread to grab the knife will be able to eat normally. Translated into the language of program flows, this means that the capture of object 2 is possible only if object 1 is previously captured.

Rice. 10.7. Deadlock Analysis in the Threads Window

Therefore, if we remove the call to Rnd on line 98 and replace it with the fragment

mFork.GrabFork(Me)

mKnife.GrabKnife(Me)

deadlock disappears!

Collaborate on data as it is created

A common situation in multithreaded applications is when threads not only work with shared data, but also wait for it to arrive (that is, thread 1 must create the data before thread 2 can use it). Since the data is shared, access to it must be synchronized. It is also necessary to provide means for notifying waiting threads when data is ready.

Such a situation is usually called supplier/consumer problem. A thread is trying to access data that does not yet exist, so it must transfer control to another thread that creates the required data. The problem is solved with the following code:

  • Thread 1 (the consumer) wakes up, enters the synchronized method, looks for data, doesn't find it, and goes into a wait state. PreliminaryHowever, it must release the lock so as not to interfere with the work of the provider thread.
  • Thread 2 (the provider) enters the synchronized method freed by thread 1, creates data for thread 1 and somehow notifies thread 1 of the presence of data. It then releases the lock so Thread 1 can process the new data.

Don't try to solve this problem by constantly activating thread 1 while checking the status of a condition variable whose value is >set by thread 2. This solution will seriously affect the performance of your program, because in most cases thread 1 will wake up for no reason; and thread 2 will wait so often that it doesn't have time to create data.

Producer/consumer relationships are very common, so multithreaded programming class libraries create special primitives for such situations. In .NET, these primitives are called Wait and Pulse-PulseAl 1 and are part of the Monitor class. Figure 10.8 explains the situation we are about to program. There are three queues of threads in the program: the waiting queue, the blocking queue, and the execution queue. The thread scheduler does not allocate CPU time to threads that are in the wait queue. In order for a thread to be allocated time, it must move to the execution queue. As a result, the work of the application is organized much more efficiently than with the usual polling of a conditional variable.

In pseudocode, the data consumer idiom is formulated as follows:

" Entering a synchronized block of the following form

while no data

Go to waiting queue

loop

If there is data, process it.

Leave synchronized block

Immediately after the Wait command is executed, the thread is suspended, the lock is released, and the thread enters the wait queue. When the lock is released, the thread in the run queue is allowed to run. Over time, one or more blocked threads will create the data needed for the work of the thread in the waiting queue. Since data validation is done in a loop, the transition to using data (after the loop) occurs only when there is data ready for processing.

In pseudocode, the data provider idiom looks like this:

"Entering a Synchronized View Block

While data is NOT needed

Go to waiting queue

Else Produce data

After the data is ready, call Pulse-PulseAll.

to move one or more threads from the blocking queue to the execution queue. Leave the synchronized block (and return to the run queue)

Suppose our program is simulating a family with one parent who earns money and a child who spends that money. When the money is overutsya, the child has to wait for the arrival of a new amount. The software implementation of this model looks like this:

1 Option Strict On

2 Imports System.Threading

3 Module

4 SubMain()

5 Dim theFamily As New Family()

6 theFamily.StartltsLife()

7 End Sub

8 End fjodule

9

10 Public Class Family

11 Private mMoney As Integer

12 Private mWeek As Integer = 1

13 Public Sub StartltsLife()

14 Dim aThreadStart As New ThreadStarUAddressOf Me.Produce)

15 Dim bThreadStart As New ThreadStarUAddressOf Me.Consume)

16 Dim aThread As New Thread(aThreadStart)

17 Dim bThread As New Thread(bThreadStart)

18 aThread.Name = "Produce"

19 aThread.Start()

20 bThread.Name = "Consume"

21 bThread. Start()

22 End Sub

23 Public Property TheWeek() As Integer

24 Get

25 Return mweek

26 End Get

27 Set(ByVal Value As Integer)

28 mweek - Value

29 End Set

30 End Property

31 Public Property OurMoney() As Integer

32 Get

33 Return mMoney

34 End Get

35 Set(ByVal Value As Integer)

36 mMoney =Value

37 End Set

38 End Property

39 Public Sub Produce()

40 Thread.Sleep(500)

41 Do

42 Monitor.Enter(Me)

43 Do While Me.OurMoney > 0

44 Monitor.Wait(Me)

45 Loop

46 Me.OurMoney=1000

47 Monitor.PulseAll(Me)

48 Monitor.Exit(Me)

49 Loop

50 End Sub

51 Public Sub Consume()

52 MsgBox("Am in consume thread")

53 Do

54 Monitor.Enter(Me)

55 Do While Me.OurMoney = 0

56 Monitor.Wait(Me)

57 Loop

58 Console.WriteLine("Dear parent I just spent all your " & _

money in week" & The Week)

59 The Week += 1

60 If TheWeek = 21 *52 Then System.Environment.Exit(0)

61 Me.OurMoney=0

62 Monitor.PulseAll(Me)

63 Monitor.Exit(Me)

64 Loop

65 End Sub

66 End Class

The StartltsLife method (lines 13-22) prepares to start the Produce and Consume threads. The most important thing happens in the Produce (lines 39-50) and Consume (lines 51-65) flows. The Sub Produce procedure checks for the availability of money, and if there is money, it goes to the waiting queue. Otherwise, the parent generates money (line 46) and notifies the objects in the waiting queue about the change in the situation. Note that the Pulse-Pulse All call only takes effect when the lock is released with the Monitor.Exit command. Conversely, the Sub Consume procedure checks for the presence of money, and if there is no money, it notifies the expecting parent. Line 60 simply terminates the program after 21 conditional years; call System. Environment.Exit(0) is the .NET counterpart of the End command (the End command is also supported, but unlike System.Environment.Exit, it does not return an exit code to the operating system).

Threads placed on the wait queue must be released by other parts of your program. This is the reason why we prefer to use PulseAll instead of Pulse. Since it is not known in advance which thread will be activated when Pulse 1 is called, with a relatively small number of threads in the queue, PulseAll can be called with the same success.

Multithreading in graphics programs

Our discussion of multithreading in GUI applications will begin with an example explaining what multithreading is for in GUI applications. Create a form with two buttons Start (btnStart) and Cancel (btnCancel), as shown in Figure 1. 10.9. When the Start button is clicked, a class is created that contains a random string of 10 million characters and a method to count occurrences of the letter "E" in that long string. Note the use of the StringBuilder class, which makes building long strings more efficient.

Step 1

Thread 1 notices that there is no data for it. It calls Wait, releases the lock, and enters the wait queue.



Step 2

When the lock is released, thread 2 or thread 3 exits the lock queue and enters the synchronized block, acquiring the lock

Step3

Let's say thread 3 enters a synchronized block, creates data, and calls Pulse-Pulse All.

Immediately after it exits the block and releases the lock, thread 1 is moved to the execution queue. If thread 3 calls Pluse, only one goes to the run queue.thread, when Pluse All is called, all threads go to the run queue.



Rice. 10.8. The supplier/consumer problem

Rice. 10.9. Multithreading in a simple GUI application

Imports System.Text

Public Class RandomCharacters

Private m_Data As StringBuilder

Private mjength, m_count As Integer

Public Sub New(ByVal n As Integer)

m_Length = n-1

m_Data = New StringBuilder(m_length) MakeString()

end sub

Private Sub MakeString()

Dim i As Integer

Dim myRnd As New Random()

For i = 0 To m_length

" Generate a random number from 65 to 90,

" convert it to uppercase

" and attach to the StringBuilder object

m_Data.Append(Chr(myRnd.Next(65.90)))

Next

end sub

Public Sub StartCount()

GetEes()

end sub

Private Sub GetEes()

Dim i As Integer

For i = 0 To m_length

If m_Data.Chars(i) = CChar("E") Then

m_count += 1

End If Next

m_CountDone = True

end sub

Public Readonly

Property GetCount() As Integer Get

If Not (m_CountDone) Then

Return m_count

End if

End Get End Property

Public Readonly

Property IsDone() As Boolean Get

return

m_CountDone

End get

End Property

end class

The two buttons on the form have very simple code associated with them. The btn-Start_Click procedure instantiates the above RandomCharacters class, which encapsulates a string with 10 million characters:

Private Sub btnStart_Click(ByVal sender As System.Object.

ByVal e As System.EventArgs) Handles btnSTart.Click

Dim RC As New RandomCharacters(10000000)

RC.StartCount()

MsgBox("The number of es is " & RC.GetCount)

end sub

The Cancel button displays a message box:

Private Sub btnCancel_Click(ByVal sender As System.Object._

ByVal e As System.EventArgs)Handles btnCancel.Click

MsgBox("Count Interrupted!")

end sub

When you run the program and click the Start button, you find that the Cancel button does not respond to user input because the continuous loop prevents the button from handling the event it receives. In modern programs, this is unacceptable!

Two solutions are possible. The first option, well known from previous versions of VB, does without multithreading: the DoEvents call is included in the loop. In .NET, this command looks like this:

Application.DoEvents()

In our example, this is definitely undesirable - who wants to slow down the program with ten million DoEvents calls! If you instead separate the loop into a separate thread, the operating system will switch between threads and the Cancel button will still work. An implementation with a separate thread is shown below. To visually show that the Cancel button works, when it is pressed, we simply end the program.

Next Step: Show Count Button

Let's say you decide to get creative and give the form the look shown in fig. 10.9. Please note: the Show Count button is not yet available.

Rice. 10.10. Form with disabled button

A separate thread is supposed to do the counting and unlock the disabled button. Of course it can be done; moreover, such a problem arises quite often. Unfortunately, you won't be able to do it in the most obvious way - linking a secondary thread to a GUI thread by keeping a reference to the ShowCount button in the constructor, or even using a standard delegate. In other words, never do not use the option below (basic erroneous lines are in bold).

Public Class RandomCharacters

Private m_0ata As StringBuilder

Private m_CountDone As Boolean

Private mjength. m_count As Integer

Private m_Button As Windows.Forms.Button

Public Sub New(ByVa1 n As Integer,_

ByVal b AsWindows.Forms.Button)

m_length = n - 1

m_Data = New StringBuilder(mJength)

m_Button = b MakeString()

end sub

Private Sub MakeString()

Dim I As Integer

Dim myRnd As New Random()

For I = 0 To m_length

m_Data.Append(Chr(myRnd.Next(65.90)))

Next

end sub

Public Sub StartCount()

GetEes()

end sub

Private Sub GetEes()

Dim I As Integer

For I = 0 To mjength

If m_Data.Chars(I) = CChar("E") Then

m_count += 1

End If Next

m_CountDone =True

m_Button.Enabled=True

end sub

Public Readonly

Property GetCount()As Integer

Get

If Not (m_CountDone) Then

Throw New Exception("Count not yet done") Else

Return m_count

End if

End get

End Property

Public Readonly Property IsDone() As Boolean

Get

Return m_CountDone

End get

End Property

end class

It is likely that in some cases this code will work. Nonetheless:

  • The interaction of the secondary thread with the thread that creates the GUI cannot be arranged obvious means.
  • Never do not change elements in graphics programs from other program threads. All changes should only occur on the thread that created the GUI.

If you break these rules, we we guarantee that subtle, subtle bugs will occur in your multi-threaded graphics programs.

It will not be possible to organize the interaction of objects using events either. The 06-event worker runs on the same thread that RaiseEvent was called on, so the events won't help you.

Yet common sense dictates that graphics applications should have a means of modifying elements from another thread. The .NET Framework provides a thread-safe way to call GUI application methods from another thread. For this purpose, a special Method Invoker delegate type from the System.Windows namespace is used. forms. The following snippet shows the new version of the GetEes method (changed lines are in bold):

Private Sub GetEes()

Dim I As Integer

For I = 0 To m_length

If m_Data.Chars(I) = CChar("E")Then

m_count += 1

End If Next

m_CountDone = True Try

Dim myInvoker As New MethodInvoker(AddressOf UpDateButton)

myInvoker.Invoke() Catch e As ThreadInterruptedException

"Failure

End Try

end sub

Public Sub UpDateButton()

m_Button.Enabled =True

end sub

Inter-thread calls to the button are not made directly, but through the Method Invoker. The .NET Framework guarantees that this option is thread-safe.

Why does multi-threaded programming cause so many problems?

Now that you have some idea of ​​multi-threaded programming and the potential problems associated with it, we thought it would be appropriate to answer the question posed in the subsection heading at the end of this chapter.

One of the reasons is that multithreading is a non-linear process, and we are used to a linear programming model. At first, it is difficult to get used to the very idea that the execution of the program can be interrupted randomly, and control will be transferred to other code.

However, there is another, more fundamental reason: too few programmers these days program in assembly or even look at the disassembled output of a compiler. Otherwise, it would be much easier for them to get used to the idea that dozens of assembler instructions can correspond to one command of a high-level language (such as VB .NET). The thread can be interrupted after any of these instructions, and therefore also in the middle of a high-level instruction.

But that's not all: modern compilers optimize the speed of programs, and computer hardware can interfere with memory management. As a result, the compiler or hardware may, without your knowledge, change the order of commands specified in the source code of the program [ Many compilers optimize circular array copy operations like for i=0 to n:b(i)=a(i):ncxt. A compiler (or even a specialized memory manager) can simply create an array and then fill it with a single copy operation instead of copying individual elements multiple times!].

We hope these explanations will help you better understand why multithreaded programming causes so many problems - or at least be less surprised by the strange behavior of your multithreaded programs!

Earlier posts covered multithreading on Windows using CreateThread and other WinAPI s, and multithreading on Linux and other *nix systems using pthreads . If you're writing in C++11 or later, you have access to std::thread and other multi-threaded primitives introduced in this language standard. The following will show how to work with them. Unlike WinAPI and pthreads, code written with std::thread is cross-platform.

Note: The above code has been tested on GCC 7.1 and Clang 4.0 under Arch Linux , GCC 5.4 and Clang 3.8 under Ubuntu 16.04 LTS, GCC 5.4 and Clang 3.8 under FreeBSD 11, and Visual Studio Community 2017 under Windows 10. CMake prior to version 3.8 cannot speak compiler to use the C++17 standard specified in the project properties. How to Install CMake 3.8 on Ubuntu 16.04. In order for code to compile with Clang, the libc++ package must be installed on *nix systems. For Arch Linux, the package is available on the AUR. Ubuntu has the libc++-dev package, but you may run into , which makes the code not build so easily. Workaround is described on StackOverflow. On FreeBSD, you need to install the cmake-modules package to compile the project.

Mutexes

Below is a simple example of using threads and mutexes:

#include
#include
#include
#include

std::mutex mtx;
static int counter = 0 ;


for (;; ) (
{
std::lock_guard< std:: mutex >lock(mtx) ;

break ;
int ctr_val = ++ counter;
std::out<< "Thread " << tnum << ": counter = " <<
ctr_val<< std:: endl ;
}

}
}

int main() (
std::vector< std:: thread >threads;
for (int i = 0 ; i< 10 ; i++ ) {


}

// can"t use const auto& here since .join() is not marked const

thr.join();
}

std::cout<< "Done!" << std:: endl ;
return 0 ;
}

Note the wrapping of std::mutex in std::lock_guard, following the RAII idiom. This approach guarantees that the mutex will be released when exiting the scope in any case, including when exceptions occur. To capture several mutexes at once in order to prevent deadlocks, there is a class std::scoped_lock . However, it only appeared in C++17 and therefore may not work everywhere. For earlier versions of C++, there is a std::lock template that is similar in functionality, although it requires additional code to correctly release locks via RAII.

R.W.Lock

Often there is a situation in which access to an object occurs more often for reading than for writing. In this case, instead of the usual mutex, it is more efficient to use a read-write lock, aka RWLock. RWLock can be captured by multiple read threads at once, or by only one write thread. RWLock in C++ corresponds to the std::shared_mutex and std::shared_timed_mutex classes:

#include
#include
#include
#include

// std::shared_mutex mtx; // will not work with GCC 5.4
std::shared_timed_mutex mtx;

static int counter = 0 ;
static const int MAX_COUNTER_VAL = 100 ;

void thread_proc(int tnum) (
for (;; ) (
{
// see also std::shared_lock
std::unique_lock< std:: shared_timed_mutex >lock(mtx) ;
if (counter == MAX_COUNTER_VAL)
break ;
int ctr_val = ++ counter;
std::out<< "Thread " << tnum << ": counter = " <<
ctr_val<< std:: endl ;
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}

int main() (
std::vector< std:: thread >threads;
for (int i = 0 ; i< 10 ; i++ ) {
std::thread thr(thread_proc, i) ;
threads.emplace_back(std::move(thr) ) ;
}

for (auto & thr : threads) (
thr.join();
}

std::cout<< "Done!" << std:: endl ;
return 0 ;
}

By analogy with std::lock_guard, the std::unique_lock and std::shared_lock classes are used to capture RWLock, depending on how we want to capture the lock. The std::shared_timed_mutex class was introduced in C++14 and works on all* modern platforms (I won't speak for mobile devices, game consoles, etc.). Unlike std::shared_mutex , it has methods try_lock_for, try_lock_unti and others that try to acquire a mutex within a given time. I strongly suspect that std::shared_mutex should be cheaper than std::shared_timed_mutex. However, std::shared_mutex only appeared in C++17, which means it is not supported everywhere. In particular, the still widely used GCC 5.4 does not know about it.

Thread Local Storage

Sometimes you need to create a variable, like a global one, but only one thread can see it. Other threads also see the variable, but they have it's own local value. To do this, they came up with Thread Local Storage, or TLS (has nothing to do with Transport Layer Security!). Among other things, TLS can be used to significantly speed up the generation of pseudo-random numbers. An example of using TLS in C++:

#include
#include
#include
#include

std::mutex io_mtx;
thread_local int counter = 0 ;
static const int MAX_COUNTER_VAL = 10 ;

void thread_proc(int tnum) (
for (;; ) (
counter++ ;
if (counter == MAX_COUNTER_VAL)
break ;
{
std::lock_guard< std:: mutex >lock(io_mtx) ;
std::out<< "Thread " << tnum << ": counter = " <<
counter<< std:: endl ;
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}

int main() (
std::vector< std:: thread >threads;
for (int i = 0 ; i< 10 ; i++ ) {
std::thread thr(thread_proc, i) ;
threads.emplace_back(std::move(thr) ) ;
}

for (auto & thr : threads) (
thr.join();
}

std::cout<< "Done!" << std:: endl ;
return 0 ;
}

The mutex here is used solely to synchronize output to the console. No synchronization is required to access thread_local variables.

Atomic Variables

Atomic variables are often used to perform simple operations without the use of mutexes. For example, you need to increment a counter from multiple threads. Instead of wrapping an int in a std::mutex, it's more efficient to use std::atomic_int. C++ also offers std::atomic_char, std::atomic_bool, and many more types. They also implement lock-free algorithms and data structures on atomic variables. It is worth noting that they are very difficult to develop and debug, and not all systems work faster than similar algorithms and data structures with locks.

Code example:

#include
#include
#include
#include
#include

static std:: atomic_int atomic_counter(0 ) ;
static const int MAX_COUNTER_VAL = 100 ;

std::mutex io_mtx;

void thread_proc(int tnum) (
for (;; ) (
{
int ctr_val = ++ atomic_counter;
if (ctr_val >= MAX_COUNTER_VAL)
break ;

{
std::lock_guard< std:: mutex >lock(io_mtx) ;
std::out<< "Thread " << tnum << ": counter = " <<
ctr_val<< std:: endl ;
}
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}

int main() (
std::vector< std:: thread >threads;

int nthreads = std::thread::hardware_concurrency();
if (nthreads == 0 ) nthreads = 2 ;

for (int i = 0 ; i< nthreads; i++ ) {
std::thread thr(thread_proc, i) ;
threads.emplace_back(std::move(thr) ) ;
}

for (auto & thr : threads) (
thr.join();
}

std::cout<< "Done!" << std:: endl ;
return 0 ;
}

Note the use of the hardware_concurrency procedure. It returns an estimate of the number of threads that can be executed in parallel on the current system. For example, on a machine with a quad-core processor that supports hyper threading, the procedure returns the number 8. The procedure can also return zero if the evaluation fails or the procedure is simply not implemented.

For some information on how atomic variables work at the assembler level, see the x86/x64 assembler basic instructions cheat sheet.

Conclusion

As far as I can see, it all works really well. That is, when writing cross-platform applications in C ++, you can safely forget about WinAPI and pthreads. Pure C also has cross-platform threads since C11. But they are still not supported by Visual Studio (I checked) and are unlikely to ever be. It's no secret that Microsoft sees no interest in developing support for the C language in its compiler, preferring to concentrate on C++.

There are still a lot of primitives left behind the scenes: std::condition_variable(_any), std::(shared_)future, std::promise, std::sync and others. To get acquainted with them, I recommend the site cppreference.com. It may also make sense to read the book C++ Concurrency in Action . But I must warn you that it is no longer new, contains a lot of water, and in essence retells a dozen articles from cppreference.com.

The full version of the source code for this note, as usual, is on GitHub. How do you write multi-threaded applications in C++ now?