VB.NET, best practice for sorting concurrent results of threadpooling? - vb.net

In short, I'm trying to "sort" incoming results of using threadpooling as they finish. I have a functional solution, but there's no way in the world it's the best way to do it (it's prone to huge pauses). So here I am! I'll try to hit the bullet points of what's going on/what needs to happen and then post my current solution.
The intent of the code is to get information about files in a directory, and then write that to a text file.
I have a list (Counter.ListOfFiles) that is a list of the file paths sorted in a particular way. This is the guide that dictates the order I need to write to the text file.
I'm using a threadpool to collect the information about each file, create a stringbuilder with all of the text ready to write to the text file. I then call a procedure(SyncUpdate, inlcluded below), send the stringbuilder(strBld) from that thread along with the name of the path of the file that particular thread just wrote to the stringbuilder about(Xref).
The procedure includes a synclock to hold all the other threads until it finds a thread passing the correct information. That "correct" information being when the xref passed by the thread matches the first item in my list (FirstListItem). When that happens, I write to the text file, delete the first item in the list and do it again with the next thread.
The way I'm using the monitor is probably not great, in fact I have little doubt I'm using it in an offensively wanton manner. Basically while the xref (from the thread) <> the first item in my list, I'm doing a pulseall for the monitor. I originally was using monitor.wait, but it would eventually just give up trying to sort through the list, even when using a pulse elsewhere. I may have just been coding something awkwardly. Either way, I don't think it's going to change anything.
Basically the problem comes down to the fact that the monitor will pulse through all of the items it has in the queue, when there's a good chance the item I am looking for probably got passed to it somewhere earlier in the queue or whatever and it's now going to sort through all of the items again before looping back around to find a criteria that matches. The result of this is that my code will hit one of these and take a huge amount of time to complete.
I'm open to believing I'm just using the wrong tool for the job, or just not using tool I have correctly. I would strongly prefer some sort of threaded solution (unsurprisingly, it's much faster!). I've been messing around a bit with the Parallel Task functionality today, and a lot of the stuff looks promising, but I have even less experience with that vs. threadpool, and you can see how I'm abusing that! Maybe something with queue? You get the idea. I am directionless. Anything someone could suggest would be much appreciated. Thanks! Let me know if you need any additional information.
Private Sub SyncUpdateResource(strBld As Object, Xref As String)
SyncLock (CType(strBld, StringBuilder))
Dim FirstListitem As String = counter.ListOfFiles.First
Do While Xref <> FirstListitem
FirstListitem = Counter.ListOfFiles.First
'This makes the code much faster for reasons I can only guess at.
Thread.Sleep(5)
Monitor.PulseAll(CType(strBld, StringBuilder))
Loop
Dim strVol As String = Form1.Volname
Dim strLFPPath As String = Form1.txtPathDir
My.Computer.FileSystem.WriteAllText(strLFPPath & "\" & strVol & ".txt", strBld.ToString, True)
Counter.ListOfFiles.Remove(Xref)
End SyncLock
End Sub

This is a pretty typical multiple producer, single consumer application. The only wrinkle is that you have to order the results before they're written to the output. That's not difficult to do. So let's let that requirement slide for a moment.
The easiest way in .NET to implement a producer/consumer relationship is with BlockingCollection, which is a thread-safe FIFO queue. Basically, you do this:
In your case, the producer threads get items, do whatever processing they need to, and then put the item onto the queue. There's no need for any explicit synchronization--the BlockingCollection class implementation does that for you.
Your consumer pulls things from the queue and outputs them. You can see a really simple example of this in my article Simple Multithreading, Part 2. (Scroll down to the third example if you're just interested in the code.) That example just uses one producer and one consumer, but you can have N producers if you want.
Your requirements have a little wrinkle in that the consumer can't just write items to the file as it gets them. It has to make sure that it's writing them in the proper order. As I said, that's not too difficult to do.
What you want is a priority queue of some sort onto which you can place an item if it comes in out of order. Your priority queue can be a sorted list or even just a sequential list if the number of items you expect to get out of order isn't very large. That is, if you typically have only a half dozen items at a time that could be out of order, then a sequential list could work just fine.
I'd use a heap because it performs well. The .NET Framework doesn't supply a heap, but I have a simple one that works well for jobs like this. See A Generic BinaryHeap Class.
So here's how I'd write the consumer (the code is in pseudo-C#, but you can probably convert it easily enough).
The assumption here is that you have a BlockingCollection called sharedQueue that contains the items. The producers put items on that queue. Consumers do this:
var heap = new BinaryHeap<ItemType>();
foreach (var item in sharedQueue.GetConsumingEnumerable())
{
if (item.SequenceKey == expectedSequenceKey)
{
// output this item
// then check the heap to see if other items need to be output
expectedSequenceKey = expectedSequenceKey + 1;
while (heap.Count > 0 && heap.Peek().SequenceKey == expectedSequenceKey)
{
var heapItem = heap.RemoveRoot();
// output heapItem
expectedSequenceKey = expectedSequenceKey + 1;
}
}
else
{
// item is out of order
// put it on the heap
heap.Insert(item);
}
}
// if the heap contains items after everything is processed,
// then some error occurred.
One glaring problem with this approach as written is that the heap could grow without bound if one of your consumers crashes or goes into an infinite loop. But then, your other approach probably would suffer from that as well. If you think that's an issue, you'll have to add some way to skip an item that you think won't ever be forthcoming. Or kill the program. Or something.
If you don't have a binary heap or don't want to use one, you can do the same thing with a SortedList<ItemType>. SortedList will be faster than List, but slower than BinaryHeap if the number of items in the list is even moderately large (a couple dozen). Fewer than that and it's probably a wash.
I know that's a lot of info. I'm happy to answer any questions you might have.

Related

wxDataViewListCtrl is slow with 100k items from another thread

The requirements:
100k lines
One of the columns is not text - its custom painted with wxDC*.
The items addition is coming from another thread using wxThreadEvent.
Up until now I used wxDataViewListCtrl, but it takes too long to AppendItem 100 thousand time.
wxListCtrl (in virtual mode) does not have the ability to use wxDC* - please correct me if I am wrong.
The only thing I can think of is using wxDataViewCtrl + wxDataViewModel. But I can't understand how to add items.
I looked at the samples (https://github.com/wxWidgets/wxWidgets/tree/WX_3_0_BRANCH/samples/dataview), too complex for me.
I cant understand them.
I looked at the wiki (https://wiki.wxwidgets.org/WxDataViewCtrl), also too complex for me.
Can somebody please provide a very simple example of a wxDataViewCtrl + wxDataViewModel with one string column and one wxDC* column.
Thanks in advance.
P.S.
Per #HajoKirchhoff's request in the comments, I am posting some code:
// This is called from Rust 100k times.
extern "C" void Add_line_to_data_view_list_control(unsigned int index,
const char* date,
const char* sha1) {
wxThreadEvent evt(wxEVT_THREAD, 44);
evt.SetPayload(ViewListLine{index, std::string(date), std::string(sha1)});
wxQueueEvent(g_this, evt.Clone());
}
void TreeWidget::Add_line_to_data_view_list_control(wxThreadEvent& event) {
ViewListLine view_list_line = event.GetPayload<ViewListLine>();
wxVector<wxVariant> item;
item.push_back(wxVariant(static_cast<int>(view_list_line.index)));
item.push_back(wxVariant(view_list_line.date));
item.push_back(wxVariant(view_list_line.sha1));
AppendItem(item);
}
Appending 100k items to a control will always be slow. That's because it requires moving 100k items from your storage to the controls storage. A much better way for this amount of data is to have a "virtual" list control or wxGrid. In both cases the data is not actually transferred to the control. Instead when painting occurs, a callback function will transfer only the data required to paint. So for a 100k list you will only have "activity" for the 20-30 lines that are visible.
With wxListCtrl see https://docs.wxwidgets.org/3.0/classwx_list_ctrl.html, specify the wxLC_VIRTUAL flag, call SetItemCount and then provide/override
OnGetItemText
OnGetItemImage
OnGetItemColumnImage
Downside: You can only draw items contained in a wxImageList, since the OnGetItemImage return indizes into the list. So you cannot draw arbitrary items using a wxDC. Since the human eye will be overwhelmed with 100k different images anyway, this is usually acceptable. You may have to provide 20/30 different images beforehand, but you'll have a fast, flexible list.
That said, it is possible to override the OnPaint function and use that wxDC to draw anything in the list. But that'll get difficult pretty soon.
So an alternative would be to use wxGrid, create a wxGridTableBase derived class that acts as a bridge between the grid and your actual 100k data and create wxGridCellRenderer derived classes to render the actual data onscreen. The wxGridCellRenderer class will get a wxDC. This will give you more flexibility but is also much more complex than using a virtual wxListCtrl.
The full example of doing what you want will inevitably be relatively complex. But if you decompose in simple parts, it's really not that difficult: you do need to define a custom model, but if your list is flat, this basically just means returning the value of the item at the N-th position, as you can trivially implement all model methods related to the tree structure. An example of such a model, although with multiple columns can be found in the sample, so you just need to simplify it to a one (or two) column version.
Next, you are going to need a custom renderer too, but this is not difficult neither and, again, there is an example of this in the sample too.
If you have any concrete questions, you should ask them, but it's going to be difficult to do much better than what the sample shows and it does already show exactly what you want to do.
Thank you every one who replied!
#Vz.'s words "If you have any concrete questions, you should ask them" got me thinking and I took another look at the samples of wxWidgets. The full code can be found here. Look at the following classes:
TreeDataViewModel
TreeWidget
TreeCustomRenderer

Reactor - Stop source when first empty

I have a requirement like this.
Flux<Integer> s1 = .....;
s1.flatMap(value -> anotherSource.find(value));
I need a way to stop this s1 when anotherSource.find gives me first empty. how to do that?
Note:
One possible solution is to throw error then capture it to stop.
anotherSource.find(value).switchIfempty(Mono.error(..))
I am looking for better solution than this.
You won't find a specific operator for this, you'll have to combine operators to achieve it. (Note that doesn't make it a "hack" per-se, reactive frameworks are generally intended to be used in a way where you combine basic operators together to achieve your use-case.)
I would agree that using an error to achieve is far from ideal though as it potentially disrupts the flow of real errors in the reactive chain - so that should really be a last resort.
The approach I've generally taken in cases where I want the stream to stop based on an inner publisher is to materialise the inner stream, filter out the onComplete() signals and then re-add the onComplete() wherever appropriate (in this case, if it's empty.) You can then dematerialise the outer stream and it'll respond to the completed signal wherever you've injected it, stopping the stream:
s1.flatMap(
value ->
anotherSource
.find(value)
.materialize()
.filter(s -> !s.isOnComplete())
.defaultIfEmpty(Signal.complete()))
.dematerialize()
This has the advantage of preserving any error signals, while also not requiring another object or special value.

How can I use multi-threading (Parallel ForEach), and batch? Or should I handle it differently?

I'm creating an app to send out bulk emails, but I'm lacking understanding of batch, multi-threads, or the best way to handle it.
Say I do something like this:
Dim options as New ParallelOptions
options.MaxDegreeOfParallelism = Environment.ProcessorCount * 10
Parallel.ForEach(recipients.AsEnumerable(), options, _
Function(row)
Return SendEmail(args)
End Function)
What if there are a large amount of emails? I was thinking of adding a batch option, but not sure if it's beneficial. So if I have 500,000 subscribers, could this Parallel looping become unstable? Should there be a pause or sleep at some point? When thinking of emails I think of "Batch = #" but I'm unsure of how to take this concept into multi-threading like this or if I should be doing something entirely different.
I've heard of getting blacklisted if not handled correctly

Is foreach the only way to consume a BlockingCollection<T> in C#?

I'm starting to work with TPL right now. I have seen a simple version of the producer/consumer model utilizing TPL in this video.
Here is the problem:
The following code:
BlockingCollection<Double> bc = new BlockingCollection<Double>(100);
IEnumerable<Double> d = bc.GetConsumingEnumerable();
returns an IEnumerable<Double> which can be iterated (and automatically consumed) using a foreach:
foreach (var item in d)
{
// do anything with item
// in the end of this foreach,
// will there be any items left in d or bc? Why?
}
My questions are:
if I get the IEnumerator<Double> dEnum = d.GetEnumerator() from d (to iterate over d with a while loop, for instance) would the d.MoveNext() consume the list as well? (My answer: I don't think so, because the the dEnum is not linked with d, if you know what I mean. So it would consume dEnum, but not d, nor even bc)
May I loop through bc (or d) in a way other than the foreach loop, consuming the items? (the while cycles much faster than the foreach loop and I'm worried with performance issues for scientific computation problems)
What does exactly consume mean in the BlockingCollection<T> type?
E.g., code:
IEnumerator<Double> dEnum = d.GetEnumerator();
while (dEnum.MoveNext())
{
// do the same with dEnum.Current as
// I would with item in the foreach above...
}
Thank you all in advance!
If I get the IEnumerator<Double> dEnum = d.GetEnumerator() from d (to iterate over d with a while loop, for instance) would the d.MoveNext() consume the list as well?
Absolutely. That's all that the foreach loop will do anyway.
May I loop through bc (or d) in a way other than the foreach loop, consuming the items? (the while cycles much faster than the foreach loop and I'm worried with performance issues for scientific computation problems)
If your while loop is faster, that suggests you're doing something wrong. They should be exactly the same - except the foreach loop will dispose of the iterator too, which you should do...
If you can post a short but complete program demonstrating this discrepancy, we can look at it in more detail.
An alternative is to use Take (and similar methods).
What does exactly consume mean in the BlockingCollection type?
"Remove the next item from the collection" effectively.
There is no "performance issue" with foreach. Using the enumerator directly is not likely to give you any measurable improvement in performance compared to just using a foreach loop directly.
That being said, GetConsumingEnumerable() returns a standard IEnumerable<T>, so you can enumerate it any way you choose. Getting the IEnumerator<T> and enumerating through it directly will still work the same way.
Note that, if you don't want to use GetConsumingEnumerable(), you could just use ConcurrentQueue<T> directly. By default, BlockingCollection<T> wraps a ConcurrentQueue<T>, and really just provides a simpler API (GetConsumingEnumerable()) to make Producer/Consumer scenarios simpler to write. Using a ConcurrentQueue<T> directly would be closer to using BlockingCollection<T> without using the enumerable.

Is while (true) with break bad programming practice?

I often use this code pattern:
while(true) {
//do something
if(<some condition>) {
break;
}
}
Another programmer told me that this was bad practice and that I should replace it with the more standard:
while(!<some condition>) {
//do something
}
His reasoning was that you could "forget the break" too easily and have an endless loop. I told him that in the second example you could just as easily put in a condition which never returned true and so just as easily have an endless loop, so both are equally valid practices.
Further, I often prefer the former as it makes the code easier to read when you have multiple break points, i.e. multiple conditions which get out of the loop.
Can anyone enrichen this argument by adding evidence for one side or the other?
There is a discrepancy between the two examples. The first will execute the "do something" at least once every time even if the statement is never true. The second will only "do something" when the statement evaluates to true.
I think what you are looking for is a do-while loop. I 100% agree that while (true) is not a good idea because it makes it hard to maintain this code and the way you are escaping the loop is very goto esque which is considered bad practice.
Try:
do {
//do something
} while (!something);
Check your individual language documentation for the exact syntax. But look at this code, it basically does what is in the do, then checks the while portion to see if it should do it again.
To quote that noted developer of days gone by, Wordsworth:
...
In truth the prison, unto which we doom
Ourselves, no prison is; and hence for me,
In sundry moods, 'twas pastime to be bound
Within the Sonnet's scanty plot of ground;
Pleased if some souls (for such their needs must be)
Who have felt the weight of too much liberty,
Should find brief solace there, as I have found.
Wordsworth accepted the strict requirements of the sonnet as a liberating frame, rather than as a straightjacket. I'd suggest that the heart of "structured programming" is about giving up the freedom to build arbitrarily-complex flow graphs in favor of a liberating ease of understanding.
I freely agree that sometimes an early exit is the simplest way to express an action. However, my experience has been that when I force myself to use the simplest possible control structures (and really think about designing within those constraints), I most often find that the result is simpler, clearer code. The drawback with
while (true) {
action0;
if (test0) break;
action1;
}
is that it's easy to let action0 and action1 become larger and larger chunks of code, or to add "just one more" test-break-action sequence, until it becomes difficult to point to a specific line and answer the question, "What conditions do I know hold at this point?" So, without making rules for other programmers, I try to avoid the while (true) {...} idiom in my own code whenever possible.
When you can write your code in the form
while (condition) { ... }
or
while (!condition) { ... }
with no exits (break, continue, or goto) in the body, that form is preferred, because someone can read the code and understand the termination condition just by looking at the header. That's good.
But lots of loops don't fit this model, and the infinite loop with explicit exit(s) in the middle is an honorable model. (Loops with continue are usually harder to understand than loops with break.) If you want some evidence or authority to cite, look no further than Don Knuth's famous paper on Structured Programming with Goto Statements; you will find all the examples, arguments, and explanations you could want.
A minor point of idiom: writing while (true) { ... } brands you as an old Pascal programmer or perhaps these days a Java programmer. If you are writing in C or C++, the preferred idiom is
for (;;) { ... }
There's no good reason for this, but you should write it this way because this is the way C programmers expect to see it.
I prefer
while(!<some condition>) {
//do something
}
but I think it's more a matter of readability, rather than the potential to "forget the break." I think that forgetting the break is a rather weak argument, as that would be a bug and you'd find and fix it right away.
The argument I have against using a break to get out of an endless loop is that you're essentially using the break statement as a goto. I'm not religiously against using goto (if the language supports it, it's fair game), but I do try to replace it if there's a more readable alternative.
In the case of many break points I would replace them with
while( !<some condition> ||
!<some other condition> ||
!<something completely different> ) {
//do something
}
Consolidating all of the stop conditions this way makes it a lot easier to see what's going to end this loop. break statements could be sprinkled around, and that's anything but readable.
while (true) might make sense if you have many statements and you want to stop if any fail
while (true) {
if (!function1() ) return;
if (!function2() ) return;
if (!function3() ) return;
if (!function4() ) return;
}
is better than
while (!fail) {
if (!fail) {
fail = function1()
}
if (!fail) {
fail = function2()
}
........
}
Javier made an interesting comment on my earlier answer (the one quoting Wordsworth):
I think while(true){} is a more 'pure' construct than while(condition){}.
and I couldn't respond adequately in 300 characters (sorry!)
In my teaching and mentoring, I've informally defined "complexity" as "How much of the rest of the code I need to have in my head to be able to understand this single line or expression?" The more stuff I have to bear in mind, the more complex the code is. The more the code tells me explicitly, the less complex.
So, with the goal of reducing complexity, let me reply to Javier in terms of completeness and strength rather than purity.
I think of this code fragment:
while (c1) {
// p1
a1;
// p2
...
// pz
az;
}
as expressing two things simultaneously:
the (entire) body will be repeated as long as c1 remains true, and
at point 1, where a1 is performed, c1 is guaranteed to hold.
The difference is one of perspective; the first of these has to do with the outer, dynamic behavior of the entire loop in general, while the second is useful to understanding the inner, static guarantee which I can count on while thinking about a1 in particular. Of course the net effect of a1 may invalidate c1, requiring that I think harder about what I can count on at point 2, etc.
Let's put a specific (tiny) example in place to think about the condition and first action:
while (index < length(someString)) {
// p1
char c = someString.charAt(index++);
// p2
...
}
The "outer" issue is that the loop is clearly doing something within someString that can only be done as long as index is positioned in the someString. This sets up an expectation that we'll be modifying either index or someString within the body (at a location and manner not known until I examine the body) so that termination eventually occurs. That gives me both context and expectation for thinking about the body.
The "inner" issue is that we're guaranteed that the action following point 1 will be legal, so while reading the code at point 2 I can think about what is being done with a char value I know has been legally obtained. (We can't even evaluate the condition if someString is a null ref, but I'm also assuming we've guarded against that in the context around this example!)
In contrast, a loop of the form:
while (true) {
// p1
a1;
// p2
...
}
lets me down on both issues. At the outer level, I am left wondering whether this means that I really should expect this loop to cycle forever (e.g. the main event dispatch loop of an operating system), or whether there's something else going on. This gives me neither an explicit context for reading the body, nor an expectation of what constitutes progress toward (uncertain) termination.
At the inner level, I have absolutely no explicit guarantee about any circumstances that may hold at point 1. The condition true, which is of course true everywhere, is the weakest possible statement about what we can know at any point in the program. Understanding the preconditions of an action are very valuable information when trying to think about what the action accomplishes!
So, I suggest that the while (true) ... idiom is much more incomplete and weak, and therefore more complex, than while (c1) ... according to the logic I've described above.
The problem is that not every algorithm sticks to the "while(cond){action}" model.
The general loop model is like this :
loop_prepare
loop:
action_A
if(cond) exit_loop
action_B
goto loop
after_loop_code
When there is no action_A you can replace it by :
loop_prepare
while(cond)
action_B
after_loop_code
When there is no action_B you can replace it by :
loop_prepare
do action_A
while(cond)
after_loop_code
In the general case, action_A will be executed n times and action_B will be executed (n-1) times.
A real life example is : print all the elements of a table separated by commas.
We want all the n elements with (n-1) commas.
You always can do some tricks to stick to the while-loop model, but this will always repeat code or check twice the same condition (for every loops) or add a new variable. So you will always be less efficient and less readable than the while-true-break loop model.
Example of (bad) "trick" : add variable and condition
loop_prepare
b=true // one more local variable : more complex code
while(b): // one more condition on every loop : less efficient
action_A
if(cond) b=false // the real condition is here
else action_B
after_loop_code
Example of (bad) "trick" : repeat the code. The repeated code must not be forgotten while modifying one of the two sections.
loop_prepare
action_A
while(cond):
action_B
action_A
after_loop_code
Note : in the last example, the programmer can obfuscate (willingly or not) the code by mixing the "loop_prepare" with the first "action_A", and action_B with the second action_A. So he can have the feeling he is not doing this.
The first is OK if there are many ways to break from the loop, or if the break condition cannot be expressed easily at the top of the loop (for example, the content of the loop needs to run halfway but the other half must not run, on the last iteration).
But if you can avoid it, you should, because programming should be about writing very complex things in the most obvious way possible, while also implementing features correctly and performantly. That's why your friend is, in the general case, correct. Your friend's way of writing loop constructs is much more obvious (assuming the conditions described in the preceding paragraph do not obtain).
There's a substantially identical question already in SO at Is WHILE TRUE…BREAK…END WHILE a good design?. #Glomek answered (in an underrated post):
Sometimes it's very good design. See Structured Programing With Goto Statements by Donald Knuth for some examples. I use this basic idea often for loops that run "n and a half times," especially read/process loops. However, I generally try to have only one break statement. This makes it easier to reason about the state of the program after the loop terminates.
Somewhat later, I responded with the related, and also woefully underrated, comment (in part because I didn't notice Glomek's the first time round, I think):
One fascinating article is Knuth's "Structured Programming with go to Statements" from 1974 (available in his book 'Literate Programming', and probably elsewhere too). It discusses, amongst other things, controlled ways of breaking out of loops, and (not using the term) the loop-and-a-half statement.
Ada also provides looping constructs, including
loopname:
loop
...
exit loopname when ...condition...;
...
end loop loopname;
The original question's code is similar to this in intent.
One difference between the referenced SO item and this is the 'final break'; that is a single-shot loop which uses break to exit the loop early. There have been questions on whether that is a good style too - I don't have the cross-reference at hand.
Sometime you need infinite loop, for example listening on port or waiting for connection.
So while(true)... should not categorized as good or bad, let situation decide what to use
It depends on what you’re trying to do, but in general I prefer putting the conditional in the while.
It’s simpler, since you don't need another test in the code.
It’s easier to read, since you don’t have to go hunting for a break inside the loop.
You’re reinventing the wheel. The whole point of while is to do something as long as a test is true. Why subvert that by putting the break condition somewhere else?
I’d use a while(true) loop if I was writing a daemon or other process that should run until it gets killed.
If there's one (and only one) non-exceptional break condition, putting that condition directly into the control-flow construct (the while) is preferable. Seeing while(true) { ... } makes me as a code-reader think that there's no simple way to enumerate the break conditions and makes me think "look carefully at this and think about carefully about the break conditions (what is set before them in the current loop and what might have been set in the previous loop)"
In short, I'm with your colleague in the simplest case, but while(true){ ... } is not uncommon.
The perfect consultant's answer: it depends. Most cases, the right thing to do is either use a while loop
while (condition is true ) {
// do something
}
or a "repeat until" which is done in a C-like language with
do {
// do something
} while ( condition is true);
If either of these cases works, use them.
Sometimes, like in the inner loop of a server, you really mean that a program should keep going until something external interrupts it. (Consider, eg, an httpd daemon -- it isn't going to stop unless it crashes or it's stopped by a shutdown.)
THEN AND ONLY THEN use a while(1):
while(1) {
accept connection
fork child process
}
Final case is the rare occasion where you want to do some part of the function before terminating. In that case, use:
while(1) { // or for(;;)
// do some stuff
if (condition met) break;
// otherwise do more stuff.
}
I think the benefit of using "while(true)" is probably to let multiple exit condition easier to write especially if these exit condition has to appear in different location within the code block. However, for me, it could be chaotic when I have to dry-run the code to see how the code interacts.
Personally I will try to avoid while(true). The reason is that whenever I look back at the code written previously, I usually find that I need to figure out when it runs/terminates more than what it actually does. Therefore, having to locate the "breaks" first is a bit troublesome for me.
If there is a need for multiple exit condition, I tend to refactor the condition determining logic into a separate function so that the loop block looks clean and easier to understand.
No, that's not bad since you may not always know the exit condition when you setup the loop or may have multiple exit conditions. However it does require more care to prevent an infinite loop.
He is probably correct.
Functionally the two can be identical.
However, for readability and understanding program flow, the while(condition) is better. The break smacks more of a goto of sorts. The while (condition) is very clear on the conditions which continue the loop, etc. That doesn't mean break is wrong, just can be less readable.
A few advantages of using the latter construct that come to my mind:
it's easier to understand what the loop is doing without looking for breaks in the loop's code.
if you don't use other breaks in the loop code, there's only one exit point in your loop and that's the while() condition.
generally ends up being less code, which adds to readability.
I prefer the while(!) approach because it more clearly and immediately conveys the intent of the loop.
There has been much talk about readability here and its very well constructed but as with all loops that are not fixed in size (ie. do while and while) you run at a risk.
His reasoning was that you could "forget the break" too easily and have an endless loop.
Within a while loop you are in fact asking for a process that runs indefinitely unless something happens, and if that something does not happen within a certain parameter, you will get exactly what you wanted... an endless loop.
What your friend recommend is different from what you did. Your own code is more akin to
do{
// do something
}while(!<some condition>);
which always run the loop at least once, regardless of the condition.
But there are times breaks are perfectly okay, as mentioned by others. In response to your friend's worry of "forget the break", I often write in the following form:
while(true){
// do something
if(<some condition>) break;
// continue do something
}
By good indentation, the break point is clear to first time reader of the code, look as structural as codes which break at the beginning or bottom of a loop.
It's not so much the while(true) part that's bad, but the fact that you have to break or goto out of it that is the problem. break and goto are not really acceptable methods of flow control.
I also don't really see the point. Even in something that loops through the entire duration of a program, you can at least have like a boolean called Quit or something that you set to true to get out of the loop properly in a loop like while(!Quit)... Not just calling break at some arbitrary point and jumping out,
using loops like
while(1) { do stuff }
is necessary in some situations. If you do any embedded systems programming (think microcontrollers like PICs, MSP430, and DSP programming) then almost all your code will be in a while(1) loop. When coding for DSPs sometimes you just need a while(1){} and the rest of the code is an interrupt service routine (ISR).
If you loop over an external condition (not being changed inside the loop), you use while(t), where t is the condition. However, if the loop stops when the condition changes inside the loop, it's more convenient to have the exit point explicitly marked with break, instead of waiting for it to happen on the next iteration of the loop:
while (true) {
...
a := a + 1;
if (a > 10) break; // right here!
...
}
As was already mentioned in a few other answers, the less code you have to keep in your head while reading a particular line, the better.