?

Log in

 

New to threads - Advanced C++ Community

About New to threads

Previous Entry New to threads Dec. 20th, 2007 @ 05:41 pm Next Entry
I've never written any threaded applications before, but I have one that could probably benefit from it. I'm looking at converting it over using Boost.

I have found the Dr. Dobbs article, which seems to have the best and most numerous examples of Boost threading, but I'm still unclear on how to actually get the different threads to execute in parallel. I'm really just starting out here, so maybe I'm just misinterpreting, but it looks like in the examples the threads are executed serially, which doesn't make any sense. I must misunderstand something.

I admit my impatience with reading the whole article in detail, because my particular application seems--to my naive self--to naturally avoid the major pitfalls of threading.

Basically, it creates a collection of Recipients, and for each recipient:
- queries for some information to email to the recipient
- emails the information to the Recipient
- deletes the database records that led to the email being sent (no two recipients will share a record, and this is MySQL, so there will be no table locks)
- logs the sending of the email in another database table
- updates a record of the last time the user received an email

Rather than processing each recipient serially in a loop, I'd like to be able to spawn a thread for each recipient. I'm thinking that I need to combine all of the per-recipient tasks into a single function and then spawn a new thread of that function for each recipient. Am I on the right track? Do I need to use a thread_group so I can dynamically create threads rather than instantiating specific named threads, as is done in all of the Dr. Dobbs examples?

Here is the gist of the application. The mailer is a functor.


  try {
    bool recipients_found = load_recipient_data();
    if (recipients_found) {
      Summary_Mailer mailer(get_url_domain(), get_bounce_address(), destination_template, mail_drop, verbose);

      for (summary_recipient_hash_map::iterator recipient_it = recipients.begin(); recipient_it != recipients.end(); ++recipient_it) {
        set_recipient_documents(&(recipient_it->second));

        if (!recipient_it->second.documents.empty()) {
          mailer(&(recipient_it->second));
          cleanup_research_summary_email(recipient_it->second);
          insert_email_sent_summary_research_activity(recipient_it->second);
        }

        update_user_research_summary_options(recipient_it->second);
      }
    }

    user_db.logoff();
    document_db.logoff();

  } 



Any applicable advice appreciated. Alliteration optional.
Leave a comment
[User Picture Icon]
From:ataxi
Date:December 21st, 2007 03:36 am (UTC)

This sort of thing ...

(Link)
You create a boost::thread with a boost::function0 (or equivalently, a boost::function<void ()>, or anything convertible to either).

In this case your recipient is a parameter of the threaded activity, so you could create a function that looks like
void processRecipient()(YourRecipientClass& rec)
{
  set_recipient_documents(&rec.second);
  if (!rec.second.documents.empty();
  {
    mailer(&rec.second);
    cleanup_research_summary_email(&rec.second);
    insert_email_sent_summary_research_activity(rec.second);
  }
  update_user_research_summary_options(rec.second);
}
Then in order to spawn the concurrent tasks you'd do something like
typedef boost::shared_ptr<boost::thread> ThreadPtr;
std::vector<ThreadPtr> tasks;

for (summary_recipient_hash_map::iterator recipient_it = recipients.begin();
     ...; ...)
{
  tasks.push_back(new ThreadPtr(boost::thread(boost::bind(&processRecipient, *recipient_it)));
}

// tasks are running now, so join them all on completion
std::vector<ThreadPtr>::iterator iter = tasks.begin();
std::vector<ThreadPtr>::iterator fin = tasks.end();

for (iter; iter != fin; ++iter)
{
  (*iter)->join();
}
Boost threads begin executing their thread functions on construction, and the thread::join method causes the joining thread to wait for the thread being joined to complete.

Because Boost threads aren't copy-constructible, I think it's necessary to pass them around as pointers or shared_ptrs when creating collections of threads. I haven't experimented with thread_group myself, but perhaps it provides a simpler way of doing that.

Finally boost::bind is an incredibly useful toy.
[User Picture Icon]
From:ataxi
Date:December 21st, 2007 03:40 am (UTC)

Re: This sort of thing ...

(Link)
This assumes that the various functions being carried out on each recipient don't share data, because then your concurrent tasks will probably conflict horribly.

And I missed some markup escapes at the top (boost::function0 should be boost::function0<void>).
[User Picture Icon]
From:ataxi
Date:December 21st, 2007 03:42 am (UTC)

Re: This sort of thing ...

(Link)
Oh, and it should be
ThreadPtr(new boost::thread(...
not
new ThreadPtr(boost::thread...
Feh.
[User Picture Icon]
From:jedipussytricks
Date:December 21st, 2007 05:11 am (UTC)

Re: This sort of thing ...

(Link)
Thanks! So very helpful. I'll give it a whirl when I get into work tomorrow.

The recipients don't share any data at all that I can think of, and I didn't write the app that long ago (maybe a year and a half ago, with recent maintenance) so I'm pretty sure it's all in the clear.

If I can get this one working nicely with threads, I've got another very related app (it even shares many base classes) that I might do as well. Wheee!!
[User Picture Icon]
From:ataxi
Date:December 21st, 2007 08:25 am (UTC)

Re: This sort of thing ...

(Link)
No problem -- couple of other things ... it may not be particularly efficient to explicitly join the tasks immediately after you launch them.

Also, you need to be fairly sure that the task loops themselves are able to complete in a timely fashion in all cases, because boost::thread doesn't provide a cancellation mechanism. Thread cancellation is a bad thing anyway. Calls that could block indefinitely within the task loop are therefore a bad idea.

And there's a certain point where the context switching overhead of creating another concurrent task is greater than the benefit of running them all concurrently. When you're in that situation you want to start using a thread pool and queueing your jobs.

I'm a fan of the boost threading API, it's far nicer to work with than pthreads, or the abstractions that most C++ people tend to write around them.
[User Picture Icon]
From:lertulo
Date:December 21st, 2007 07:06 am (UTC)
(Link)
I hate to be thoroughly non-helpful, but can you put the snippet behind a cut? It's too wide, and screwing up the formatting for all my other stuff.
[User Picture Icon]
From:jedipussytricks
Date:December 21st, 2007 05:54 pm (UTC)
(Link)
Sure, no problem.
[User Picture Icon]
From:omnifarious
Date:December 21st, 2007 08:00 am (UTC)

Yes, I too would like you to put this behind a cut.

(Link)

The snippet is too big and messes up people's friend's list pages because it's too wide. Not sure if I'll delete your post over this, but I may.

[User Picture Icon]
From:jedipussytricks
Date:December 21st, 2007 05:54 pm (UTC)

Re: Yes, I too would like you to put this behind a cut.

(Link)
Alright, Mr. Crankypants.
[User Picture Icon]
From:omnifarious
Date:December 21st, 2007 07:37 pm (UTC)

Re: Yes, I too would like you to put this behind a cut.

(Link)

I don't think that putting the word 'crank' and the word 'pants' in the same sentence (much less the same word) is a place we really want to go today. ;-)


Thanks for putting the code behind a cut. :-)

(Leave a comment)
Top of Page Powered by LiveJournal.com