Interspire Emailmarketer poor performance: 5 emails per 2 seconds - PMTA/EXIM - exim

Can anyone tell me what is holding us back.
I tried every different php script in front end to send emails. Interspire, Oempro, PHP list, PHPknode. But we are only able to send 5 emails every 2 seconds.
We upgraded our server, Our H/W configuration is good. We have used EXIM, We even tried PMTA. Even though our sending speed does not improved.
Our requirement is to send 200,000 - 300,000 emails a day But we need to send this in peak hours i.e. between 9am to 1pm. We are only able to send 15000 emails in 6-7 hours.
I don't know what is the problem, Why are we not able to send emails quickly. Is it because of the PHP script, MTA or the server h/w configuration.
Can anyone please help me with this problem? Any help will be appreciated.

I can tell you directly that Interspire Emailmarketer is not especially high-performing. I had a similar situation as you do. We had a high-end server machine, with SAS disks, 16 CPU cores and lots of RAM. We had a highly fine-tuned Postfix MTA and MySQL server (spent a few days configuring those). The performance you get matches our experience. The load in our case was entirely in the PHP script, not the database and not in the MTA.
I suspect that the Interspire software is meant for very low-traffic newsletters (where receivers can be counted in the hundreds).

Interspire by default uses a single php process to process the email queue, and thus it's unable to use multi-core machines. There is a paid multi-processing script called MSH addon which takes IEM processing queue and distributes it across several processor cores for massive speed bonus. From the addons website:
MSH is built around "multi processing library", a multi platform
,shared nothing, multi processing framework inspired by Python
multiprocessing module(but very different from api level). It uses
"proc family functions" for process spawning and "soq" for IPC.
Disclaimer: I am one of the developers of MSH addon.

Related

ColdFusion 11 to 2018 Upgrade -- Server Locking Up, How to Test Better?

We are currently testing an upgrade from CF11 to CF2018 for my company's intranet. To give you an idea how long this site has been running, our first version of CF was 3.1! It is still using application.cfm, and there is code from 1998, when I started writing this thing. Yes, 21 years -- I'm astonished, too. It is a hodgepodge of all kinds of older frameworks, too, including Fusebox.
Anyway, we're running Win 2012 VM connected to a SQL 2016 farm. Everything looked OK initially, but in the Week I've been testing, the server has come to a slowdown once (a page took more than 5 seconds to run, something that usually takes 100ms, no DB involvement), and another time, the server came to a grinding halt. The only way I could restart CF App service was by connecting to the server with another server via Services, because doing it via Remote Desktop was so slow.
Now keep in mind -- it's just me testing. This is a site that doesn't have a ton of users, but still, having 5 concurrent connections is normal and there are upwards of 200-400 users hitting this thing every day.
I have FusionReactor running on this thing now, so the next time a lockup happens, I will be able to take a closer look, but what do you think is the best way I can test this? Our site is mostly transactional, users going and filling out forms to put internal orders through. We also connect to XML web services and REST services; we also provide REST services, too. Obviously there's no way to completely replicate a production server's requests onto a test server, but I need to do more thorough testing. Any advice would be hugely appreciated.
I realize your focus for now is trying to recreate the problem on test. That may not be as easy as hoped. Instead, you should be able to understand and resolve it in production. FusionReactor can help, but the answer may well be in the cf logs.
You don't mention assessing the logs at the time of the hangup. See especially the coldfusion-error log, for outofmemory conditions.
You mention raising the heap, but the problem may be with the metaspace instead. If so, consider simply removing the maxmetaspace setting in the jvm args. That may be the sole and likely cause of such new and unexpected outages.
Or if it's not, and there's nothing in the logs at the time, THEN do consider FR. Does IT show anything happening at the time?
If not then consider a need to tune the cf/web server connector. I assume you're using iis. How many sites do you have? And how many connectors (folders in the cf config/wsconfig folder)? What are the settings in their workers.properties file? Are they optimized for the number of sites using that connector?
Also, have you updated cf2018? Are there any errors in the update error log? Did you update the web server connector also?
Are you running the cf2018 pmt (performance monitoring tool set)? Have you updated it?
There could be still more to consider, but let's see how it goes with those. I have blog posts on these and many more topics that would elaborate on things, both at my site (carehart.org) and the Adobe cf portal (coldfusion.adobe.com).
But let's hear if any of this gets you going.

scalability of azure cloud queue

In current project we currently use 8 worker role machines side by side that actually work a little different than azure may expect it.
Short outline of the system:
each worker start up to 8 processes that actually connect to cloud queue and processes messages
each process accesses three different cloud queues for collecting messages for different purposes (delta recognition, backup, metadata)
each message leads to a WCF call to an ERP system to gather information and finally add retreived response in an ReDis cache
this approach has been chosen over many smaller machines due to costs and performance. While 24 one-core machines would perform by 400 calls/s to the ERP system, 8 four-core machines with 8 processes do over 800 calls/s.
Now to the question: when even increasing the count of machines to increase performance to 1200 calls/s, we experienced outages of Cloud Queue. In same moment of time, 80% of the machines' processes don't process messages anymore.
Here we have two problems:
Remote debugging is not possible for these processes, but it was possible to use dile to get some information out.
We use GetMessages method of Cloud Queue to get up to 4 messages from queue. Cloud Queue always answers with 0 messages. Reconnect the cloud queue does not help.
Restarting workers does help, but shortly lead to same problem.
Are we hitting the natural end of scalability of Cloud Queue and should switch to Service Bus?
Update:
I have not been able to fully understand the problem, I described it in the natual borders of Cloud Queue.
To summarize:
Count of TCP connections have been impressive. Actually too impressive (multiple hundreds)
Going back to original memory size let the system operate normally again
In my experience I have been able to get better raw performance out of Azure Cloud Queues than service bus, but Service Bus has better enterprise features (reliable, topics, etc). Azure Cloud Queue should process up to 2K/second per queue.
https://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/
You can also try partitioning to multiple queues if there is some natural partition key.
Make sure that your process don't have some sort of thread deadlock that is the real culprit. You can test this by connecting to the queue when it appears hung and trying to pull messages from the queue. If that works it is your process, not the queue.
Also take a look at this to setup some other monitors:
https://azure.microsoft.com/en-us/documentation/articles/storage-monitor-storage-account/
It took some time to solve this issue:
First a summarization of the usage of the storage account:
We used the blob storage once a day pretty heavily.
The "normal" diagonistics that Azure provides out of the box also used the same storage account.
Some controlling processes used small tables to store and read information once an hour for ca. 20 minutes
There may be up to 800 calls/s that try to increase a number to count calls to an ERP system.
When recognizing that the storage account is put under heavy load we split it up.
Now there are three physical storage accounts heaving 2 queues.
The original one still keeps up to 800/s calls for increasing counters
Diagnositics are still on the original one
Controlling information has been also moved
The system runs now for 2 weeks, working like a charm. There are several things we learned from that:
No, the infrastructure is "not just there" and it doesn't scale endlessly.
Even if we thought we didn't use "that much" summarized we used quite heavily and uncontrolled.
There is no "best practices" anywhere in the net that tells the complete story. Esp. when start working with the storage account a guide from MS would be quite helpful
Exception handling in storage is quite bad. Even if the storage account is overused, I would expect some kind of exception and not just returning zero message without any surrounding information
Read complete story here: natural borders of cloud storage scalability
UPDATE:
The scalability has a lot of influences. You may are interested in Azure Service Bus: Massive count of listeners and senders to be aware of some more pitfalls.

Optimizing long polling on dedicated server

Right now, I am hosting a site on a dedicated server, 8 GB ram, Intel Xeon E3 1230 V3. I am using long polling techniques in order to display information which gets added into a database consistently.
The problem is: so far, after around let's say 20 users come onto the site, it starts lagging and slowing down dramatically. I'm pretty sure the server is strong enough to handle way more people than that. Thus, I am not sure what exactly is the problem. Can long polling using Apache handle that many users? If not, how should I implement real-time information being displayed. And if it can, how should I configure Apache or anything in order to handle around 500-1000 concurrent users.
Any help is appreciated.

Bigquery streaming inserts taking time

During load testing of our module we found that bigquery insert calls are taking time (3-4 s). I am not sure if this is ok. We are using java biguqery client libarary and on an average we push 500 records per api call. We are expecting a million records per second traffic to our module so bigquery inserts are bottleneck to handle this traffic. Currently it is taking hours to push data.
Let me know if we need more info regarding code or scenario or anything.
Thanks
Pankaj
Since streaming has a limited payload size, see Quota policy it's easier to talk about times, as the payload is limited in the same way to both of us, but I will mention other side effects too.
We measure between 1200-2500 ms for each streaming request, and this was consistent over the last month as you can see in the chart.
We seen several side effects although:
the request randomly fails with type 'Backend error'
the request randomly fails with type 'Connection error'
the request randomly fails with type 'timeout' (watch out here, as only some rows are failing and not the whole payload)
some other error messages are non descriptive, and they are so vague that they don't help you, just retry.
we see hundreds of such failures each day, so they are pretty much constant, and not related to Cloud health.
For all these we opened cases in paid Google Enterprise Support, but unfortunately they didn't resolved it. It seams the recommended option to take for these is an exponential-backoff with retry, even the support told to do so. Which personally doesn't make me happy.
The approach you've chosen if takes hours that means it does not scale, and won't scale. You need to rethink the approach with async processes. In order to finish sooner, you need to run in parallel multiple workers, the streaming performance will be the same. Just having 10 workers in parallel it means time will be 10 times less.
Processing in background IO bound or cpu bound tasks is now a common practice in most web applications. There's plenty of software to help build background jobs, some based on a messaging system like Beanstalkd.
Basically, you needed to distribute insert jobs across a closed network, to prioritize them, and consume(run) them. Well, that's exactly what Beanstalkd provides.
Beanstalkd gives the possibility to organize jobs in tubes, each tube corresponding to a job type.
You need an API/producer which can put jobs on a tube, let's say a json representation of the row. This was a killer feature for our use case. So we have an API which gets the rows, and places them on tube, this takes just a few milliseconds, so you could achieve fast response time.
On the other part, you have now a bunch of jobs on some tubes. You need an agent. An agent/consumer can reserve a job.
It helps you also with job management and retries: When a job is successfully processed, a consumer can delete the job from the tube. In the case of failure, the consumer can bury the job. This job will not be pushed back to the tube, but will be available for further inspection.
A consumer can release a job, Beanstalkd will push this job back in the tube, and make it available for another client.
Beanstalkd clients can be found in most common languages, a web interface can be useful for debugging.

Multi-threaded performance testing MS SQL server DB

Let's assume the following situation:
I have a database server that uses 4 core CPU;
My machine has 2 core CPU;
Assume they are of equal speed in terms of GHZ;
Systems are connected over a network (two lines 200mb/s each);
Test tool that I use provides # of threads parameter and will issue commands in parallel to the server.
QUESTIONS:
How would you test parallel reads/writes via stored procedure? Please brainstorm as any advice is appreciated;
How can I prove that many threads are executing the queries on the server (or should I not pay attention to this as this servers and DB's responsibility)?
What controls how many threads are executed at any time primarily in case of SQL server? I checked the "server properties" > processors > # of processors and threads section - waht more should I check?
How can I check that my application truly executes on all my machine cores - in other words - uses real threads instead of virtual ones? Or should I pay attention only to the virtual ones?
Should I pay attention to the network bandwidth? Can it be a bottleneck (I dont' send any big data, only commands with variables).
1.) not sure perhaps someone else can answer
2.) SQL Sentry allows you to monitor your SQL activity (use the free trial and buy if you like it)
3.) Max Dop controls the number of processors & also the cost threshold will affect parrallelism
4.) Same as 2 perhaps, i'm not sure i understand
5.) Depends on what you are doing are where you see aproblem SQL sentry will show wait stats that may help