High response time in WSO2 DSS - sql

I have created a simple data service using WSO2 DSS for the following simple query.
"SELECT * FROM EMP_VIEW"
"EMP_VIEW" is having around 45 columns and 8500 entries(tuples). My DB instance is Oracle 11g Enterprise edition & i'm using ojdbc6.jar as the driver. Due to some reason Data Service takes around 14 mins to get the response once I try it in SoapUI.
But the same query takes around 14 or less seconds to retrieve all the records in Oracle SQL Developer/ Eclipse database explorer.
Any idea why it's taking high response time?

Not an answer but potential direction in order to get to an answer.
There may be multiple factors at play here. You have proven that the Oracle side is working well (assuming the 14s response time is acceptable).
You mention that SOAPUI takes considerable time. This could be a SOAPUI problem where it is waiting for all results to be returned (time taken) and then building a full display (more time taken) before showing the full result.
The Oracle Dev tool could be faster at showing results since it may not be; waiting for the full result set and/or taking much time to build the display.
Keep in mind that DSS is taking the SQL result and placing XML, that in itself may add some time but I suspect the SOAPUI tool is taking a significant amount of time to decode the XML and place on your screen.
To further narrow down the problem I suggest you use another tool
1. possibly the TRYIT tool from DSS and see what type of timing it gets for the same calls.
2. write a small client c# / java etc and measure that actual time between your request and the response. This will definitely tell you how long DSS is taking versus how long it takes for the client to form a display.
Please do post your results as this type of information is definitely helpful to others.

As per my understandings and observations, SOAP UI waits till whole message receive. therefore that much of time will spent. but when you try curl, you can find less seconds to generate the response.
I tried curl to receive 2MB messages with streaming enabled DSS service,
The response was generated within less than one second.

Related

Mule is taking a long time for the simple select for the first execution

I am just using a HTTP listener and Select in mule flow. It is a get method, passing ID as an input, and the same ID is passed to select (input). It is taking 3 to 4 minutes of delay when we execute via mule for the first time, but in DB, it took only millisecond.
This delay only happens after adding the parameter in the select.
Someone help me, why there is a delay for the first time and how to resolve it?
Possible cause could be how you create Metadata. For example you use huge CSV file as example for your data structure. Mule reads whole file to have headers. It takes time.
Solution - if you create Metadata by example - use small examples with couple rows of data.
Usually the main points that cause performance issues in first executions are:
JVM and Mule Runtime warmup
Time to establish connections
The first one can not be avoided. For the second one usually a connection pool is used to mitigate it somewhat. Having said that 4 minutes is a very excessive time for either of those. You need to do some performance analysis, adding logs before and after operation in the flow, enabling debug logs for the database connector and even using a Java profiler connected to the Mule JVM to understand what could be happening.
You also have to consider if there is a high number of records that need to be processed, even if the database can answer quickly, it might take some time to format.

BigQuery Retrieval Times Slow

BigQuery is fast at processing large sets of data, however retrieving large results from BigQuery is not fast at all.
For example, I ran a query that returned 211,136 rows over three HTTP requests, taking just over 12 seconds in total.
The query itself was returned from cache, so no time was spent executing the query. The host server is Amazon m4.xlarge running in US-East (Virginia).
In production I've seen this process take ~90seconds when returning ~1Mn rows. Obviously some of this could be down to network traffic... but it seems too slow for that to be the only cause (those 211,136 rows were only ~1.7MB).
Has anyone else encountered such slow speed when having results returned, and have found a resolution?
Update: Reran test on VM inside Google Cloud with very similar results. Ruling out network issues beteween Google and AWS.
Our SLO on this API is 32 seconds,and a call taking 12 seconds is normal. 90 seconds sounds too long, it must be hitting some of our system's tail latency.
I understand that it is embarrassingly slow. There are multiple reasons to it, and we are working on improving the latency of this API. By the end of Q1 next year, we should be able to roll out a change that would cut tabledata.list time in half (by upgrading the API frontend to our new One Platform technology). If we have more resource, we would also make jobs.getQueryResults faster.
Concurrent Requests using TableData.List
It's not great, but there is a resolution.
Make a query, and set the max rows to 1000. If there is no page token simply return the results.
If there is a page token then disregard the results*, and use the TableData.List API. However rather than simply sending one request at a time, send a request for every 10,000 records* in the result. To so this one can use the 'MaxResults' and 'StartIndex' fields. (Note even these smaller pages may be broken into multiple requests*, so paging logic is still needed).
This concurrency (and smaller pages) leads to significant reductions in retrieval times. Not as good as BigQ simply streaming all results, but enough to start realizing the gains from using BigQ.
Potential Pitfals: Keep an eye on the request count, as with larger result-sets there could be 100req/s throttling. It's also worth noting that there's no guarantee of ordering, so using StartIndex field as pseudo-paging may not always return correct results*.
* Anything with a single asterix is still an educated guess, but not confirmed as true/best practise.

Google BigQuery: Slow streaming inserts performance

We are using BigQuery as event logging platform.
The problem we faced was very slow insertAll post requests (https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll).
It does not matter where they are fired - from server or client side.
Minimum is 900ms, average is 1500s, where nearly 1000ms is connection time.
Even if there is 1 request per second (so no throttling here).
We use Google Analytics measurement protocol and timings from the same machines are 50-150ms.
The solution described in BigQuery streaming 'insertAll' performance with PHP suugested to use queues, but it seems to be overkill because we send no more than 10 requests per second.
The question is if 1500ms is normal for streaming inserts and if not, how to make them faster.
Addtional information:
If we send malformed JSON, response arrives in 50-100ms.
Since streaming has a limited payload size, see Quota policy it's easier to talk about times, as the payload is limited in the same way to both of us, but I will mention other side effects too.
We measure between 1200-2500 ms for each streaming request, and this was consistent over the last month as you can see in the chart.
We seen several side effects although:
the request randomly fails with type 'Backend error'
the request randomly fails with type 'Connection error'
the request randomly fails with type 'timeout' (watch out here, as only some rows are failing and not the whole payload)
some other error messages are non descriptive, and they are so vague that they don't help you, just retry.
we see hundreds of such failures each day, so they are pretty much constant, and not related to Cloud health.
For all these we opened cases in paid Google Enterprise Support, but unfortunately they didn't resolved it. It seams the recommended option to take for these is an exponential-backoff with retry, even the support told to do so. Which personally doesn't make me happy.
Also the failure rate fits the 99.9% uptime we have in the SLA, so there is no reason for objection.
There's something to keep in mind in regards to the SLA, it's a very strictly defined structure, the details are here. The 99.9% is uptime not directly translated into fail rate. What this means is that if BQ has a 30 minute downtime one month, and then you do 10,000 inserts within that period but didn't do any inserts in other times of the month, it will cause the numbers to be skewered. This is why we suggest a exponential backoff algorithm. The SLA is explicitly based on uptime and not error rate, but logically the two correlates closely if you do streaming inserts throughout the month at different times with backoff-retry setup. Technically, you should experience on average about 1/1000 failed insert if you are doing inserts through out the month if you have setup the proper retry mechanism.
You can check out this chart about your project health:
https://console.developers.google.com/project/YOUR-APP-ID/apiui/apiview/bigquery?tabId=usage&duration=P1D
It happens that my response is on the linked other article, and I proposed the queues, because it made our exponential-backoff with retry very easy, and working with queues is very easy. We use Beanstalkd.
To my experience any request to bigquery will take long. We've tried using it as a database for performance data but eventually are moving out due to slow response times. As far as I can see. BQ is built for handling big requests within a 1 - 10 second response time. These are the requests BQ categorizes as interactive. BQ doesn't get faster by doing less. We stream quite some records to BQ but always make sure we batch them up (per table). And run all requests asynchronously (or if you have to in another theat).
PS. I can confirm what Pentium10 sais about faillures in BQ. Make sure you retry the stuff that fails and if it fails again log it to file for retrying it another time.

Bigquery Stream Benchmark

Bigquery officially becomes our device log data repository and live monitor/analysis/diagnostic base. As one step further, We need to measure and monitor data streaming performance. Any relevant benchmark you are using for Bigquery live stream? What relevant once I can refer to?
Since streaming has a limited payload size, see Quota policy it's easier to talk about times and other side effects.
We measure between 1200-2500 ms for each streaming request, and this was consistent over the last month as you can see in the chart.
We seen several side effects although:
the request randomly fails with type 'Backend error'
the request randomly fails with type 'Connection error'
the request randomly fails with type 'timeout'
some other error messages are non descriptive, and they are so vague that they don't help you, just retry.
we see hundreds of such failures each day, so they are pretty much constant, and not related to Cloud health.
For all these we opened cases in paid Google Enterprise Support, but unfortunately they didn't resolved it. It seams the recommended option to take for these is an exponential-backoff with retry, even the support told to do so. Which personally doesn't make me happy.
UPDATE
Someone requested in the comments new stats, so I posted 2017. It's still the same, there was some heavy data reorganization for us, you see the spike, but essentially it's the same it's around 2sec if you use the max of the streaming insert.

Best Way to Transmit LARGE data packages via SOAP web service

We are working with a .NET 3.5 app which is fast approaching legacy status. We have an existing SOAP service which reads records from our database and saves them to a third party MS SQL database, sending all the data rows in a single batch.
This has always worked fine, but recently we've taken on a much larger client than any we've had before, and they are transmitting much larger batches, so much so that they have begun to fail. We've upped the time out and max memory sizes in IIS, and maxed out the maxRequestLength in the web.config, but we are still bumping up against size problems.
So, I understand that long term, we should consider moving away from SOAP and into WCF, and plans for that are in the works. But in the mean time, we need a short term fix for this new client. And of course, to make the business and sales people happy, we need it kinda quickly.
I'm wondering what the best-practice approach might be. Initially I'm thinking something like this, but I could be thinking inside the box too much:
Establish a bench mark of # of records over which we don’t want to attempt to sync all at once.
Before attempting to save the data, check the number of records against that bench mark
If it's above it, then break the transmission down into segments which are each below that benchmark. SELECT TOP 10000 * FROM table WHERE sent = false, etc., if the benchmark is 10000. Then update sent to true for those records once submitted. Repeat.
Obviously, this will slow the process down, so to handle the user experience, we may want to toss in a status bar so they can see the progress.
Am I on the right track?
In addition to the comments from John, you should consider if you are solving the problem in the most optimal way.
It looks like you are triggering a one way sync between 2 database by calling a web service. This approach leads to the time out and memory problems that you are experiencing.
If your goal is to do the one way sync, you could use a free framework such as Microsofts sync framework: http://msdn.microsoft.com/en-US/sync