Testing a http-client - testing

This project uses http clilent libraries to poll a http server for an xml file containing data gathered from hardware. Polling happens relatively fast. The data changes with time. Only one xml file is polled ever.
Is there a testing method/tool that can be used as the http server and feed the client an xml file based on the time it is polled?
Basically, what I'm trying to do is send xml data that may change on each poll. Each version of data is pre-determined for testing.
An idea I've thought is having a log rotator script cron'ed at polling frequency to check out and replace each version of the data into /var/log/www and let apache handle the rest. However, this does not tightly control which version will be served when it is polled as network delay may cause files to be replaced before the data is served. Each version of the data must be served and no versions may be skipped.
Any solutions/thoughts/methods/ideas will be appreciated.
Thanks

If you are attempting to perform Unit tests of specific functionality, I would suggest mocking the HTTP response and go from there. Relatively easy to setup and then very easy to modify.

Related

Apache nifi use in data science

I'm trying to create a web service dataflow using Apache NiFi. I've setup the request and response http processors however I can't seem to figure out how to update the flowfile from the request processor with data from say... another connection. Can someone please let me know how I can achieve this behaviour?
What is the use of Apache nifi, is it used in data science or it is just tool for working on some kind of data. What exactly Apache nifi does.
What is the use of Apache nifi, is it used in data science or it is just tool for working on some kind of data. What exactly Apache nifi does.
NiFi is a data orchestration and ETL/ELT tool. It's not meant for data science in the sense that data science is primarily about analytics. It's used by data engineers to process data, move data and things like that. These are the tasks that tend to happen prior to data science work.
I can't seem to figure out how to update the flowfile from the request processor with data from say
Use InvokeHttp. You can configure it to "always output response" and then you will have the response to work with. NiFi won't automagically merge the response with the data you send, so you would need to have the REST APIs you wrote send you output that you would find satisfactory on the NiFi side. That's a common use case with REST-based enrichment.

jmeter run in headless with 5000 user, which is better option to save results in CSV or XML or JTL

Is it good to save the data, when jmeter running in high load and distributed load testing.
we are running the jmeter with 5000 users in AWS server. need to report the results of below.
throughput over time and over active users.
response time over TPS.
aggregate report.
So which is better option to save CSV or XML or JTL.
JTL is a common JMeter Test Log file extension, currently supported formats are csv and xml
Generally it is better to use csv format where possible (by the way, it is default format) as generating XML is more resource intensive. Good practice is stick to csv output and temporarily switch to xml in case when you need to inspect request/response details for debugging/troubleshooting purposes.
So make sure you have the following entry in user.properties file:
jmeter.save.saveservice.output_format=csv
For distributed execution it is also important to choose the appropriate sample sender to limit the data being sent from slaves to master to necessary minimum. The recommended value is
mode=StrippedBatch
however this mode removes response data from successful samplers so your PostProcesors and/or Assertions may stop working.
References:
JMeter Best Practices
9 Easy Solutions for a JMeter Load Test “Out of Memory” Failure

CMIS: cache data on server side

I'm writing a CMIS interface(server) for my application. The server needs to load data from a database to process the request. At the moment I'm loading the same data for every request.
Is there a common way to cache this data. Are cookies supported for each cmis client? Is there an other chance to cache this data?
Thank you
You should not rely on cookies. Several clients and client libraries support them but not all. Cookies can help and you should make use of them, but be prepared for simple clients without cookie support.
Since your data is usually bound to a user, you can build a cache based on the username. But it depends on your repository and your use case what you can and should cache. Repository infos and type definitions are good candidates. But you should be careful with everything else.

Measuring Request processing time at JBoss server using Mod_JK

My web application architecture is: Apache 2 (as load balancer) + JBoss 3.2.3 + MySQL 5.0.19.
I want to: measure the request processing time (for every individual request) spent on the JBoss server only (i.e., excluding time spent on Web and database servers.
I've been researching about how to log request processing time on an application tier only. I found *mod_JK logging*, *Apache's mod_log_config* and Tomcat AccessLogValve as two possible methods.
Using *mod_JK logging*: my understand mod_jk logging provides request processing time for each request and calculate as time difference between time when a request leaves the Apache server and time when the corresponding response received by the Apache server. Please correct me if this not accurate/correct.
Using Apache's mod_log_config model (http://www.lifeenv.gov.sk/tomcat-docs/jk/config/apache.html): by adding "%{JK_REQUEST_DURATION}n" in the LogFormat (the JKLogFile construct) construct (see the above link). The "JK_REQUEST_DURATION" capture overall Tomcat processing time from Apache perspective.
The times (in the above cases) includes Tomcat/JBoss + MySQL processing time. It won't help in my case as it includes MySQL processing time- I want to record request processing time on JBoss only. Any suggestions/ideas are appreciated?
Using AccessLogValve: it can log "time taken to process request, in millis" by setting %D in the pattern attribute of the AccessLogValve XML construct. It is not very clear if this
Time if this time is the time required by tomcat/JBoss to serve a
request (e.g., allocate thread worker to handle it)
Time taken to process a request and send it to the database server
(overall time on Tomcat/JBoss server)
Time taken to process a request by Tomcat/JBoss and send a response
back to a Web server/client
Any idea/clue?
This is my experience/research I want to share. It would be appreciated if anyone has similar problem/know a way to do it to share their experience/pointers/thoughts where a better solution can be found.
looking forward for your thoughts/suggestions
Why do you want to exclude the database time? Time spent on the database is time your application is waiting, exactly as it could be waiting for other resources e.g lucene indexing to finish, a remote http request to complete etc.
If you really want to exclude the db access time you need to instrument your application with timer start/stop instructions. This will definitelly need to go inside your application (either "cleanly" via AOP or manually via start/stop statements in critical points in the app) and cannot simply be a configuration from the outside world (e.g an apache module).
So, you'll need to start the timer when you receive your request in the very start of the processing chain (a filter works well here) and stop every time you send a query. Then start again exactly after the query. This of course cannot be 100% complete especially in the case you use a transparent ORM such as hibernate. That's because sometimes you could be executing queries indirectly, i.e when traversing a collection or an association via plain Java.
I think you are looking for profiling tools, like java visualVM, JProfiler or others ..

OS and/or IIS Caching

Is there a way where I can force caching files at an OS level and/or Web Server level (IIS)
The problem I am facing is that there a many static files ( xslt's for example ) that need to be loaded again and again - and I want to load all these files to memory so that no time wasted on hard disk I/O.
(1) I want to cache it at the OS level so that every program that runs on my OS and which tries to read a file must read it from memory. I want no changing in program source code - it must happen transparently. For example, read("c:\abc.txt") must not cause a disk I/O, it must read it from the memory.
(2) Achieving similar thing in IIS. I've read few things about output caching for database queries - but how to achieve it for files?
All suggestions are welcome!
Thanks
You should look into some tricks used by SO itself. One was that they moved all their static content off to another domain for efficiency.
The problem with default set ups for Apache (at a minimum) is that the web server will pass all requests through to an app server to see if the content is meant to be dynamic. That's a huge waste for content that you know to be static.
Far better to set up a separate domain for static content without an app server. That way, the static requests are not sent unnecessarily to another layer and the web server can run much faster.
Even in a setup where there's not another layer invoked every time, there are other reasons for a separate domain, as you'll see from that link (specifically removing cookies which both reduces traffic and improves the chances of the Internet caching your data).