Is there any way to measure the time taken for data from Database using WCF Data service. So that we can log it for our analysis purpose?
Any pointers or suggestions on this will be helpful.
Thanks in advance!
Related
I hope you are doing well, I am newbie in Pentaho I need help in troubleshooting an issue.
Flow of transformation --
Fetching 9000 Id number from previous step without any issue.
Requesting data from an API for 9000 Id's - "Rest Client".
Injecting it into MongoDB
Transformation Snapshot
I have attached an snapshot of transformation (not the actual only with main steps).
After fetching some amount of data form REST Client, I believe it is not sending the next request which is way the transformation is being stuck and it never stopped.
The Steps I have taken to troubleshoot this issue but did not work:
I broken-down the transformation to retrieve 2k data at one go with the help of "block until this step" operation.
Closely monitor the CPU & Memory of the server - Max CPU 40% sometimes it touch 90% for a seconds , Memory - less than 80%
Not sure , if this is a cache issue or PDI issue or something else ?
Please help me to resolve this issue, Any suggestion will be much appreciated.
Thanks and Regards
Aslam Shaikh
I have created input using ADLS Gen2 data stream option. I have added path pattern (upto folder which gets continuous data from eventhub). Test connection is successful but when I try to run query or sample data, it fails and error is:
Diagnostics: While sampling data, no data was received from '1' partitions.
Appreciate your help in advance.
Thank you Florian Eiden and Swati for your valuable discussions and suggestions. Posting it as answer to help other community members.
Used Event Hub directly as data streaming input instead of ADLS Gen2
data streaming option (that gets continuous data from Event Hub). This is more efficient option.
I would like to ask if anyone could tell me or refer me to an internet page which describes all possibilities to store data in an apache hadoop cluster.
What I would like to know is: Which type of data should be stored in which "system". Under type of data I mean for example:
Live data (realtime)
Historical data
Data which is regularly accessed from an application
...
The complete question is not reduced on Hbase or Hive ("System") but for everything which is available under Hdp.
I hope someone could lead me in a direction where i could find my answer. Thanks!
I can give you an overview, but rest of the things you have to read on your own.
Let's begin with the types of data you want to store in HDFS:
Data in Motion(Which you denoted as real-time data).
So, how can you fetch the real-time data? Is it even possible? The answer is NO. There will always be a delay. However, we can reduce the downtime and processing time of the data. For which we have HDF(Hortonworks Data Flow). It works with the data in motion. There are many services providing the real-time data streaming. You can take the example of Kafka, Nifi, Storm and many more. These tools are used to process the data. You also need to store the data in such a way that you'd be able to fetch it no time(~2 sec), for that we use HBase. HBase stores the data in the columnar structure.
Data at rest (Historic/Data stored for future use)
So, to store the data at rest, there are no such issues. HDP(Hortonworks Data Platform) is there providing us the services to ingest, store and process the data. Even we can integrate HDF services to HDP(prior to version 2.6), which makes it easier to process Data in motion also. Here we need Databases to store a large amount of data. However, we are provided with HDFS(Hadoop Distributed File System) which can help us store any kind of data. But we don't ONLY want to store our data, we want to fetch it no time when it is required. So, how are we planning to do that? By storing our data in a structured form. For which we are provided Hive and HBase. To store such amount of data which is in TB, we need to run heavy processes that are where MapReduce, YARN, Spark, Kubernetes, Spark comes in to picture.
This is the basic idea of storing and processing data in Hadoop.
Rest you can always read from the internet.
Is there any way for us to measure the time taken by the WCF data service to fetch entities from Database.
For eg. Lets say that we expose the NortherWind DB through data-service and accessed Orders entities through below URL,
http://<domain>/Dataservice/Orders
Any ways to measure the time taken to fetch Orders table contents from DB.
Thanks in advance
Stopwatch was made exactly for that purpose
i designed one application which pulls data from DB once its stated,if all the clients pulls data at a time from server, there is a lot of bandwidth, and one more worse case is one client may close & open his application many time a days, so there is a lot of bandwidth consumption on serve...
Is there any database Sync technique to implement in AIR desktop application ?
Please if anybody know please let me know..plz dont suggest LCDS(this is bit cost)
Advance Thanking,
Cheers,
vasu
Try and go for GraniteDS