SniffStep not working on Carte server - pentaho

I am executing a transformation in a Carte server. And the transformation runs just fine, it ends up with a finished status, it shows the read and written rows per step, and the execution logs.
However, when I perform a sniff-step like this: http://localhost:9080/kettle/sniffStep/?trans=Add+a+sequence+-+Basic+example&step=Generate+Rows&xml=Y&lines=10
It returns a empty xml: <step-sniff></step-sniff>
I don't understand what is happening, since I can see in the table that Generate Rows has written 10 rows. Can someone help me?

This sniffing step works only for transformation that is currently running and the sniffing step is being executed, i.e. sniffing step is not finished. Most of the time you have to be very fast for transformations which finish fast.
Build 'Delay Row' step after the sniffing step. This forces the sniffing step to be executed for a longer time. In this case, you may be able to see some rows while sniffing.

Related

SqlException: Data modification failed on system-versioned table because transaction time was earlier than period start time for affected records

I m getting the above error when running the Web Job in multi-threaded environment. I m calling one stored procedure to perform some action, stored procedure has code which Inserts/Updates/Delete records from pretty big temporal tables(3-4M records[not sure if its relevant here]). Every time the job is run it deals with(Insert/Update) around 40K-80K records based on condition. When the single thread is running everything goes fine. But as soon as number of parallel jobs count is set to 2 or more I m getting the error. From initial analysis seems like issue is with Auto generated column values with for SysStartTime and SysEndTime in history table. I have tried one of the solution from internet to reduce 1 second from the date to be saved in those columns as below
DEFAULT (dateadd(second,(-1),sysutcdatetime()))
But its not working. I have read few articles where it says temporal tables does not work properly in multi-threaded environment. Now I m not sure why the issue is happening and how to resolve this in multi-threaded environment.
Can someone here please help me understanding the reason behind the error and how to fix it.
NOTE: I can't make my code to run on single thread. Minimum three threads are required. Converting to single thread is not solution in this case.

Statement object has been closed in querying from Amazon Redshift

On attempting to execute a simple query on a table (dimensions: 1,131,714,069 rows by 22 columns), I am running into the error:
[Amazon][JDBC](12080) Statement object has been closed.
Research online has unfortunately not provided much insight into this error.
I will not encounter this error each time I execute a query; so far it seems that its occurrence is unpredictable. The query that most recently caused this error looked was a very simple SELECT ... FROM ... WHERE with no subqueries and only one condition in the WHERE clause.
The query was busy for about 22 minutes before failing, however after waiting a few minutes, then running it again, it completed successfully in a matter of seconds. That being said, this kind of unpredictability and unreliability is exactly what I'm trying to prevent against.
If it helps, the IDE that I am using to connect to my Redshift database is TeamSQL.
What could be causing this error, and what steps could I take to prevent it?

BQ PY Client Libraries :: client.run_async_query() vs client.run_sync_query()

I'm looking at BQ PY Client Libraries:
There used to be two different operations to query a table
client.run_async_query()
client.run_sync_query()
But in the latest version (v1.3) it seems there's only one operations to execute a query, Client.query(). Did I understand correctly?
And looking at GH code it looks Client.query() just returns the query job, not the actual query results / data.... Making me conclude it works in a similar way as client.run_async_query().. there's no replacement for client.run_sync_query() operation anymore which return query results (data) synchronously / immediately?
Thanks for the clarification!
Cheers!
Although .run_sync_query() has been removed, the Query reference says that short jobs may return results right away if they don't take long to finish:
query POST /projects/projectId/queries
Runs a BigQuery SQL query and returns results if the query completes within a specified timeout.

Is it possible to apply no lock at DB/Table level for all queries hitting on thd DB /TBL

I am not sure even that this is valid question. But I will explain my situation and probable get an answer from experts like you.
We have on primes MS Dynamics installed. We are observing very slow performance.
We are looking at APP Log server. we are noticing 4-5 warning messages per second about "Query Execution time exceeded 10.seconds threshold"
Here is an example of the error and related query.
Query execution time of 27.7 seconds exceeded the threshold of 10 seconds. Thread: 109; Database: Main_MSCRM;
select
top 5001 "systemuser0".QueueId as "queueid"
, "systemuser0".CreatedBy as "createdby"
, "systemuser0".Address1_Latitude as "address1_latitude"
, "systemuser0".Address2_StateOrProvince as "address2_stateorprovince"
, "systemuser0".Address1_County as "address1_county"
, "systemuser0".Address2_Country as "address2_country"
, "systemuser0".Address2_PostOfficeBox as "address2_postofficebox"
, "systemuser0".PreferredPhoneCode as "preferredphonecode"
, "systemuser0".new_RegistrationNumer as "new_registrationnumer"
, "systemuser0".YammerUserId as "yammeruserid"
, "systemuser0".Title as "title"
, "systemuser0".SetupUser as "setupuser"
, "systemuser0".FirstName as "firstname"
, "systemuser0".EmployeeId as "employeeid"
, "systemuser0".Address1_Line2 as "address1_line2"
, "systemuser0".Address1_City as "address1_city"
, "systemuser0".YomiFirstName as "yomifirstname"
, "systemuser0".ExchangeRate as "exchangerate"
, "systemuser0".Address1_ShippingMethodCode as "address1_shippingmethodcode"
, "systemuser0".YomiMiddleName as "yomimiddlename"
, "systemuser0".Address2_Line2 as "address2_line2"
, "systemuser0".DefaultFiltersPopulated as "defaultfilterspopulated"
, "systemuser0".ModifiedOnBehalfBy as "modifiedonbehalfby"
, "systemuser0".Address2_Line3 as "address2_line3"
, "systemuser0".DefaultMailboxName as "defaultmailboxname"
from
SystemUser as "systemuser0"
where
(("systemuser0".IsDisabled = 0)) order by
"systemuser0".SystemUserId asc
Now when I run this query at SQL level, the result came up in less than 2 seconds. So my confusion is why it takes more time on CRM Front end side?
Apart from, the time that it takes for data rendering at CRM Front end level, I can not think anything else.
My Second confusion is when I run this query and other where I was getting warning messages with no lock in query itself, it was way faster than even 2 seconds.
What I am thinking is to write logic that will apply at DB level and whatever query hits the DB will have by default NO LOCK in it.
Is it possible even? PLEASE let me know how to get rid of these warning messages.
Thanks.
By running the query at SQL level I don't know whether you mean running it directly from the SQL Server console, but I think I can offer some insight as to why it appears to take longer when running on CRM Front end side. When you run the query directly from SQL Server, most of the time you already have an active connection to the database. This means that you don't have to wait to establish a connection, and the two big holdups would be waiting for the query to execute, and waiting to receive the result set.
However, when you run the query from your CRM front end, you presumably would have to establish a connection before you even begin the query. Setting up a connection can take longer than you might think.
So, a good test to run would be this: Execute your query twice, back-to-back, from the CRM front end and clock the running time of each one. If the second query runs much faster, than you may have discovered that the cost of establishing a connection with your SQL Server.
Maybe this is the same famous problem we usually get when command times out in ado.net but works fine in Management Studio. I think there are several fixes out there but since you don't have control over the query so try these 2 commands on the DB in question
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
I would strongly recommend running a Query Execution Plan so that it'll tell you if any queries might be improved by adding indexes.
Also, why are you trying to pull 5k records at once? Is that generated from custom code or something? CRM views are way less than that, i.e. 50-200 records. By pulling so many records at once you're increasing the likelihood of having database locks.

Loop SQL Query For Specific Time

I have a sql query that I need to loop through the system views sys.dm_exec_requests and sys.dm_exec_sessions every 60 seconds to pull specific information and dump it into a separate table. After a specified time I would like it to kill the loop. How would the loop be formatted?
This sounds like a SQL Agent job. If so, the short form of the answer is:
Create the job with one step that runs the query
Add a Schedule that runs it once a minute, starting whenever you want it to start
Set the schedule to stop running it when the cut-off time is reached
The long form, of course, is all the detail work behind creating a SQL Agent job. Best to read up on them in Books Online (here)
Don't do this in a loop. Do it with a job.
Write a sproc that does the query and save the results and then call it from a job.
I think you should use a job as well. But some work environments that is not practical. So you could have something like:
WHILE #StopTime < getdate()
BEGIN
exec LogCurrentData
WAITFOR DELAY '00:01:00'; -- wait 1 minute
END
I think the best way is creating a Job
There is a post that explain how to create a job step by step (with images) in SQL Server.
You can visit the post here
If you prefer a video tutorial, you can visit this link