How to get query output in Hive? - hive

I run a query from Oozie workflow:
select * from hivetable limit=5;
The query successfully executed but in the logs I can't see output of the select query.

I got why output of SQL query is not showing.
Put a new line after last command in your .sql file and REMOVE all new lines within a query. Means your query should not contain any new line, when query finishes add a new line and add another query if you want to.

Related

Presto Trino - Execute from SQL file

Beginner. Using CLI/Presto/Trino. Not sure the right term. It looks to be command line and we are using Hive.
I can run select, create tables. I'm trying to run multiple queries at once. I created SQL file and uploaded to Hive folder structure. I think I can execute all of them at once instead of going one by one.
How do I initiate the process of running SQL query from file?
--execute file user/hivefile.sql > result getting nowhere

Run several scripts in sequence and output each one to a csv file

I have several queries that need to be run on a weekly basis in Microsoft SQL Server Management Studio, each one one is just a relatively simple select query, and the results need to be saved into csv file. Right now someone spends an hour running each script in turn and saving the results.
I figured this could be somewhat automated but am struggling.
From reading previous questions here I've gotten as far as using SQLCMD mode, and by putting :output c:\filename.csv I get the output saved into a file, but I am having trouble getting separate files to be generated for each query.
For simplicity's sake, assume my query looks like this:
OUT: C:\File1.csv
SELECT * FROM table1;
OUT: C:\File2.csv
SELECT * FROM table2;
OUT: C:\File3.csv
SELECT * FROM table3;
Instead of getting three files with the output of each query, I end up with File1 and File2 filled with a couple of unreadable characters, and all three queries in File3. I know in Oracle there is a spool off command, is there something similar for OUT: in SSMS?
I ran a somewhat modified query and was able to get three files with three different query results. I ran the following for a quick test:
:OUT C:\File1.csv
SELECT 'Hello'
GO
:OUT C:\File2.csv
SELECT 'My'
GO
:OUT C:\File3.csv
SELECT 'Friend'
This gave me three separate files with the results from each query in a separate file. All I did was take out the semi colon and added the keyword GO which will terminate a command and move on to the next one. I hope this helps.

Add an Update SQL Query on a Pentaho Kettle Transformation

I have a scenario where I would like to run an update script after a table input and table output job, can anyone assist? I have tried these four but I can't seem to figure out how to make them work.
My Current Transformation
Here's the scenario...
Table Input: MySQL Database Table1 (*Select * from Table1*)
Table Output: Oracle Database (Create Table 1)
(this runs well to completion but then I have to execute the update script manually. I am looking for a way to automate this)
The update query I would like to run:
*update odb.table1 set colum1='New Value1' where column1='Old Value1'
update odb.table1 set colum1='New Value2' where column1='Old Value2'*
Thank you in advance.
I used the Execute SQL Script tool. I just added the two update queries separated by a semicolon ;.
I created two transformations. One for the table input and table output and another for the Execute SQL Script Tool.
I then created a Kettle Job and placed my query after the table output transformation.

SQL Server job to execute query from the output CSV file of first step

This is my first job creation task as a SQL DBA. First step of the job runs a query and sends the output to a .CSV. As a last step, I need the job to execute the query from the .CSV file (output of first step).
I have Googled all possible combinations but no luck.
your question got lost somehow ...
You last two comments make ist a little clearer.
If I understand it correctly you create a SQL script which restores all the logins, roles and users, their rights etc. into a newly created db.
If this created script is executable within a query window you can easily execute it with EXECUTE (https://msdn.microsoft.com/de-de/library/ms188332(v=sql.120).aspx)
Another approach could be SQLCMD (http://blog.sqlauthority.com/2013/04/10/sql-server-enable-sqlcmd-mode-in-ssms-sql-in-sixty-seconds-048/)
If you need further help, please come back with more details: What does your "CSV" look like? What have you tried so far?

Pentaho Data Integration (PD)I: After Selecting records I need to update the field value in the table using pentaho transforamtion

Have a requirement to create a transformation where I have to run a select statement. After selecting the values it should update the status, so it doesn't process the same record again.
Select file_id, location, name, status
from files
OUTPUT:
1, c/user/, abc, PROCESS
Updated output should be:
1, c/user/, abc, INPROCESS
Is it possible for me to do a database select and cache the records so it doesn't reprocess the same record again in a single transformation in PDI? So I don't need to update the status in the database. Something similar to dynamic lookup in Informatica. If not what's the best possible way to update the database after doing the select.
Thanks, that helps. You wouldn't do this in a single transformation, because of the multi-threaded execution model of PDI transformations; you can't count on a variable being set until the transform ends.
The way to do it is to put two transformations in a Job, and create a variable in the job. The first transform runs your select and flows the result into a Set Variables step. Configure it to set the variable you created in your Job. Next you run the second transform which contains your Excel Input step. Specify your Job level variable as the file name.
If the select gives more than one result, you can store the file names in the Jobs file results area. You do this with an Set files in result step. Then you can configure the job to run the second transform once for each result file.