set variable value in hive from select statement

set variable value in hive from select statement - hive

I have a subquery as below
SELECT COUNT(*) FROM ABC WHERE DT BETWEEN (SELECT DT FROM FCT_DT) AND (SELECT MAX(DT) FROM FCT_DT)
Hive do not allow select in between clause,
can i capture select state in variable and pass to between clause

Create you table;
CREATE TABLE ABC ( send_org string,rec_org string, participants int );
insert into ABC values ( "b", "z", 1);
insert into ABC values ( "b", "z", 2);
insert into ABC values ( "b", "z", 3);
insert into ABC values ( "b", "z", 4);
Use with to help create "tables" that you can then reference in your query.
with
theMax as (SELECT max(participants) as max_ FROM ABC),
theMin as (select min(participants) as min_ FROM ABC)
SELECT
COUNT(*) FROM ABC, theMax, theMin
WHERE
participants between theMin.min_ and theMax.max_

Related

Filter table based on data from json object

I'm trying to use JSONtype as an input parameter for stored procedure to do some filtering on the return data.
I have the following TableA:
CREATE TABLE TableA (
Id INT NULL,
Value1 VARCHAR(25) NULL,
Value2 VARCHAR(25) NULL
);
INSERT INTO TableA (Id, Value1, Value2) values (1, 'test1', 'new1')
INSERT INTO TableA (Id, Value1, Value2) values (1, 'test1', 'new2')
INSERT INTO TableA (Id, Value1, Value2) values (null, null, 'test3')
INSERT INTO TableA (Id, Value1, Value2) values (2, 'myvalue1', 'newvalue')
The JSON parameter is dynamic - representing one or more column name and value from the above table.
DECLARE #Filter NVARCHAR(MAX)
SET #Filter=N'{
"Id": 2,
"Value1": "myvalue1",
"Value2": "newvalue"
}'
And I extract the data from the json and inner join it with the TableA to get the output I need:
...
;WITH cte AS
(
SELECT * FROM OPENJSON(#Filter)
WITH(Id INT,
Value1 VARCHAR(25),
Value2 VARCHAR(25))
)
SELECT a.* FROM cte ct
INNER JOIN TableA a
ON ct.Id = a.Id
INNER JOIN TableA b
ON ct.Value1 = b.Value1
INNER JOIN TableA c
ON ct.Value2 = c.Value2
In this particular example, I get the desired output, since I'm specifying all the columns. However, the whole reason behind using JSON as parameter is to be able to dynamically pass different columns (the actual table has much more columns).
If I would to pass the following filters, I will no longer get the desired output because of inner join.
SET #Filter=N'{
"Id": 2,
}'
...
SET #Filter=N'{
"Id": 2,
"Value1": "test1"
}'
...
SET #Filter=N'{
"Value1": "test1,
"Value1": "new2"
}'
...etc
Is there a way to dynamically select all the objects from json that has values and skip the nulls? Or any other suggestion on how I can resolve this issue?
Is there a way to check if the json has any objects in it? Based on the documentation, there's only one function ISJSON that validates to make sure it's in the proper format. However, if I pass: SET #Filter=N'{}' it's a valid json object, but it's empty.
SQLFIDDLE

You can try this:
SELECT a.*
FROM TableA a
WHERE (a.Id IS NULL AND JSON_VALUE(#Filter,N'$.Id') IS NULL OR a.Id = ISNULL(CAST(JSON_VALUE(#Filter,N'$.Id') AS INT),a.Id))
AND (a.Value1 IS NULL AND JSON_VALUE(#Filter,N'$.Value1') IS NULL OR a.Value1 = ISNULL(JSON_VALUE(#Filter,N'$.Value1'),a.Value1))
AND (a.Value2 IS NULL AND JSON_VALUE(#Filter,N'$.Value2') IS NULL OR a.Value2 = ISNULL(JSON_VALUE(#Filter,N'$.Value2'),a.Value2));
The trick is, to replace a NULL with the actual column's value...

I know this is not really a satisfying answer, but the approach I've used in such situations is to combine each optional field with:
where (value1 is null or field1 = value1)
This removes the ability to explicitly select field1 is null using the filter, and you will need to blow up your select statement by two expressions per parameter:
--SELECT * FROM TableA
DECLARE #Filter NVARCHAR(MAX)
SET #Filter=N'{
"Id": 1,
"Value1": "test1",
"Value2": "new2"
}'
;WITH cte AS
(
SELECT * FROM OPENJSON(#Filter)
WITH(Id INT,
Value1 VARCHAR(25),
Value2 VARCHAR(25))
)
SELECT a.* FROM cte filter
INNER JOIN TableA a
ON filter.Id = a.Id
and (filter.Value1 is null or filter.Value1 = a.Value1)
and (filter.Value2 is null or filter.Value2 = a.Value2)
Query Result
Id Value1 Value2
1 test1 new2
Filtering with values not set
If you have a missing value in your JSON, this will not be used as a filter criterion:
--SELECT * FROM TableA
DECLARE #Filter NVARCHAR(MAX)
SET #Filter=N'{
"Id": 1,
"Value2": "new2"
}'
;WITH cte AS
(
SELECT * FROM OPENJSON(#Filter)
WITH(Id INT,
Value1 VARCHAR(25),
Value2 VARCHAR(25))
)
SELECT a.* FROM cte filter
INNER JOIN TableA a
ON filter.Id = a.Id
and (filter.Value1 is null or filter.Value1 = a.Value1)
and (filter.Value2 is null or filter.Value2 = a.Value2)
SQL Output
Id Value1 Value2
1 test1 new2

SQL logic for the Vlookup function in excel/ How to do a Vlookup in SQL

I have a table with 2 columns OLD_VALUE and NEW_VALUE and 5 rows. 1st row has values (A,B). Other row values can be (B,C),(C,D),(E,D),(D,F). I want to update all the old values with the new value (how a vlookup in excel would work) The Final Result Required: The newest value in the above example would be D,F. i.e. D points to F. E and C point to D. B points to C and A points to B. D pointing to F is the last and newest and there are no more successions after D,F. So (OLD_VALUE,NEW_VALUE)->(A,F), (B,F), (C,F), (D,F), (E,F). I want 5 rows with the NEW_VALUE as 'F'. The level of successions can be ranging from 1 to x.

This is the table I have used for the script:
declare #t as table(old_value char(1), new_value char(1));
insert into #t values('A','B')
insert into #t values('B','C')
insert into #t values('C','D')
insert into #t values('E','D')
insert into #t values('D','F')
This needs to be done with a recursive CTE. First, you will need to define an anchor for the CTE. The anchor in this case should be the record with the latest value. This is how I define the anchor:
select old_value, new_value, 1 as level
from #t
where new_value NOT IN (select old_value from #t)
And here is the recursive CTE I used to locate the latest value for each row:
;with a as(
select old_value, new_value, 1 as level
from #t
where new_value NOT IN (select old_value from #t)
union all
select b.old_value, a.new_value, a.level + 1
from a INNER JOIN #t b ON a.old_value = b.new_value
)
select * from a
Results:
old_value new_value level
--------- --------- -----------
D F 1
C F 2
E F 2
B F 3
A F 4
(5 row(s) affected)

I think a recursive CTE like the following is what you're looking for (where the parent is the row whose second value does not exist as a first value elsewhere). If there's no parent(s) to anchor to, this would fail (e.g. if you had A->B, B->C, C->A, you'd get no result), but it should work for your case:
DECLARE #T TABLE (val1 CHAR(1), val2 CHAR(2));
INSERT #T VALUES ('A', 'B'), ('B', 'C'), ('C', 'D'), ('E', 'D'), ('D', 'F');
WITH CTE AS
(
SELECT val1, val2
FROM #T AS T
WHERE NOT EXISTS (SELECT 1 FROM #T WHERE val1 = T.val2)
UNION ALL
SELECT T.val1, CTE.val2
FROM #T AS T
JOIN CTE
ON CTE.val1 = T.val2
)
SELECT *
FROM CTE;

SQL - Select the largest value within a row

I seem to be stuck on this and can't find a solution having had a look around.
I have an SQL table who's first row looks something like this:
Name Val1 Val2 Val3
John 1000 2000 3000
What I need to do is Select the largest value within this row i.e. 3000
Obviously if these values were in a column rather than row you could just use SELECT MAX(column) FROM table to get the largest value in the column. Is there an equivalent of this for finding the max value in a row?
I have also had a look at the uses of PIVOT and UNPIVOT but I don't think they are useful to me here..
The only way I have been able to do it is to create a temp table and insert each value into a single column like so:
CREATE TABLE #temp (colvals float)
INSERT INTO #temp (colvals)
SELECT Val1 FROM table WHERE ID=1
UNION
SELECT Val2 FROM table WHERE ID=1
UNION
SELECT Val3 FROM table WHERE ID=1
--------------------------------------------
SELECT MAX(colvals) FROM #temp
--------------------------------------------
DROP TABLE #temp
However I feel this is rather slow especially as my table has a lot more columns than the snippet I have shown above.
Any ideas?
Thanks in advance.

You can build a reference table for columns by APPLY and use native MAX()
-- Sample Data
declare #data table (Name varchar(10), Val1 int, Val2 int, Val3 int, Val4 int, Val5 int, Val6 int)
insert #data values
('John', 1000, 2000, 3000, 4000, 5000, 6000),
('Mary', 1, 2, 3, 4, 5, 6)
select Name, MaxValue from
#data
cross apply
(
select max(value) as MaxValue
from
(values
(Val1),(Val2),(Val3),(Val4),(Val5),(Val6) -- Append here
) t(value)
) result
SQL Fiddle

select MAX(case when c1 > c2 and c1 > c3 then c1
when c2 > c3 then c2
else c3
end)
from tablename

You need something like this:
SELECT *, Row_Number() OVER (ORDER BY GETDATE()) Rowid INTO #temp From yourtable
DECLARE #Columns AS Varchar(MAX)
SET #Columns =''
SELECT #Columns = #Columns + ',[' + name + ']' FROM tempdb..syscolumns
WHERE id=object_id('tempdb..#temp') AND name <> 'Rowid'
SELECT #Columns = Right(#Columns, len(#Columns)-1)
exec ('Select Rowid,Max(val) maxval from #temp t Unpivot(val For data in (' + #Columns + ')) as Upvt Group by Rid')
Drop table #temp

Use math logic:
select
case
when val1 >= val2 and val1 >= val2 then val1
when val2 >= val1 and val2 >= val3 then val2
else val3
end maxVal
from mytable
where id = 1

I think you were on the right track when you looked at unpivot as an option. Becaue that's exactly what you want to do - you have a pivot table, and you want the unpivoted value from it. Here's what I came up with:
declare #base table (Name char(4), Val1 int, Val2 int ,Val3 int);
insert into #base (Name, Val1 , Val2 , Val3) values ('John' , 1000 , 2000 , 3000);
select name, max(value) as max_value
from (
select name, valuetype, value
from #base b
unpivot ( value for valuetype in (Val1 , Val2 , Val3)) as u
) as up
group by name
To expand to your whole table, you can then just add more column names to the unpivot row:
unpivot ( value for valuetype in (Val1 , Val2 , Val3, ... more values here...)) as u

You could always replicate this answer Is there a Max function in SQL Server that takes two values like Math.Max in .NET?
-- Sample Data
declare #data table (Name varchar(10), Val1 int, Val2 int, Val3 int, Val4 int, Val5 int, Val6 int)
insert #data values
('John', 1000, 2000, 3000, 4000, 5000, 6000),
('Mary', 1, 2, 3, 4, 5, 6),
('Tony66', 1, 2, 3, 4, 5, 66),
('Tony55', 1, 2, 3, 4, 55, 6),
('Tony44', 1, 2, 3, 44, 5, 6),
('Tony33', 1, 2, 33, 4, 5, 6),
('Tony22', 1, 22, 3, 4, 5, 6),
('Tony11', 11, 2, 3, 4, 5, 6)
SELECT name,
(SELECT MAX(value)
FROM (VALUES (Val1),(Val2), (Val3), (Val4), (Val5), (Val6)) AS AllValues(value)) AS 'MaxValue'
FROM #data

Select only distinct values from two columns from a table

If I have a table such as
1 A
1 B
1 A
1 B
2 C
2 C
And I want to select distinct from the two columns so that I would get
1
2
A
B
C
How can I word my query? Is the only way to concatenate the columns and wrap them around a distinct function operator?

You could use a union to create a table of all values from both columns:
select col1 as BothColumns
from YourTable
union
select col2
from YourTable
Unlike union all, union removes duplicates, even if they come from the same side of the union.

SQL Fiddle
Why even distinct in Union, try this :
select cast(id as char(1)) from test
union
select val from test

Please try:
Select Col1 from YourTable
union
Select Col2 from YourTable
UNION removes duplicate records (where all columns in the results are the same), UNION ALL does not.
Please check What is the difference between UNION and UNION ALL
For multiple columns, you can go for UNPIVOT.
SELECT distinct DistValues
FROM
(SELECT Col1, Col2, Col3
FROM YourTable) p
UNPIVOT
(DistValues FOR Dist IN
(Col1, Col2, Col3)
)AS unpvt;

Try this one -
DECLARE #temp TABLE
(
Col1 INT
, Col2 NVARCHAR(50)
)
INSERT INTO #temp (Col1, Col2)
VALUES (1, 'ab5defg'), (2, 'ae4eii')
SELECT disword = (
SELECT DISTINCT dt.ch
FROM (
SELECT ch = SUBSTRING(t.mtxt, n.number + 1, 1)
FROM [master].dbo.spt_values n
CROSS JOIN (
SELECT mtxt = (
SELECT CAST(Col1 AS VARCHAR(10)) + Col2
FROM #temp
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'
)
) t
WHERE [type] = N'p'
AND number <= LEN(mtxt) - 1
) dt
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'
)
Or try this -
DECLARE #temp TABLE
(
a CHAR(1), b CHAR(1)
)
INSERT INTO #temp (a, b)
VALUES
('1', 'A'), ('1', 'B'), ('1', 'A'),
('1', 'B'), ('2', 'C'), ('2', 'C')
SELECT a
FROM #temp
UNION
SELECT b
FROM #temp

Because what you want select is in different columns, you can use union like below:
select distinct tarCol from
(select distinct column1 as tarCol from table
union
select distinct column2 from table) as tarTab

You can use like this to get multiple distinct column values
(SELECT DISTINCT `enodeb` as res,
"enodeb" as columnname
FROM `raw_metrics`)
UNION
(SELECT DISTINCT `interval` as res,
"interval" as columnname
FROM `raw_metrics`)

combining resultset of many select queries

I have four Select queries for four different tables, each extracting only one record. For example:
Select * from table where col1 = 'something'
gives one row having 3 columns.
The second select query also gives one record having two columns(fields). Same for third and fourth select query.
I want to combine all four result sets into one having one row. How is it possible?
I will write the queries for you.
1st one:
Select Top 1 column1, column2
from table 1
where column 1 = 'something'
and col1 = (Select max(col1) where column 1 = 'something')
2nd query:
Select Top 1 column1, column3
from table 2
where column 1 = 'something'
and column3 = (Select max(column3) where column 1 = 'something')
3rd query uses the result obtained from query 2:
Select column4
from table 3
where column3 = (obtained from 2nd query) (there is only one row)
4th:
Select column5
from table 4
where column3 = (obtained from 2nd query) (there is only one row)
This means I have to join 2nd, 3rd, 4th query, then resulting set in 1st.
I can't use union since columns are different.
So only problem is with joining the result set.

You can use CROSS JOINs to accomplish this.
CREATE TABLE table1 (id int, column1 varchar(5), column2 varchar(15));
CREATE TABLE table2 (column3 varchar(5), column4 varchar(15));
CREATE TABLE table3 (id int, column5 varchar(5), column6 varchar(15));
INSERT INTO table1 VALUES (1, 'aaa', 'row1')
INSERT INTO table2 VALUES ('bbb', 'table2')
INSERT INTO table3 VALUES (1, 'ccc', 'table3')
INSERT INTO table1 VALUES (1, 'ddd', 'table1')
SELECT * FROM (SELECT * FROM table1) a
CROSS JOIN (SELECT * FROM table2) b
CROSS JOIN (SELECT * FROM table3) c
Result:
id column1 column2 column3 column4 id column5 column6
1 aaa row1 bbb table2 1 ccc table3
1 ddd table1 bbb table2 1 ccc table3
Update after clarification:
CREATE TABLE table1
(
id int IDENTITY(1,1)
, searchstring nvarchar(25)
);
CREATE TABLE table2
(
id2 int IDENTITY(10, 10)
, searchstring2 nvarchar(25)
, newsearchstring nvarchar(50)
);
CREATE TABLE table3
(
id3 int IDENTITY(100, 100)
, id2 int
, table3srow nvarchar(25)
)
INSERT INTO table1 VALUES ('something');
INSERT INTO table1 VALUES ('something else');
INSERT INTO table1 VALUES ('something'); -- ID = 3, this row will be selected by 1st query
INSERT INTO table2 VALUES ('something', 'newvalue1');
INSERT INTO table2 VALUES ('something else', 'this will not be shown');
INSERT INTO table2 VALUES ('something', 'this will be returned by query 2'); -- ID = 30, this row will be selected by 2nd query
INSERT INTO table3 VALUES (10, 'not relevant');
INSERT INTO table3 VALUES (20, 'not relevant');
INSERT INTO table3 VALUES (30, 'This is from table 3'); -- This row will be returned by 3rd query
SELECT * FROM
(SELECT TOP 1 id, searchstring FROM table1 WHERE searchstring = 'something' and id = (SELECT MAX(id) FROM table1 WHERE searchstring = 'something')) AS query1,
(SELECT TOP 1 id2, newsearchstring FROM table2 WHERE searchstring2 = 'something' and id2 = (SELECT MAX(id2) FROM table2 WHERE searchstring2 = 'something')) AS query2,
(SELECT id2, table3srow FROM table3) as query3
WHERE query3.id2 = query2.id2
Use the same approach for table4 as indicated for table3.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

set variable value in hive from select statement - hive

I have a subquery as below SELECT COUNT(*) FROM ABC WHERE DT BETWEEN (SELECT DT FROM FCT_DT) AND (SELECT MAX(DT) FROM FCT_DT) Hive do not allow select in between clause, can i capture select state in variable and pass to between clause

Related

Filter table based on data from json object

SQL logic for the Vlookup function in excel/ How to do a Vlookup in SQL

SQL - Select the largest value within a row

Select only distinct values from two columns from a table

combining resultset of many select queries

Categories

Resources