Hive Query with a large WHERE Condition - hive

I am writing a HIVE query to pull about 2,000 unique keys from a table.
I keep getting this error - java.lang.StackOverflowError
My query is basic but looks like this:
SELECT * FROM table WHERE (Id = 1 or Id = 2 or Id = 3 Id = 4)
my WHERE clause goes all the way up to 2000 unique id's and I receive the error above. Does anyone know of a more efficient way to do this or get this query to work?
Thanks!

You may use the SPLIT and EXPLODE to convert the comma separated string to rows and then use IN or EXISTS.
using IN
SELECT * FROM yourtable t WHERE
t.ID IN
(
SELECT
explode(split('1,2,3,4,5,6,1998,1999,2000',',')) as id
) ;
Using EXISTS
SELECT * FROM yourtable t WHERE
EXISTS
(
SELECT 1 FROM (
SELECT
explode(split('1,2,3,4,5,6,1998,1999,2000',',')) as id
) s
WHERE s.id = t.id
);

Make use of the Between clause instead of specifying all unique ids:
SELECT ID FROM table WHERE ID BETWEEN 1 AND 2000 GROUP BY ID;

i you can create a table for these IDs and after use the condition of exist in the new table to get only your specific IDs

Related

Microsoft SQL Server - Convert column values to list for SELECT IN

I have this (3 int columns in one table)
Int1 Int2 Int3
---------------
1 2 3
I would like to run such query with another someTable:
SELECT * FROM someTable WHERE someInt NOT IN (1,2,3)
where 1,2,3 are list of INTs converted to a list that I can use with SELECT * NOT IN statement
Any suggestions how to achieve this without stored procedures in Micorosft SQL Server 2019 ?
If you want rows in some table that are not in one of three columns of another table, then use not exists:
select t.*
from sometable t
where not exists (select 1
from t t2
where t.someint in (t2.int1, t2.int2, t2.int3)
);
The subquery returns a row where there is a match. The outer query then rejects any rows with a match.
Seems like you actually want a NOT EXISTS?
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
WHERE sT.someInt NOT IN (oT.int1,oT.int2,oT.int3));
An alternative method would be to unpivot the data, and then use an equality operator:
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
CROSS APPLY (VALUES(oT.int1),(oT.int2),(oT.int3))V(I)
WHERE V.I = sT.someInt);

Need Sorting With External Array or Comma Separated data

Am working with PostgreSQL 8.0.2, I have table
create table rate_date (id serial, rate_name text);
and it's data is
id rate_name
--------------
1 startRate
2 MidRate
3 xlRate
4 xxlRate
After select it will show data with default order or order by applied to any column of same table. My requirement is I have separate entity from where I will get data as (xlRate, MidRate,startRate,xxlRate) so I want to use this data to sort the select on table rate_data. I have tried for values join but it's not working and no other solution am able to think will work. If any one have idea please share detail.
Output should be
xlRate
MidRate
startRate
xxlRate
my attempt/thinking.
select id, rate_name
from rate_date r
join (
VALUES (1, 'xlRate'),(2, 'MidRate')
) as x(a,b) on x.b = c.rate_name
I am not sure if this is helpful but in Oracle you could achieve that this way:
select *
from
(
select id, rate_name,
case rate_name
when 'xlRate' then 1
when 'MidRate' then 2
when 'startRate' then 3
when 'xxlRate' then 4
else 100
end my_order
from rate_date r
)
order by my_order
May be you can do something like this in PostgreSQL?

union unusual behavior

Trying to union two tables with the same field into one master table but for some reason im getting a weird result.
select count(*)
from staging.sandoval_parcels
where parcel_id = 0;
returns 0
select count(*)
from staging.bernalillo_parcels
where parcel_id = 0;
returns 0
but when i merge the tables using
CREATE TABLE staging.master_parcels
AS
SELECT * FROM bernalillo_parcels
UNION ALL
SELECT * FROM sandoval_parcels
;
then
select count(*)
from staging.master_parcels
where parcel_id = 0;
returns 85553
both tables have the same fields and the fields are the same data type,also, no of the values for any field are missing, thus no nulls, why am i getting ids = 0 when either of the table have parcel_ids = 0?
The order of the fields matter, replace the * for the explicit name, other wise the second query field will be inserted on the first query position. But not necessarily on the same field you want.
CREATE TABLE staging.master_parcels
AS
SELECT parcel_id, field1 ... FROM bernalillo_parcels
UNION ALL
SELECT parcel_id, field1 ... FROM sandoval_parcels
;
Union will merge tables even if the column order is not the same. If all of the columns match and are in the same order, it will union distinct values and not create duplicates if the rows are the same for each table. Having the order and data type be the same is important.

How to fill Joining date and id based on following requirement?

I want to fill the joining date and id by creating a new view and output should be like second image
you might be looking for something like:
UPDATE mytable
SET tofill.ID = fillvalues.ID
,tofill.JOININGDATE = fillvalues.JOININGDATE
FROM mytable tofill
INNER JOIN
( SELECT DISTINCT ID, JOININGDATE, NAME
FROM mytable
WHERE ID IS NOT NULL
AND JOININGDATE IS NOT NULL
) fillvalues
ON tofill.NAME = fillvalues.NAME
WHERE tofill.ID IS NULL
OR tofill.JOININGDATE IS NULL
;
I am not familiar with Oracle, but statement should be teh same or similiar

Reuse of select query result oracle

I've got following query
SELECT ID FROM MARMELADES mrm
where not exists
(SELECT 1 FROM TOYS toys
WHERE mrm.ID = toys.ID
AND mrm.INGREDIENT = toys.INGREDIENT
AND mrm.BOX_TYPE = 2)
AND mrm.BOX_TYPE = 2
It returns almost 400+ results of id, for example [12, 33, 45, ... , 3405]
Now, i want to remove all ids that are from that list everywhere from my database. this is not only MARMELADES and TOYS. Also, i have for example 35+ tables where i can have this id).
I would be happy if this query could extract in some functions like ALL_UNNEEDED_IDS so i can use it like this:
DELETE FROM ANOTHER_TABLE_1 WHERE ID IN ( ALL_UNNEEDED_IDS )
DELETE FROM ANOTHER_TABLE_2 WHERE ID IN ( ALL_UNNEEDED_IDS )
DELETE FROM ANOTHER_TABLE_3 WHERE ID IN ( ALL_UNNEEDED_IDS )
DELETE FROM ANOTHER_TABLE_4 WHERE ID IN ( ALL_UNNEEDED_IDS )
...
DELETE FROM ANOTHER_TABLE_35 WHERE ID IN ( ALL_UNNEEDED_IDS )
It is possible to do it in oracle to reuse such results?
Use the first query within your subsequent queries. IE:
DELETE FROM ANOTHER_TABLE_1 WHERE ID IN (
SELECT ID FROM MARMELADES mrm
where not exists
(SELECT 1 FROM TOYS toys
WHERE mrm.ID = toys.ID
AND mrm.INGREDIENT = toys.INGREDIENT
AND mrm.BOX_TYPE = 2)
AND mrm.BOX_TYPE = 2
);
When you get to the toys and marmelades tables, you'll need a temporary holder table as #Gordon suggests.