Find whether id matches and substitute using Case Hive query

Find whether id matches and substitute using Case Hive query - sql

I have a table called "Scan" customer transactions where an individual_id appears once for every different transaction and contains column like scan_id.
I have another table called ids which contains random individual_ids sampled from Scan Table
I would like to join ids with scan and get a single record of ids and scan_id if it matches certain values.
Suppose data is like below
Scan table
Ids scan_id
---- ------
1 100
1 111
1 1000
2 100
2 111
3 124
4 1000
4 111
Ids table
id
1
2
3
4
5
I want below output i.e if scan_id matches either 100 or 1000
Id MT
------ ------
1 1
2 1
3 0
4 1
I executed below query and got error
select MT, d.individual_id
from
(
select
CASE
when scan_id in (90069421,53971306,90068594,136739913,195308160) then 1
ELSE 0
END as MT
from scan cs join ids r
on cs.individual_id = r.individual_id
where
base_div_nbr =1
and
country_code ='US'
and
retail_channel_code=1
and visit_date between '2019-01-01' and '2019-12-31'
) as d
group by individual_id;
I would appreciate any suggestions or help with regard to this Hive query. If there is an efficient way of getting this job done. Let me know.

Use a group by:
select s.individual_id,
max(case when s.scan_id in (100, 1000) then 1 else 0 end) as mt
from scan s
group by s.individual_id;
The ids table doesn't seem to be needed for this query.

Related

How to check how many times a record is repeated in different tables

I have two tables here:
Table 1:
process_id customer_id
16 1
21 1
22 1
Table 2:
process_id customer_id
16 1
16 1
22 1
I would like to check how many times each row in table 1 is repeated in table 2.
For example, row 1 in table 1 is repeated 2 times in table 2, row 2 repeated 0 times and row 3 repeated 1 time. I'm not sure how to loop through each row in table 1 and get this result.

As I understood, this is what you are asking for:
select table1.process_id,table1.customer_id,count(table2.process_id) as table2count
from table1 left outer join table2 on table1.process_id==table2.process_id and table1.customer_id=table2.customer_id
group by table1.process_id,table1.customer_id;

Left Join Display All Data From Table1 and Table2

I am trying to do a left join so that I get all of my rows from Table 1 even if there is no value corresponding to it in the second table.
My structures are:
Location Table:
ID LocName
1 Trk1
2 Trk2
3 Trk3
4 Unk
Quantity Table:
ID PartID Quantity LocationID
1 1 2 1
2 3 12 2
3 2 6 1
4 6 8 3
5 6 5 1
I am trying to join but also make a query on a specific PartID. My query is:
SELECT
INV_LOCATIONS.ID AS LocationID,
INV_LOCATIONS.NAME AS LocationName,
INV_QUANTITY.QUANTITY AS Quantity
FROM INV_LOCATIONS
LEFT JOIN INV_QUANTITY ON INV_LOCATIONS.ID = INV_QUANTITY.LOCATION_ID
WHERE INV_QUANTITY.PART_ID = 1;
My output right now would be:
ID LocName Quantity
1 Trk1 5
3 Trk3 8
The Desired output is:
ID LocName Quantity
1 Trk1 5
2 Trk2 NULL/0
3 Trk3 8
4 Unk NULL/0
I assume it is because I have the WHERE INV_QUANTITY.PART_ID = 1 and that is forcing it to be in the quantity table. I need to be able to verify it is on the right part but how do I also include it if it doesn't exist. I know I have done something very similar before but I cannot remember which project and so I cannot find the code anywhere.

You need to move the filtering logic to the ON clause:
SELECT il.ID AS LocationID, il.NAME AS LocationName,
iq.QUANTITY AS Quantity
FROM INV_LOCATIONS il LEFT JOIN
INV_QUANTITY iq
ON il.ID = iq.LOCATION_ID AND iq.PART_ID = 1;

Count Values From Column

I am trying to create a script in SQL Server that will count values under a column but I want it to still report missing values not counted.
Currently, I have the following setup with a group by, but it cuts the results in half:
select count(ID) as Count, Building, ID
from table
group by Building, ID
I want my output to show the count per ID as well as null values if there was nothing to count per ID.
Building ID
1234 1
1234 2
4567 3
4567 4
8910 5
0 6
Want the Output To Be:
Building ID Count
1234 1 2
1234 2 2
4567 3 2
4567 4 2
8910 5 1
0 6 0
The total population is 200,000. I want to see 200,000 records with the total counts per name or null values. When I run the above script, I obtain 1's per record.
Example: If ID 1 has a count of 2 and ID 2 has a count of 2, I want both IDs to show up as separate counts per ID.

You need to get the count from a sub-query and then join that sub-query
SELECT
CASE WHEN t1.building is null THEN 0
ELSE t1.building END AS Building,
t1.id,
CASE WHEN t1.building is null THEN 0
ELSE t2.count END AS Count
FROM table t1
JOIN (SELECT building, COUNT(*) as count
FROM table
GROUP BY building) AS t2 ON t2.building = t1.building OR (t2.building is null AND t1.building is null)

try
select count(name) as Count, ID, Building
from table
group by ID, Building

Top 1 record of a grouped data in hive sql

I have a table with 3 different columns pid,org,amount as shown below.
pid org amount
---- ---- ------
1 1 5
1 1 6
2 1 2
2 1 4
I need the records grouped by pid and org with the maximum amount.
As,Rich functionalities of sql are not supported in hive need an easy way of obtaining it.
The result table should be like
pid org amount
---- ---- ------
1 1 6
2 1 4

select pid,org,max(amount) from table1 group by pid,org;

use max function
Returns the maximum value of the column in the group
select pid,org,max(amount) from data
group by pid,org;
if not work, convert amount in double;
select pid,org,max(CAST(amount as double)) from data
group by pid,org;

Count number of not exist in child table

Essentially what I'm trying to do is count the number of rows something doesn't exist in an audit/history table. I'd like the following query to return a count of one per detail. Currently it gives me one per row in the history table.
--Detail Table
ID DETAIL_GROUP
1 A
2 B
3 B
--Detail History Table
DETAIL_ID_FK VALUE1
1 NOT_MATCH
1 NOT_MATCH
2 MATCH
2 NOT_MATCH
3 MATCH
3 NOT_MATCH
SELECT D.DETAIL_GROUP, COUNT(*)
FROM DETAIL D
WHERE (NOT EXISTS(
SELECT NULL
FROM DETAIL_HISTORY HI
WHERE HI.D_ID_FK = D.ID
AND HI.VALUE1 = 'MATCH'))
GROUP BY D.DETAIL_GROUP;
I'd like to see the following result:
DETAIL_GROUP COUNT(*)
A 1
but I'm receiving the following result:
DETAIL_GROUP COUNT(*)
A 2
Thank you in advance for any assistance provided.

Assuming that your detail table is as follows:
D_ID VALUE1
1 MATCH
1 NOT_MATCH
2 MATCH
2 NOT_MATCH
3 MATCH
3 NOT_MATCH
The below query:
SELECT d.detail_group, count(*)
FROM detail d
JOIN detail_history dh ON dh.d_id = d.id
WHERE dh.value1 = 'MATCH'
GROUP BY d.detail_group
Would produce:
DETAIL_GROUP COUNT(*)
A 1
B 2
The above query creates the groups matching the ids and then goes into each group and restricts the items based on value1.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find whether id matches and substitute using Case Hive query - sql

Use a group by: select s.individual_id, max(case when s.scan_id in (100, 1000) then 1 else 0 end) as mt from scan s group by s.individual_id; The ids table doesn't seem to be needed for this query.

Related

How to check how many times a record is repeated in different tables

Left Join Display All Data From Table1 and Table2

Count Values From Column

Top 1 record of a grouped data in hive sql

Count number of not exist in child table

Categories

Resources