Query With Distinct Value - sql

I have two tables Tabel1 and Table2. My tables look something like this.
Table1 has three fields Cust_Number, sales_org and Orders.
Table 2 has fields by name Cust_Number, sales_org, BU and Dist_Channel.
Dist_Channel is missing in table1 hence I have to get the Distinct of Cust_Number and sales_org from TABLE 2 and then do a join with table1 to get the corresponding BU.
I was able to do it in MS access by creating one additional query to pull the distinct Numbers and then using that query in my final query.
Could anybody give some suggestions on this?

SELECT t1.cust_number, t2.bu, t2.dist_channel
FROM table1 t1, table2 t2
WHERE t1.cust_number = t2.cust_number
AND t1.sale_org = t2.sales_org
AND -- your actual criteria go here

Try this simple version :
SELECT t1.cust_number, t1.sales_org, t2.BU
FROM Tabel1 t1
INNER JOIN Table2 t2 ON t1.cust_number= t2.cust_number AND t1.sales_org = t2.sales_org

You can do it like this:-
SELECT tbl1.cust_number, tbl1.sales_org tbl2.bu, tbl2.dist_channel
FROM Table1 as tbl1, Table2 as tbl2
WHERE tbl1.cust_number = tbl2.cust_number
AND tbl1.sale_org = tbl2.sales_org
......

Cust_Number, sales_org, are common in both the table you can do this by left join
also first check distinct count of Cust_Number and sales_org in table 2 is same or not
SELECT Tabel1.Cust_Number, Tabel1.sales_org, Table2.BU FROM Tabel1 left JOIN Table2 ON Tabel1.Cust_Number = Table2.Cust_Number AND Tabel1.sales_org = Table2.sales_org

Santhosa, as I understand the problem, it's that in Table2, you can have multiple rows that have the same customer number and sales org (but different dist_channels) so that a simple join to the table will multiply the number of rows coming out of Table1, which is not what you want. Instead, you want to use Table2 simply as a look up for the BU for a customer number and sales org, and we are assuming that there is a functional dependency from cust_number,sale_org --> BU, i.e. that for any given cust_number, sale_org pair in Table2, the BU is always the same.
If that assumption is true, then you can do the following:
SELECT tb1.cust_number, tb1.sales_org, tb2.bu
FROM Table1 AS tb1
JOIN (
SELECT DISTINCT cust_number, sales_org, bu
FROM Table2) AS tb2
ON tb1.cust_number = tb2.cust_number
AND tb1.sales_org = tb2.sales_org
But keep in mind that if you have multiple BUs for a given cust_number, sales_org pair, then this will still result in multiple rows being returned for a given cust_number, sales_org pair in Table1.

Related

Comparing base table value with second table's sum of value with group by

I have two tables:
One is base table and second is transaction table. I want to compare base table value with second table's sum of value with group by.
Table1(T1Id,Amount1,...)
Tabe2(T2Id,T1ID,Amount2)
I want those rows from table 1 WHere SUM of Table2's SUM( Amount2) is greater or equal table1's Amount1.
*T1ID is in relation with both tables
* The SQL query have many joins with other table for data retriving.
One approach uses a join:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
INNER JOIN Table2 t2
ON t1.T1Id = t2.T1ID
GROUP BY
t1.T1Id, t1.Amount1
HAVING
SUM(t2.Amount2) >= t1.Amount1;
We can also try doing this via a correlated subquery:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
WHERE t1.Amount1 <= (SELECT SUM(t2.Amount2) FROM Table2 t2
WHERE t1.T1Id = t2.T1ID);
I would use something similar to the query below:
SELECT
a.T1Id, a.Amount1, SUM(b.Amount2)
FROM Table1 a
INNER JOIN Table2 b on b.T1Id = a.T1Id
GROUP BY a.T1Id, a.Amount1
HAVING SUM(b.Amount2) >= a.Amount1;
Basically what the query above does is give you the ID, Amount from table 1 and the summed amount from table 2. The HAVING clause at the end of query filters out those records where the summed amount from the second table is smaller than the amount from the first one.
If you want to add further table joins to the query, you can do so by adding as many joins as you wish. I would recommend having a referenced ID for each table you are joining in the Table1 table.

How to map each distinct value of a column in one table with each distinct value of a column in another table in Hive

I have two tables in Hive, Table1 and Table2. I want to get each distinct customerID in Table1 and map it to each distinct value in a column called category of Table2. However I am a bit lost on how to do this in hive. A better example of what I am trying to do is the following: Let's say Table1 contains 5 distinct customerID's and Table2 contains 3 distinct categories. I want my query result to look something like the following:
However Table1 and Table2 do not have any columns in common so I am a bit lost on how to perform a join on this two tables in hive. Is this task possible in hive? Any insights on this would be greatly appreciated!
You can do that with a cross join of distinct values from both tables.
select t1.customerid,t2.categories
from (select distinct customerid from tbl1) t1
cross join (select distinct categories from tbl2) t2

SQL to sum a total of a column where 2 columns match in a different table

SO I have 2 tables and would like to SUM the total of a column in one table where 2 other columns match in another table.
In table1 I have acc_ref and bill_no.
acc_ref is different but bill_no could be 1-10 (so 2 or more acc_ref could have the same bill_no)
In table2 I have acc_ref, bill_no and tran_amnt.
Εach acc_ref has multiple rows and I want to SUM the tran_amnt but only if acc_ref and bill_no both match in table1.
I tried this but I get an error
'The columns in the SELECT clause must be contained in the GROUP BY
clause'
select a.acc_ref, a.bill_no
from table1 a
where exists (select acc_ref, bill_no, SUM (tran_amount)
from table2 b
where a.acc_ref = b.acc_ref
and a.bill_no = b.bill_no
group by acc_ref)
Apologies if this is very basic and obvious but I'm struggling!!
In you description it seems like table1 does not contain any useful information. Because both columns you give also exist in table2. So if nothing else from table1 is needed, you could just remove table1 from the query). Still with your problem you should do a simple join
SELECT a.acc_ref, a.bill_no, SUM(b.tran_amount)
FROM table1 a
JOIN table2 b ON b.acc_ref = a.acc_ref AND b.bill_no=a.bill_no
GROUP BY a.acc_ref, a.bill_no
I believe you should use case:
Sample:
SELECT
TABEL1.Id,
CASE WHEN EXISTS (SELECT Id FROM TABLE2 WHERE TABLE2.ID = TABLE1.ID)
THEN 'TRUE'
ELSE 'FALSE'
END AS NewFiled
FROM TABLE1

Counts for distinct values in different tables where columns are common to separate tables

I have no idea if that title conveys what I want it to.
I have two tables containing phone records (one for each account) and I'd like to get call counts for the numbers that are common to each account. In other words:
Table 1
Number ...
8675309
8675309
8675310
8675310
8675312
Table 2
Number ...
8675309
8675309
8675309
8675310
8675311
Querying with something like:
SELECT DISTINCT table1.number, COUNT(table1.number), COUNT(table2.number) FROM table1, table2 WHERE table1.number = table2.number GROUP BY table1.number
would hopefully produce:
8675309|2|3
8675310|2|1
Instead, it currently produces something like:
8675309|6|6
8675310|2|2
It appears to be multiplying the count from each table. Presumably, this is because I'm not joining the tables the way I should for this goal. Or because by the time I ask for COUNT(table1.number) the tables have already been joined in some multiplicative way. Should I not be doing a JOIN and instead something that would read like: "where table2.number CONTAINS(table1.number)"?
Any tips?
One way is with subqueries:
SELECT t1.number, t1.table1Count, t2.table2Count
from (select number, count(*) table1Count
from table1
group by number) t1
inner join (select number, count(*) table2Count
from table2
group by number) t2
on t2.number = t1.number
This assumes that you only want to list numbers that appear in both tables. If you want to list all numbers that appear in one table and optionally the other, you'd use a left or right outer join; if you wanted all numbers that appeared in either or both tables, you'd use a full outer join.
Another and potentially more efficient way requires the presence of a single column that uniquely identifies each row in each table:
SELECT
t1.number
,count(distinct t1.PrimaryKeyValue) table1Count
,count(distinct t2.PrimaryKeyValue) table2Count
from table1 t1
inner join table2 t2
on t2.number = t1.number
group by t1.number
This makes the same assumptions as before, and can also be adjusted modified via outer joins.
One way is to use a couple of derived tables to compute your counts separately and then join them to produce your final summary:
select t1.number, t1.count1, t2.count2
from (select number, count(number) as count1 from table1 group by number) as t1
join (select number, count(number) as count2 from table2 group by number) as t2
on t1.number = t2.number
There are probably other ways but that should work and it is the first thing that came to mind.
You're getting your "multiplicative" effect pretty much for the reasons you suspect. If you have this:
table1(id,x) table2(id,x)
------------ ------------
1, a 4, a
2, a 5, a
3, b 6, b
Then joining them on x will give you this:
1,a, 4,a
1,a, 5,a
2,a, 4,a
2,a, 5,a
...
Usually you could use a GROUP BY to sort out the duplicates but you can't do that because it would mess up your per-table counts.
Try this:
select tab1.number,tab1.num1,tab2.num2
from
(SELECT number, COUNT(number) as num1 from table1 group by number) as tab1
left join
(SELECT number, COUNT(number) as num2 from table2 group by number) as tab2
on tab1.number = tab2.number

SQL SELECT across two tables

I am a little confused as to how to approach this SQL query.
I have two tables (equal number of records), and I would like to return a column with which is the division between the two.
In other words, here is my not-working-correctly query:
SELECT( (SELECT v FROM Table1) / (SELECT DotProduct FROM Table2) );
How would I do this? All I want it a column where each row equals the same row in Table1 divided by the same row in Table2. The resulting table should have the same number of rows, but I am getting something with a lot more rows than the original two tables.
I am at a complete loss. Any advice?
It sounds like you have some kind of key between the two tables. You need an Inner Join:
select t1.v / t2.DotProduct
from Table1 as t1
inner join Table2 as t2
on t1.ForeignKey = t2.PrimaryKey
Should work. Just make sure you watch out for division by zero errors.
You didn't specify the full table structure so I will assume a common ID column to link rows in the tables.
SELECT table1.v/table2.DotProduct
FROM Table1 INNER JOIN Table2
ON (Table1.ID=Table2.ID)
You need to do a JOIN on the tables and divide the columns you want.
SELECT (Table1.v / Table2.DotProduct) FROM Table1 JOIN Table2 ON something
You need to substitue something to tell SQL how to match up the rows:
Something like: Table1.id = Table2.id
In case your fileds are both integers you need to do this to avoid integer math:
select t1.v / (t2.DotProduct*1.00)
from Table1 as t1
inner join Table2 as t2
on t1.ForeignKey = t2.PrimaryKey
If you have multiple values in table2 relating to values in table1 you need to specify which to use -here I chose the largest one.
select t1.v / (max(t2.DotProduct)*1.00)
from Table1 as t1
inner join Table2 as t2
on t1.ForeignKey = t2.PrimaryKey
Group By t1.v