SQL Server - Join tables so multiple rows become one row - sql

I'm trying to make a Join statement that will combine two tables, one being for employee information and the other for job role info. The structure of the tables is as follows:
Table 1
Table 2
I would like to join these two tables in such a way that JobKey and JobValue would be the same row as the associated employee, and not create duplicate rows. Normally, a join statement would create this:
Instead, what i would like is something like this:
Is there an effective way to do this?
Edit:
Here is the query I'm currently using to join them:
select * from testTable1 as a left join testTable2 as b on a.EmployeeName = b.EmployeeName

You can use conditional aggregation:
select t1.*, t2.jobkey_1, t2.jobvalue_1, t2.jobkey_2, t2.jobkey_2
from table1 t1 left join
(select t2.employeename,
max(case when seqnum = 1 then jobkey end) as jobkey_1,
max(case when seqnum = 1 then jobvalue end) as jobvalue_1,
max(case when seqnum = 2 then jobkey end) as jobkey_2,
max(case when seqnum = 2 then jobvalue end) as jobvalue_2,
from (select t2.*, row_number() over (partition by employeename order by rowid) as seqnum
from table2 t2
) t2
group by employeename
) t2;
Note: You appear to be using the employee name as the join key between the tables. You should really be using the employee id.

Related

MS Access 2010 - FIRST function in Access what is it in SQL

I am running a script in SQL server from MS Access and not getting the correct results due to FIRST function in MS Access
Select ClientID, ClientREF, FIRST(AgentID) AS FirstOFAgentID, FIRST(AgentREF) AS FirstOFAgentREF
From table 1
Right join table2 ON table1 = table2
Left join table 4 ON table1 = table4
Group by ClientID, ClienREF, AgentID, AgentREF
Your query doesn't make sense. You have FIRST(AgentID), but are including AgentId in the GROUP BY. You query is equivalent to simply SELECT DISTINCT with your columns.
Presuming you really want a first value, I can readily think of in SQL Server, both using window functions. I prefer conditional aggregation:
select ClientID, ClientREF,
max(case when seqnum = 1 then AgentID end) AS FirstOFAgentID, F
max(case when seqnum = 1 then AgentREF end) AS FirstOFAgentREF
from (select ClientID, ClientREF, AgentID, AgentREF,
row_number() over (partition by ClientID, ClientREF order by ?) as seqnum
from table 1 Right join
table2
ON table1 = table2 Left join
table 4
ON table1 = table4
) cc
group by ClientID, ClienREF;
The second uses SELECT DISTINCT with FIRST_VALUE(), which is provided as a window function but not an aggregation function.

How to compare two tables in Hive based on counts

I have below hive tables
Table_1
ID
1
1
2
Table_2
ID
1
2
2
I am comparing two tables based on count of ID in both tables, I need the output like below
ID
1 - 2records in table 1 and 1 record in Table 2
2 - one record in Table 1 and 2 records in table 2
Table_1 is parent table
i am using below query
select count(*),ID from Table_1 group by ID;
select count(*),ID from Table_2 group by ID;
Just do a full outer join on your queries with the on condition as X.id = Y.id, and then select * from the resultant table checking for nulls on either side.
Select id, concat(cnt1, " entries in table 1, ",cnt2, "entries in table 2") from (select * from (select count(*) as cnt1, id from table1 group by id) X full outer join (select count(*) as cnt2, id from table2 group by id)
on X.id=Y.id
)
Try This. You may use a case statement to check if it should be record / records etc.
SELECT m.id,
CONCAT (COALESCE(a.ct, 0), ' record in table 1, ', COALESCE(b.ct, 0),
' record in table 2')
FROM (SELECT id
FROM table_1
UNION
SELECT id
FROM table_2) m
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_1
GROUP BY id) a
ON m.id = a.id
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_2
GROUP BY id) b
ON m.id = b.id;
You could use this Python program to do a full comparison of 2 Hive tables:
https://github.com/bolcom/hive_compared_bq
If you want a quick comparison just based on counts, then pass the "--just-count" option (you can also specify the group by column with "--group-by-column").
The script also allows you to visually see all the differences on all rows and all columns if you want a complete validation.

PIVOT combined tables with criteria in sql

I have 2 tables
I want to show the result that returns all rows that has both a work and a home number
RESULT
I have written this SQL but it shows all. How do I show only to appear those with both values in home and work number and not showing the null values. I have tried adding WHERE PHONE_NUM IS NOT NULL but it did not work. I would appreciate any help. Thanks.
WITH TABLE1 AS (
SELECT
P.ID,
P.NAMES,
P.DIGIT,
Q.NUM_TYP,
Q.PHONE_NUM
FROM
dbo.TABLE1 P
INNER JOIN dbo.TABLE2 Q
ON P.ID = Q.ID
)
SELECT *
FROM
TABLE1
PIVOT (Max(PHONE_NUM) FOR NUM_TYP IN (HOME, WORK)) R
;
You can get the results from just table 2 using conditional aggregation:
select t2.id,
max(case when t2.num_type = 'HOME' then phone_num end) as home,
max(case when t2.num_type = 'WORK' then phone_num end) as work
from dbo.TABLE2 t2
group by t2.id
having max(case when t2.num_type = 'HOME' then phone_num end) is not null and
max(case when t2.num_type = 'WORK' then phone_num end) is not null;
You can join table 1 to get other fields if you like.

SQL: Inner Join return one row based on criteria

This is probably simple, but i'm looking for the raw SQL to perform an INNER JOIN but only return one of the matches on the second table based on criteria.
Given two tables:
**TableOne**
ID Name
1 abc
2 def
**TableTwo**
ID Date
1 12/1/2014
1 12/2/2014
2 12/3/2014
2 12/4/2014
2 12/5/2014
I want to join but only return the latest date from the second table:
Expected Result:
1 abc 12/2/2014
2 def 12/5/2014
I can easily accomplish this in LINQ like so:
TableOne.Select(x=> new { x.ID, x.Name, Date = x.TableTwo.Max(y=>y.Date) });
So in other words, what does the above LINQ statement translate into in raw SQL?
There are two ways to do this:
Using GROUP BY and MAX():
SELECT one.ID,
one.Name,
MAX(two.Date)
FROM TableOne one
INNER JOIN TableTwo two on one.ID = two.ID
GROUP BY one.ID, one.Name
Using ROW_NUMBER() with a CTE:
; WITH cte AS (
SELECT one.ID,
one.Name,
two.Date,
ROW_NUMBER() OVER (PARTITION BY one.ID ORDER BY two.Date DESC) as rn
FROM TableOne one
INNER JOIN TableTwo two ON one.ID = two.ID
)
SELECT ID, Name, Date FROM cte WHERE rn = 1
You could join the first table with an aggregate query:
SELECT t1.id, d
FROM TableOne t1
JOIN (SELECT id, MAX[date] AS d
FROM TableTwo
GROUP BY id) t2 ON t1.id = t2.id
Something like:
SELECT TableOne.id, TableOne.name, MAX(TableTwo.Date)
FROM TableOne
LEFT JOIN TableTwo ON TableOne.id = TableTwo.id
GROUP BY TableOne.id, TableOne.name;
The join will produce a table with as many rows as TableTwo, but the group by will filter it to one row per TableOne's rows.
Since nobody else has covered a Common Table Expression (CTE) that will perform the task you want, I'll throw it in here:
with maxDates as (
select Id, max(Date)
from Table2
group by Id
)
select x.Id, x.Name, y.Date
from TableOne x
inner join maxDates y
on x.Id = y.id

SUM function in SQL

There are three tables,we have to select data from these table using one primary key and foreign key. But in the one table there is lot of data in the third table. We have to sum the data on the base of the primary key.
BAl = Balance, met = Method, amo = amount, cst_id, cut_id, cut_i = customer_id
Now we have to sum the on the basis of method and sum for 10 cust id in the same query. Can anyone help me on this?
;WITH cte
AS
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY t1.cst_id) RowNum
FROM Table1 t1
INNER JOIN Table2 t2
ON t1.cst_id = t2.cut_id
INNER JOIN Table3 t3
ON t2.cut_id = t3.customer_id
AND t2.BAL = t3.Balance
AND t2.amo = t3.amount
)
SELECT SUM(*)
FROM cte
WHERE RowNum Between 1 AND 10
-- You can add a GROUP BY here
If you give some sample data it will be easier to write queries to help you.
But if your MET field is numerical and you want to sum it then you need.
select
t1.cst_n, t2.bal,
sum(t3.met) as met,
sum(t3.amo) as amo
from table1 as t1
inner join table2 as t2 on t2.cut_id = t1.cst_id
inner join table3 as t3 on t3.cut_i = t1.cst_id
group by t1.cst_n, t2.bal
well if you want to sum data for all 10 customers into one number, may be you just need
select
sum(t3.met) as met,
sum(t3.amo) as amo
from table3 as t3
where t3.cut_i in (select t.customerid from #<your variable table with cust. ids> as t)