I have a query that looks for records that don't have a matching account number and tries to match those accounts by address.
I am getting the results I want, but I want to include columns from the table2 below. How can I do this?
Select DISTINCT
account_num
,product
,accountName
,address_1
,address_2
,city
,state
,zip
,short_address
INTO #Matching_Address
From #Non_Matching_Accounts t
Where EXISTS
(SELECT * FROM (SELECT
left(ADDRESS_LINE1_TXT,20) AS matching_add
,CITY
,STATE
,ZIP
,ACCOUNT_OWNER
From [database].[dbo].[table2]) v (matching_add, CITY, STATE,ZIP,ACCOUNT_OWNER)
WHERE
t.short_address= v.matching_add
AND t.city= v.NAME
AND t.state = v.STATE
AND t.zip = v.ZIP
AND t.accountName LIKE '%'+v.ACCOUNT_OWNER+'%')
I've tried:
Select DISTINCT
account_num
,product
,accountName
,address_1
,address_2
,city
,state
,zip
,short_address
,matching_add
,CITY
,STATE
,ZIP
,ACCOUNT_OWNER
INTO #Matching_Address
From #Non_Matching_Accounts t
Where EXISTS
(SELECT * FROM (SELECT
left(ADDRESS_LINE1_TXT,20) AS Select DISTINCT
account_num
,product
,accountName
,address_1
,address_2
,city
,state
,zip
,short_address
INTO #Matching_Address
From #Non_Matching_Accounts t
Where EXISTS
(SELECT * FROM (SELECT
left(ADDRESS_LINE1_TXT,20) AS matching_add
,CITY
,STATE
,ZIP
,ACCOUNT_OWNER
From [database].[dbo].[table2]) v (matching_add, CITY, STATE,ZIP,ACCOUNT_OWNER)
WHERE
t.short_address= v.matching_add
AND t.city= v.NAME
AND t.state = v.STATE
AND t.zip = v.ZIP
AND t.accountName LIKE '%'+v.ACCOUNT_OWNER+'%')
From [database].[dbo].[table2]) v (matching_add, CITY, STATE,ZIP,ACCOUNT_OWNER)
WHERE
t.short_address= v.matching_add
AND t.city= v.NAME
AND t.state = v.STATE
AND t.zip = v.ZIP
AND t.accountName LIKE '%'+v.ACCOUNT_OWNER+'%')
Expected Results:
acct_num|prd|actName|add1|add2|city|state|zip|act_num2|prd2|actName|add1|add2|city2|state2|zip2|
----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
a | a | a | a | a | a | a | a | a | a | a | a | a | a a| a
b | b | b | b | b | b | b | b | b | b | b | b | b | b | b
c | c | c | c | c | c | c | c | c | c | c | c | c | c | c |
d | d | d | d | d | d | d | d | d | d | d | d | d | d | d |
You're using 'exists' when an 'inner join' is advised. Restructure as follows:
select
distinct t.account_num,
t.product,
t.accountName,
t.address_1,
t.address_2,
t.city,
t.state,
t.zip,
t.short_address,
matching_add = left(v.address_line1_txt,20),
vCity = v.city,
vState = v.state,
vZip = v.zip,
v.account_owner
into #Matching_Address
from #Non_Matching_Accounts t
join [database].[dbo].[table2] v
on t.short_address = v.matching_add
and t.city = v.name
and t.state = v.state
and t.zip = v.zip
and t.accountName like '%' + v.account_owner + '%'
An inner join (or just 'join' for short), will only return matches, so it works like 'exists' in that sense. But it makes the columns from the right-hand table available to you.
My hunch is that you may have tried this. I see a 'distinct' in your query, which probably would not have been necessary with just 'exists'. Did you abandon 'inner join' because it was duplicating your rows? If so, 'exists' is still not the answer. Maybe a cross apply can help you:
select ... (same as above)
into #Matching_Address
from #Non_Matching_Accounts t
cross apply (
select
top 1 *
from [database].[dbo].[table2] v
where t.short_address = v.matching_add
and t.city = v.name
and t.state = v.state
and t.zip = v.zip
and t.accountName like '%' + v.account_owner + '%'
order by v.matching_add -- or whatever puts the better one on top
) v
With 'top 1', The 'v' result will produce no more than 1 record per row in 't'. With 'cross apply', if the result of 'v' is no records, then 't' will not return a row, (similar to 'exists' or 'inner join').
Related
I have a DesignGroup table as:
+--------------------------------------+----------+
| DesignGroupId | Name |
+--------------------------------------+----------+
| 3A81C1FF-442F-4291-B8E2-7079D80920CF | Design 1 |
| 3238F4C6-7BA7-4B3F-9383-17702B0D1CC3 | Design 2 |
+--------------------------------------+----------+
Each DesignGroup can have multiple customers, so I have a table DesignGroupCustomers as:
+--------------------------------------+--------------------------------------+-------------+
| DesignGroupCustomerId | DesignGroupId (FK) | CustomerKey |
+--------------------------------------+--------------------------------------+-------------+
| D0828677-F295-46F7-BB85-65888D5A48B7 | 3A81C1FF-442F-4291-B8E2-7079D80920CF | 10 |
| 10C01BB9-1DDB-4DB4-BEC4-9539E030BF68 | 3A81C1FF-442F-4291-B8E2-7079D80920CF | 20 |
| F88C9F66-C0D9-EB11-8481-5CF9DDF6DC87 | 3238F4C6-7BA7-4B3F-9383-17702B0D1CC3 | 10 |
+--------------------------------------+--------------------------------------+-------------+
Each customer have a CustomerType as, customerTable:
+-------------+-------------+
| CustomerKey | CustTypeKey |
+-------------+-------------+
| 10 | 2 |
| 20 | 1 |
+-------------+-------------+
That I want to achieve is to get only this statement:
return only the DesignGroup who not have a customer with custTypeKey = 1
In this case it should return Design 2 because it does not have customer with custTypeKey = 1
I was thinking about CTE usage but I just have not idea how to get the desire result:
;WITH CTE
AS (SELECT
[DG].[DesignGroupId]
, ROW_NUMBER() OVER(PARTITION BY [DesignGroupCustomer]) AS [RN]
FROM [DesignGroup] AS [DG]
INNER JOIN [DesignGroupCustomer] AS [DGC] ON [DG].[DesignGroupId] = [DGC].[DesignGroupId]
INNER JOIN [Customer] AS [C] ON [DGC].[CustomerKey] = [C].[CustomerKey]
INNER JOIN [CustomerType] AS [CT] ON [C].[CustTypeKey] = [CT].[CustTypeKey])
SELECT
[DesignGroupId]
FROM [CTE] -- WHERE CustomerType NOT CONTAINS (1)
WITH temp AS (
SELECT DISTINCT
dgc.DesignGroupId AS DesignGroupId
FROM DesignGroupCustomers dgc
INNER JOIN customerTable ct
ON dgc.CustomerKey = ct.CustomerKey
WHERE ct.CustTypeKey = 1
)
SELECT
DesignGroupId
FROM DesignGroup
WHERE DesignGroupId NOT IN (
SELECT
DesignGroupId
FROM temp
)
Firstly, you can get all designgroups having CustTypeKey =1 and then get all other designgroups using NOT IN. Please let me know if you face any issues
You can use a subquery to return the design groups which have this customer type key of 1 and then LEFT JOIN the subquery on the design table and filter down to results that have a DesignGroupId of null (any design group that isn't included in the dataset of the subquery)
SELECT d.[DesignGroupId]
FROM [DesignGroup] AS d
LEFT JOIN
(
SELECT dgc.[DesignGroupId]
FROM [DesignGroupCustomer] AS dgc
ON dgc.[DesignGroupId] = d.[DesignGroupId]
INNER JOIN [Customer] AS c
ON c.[CustomerKey] = dgc.[CustomerKey]
WHERE c.[CustTypeKey] = 1
GROUP BY dgc.[DesignGroupId]
) x
ON x.[DesignGroupId] = d.[DesignGroupId]
WHERE x.[DesignGroupId] IS NULL
I have following table in Postgres
| phone | group | spec |
| 1 | 1 | 'Lock' |
| 1 | 2 | 'Full' |
| 1 | 3 | 'Face' |
| 2 | 1 | 'Lock' |
| 2 | 3 | 'Face' |
| 3 | 2 | 'Scan' |
Tried this
SELECT phone, string_agg(spec, ', ')
FROM mytable
GROUP BY phone;
Need this ouput for each phone where there is empty string for missing group.
| phone | spec
| 1 | Lock, Full, Face
| 2 | Lock, '' , Face
| 3 | '', Scan ,''
You need a CTE which returns all possible combinations of phone and group and a left join to the table so you can group by phone:
with cte as (
select *
from (
select distinct phone from mytable
) m cross join (
select distinct "group" from mytable
) g
)
select c.phone, string_agg(coalesce(t.spec, ''''''), ',') spec
from cte c left join mytable t
on t.phone = c.phone and t."group" = c."group"
group by c.phone
See the demo.
Results:
| phone | spec |
| ----- | -------------- |
| 1 | Lock,Full,Face |
| 2 | Lock,'',Face |
| 3 | '',Scan,'' |
You can use conditional aggregation:
select phone,
(max(case when group = 1 then spec else '''''' end) || ', ' ||
max(case when group = 2 then spec else '''''' end) || ', ' ||
max(case when group = 3 then spec else '''''' end)
) as specs
from mytable t
group by phone;
Alternatively, you can general all the groups using generate_series() and then aggregation:
select p.phone,
string_agg(coalesce(t.spec, ''''''), ', ') as specs
from (select distinct phone from mytable) p cross join
generate_series(1, 3, 1) gs(grp) left join
mytable t
on t.phone = p.phone and t.group = gs.grp
group by p.phone
You can consider using a self - (RIGHT/LEFT)JOIN with all three distinct groups (which's stated within the subquery just after RIGHT JOIN keywords ) and a correlated query for your table :
WITH mytable1 AS
(
SELECT distinct t1.phone, t2."group",
( SELECT spec FROM mytable WHERE phone = t1.phone AND "group"=t2."group" )
FROM mytable t1
RIGHT JOIN ( SELECT distinct "group" FROM mytable ) t2
ON t2."group" = coalesce(t2."group",t1."group")
)
SELECT phone, string_agg(coalesce(spec,''''''), ', ') as spec
FROM mytable1
GROUP BY phone;
Demo
Source Table
Assuming I have a table called MyTable with the content:
+----------+------+
| Category | Code |
+----------+------+
| A | A123 |
| A | B123 |
| A | C123 |
| B | A123 |
| B | B123 |
| B | D123 |
| C | A123 |
| C | E123 |
| C | F123 |
+----------+------+
I'm trying to count the number of Code values which are unique to each category.
Desired Result
For the above example, the result would be:
+----------+-------------+
| Category | UniqueCodes |
+----------+-------------+
| A | 1 |
| B | 1 |
| C | 2 |
+----------+-------------+
Since C123 is unique to A, D123 is unique to B, and E123 & F123 are unique to C.
What I've Tried
I'm able to obtain the result for a single category (e.g. C) using a query such as:
SELECT COUNT(a.Code) AS UniqueCodes
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = "C"
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> "C"
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
However, whilst I can hard-code a query for each category, I cannot seem to construct a single query to calculate this for every possible Category value.
Here is what I've tried:
SELECT c.Category,
(
SELECT COUNT(a.Code)
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = c.Category
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> c.Category
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
) AS UniqueCodes
FROM
(
SELECT MyTable.Category
FROM MyTable
GROUP BY MyTable.Category
) c
Though, the c.Category is not defined within the scope of the nested SELECT query.
Could anyone advise how I could obtain the desired result?
I would use NOT EXISTS & do aggregation :
select category, count(*)
from MyTable t
where not exists (select 1 from MyTable t1 where t1.code = t.code and t1.category <> t.category)
group by category;
You can use two levels of aggregation:
select minc as category, count(*)
from (select code, min(category) as minc, max(category) as maxc
from t
group by code
) as c
where minc = maxc
group by minc;
This would also work:
select category, count(*) from(
select a.category, b.count from mytable a join (
select code, count(category) as count
from mytable
group by code
having count(category) = 1
) b on b.code = a.code
) c group by category
Learning from #isaace's answer, I also came up with this -
SELECT MyTable.Category, COUNT(*)
FROM
MyTable INNER JOIN
(SELECT Code FROM MyTable GROUP BY Code HAVING COUNT(Category) = 1) a
ON MyTable.Code = a.Code
GROUP BY MyTable.Category
I am trying to mark duplicate records, however I get wrong reassignment on few on them and I don't know why.
Data:
=FirstName | LastName | Company | Group | Status | ID
x | x | x | NULL | NULL | 1
x | x | x | NULL | NULL | 2
Then I run this query to find matches on FirstName, LastName, Company
and join it back to the main table to mark the records:
with d as (
select ID, FirstName, LAstName, Company, row_number() over (partition by FirstName,LastName, Company order by FirstName,LastName, Company) as nr
from [dbo].xx)
Update b
set Status = 'S'
, Group = d.DQ_ID
from xx as b inner join d on
b.FirstName = d.FirstName and
b.LastNAme = d.LastName and
b.Company = d.Company
where d.nr = 1
And then Update the Main Record with P
Update b
set Status = 'P'
from xx as b
where b.ID = b.Group
GO
What I expect:
=FirstName | LastName | Company | Group | Status | ID
x | x | x | 1 | P | 1
x | x | x | 1 | S | 2
What I get:
=FirstName | LastName | Company | Group | Status | ID
x | x | x | 2 | S | 1
x | x | x | 1 | S | 2
I am working on about 1M records - and it only happen to some of them!
Try this :
;with d as (
select
ID,
FirstName,
LAstName,
Company,
row_number() over (
partition by FirstName,LastName, Company
order by Id asc -- this was done to keep ordering as per ID
) as nr
from [dbo].xx
) ,
e as
(select * from d where nr=1)
-- e was created to only take the nr=1 rows which will be joined to all similar records
Update b
set Status = case when e.DQ_ID = b.DQ_ID then 'P' else 'S' end
-- the set case logic ensures that matching ids get P else S
, Group = e.DQ_ID
from xx as b
inner join e on
b.FirstName = e.FirstName and
b.LastNAme = e.LastName and
b.Company = e.Company
Can try with the following:
;WITH RankedData AS
(
SELECT
T.ID,
T.[Group],
T.Status,
T.FirstName,
T.LastName,
T.Company,
GroupRanking = ROW_NUMBER() OVER (PARTITION BY T.FirstName, T.LastName, T.Company ORDER BY T.ID ASC)
FROM
dbo.xx AS T
)
UPDATE T SET
[Group] = N.ID,
Status = CASE WHEN T.GroupRanking = 1 THEN 'P' ELSE 'S' END
FROM
RankedData AS T
INNER JOIN RankedData AS N ON
T.FirstName = N.FirstName AND
T.LastName = N.LastName AND
T.Company = N.Company AND
N.GroupRanking = 1
Keep in mind that the INNER JOIN will join on not null names and companies, will have to keep in mind if you have nulls on those columns.
I have the following tables:
event_tbl
| event_id (PK) | event_date | event_location |
|---------------|------------|----------------|
| 1 | 01/01/2018 | Miami |
| 2 | 02/04/2018 | Tampa |
performer_tbl
| performer_id (PK) | event_id (FK) | genre |
|-------------------|---------------|-------|
| 1 | 1 | A |
| 2 | 1 | B |
| 3 | 2 | A |
| 4 | 2 | C |
I want to find events that have both genre A and genre B (should just return event 1), and I'm lost on writing the query. Maybe I just haven't had enough coffee, but all I can come up with is doing two derived columns with a case statement that count either genre and group by the event_id, then filtering both to >0. It just doesn't seem very elegant.
This should do the job (in MySQL, for other DBMS the syntax can be varied easily):
SELECT
e.event_id
FROM
event_tbl e
JOIN performer_tbl p USING(event_id)
GROUP BY e.event_id
HAVING SUM(IF(p.genre = 'A', 1, 0)) >= 1 AND SUM(IF(p.genre = 'B', 1, 0)) >= 1;
if you are using sql server, check below:
Select * From
event_tbl
where event_id
IN
(
select event_id
from performer_tbl as A
where exists (select 1
from perfoermer_tbl as B
where B.event_id = A.event_id and B.genre = 'A')
and
exists (select 1
from perfoermer_tbl as B
where B.event_id = A.event_id and B.genre = 'B')
)
This should work in any SQL database (at least in mysql, sql server, postgres or oracle)
select event_tbl.* FROM (
select event_id
from performer_tbl
where genre = 'A'
GROUP BY event_id) a_t
INNER JOIN (select event_id
from performer_tbl
where genre = 'B'
GROUP BY event_id) b_t
ON a_t.event_id = b_t.event_id
INNER JOIN event_tbl
ON event_tbl.event_id = a_t.event_id
This also works using left joins: (Since there are no function calls or sub-selects, it is fast. Also, it's usable in most SQL engines.)
SELECT DISTINCT
p1.event_id
,e.event_date
,e.event_location
FROM
performer_tbl as p1
inner join event_tbl as e on
p1.event_id = e.event_id
left outer join performer_tbl as p2 on
p1.event_id = p2.event_id
AND p2.genre = 'A'
left outer join performer_tbl as p3 on
p1.event_id = p3.event_id
AND p3.genre = 'B'
WHERE
p2.genre IS NOT NULL
AND p3.genre IS NOT NULL;
If I correctly understand what you need, you can try this:
Select *
from event_tbl e
where exists (select *
from performer_tbl p
where p.event_id = e.event_id
and p.genre in ('A', 'B'))