SQL Combine duplicate rows while concatenating one column - sql

I have a table (example) of orders show below. The orders are coming in with multiple rows that are duplicated for all columns except for the product name. We want to combine the product name into a comma delimited string with double quotes. I would like to create a select query to return the output format shown below.
INPUT
Name address city zip product name
-----------------------------------------------------------------
John Smith 123 e Test Drive Phoenix 85045 Eureka Copper Canyon, LX 4-Person Tent
John Smith 123 e Test Drive Phoenix 85045 The North Face Sequoia 4 Tent with Footprint
Tom Test 567 n desert lane Tempe 86081 Cannondale Trail 5 Bike - 2021
OUTPUT
Name address city zip product name
------------------------------------------------------------------
John Smith 123 e Test Drive Phoenix 85045 "Eureka Copper Canyon, LX 4-Person Tent", "The
North Face Sequoia 4 Tent with Footprint"
Tom Test 567 n desert lane Tempe 86081 Cannondale Trail 5 Bike - 2021

You can have List_AGG() OR GROUP_CONCAT and then join the results back to original table. Then you can remove duplicates using row_number which will create a same rank
if data is same
WITH ALL_DATA AS (
SELECT * FROM TABLE
),
LIST_OF_ITEMS_PER_PRODUCT AS (
SELECT
ALL_DATA.NAME,
LIST_AGG(ALL_DATA.PRODUCT_NAME , ",") AS ALL_PRODUCTS_PER_PERSON
-- IF YOUR SQL DON'T SUPPORT LIST_AGG() THEN USE GROUP_CONCAT INSTEAD
FROM
ALL_DATA
GROUP BY 1
),
LIST_ADDED AS (
SELECT
ALL_DATA.*,
LIST_OF_ITEMS_PER_PRODUCT.ALL_PRODUCTS_PER_PERSON
FROM
ALL_DATA
LEFT JOIN LIST_OF_ITEMS_PER_PRODUCT
ON ALL_DATA.NAME = LIST_OF_ITEMS_PER_PRODUCT.NAME
),
ADDING_ROW_NUMBER AS (
SELECT
* ,
ROW_NUMBER() over (partition by list_added.NAME, ADDRESS, CITY, ZIP ORDER BY NAME) AS ROW_NUMBER_
FROM LIST_ADDED
)
SELECT
* FROM
ADDING_ROW_NUMBER
WHERE ROW_NUMBER_ = 1

Related

Dynamic Pivoting in SQL - Snowflake

I have an input file
Customer PhoneNum Location Brand
John 1234 ABC Oppo
John 1234 DEF MI
John 1234 KLM RealMe
John 1234 LKM 1+
Joe 9934 ABC Apple
Joe 9934 DEF Samsung
The same phone number can be listed to multiple phone brands and the number of brands per phone number can be dynamic i.e. some can have 2 brands some can have 4 some 8 etc. I can pass the list of unique brands in the pivot query but that would create columns which might not have values.
the result i want is
Customer PhoneNum Brand1 Brand1Location Brand2 Brand2Location Brand3 Brand3Location Brand4 Brand4Location
John 1234 Oppo ABC MI DEF RealMe KLM 1+ LKM
Joe 9934 Apple ABC Samsung DEF
```
Here i dont need the list of brands but if i know the maximum record per number is say 4 i can have the output in above format, which I believe is a good way to read the result.
Is there any way in SQL to get the above result.
select * from phone_multiple_make
pivot(max(location),max(brand) for brand in
('MI','Oppo','RealMe') )as p;
If you're ok with not using a pivot function you can acheive your results like this:
WITH CTE AS (
SELECT 'John' CUSTOMER,1234 PHONENUM,'ABC' LOCATION,'Oppo' BRAND
UNION
SELECT 'John' CUSTOMER,1234 PHONENUM,'DEF' LOCATION,'MI' BRAND UNION
SELECT 'John' CUSTOMER,1234 PHONENUM,'KLM' LOCATION,'RealMe' BRAND UNION
SELECT 'John' CUSTOMER,1234 PHONENUM,'LKM' LOCATION,'1+' BRAND UNION
SELECT 'Joe' CUSTOMER,9934 PHONENUM,'ABC' LOCATION,'Apple' BRAND UNION
SELECT 'Joe' CUSTOMER,9934 PHONENUM,'DEF' LOCATION,'Samsung' BRAND )
SELECT CUSTOMER, PHONENUM
,J:BRAND1:BRAND::STRING BRAND1, J:BRAND1:LOCATION::STRING LOCATION1
,J:BRAND2:BRAND::STRING BRAND2, J:BRAND2:LOCATION::STRING LOCATION2
,J:BRAND3:BRAND::STRING BRAND3, J:BRAND3:LOCATION::STRING LOCATION3
,J:BRAND4:BRAND::STRING BRAND4, J:BRAND4:LOCATION::STRING LOCATION4
FROM (
SELECT CUSTOMER, PHONENUM, OBJECT_AGG(KEY,OBJ) J FROM (
SELECT CUSTOMER, PHONENUM
,'BRAND'||ROW_NUMBER()OVER(PARTITION BY CUSTOMER,PHONENUM ORDER BY BRAND)::STRING KEY
,OBJECT_CONSTRUCT( 'LOCATION', LOCATION, 'BRAND',BRAND) OBJ FROM CTE) GROUP BY 1,2)

SQL query combine rows based on common id

I have a table with the below structure:
MID
FromCountry
FromState
FromCity
FromAddress
FromNumber
FromApartment
ToCountry
ToCity
ToAddress
ToNumber
ToApartment
123
USA
Texas
Houston
Well Street
1
Japan
Tokyo
6
ET3
123
Germany
Bremen
Bremen
Nice Street
4
Poland
Warsaw
9
ET67
456
France
Corsica
Corsica
Amz Street
3
Italy
Milan
8
AEC784
456
UK
UK
London
G Street
2
Portugal
Lisbon
1
LP400
The desired outcome is:
MID
FromCountry
FromState
FromCity
FromAddress
FromNumber
FromApartment
ToCountry
ToCity
ToAddress
ToNumber
ToApartment
FromCountry1
FromState1
FromCity1
FromAddress1
FromNumber1
FromApartment1
ToCountry1
ToCity1
ToAddress1
ToNumber1
ToApartment1
123
USA
Texas
Houston
Well Street
1
Japan
Tokyo
6
ET3
Germany
Bremen
Bremen
Nice Street
4
Poland
Warsaw
9
ET67
456
France
Corsica
Corsica
Amz Street
3
Italy
Milan
8
AEC784
UK
UK
London
G Street
2
Portugal
Lisbon
1
LP400
What I am trying to achieve is to bring multiple rows in 1 table, which have the same MID, under 1 row, regardless if there are columns with empty values.
I think that i over complicated the solution to this as I was trying something like this (and of course the outcome is not the desired one):
select [MID],
STUFF(
(select concat('', [FromCountry])
FROM test i
where i.[MID] = o.[MID]
for xml path ('')),1,1,'') as FromCountry
,stuff (
(select concat('', [FromState])
FROM test i
where i.[MID] = o.[MID]
for xml path ('')),1,1,'') as FromState
,stuff (
(select concat('', [FromCity])
FROM test i
where i.[MID] = o.[MID]
for xml path ('')),1,1,'') as FromCity
,stuff (
(select concat('', [FromAddress])
FROM test i
where i.[MID] = o.[MID]
for xml path ('')),1,1,'') as FromAddress
FROM test o
group by [MID]
...
Is there any way to achieve this?
On the assumption there are no more than 2 rows per MID then you can implement a simple row_number() solution.
You need to join one row for each MID to the other, so assign a unique value to each using row_number - there's nothing I can immediately see that indicates which row should be the "second" row - this is assigning row numbers based on the FromCountry - amend as necessary.
I'm not reproducing all the columns here but you get the idea, rinse and repeat for each column.
with m as (
select *, Row_Number() over(partition by Mid order by FromCountry) seq
from t
)
select m.Mid,
m.fromcountry, m.fromstate,
m2.fromcountry FromCountry1, m2.fromstate FromState1
from m
join m m2 on m.mid = m2.mid and m2.seq = 2
where m.seq = 1;
See example fiddle

Need Distinct address, ID etc with the different amount in one table using Plsql

Need help on the below scenario, please.
I want distinct address, ID, etc with the different amount in one table using plsql or
For example below is the current table
Address aRea zipcode ID Amount amount2 qua number
123 Howe's drive AL 1234 1234567 100 20 1 666666
123 Howe's drive AL 1234 1234567 5 05 2 abcccc
123 east drive AZ 456 8910112 200 11 1 777777
123 east drive AZ 456 8910112 5 5 2 SDN133
116 WOOD Ave NL 1234 2325890 3.23 1.25 1 10483210
116 WOOD Ave NL 1234 2325890 3.24 1.26 2 10483211
I need the output as below.
Address aRea zipcode ID Amount amount2 qua number
123 Howe's drive AL 1234 1234567 100 20 1 666666
5 05 2 abcccc
123 east drive AZ 456 8910112 200 11 1 777777
5 5 2 SDN133
116 WOOD Ave NL 1234 2325890 3.23 1.25 1 10483210
3.24 1.26 2 10483211
Below is for BigQuery Standard SQL
I would recommended below approach
#standardSQL
SELECT address, area, zipcode, id,
ARRAY_AGG(STRUCT(amount, amount2, qua, number)) info
FROM `project.dataset.table`
GROUP BY address, area, zipcode, id
if to apply to sample data from your question - output is
This type of task is usually better done on application side.
You can do this with SQL - but you need a column to order the record consistently:
select
case when rn = 1 then address end as address,
case when rn = 1 then area end as area,
case when rn = 1 then zipcode end as zipcode,
case when rn = 1 then id end as id,
amount,
amount2,
qua,
number
from (
select
t.*,
row_number() over(
partition by address, area, zipcode, id
order by ??
) rn
from mytable t
) t
order by address, area, zipcode, id, ??
The partition by clause of the window function lists the columns that you want to "group" together; you can modify it as you really need.
The order by clause of row_number() indicates how rows should be sorted within the partition: you need to decide which column (or set of columns) you want to use. For the output tu make sense, you also need an order by clause in the query, where the partition and ordering columns will be repeated.

how to SQL query with conditioned distinct

Simple Database:
street | age
1st st | 2
2nd st | 3
3rd st | 4
3rd st | 2
I'd like to build a query that'll return the DISTINCT street names, but only for those households where no one is over 3.
so that result would be:
street | age
1st st | 2
2nd st | 3
How do I do that? I know of DISTINCT, but now how to conditionalize it for all the records that match the DISTINCT
Suppose the name of the table is 'tab'. You can then try:
select distinct street from tab where street not in (select street from tab where age>3);
I have created a sql fiddle where you can view the result:
http://sqlfiddle.com/#!9/2c513d/2
Distinct street names for households where no one is over 3:
SELECT street
FROM table
GROUP BY street
HAVING COUNT(1) <= 3
SELECT DISTINCT street
FROM table
WHERE NOT(age>3)
USE GROUP BY
Select Street
from yourtable
group by street
Having sum(age)<=3
Another way this could be achived with a use of NOT EXISTS
SELECT *
FROM yourtable a
WHERE NOT EXISTS
(SELECT street
FROM yourtable b
WHERE age > 3
AND a.street = b.street)

Select query which returns exect no of rows as compare in values of sub query

I have got a table named student. I have written this query:
select * From student where sname in ('rajesh','rohit','rajesh')
In the above query it's returning me two records; one matching 'rajesh' and another matching: 'rohit'.
But i want there to be 3 records: 2 for 'rajesh' and 1 for 'rohit'.
Please provide me some solution or tell me where i am missing.
NOTE: the count of result of sub query is not fix there can be many words there some distinct and some multiple occurrence .
Thanks
Your requirements are not clear, and I'll try to explain why.
Let's define table students
ID FirstName LastName
1 John Smith
2 Mike Smith
3 Ben Bray
4 John Bray
5 John Smith
6 Bill Lynch
7 Bill Smith
Query with WHERE clause:
FirstName in ('Mike', 'Ben', 'Mike')
will return 2 rows only, because it could be rewritten as:
FirstName='Mike' or FirstName='Ben' or FirstName='Mike'
WHERE is filtering clause that just says if existing row satisfy given conditions or not (for each of rows created by FROM clause.
Let's say we have subquery that returns any number of non distinct FirstNames
In case if SQ contains 'Mike', 'Ben', 'Mike' using inner join you can get those 3 rows without problem
Select ST.* from Students ST
Inner Join (Select name from …. <your subquery>) SQ
On ST.FirstName=SQ.name
Result will be:
ID FirstName LastName
2 Mike Smith
2 Mike Smith
3 Ben Bray
Note data are not ordered by order of names returning by SQ. If you want that, SQ should return some ordering number, eg.:
Ord Name
1. Mike
2. Ben
3. Mike
In that case query should be:
Select ST.* from Students ST
Inner Join (Select ord, name from …. <your subquery>) SQ
On ST.FirstName=SQ.name
Order By SQ.ord
And result:
ID FirstName LastName
2 Mike Smith (1)
3 Ben Bray (2)
2 Mike Smith (3)
Now, let's se what will happen if subquery returns
Ord Name
1. Mike
2. Bill
3. Mike
You will end up with
ID FirstName LastName
2 Mike Smith (1)
6 Bill Lynch (2)
7 Bill Smith (2)
2 Mike Smith (3)
Even worse, if you have something like:
Ord Name
1. John
2. Bill
3. John
Result is:
ID FirstName LastName
1 John Smith (1)
4 John Bray (1)
5 John Smith (1)
6 Bill Lynch (2)
7 Bill Smith (2)
1 John Smith (3)
4 John Bray (3)
5 John Smith (3)
This is an complex situation, and you have to clarify precisely what requirement is.
If you need only one student with the same name, for each of rows in SQ, you can use something like SQL 2005+):
;With st1 as
(
Select Row_Number() over (Partition by SQ.ord Order By ID) as rowNum,
ST.ID,
ST.FirstName,
ST.LastName,
SQ.ord
from Students ST
Inner Join (Select ord, name from …. <your subquery>) SQ
On ST.FirstName=SQ.name
)
Select ID, FirstName, LastName
From st1
Where rowNum=1 -- that was missing row, added later
Order By ord
It will return (for SQ values John, Bill, John)
ID FirstName LastName
1 John Smith (1)
6 Bill Lynch (2)
1 John Smith (3)
Note, numbers (1),(2),(3) are shown to display value of ord although they are not returned by query.
If you can split the where clause in your calling code, you could perform a UNION ALL on each clause.
SELECT * FROM Student WHERE sname = 'rajesh'
UNION ALL SELECT * FROM Student WHERE sname = 'rohit'
UNION ALL SELECT * FROM Student WHERE sname = 'rajesh'
Try using a JOIN:
SELECT ...
FROM Student s
INNER JOIN (
SELECT 'rajesh' AS sname
UNION ALL
SELECT 'rohit'
UNION ALL
SELECT 'rajesh') t ON s.sname = t.sname
just because you've got a criteria in there two times doesn't mean that it will return 1 result per criteria. SQL engines usually just use the unique criteria - thus, from your example, there will be 2 criteria in IN clause: 'rajesh','rohit'
WHY do you need to return 2 results? are there two rajesh in your table? they should BOTH return then. You don't need to ask for rajesh twice for that to happen. What does your data look like? What do you want to see returned?
Hi i am query just as you give above and it give me all data that matches in the condition of in clause. just like your post
select * from person
where personid in (
'Carson','Kim','Carson'
)
order by FirstName
and its give me all records which fulfill this Criteria