SQL Query to concat duplicated row - sql

Let's say I have a table with 3 columns with the following values
| A | B | C |
| A1| B1| C |
| A | B | C1|
_____________
I'd like to make a query to get
A | B | C, C1|
A1| B1| C|
to get distinct First & Second Column. Any help would be greatly appreciated

Need to use listagg..
SELECT
cola,
colb,
LISTAGG(colc, ', ') WITHIN GROUP(ORDER BY colc) AS colc
FROM mytable
GROUP BY
cola,
colb

Related

How to join 2 tables when it has different values in the column

I have 2 tables as follows.
Need to join these 2 tables to get below table
I am trying different joins but not getting expected results. Could you please help me to get the desired table.
Really appreciate your help.
Thanks...
Hope this solution can help you, (I am used SQL_Server syntax)
SELECT isnull(date1,date2) as Date3, ISNULL(RM, 0 ),ISNULL(KM, 0 )
FROM table1
FULL JOIN table2
ON table1.Date1 = table2.Date2
order by Date3;
[RESULT]:
[EDIT]:
Live demo
create table Table1 (DATE1 date, RM int);
INSERT INTO Table1 VALUES ('1/4/2020' , 1);
INSERT INTO Table1 VALUES ('2/1/2020' , 4);
INSERT INTO Table1 VALUES ('2/10/2020' , 4);
GO
3 rows affected
create table Table2 (DATE2 date, KM int);
INSERT INTO Table2 VALUES ('2/2/2020' , 1);
INSERT INTO Table2 VALUES ('2/10/2020' , 3);
INSERT INTO Table2 VALUES ('3/5/2020' , 2);
GO
3 rows affected
select * from Table1;
GO
DATE1 | RM
:--------- | -:
2020-01-04 | 1
2020-02-01 | 4
2020-02-10 | 4
select * from Table2;
GO
DATE2 | KM
:--------- | -:
2020-02-02 | 1
2020-02-10 | 3
2020-03-05 | 2
SELECT isnull(date1,date2) as Date3, ISNULL(RM, 0 ),ISNULL(KM, 0 )
FROM table1
FULL JOIN table2
ON table1.Date1 = table2.Date2
order by Date3;
GO
Date3 | (No column name) | (No column name)
:--------- | ---------------: | ---------------:
2020-01-04 | 1 | 0
2020-02-01 | 4 | 0
2020-02-02 | 0 | 1
2020-02-10 | 4 | 3
2020-03-05 | 0 | 2
db<>fiddle here
I don't know scala but in pyspark you can do the following:
df1.join(df2, 'DATE', 'full').fillna(0)
Essentially you do a full join and fill all the NULLs with 0.
For Hive SQL I guess it would be something like
SELECT Date,
CASE WHEN (table1.RM IS NOT NULL) THEN table1.RM ELSE 0 END AS RM,
CASE WHEN (table2.KM IS NOT NULL) THEN table2.KM ELSE 0 END AS KM
FROM table1
FULL JOIN table2
ON table1.Date = table2.Date
I have created two initial dataframe named as df_rm, df_km as a source for your data.
df_rm looks like this:
+---------+---+
| date| rm|
+---------+---+
| 1/4/2020| 1|
| 2/1/2020| 4|
|2/10/2020| 4|
+---------+---+
df_km:
+---------+---+
| date| km|
+---------+---+
| 2/2/2020| 1|
|2/10/2020| 3|
| 3/5/2020| 2|
+---------+---+
Now, first we can do outer join then replace the null values with some values, in this case 0.
df_km.join(right = df_rm, Seq("date"),joinType = "outer")
.withColumn("rm",when(col("rm").isNull,0).otherwise(col("rm")))
.withColumn("km",when(col("km").isNull,0).otherwise(col("km")))
.show()
Which outputs like this:
+---------+---+---+
| date| km| rm|
+---------+---+---+
| 3/5/2020| 2| 0|
| 2/2/2020| 1| 0|
| 2/1/2020| 0| 4|
| 1/4/2020| 0| 1|
|2/10/2020| 3| 4|
+---------+---+---+

Group Concatenate Strings (Rows) in BigQuery

I am working with Google BigQuery & I have a query that looks like the following:
SELECT
prod.abc
uniqueid,
variable2,
cust.variable1,
purch.variable2,
from mydata.order
left join
UNNEST(purchases) as purch,
UNNEST(codes_abs) as cod, UNNEST(cod.try_products) as prod
When I do this, this results in a table that looks like this:
|prod.abc| uniqueid | variable2 | ...|
|APP123 | customer1| value | ...|
|BLU155 | customer1| value | ...|
|TRI134 | customer1| value | ...|
|LO123 | customer2| value | ...|
|ZU9274 | customer2| value | ...|
|TO134 | customer3| value | ...|
What I would like to do is to concatenate values in column "prod.abc", group them by "uniqueid" and separate them by ",". I found numerous solutions online, however, since I have unnested other variables in my query, none of the solutions I found seem to work in my case. The values do not need to be ordered in any way. Basically, what I would like to end up with is:
|prod.abc | uniqueid | variable2 | ...|
|APP123, BLU155, TRI134 | customer1| value | ...|
|LO123, ZU9274 | customer2| value | ...|
|TO134 | customer3| value | ...|
It would also be okay to get a table like this where duplicates are kept, as I could remove them later on:
|prod.abc | uniqueid | variable2 | ...|
|APP123, BLU155, TRI134 | customer1| value | ...|
|APP123, BLU155, TRI134 | customer1| value | ...|
|APP123, BLU155, TRI134 | customer1| value | ...|
|LO123, ZU9274 | customer2| value | ...|
|LO123, ZU9274 | customer2| value | ...|
|TO134 | customer3| value | ...|
Any help is much appreciated. Thank you!
Do each unnest separately:
Does aggregation work?
SELECT STRING_AGG(item.abc, ',')
uniqueid, variable2, cust.variable1, purch.variable2
FROM mydata.order LEFT JOIN
UNNEST(purchases) as purch
ON true LEFT JOIN
UNNEST(codes_abs) as cod
ON true LEFT JOIN
UNNEST(cod.try_items) as item
ON true
GROUP BY uniqueid, variable2, cust.variable1, purch.variable2;
Below is for BigQuery Standard SQL
#standardSQL
SELECT
STRING_AGG(prod.abc, ', ') AS abc
uniqueid,
variable2,
cust.variable1,
purch.variable2,
FROM mydata.order
LEFT JOIN UNNEST(purchases) AS purch
LEFT JOIN UNNEST(codes_abs) AS cod
LEFT JOIN UNNEST(cod.try_products) AS prod
GROUP BY uniqueid,
variable2,
cust.variable1,
purch.variable2

SQL: Merge localized version of a table to the main one

Imagine I have a main table like:
Table guys
|id| name|profession|
|--|------|----------|
| 1| John| developer|
| 2| Mike| boss|
| 3| Roger| fireman|
| 4| Bob| policeman|
I also have a localized version which is not complete (the boss is missing):
Table guys_bg
|id| name | profession|
|--|------|-----------|
| 1| Джон|разработчик|
| 3|Роджър| пожарникар|
| 4| Боб| полицай|
I want to prioritize guys_bg results while still showing all the guys (The boss is still a guy, right?).
This is the desired result:
|id| name | profession|
|--|------|-----------|
| 1| Джон|разработчик|
| 2| Mike| boss|
| 3|Роджър| пожарникар|
| 4| Боб| полицай|
Take into consideration that both tables may have a lot of (100+) columns so joining the tables and using CASE for every column will be very tedious.
What are my options?
Here is one way using union all:
select gb.*
from guys_bg gb
union all
select g.*
from guys g
where not exists (select 1 from guys_bg gb where gb.id = g.id);
You can also make it with using FULL JOIN.
SELECT
ISNULL(b.id,g.id) id
, ISNULL(b.name, g.name) name
, ISNULL(b.profession, g.profession) profession
FROM
guys g
FULL JOIN guys_bg b ON g.id = b.id

SQL Count occurrences of non-unique column

Suppose I have a SQL table looking something like this:
--------------------
| id| name|
--------------------
| 1| Alice|
| 2| Bob|
| 3| Alice|
| 4| Alice|
| 5| Jeff|
| ...| ...|
--------------------
Is it possible to formulate a query which returns a list of names and the number of times they occur? I've made a solution to this by querying all the rows, removing duplicates counting and then ordering; which works, but just looks messy. Can this be neatened up in a SQL query?
This is standard SQL and should deliver your expected result:
select name, count(*)
from tblName
group by name
order by name
If you want to order by the count in descending order, you can use:
select name, count(*)
from tblName
group by name
order by 2 DESC

Select with Nested table omitting duplicate values

I have two tables.
T1
--------------------------
|IDT1|DESCR | VALUE |
--------------------------
| 1|TEST 1 | 100|
| 2|TEST 2 | 80|
--------------------------
T2
-----------
|IDT2|IDT1|
-----------
| 1| 1|
| 2| 1|
| 3| 2|
-----------
The field T2.IDT1 is foreign key of T1.IDT1.
I need to omit the duplicate values of T1 table (only), like the second row in the below result.
----------------------------
|IDT1|DESCR |IDT2| VALUE|
----------------------------
| 1|TEST 1 | 1| 100|
| | | 2| |
| 2|TEST 2 | 3| 80|
----------------------------
I am using firebird 2.5.
I'm not familiar with firebird, but if this was an Oracle DB, you could try this:
select
t1.idt1,
t1.descr,
t2.idt2,
t1.value
from (
select
t2.idt2 idt2,
case
when lag(t2.idt1) over (order by t2.idt1, t2.idt2) = t2.idt1 theN null
else t2.idt1
end idt1
from t2
) t2
left outer join t1
on t1.idt1 = t2.idt1
order by 3;
You can test that here: SQL Fiddle