Create new column with all unique values from another column in SQL

Create new column with all unique values from another column in SQL - sql

I have the following table:
>>> id crop grower loc
0 11 maize Lulu Fiksi
1 13 maize Lulu Menter
2 05 maize Felix Hausbauch
3 04 apples Lulu Fiksi
4 02 apples Meni Linter
5 06 cotton Delina Marchi
6 12 cotton Lexi Tinta
7 16 cotton Lexi Ferta
...
I want tto create new table which will show the unique crop names, count of the crops appearence and then list of all the growers that grow this crop,so the result table should look like this:
>>> crop total_count growers
0 maize 3 Lulu, Felix
1 apples 2 Lulu,Meni
2 cotton 3 Delina, Lexi
I manage to create table that shows the crops and the total count without the growers names:
select "CROP",count(*) "totalCount"
from "table"
group by "CROP"
order by "totalCount" desc
My question is how can I create new table with new column that contains list of unique growers for each crop (like in the example).

GROUP_CONCAT is for MySQL, Snowflake uses LISTAGG:
create or replace table test (
id int,
crop varchar,
grower varchar,
loc varchar
);
insert into test values
(11, 'maize', 'Lulu', 'Fiksi'),
(13, 'maize', 'Lulu', 'Menter'),
(5, 'maize', 'Felix', 'Hausbauch'),
(4, 'apples', 'Lulu', 'Fiksi'),
(2, 'apples', 'Meni', 'Linter'),
(6, 'cotton', 'Delina', 'Marchi'),
(12, 'cotton', 'Lexi', 'Tinta'),
(16, 'cotton', 'Lexi', 'Ferta');
select
crop,
count(1) as total_count,
listagg(distinct grower, ', ') as growers
from test
group by crop
;
+--------+-------------+--------------+
| CROP | TOTAL_COUNT | GROWERS |
|--------+-------------+--------------|
| maize | 3 | Lulu, Felix |
| apples | 2 | Lulu, Meni |
| cotton | 3 | Delina, Lexi |
+--------+-------------+--------------+

you can use GROUP_CONCAT() or any related fun according to your data base
select "CROP",count(*) "totalCount",GROUP_CONCAT(grower) as growers
from "table"
group by "CROP"
order by "totalCount" desc

Related

How to calculate percent change between two values in the same column

I want to calculate a percent change between two values in the same column in a specific form and I have no idea if what I’m trying to do is even possible.
I have a table with 3 fields
Month, Country, Value
order_month
country
value
2021-01
UK
10
2022-02
UK
20
2021-01
France
20
2022-02
France
18
2021-01
Italy
25
2021-02
Italy
35
What I struggle to get :
order_month
country
value
2021-01
UK
10
2022-02
UK
20
diff
UK
10
2021-01
France
20
2022-01
France
18
diff
France
-2
2021-01
Italy
25
2022-02
Italy
35
diff
Italy
10
I tried many things without success. Thanks a lot if you can help me on this.

You can use the LEAD/LAG window functions for this. I'd propose using this to create a new column for the difference, rather than hoping to add in a new row into the result to get the difference of the two rows above it.
Schema (MySQL v8.0)
CREATE TABLE data (
`order_month` date,
`country` VARCHAR(6),
`value` INTEGER
);
INSERT INTO data
(`order_month`, `country`, `value`)
VALUES
('2021-01-01', 'UK', '10'),
('2022-02-01', 'UK', '20'),
('2021-01-01', 'France', '20'),
('2022-02-01', 'France', '18'),
('2021-01-01', 'Italy', '25'),
('2022-02-01', 'Italy', '35');
Query #1
select *,
VALUE - Lead(VALUE) OVER (PARTITION BY COUNTRY ORDER BY ORDER_MONTH DESC) as Month_vs_Month
from data;
order_month
country
value
Month_vs_Month
2022-02-01
France
18
-2
2021-01-01
France
20
2022-02-01
Italy
35
10
2021-01-01
Italy
25
2022-02-01
UK
20
10
2021-01-01
UK
10
View on DB Fiddle

Demo Granted I'm using SQL server but both support union, both support FIRST_VALUE analytic so... I think this will be ok...
Assumptions:
Your order_month is a string or diff will be able to be used.
the collation used supports #'s first or sort could be off.
You're ok with country being sorted.
.
WITH CTE AS (SELECT order_month, country, value
FROM data
UNION ALL
SELECT Distinct 'diff' order_month, country,
FIRST_VALUE(value) over (partition by country order by order_month DESC) -
FIRST_VALUE(value) over (partition by country order by order_month ASC) value
FROM data)
SELECT *
FROM CTE
ORDER BY country, order_month
Giving us:
+-------------+---------+-------+
| order_month | country | value |
+-------------+---------+-------+
| 2021-01-01 | France | 20 |
| 2022-02-01 | France | 18 |
| diff | France | -2 |
| 2021-01-01 | Italy | 25 |
| 2022-02-01 | Italy | 35 |
| diff | Italy | 10 |
| 2021-01-01 | UK | 10 |
| 2022-02-01 | UK | 20 |
| diff | UK | 10 |
+-------------+---------+-------+
What this does:
Generate a CTE using the first_value analytic once ordered by month asc then desc.
we then subract the older form the newer.
we then need to group and order the data so we have a CTE we select from.
I'm not a fan of overlading the order_month with diff

You need to create two subqueries or CTEs that isolate the values to the months you're analyzing.
Subquery example
select
country,
value as jan_value
from {{table}}
where order_month = '2022-01'
Do the same for Februrary then join the tables to create a new data set with county, jan_value, and feb_value. From this dataset you can determine difference in the values.

How get all value in same row without using multiple joins conditions in SQL?

I have below table structures
Student
Stud_ID | First_Name | Last_name | Contact
ID001 | AAA | AAA | 111
ID002 | BBB | BBB | 222
StudUser
Stud_ID | NUM | Value
ID001 | 10 | English
ID001 | 20 | Math
ID001 | 30 | Science
ID002 | 10 | English
ID002 | 20 | Math
Expected Output
Stud_id | First_name | 10 | 20 | 30
ID001 | AAA | English | Math | Science
ID002 | BBB | English | Math |
Current query I'm using
select
stud_id,
First_name,
EG.EGUV AS "10",
LE.LEUV AS "20",
FPS.FPSUV AS "30"
from
student,
(SELECT STUD_ID AS EGS,USER_VALUE AS EGUV FROM STUD_USER WHERE COL_NUM='10') AS EG,
(SELECT STUD_ID AS BUS,USER_VALUE AS BUUV FROM STUD_USER WHERE COL_NUM='20') AS BU,
(SELECT STUD_ID AS AUS,USER_VALUE AS AUV FROM STUD_USER WHERE COL_NUM='30') AS A
where
ST.STUD_ID=EG.EGS(+) and
ST.STUD_ID=BU.BUS(+) and
ST.STUD_ID=A.AUS(+)
Please let me know if there is any other optimized way to get all user Values.
Note : this table structure cannot be altered only read permission is available

You can use the Oracle PIVOT clause for this.
It will (sort of) dynamically create additional columns according to values of a selected column.
This is the query I used (using WITH clauses to simulate your input data):
WITH
Student
AS
(SELECT 'ID001' AS Stud_ID, 'AAA' AS First_Name, 'AAA' AS Last_name, 111 AS Contact FROM DUAL
UNION ALL
SELECT 'ID002', 'BBB', 'BBB', 222 FROM DUAL),
StudUser
AS
(SELECT 'ID001' AS Stud_ID, 10 AS NUM, 'English' AS VALUE FROM DUAL
UNION ALL
SELECT 'ID001', 20, 'Math' FROM DUAL
UNION ALL
SELECT 'ID001', 30, 'Science' FROM DUAL
UNION ALL
SELECT 'ID002', 10, 'English' FROM DUAL
UNION ALL
SELECT 'ID002', 20, 'Math' FROM DUAL),
Prepare
AS
(SELECT ST.STUD_ID,
ST.FIRST_NAME,
SU.NUM,
SU.VALUE
FROM STUDENT ST LEFT JOIN STUDUSER SU ON ST.STUD_ID = SU.STUD_ID)
SELECT *
FROM Prepare PIVOT (MIN (VALUE) FOR NUM IN (10, 20, 30))
The PIVOT function expects an aggregate function, because it will go over all of the results where e.g. STUD_ID = ID001 and NUM = 10 and will return a value according to the aggregate function that was used. Your example implies the combination of STUD_ID and NUM will be unique for the StudUser table, therefore most of the Oracle's aggregate functions will do, as they will have to go over just one value. If the combination wasn't unique, you would have to take a second to think about a way you would like to aggregate them.
The FOR section of the pivot clause lets you specify which values you would like to turn into columns. This, unfortunately cannot be generated dynamically, e.g.:
SELECT *
FROM Prepare PIVOT (MIN (VALUE) FOR NUM IN ( SELECT DISTINCT NUM FROM StudUser ))
That would be invalid and wouldn't compile.
You can also specify desired column names for the pivoted columns, like so:
SELECT *
FROM Prepare PIVOT (MIN (VALUE) FOR NUM IN (10 Subject1, 20 Subject2, 30 Subject3))
I found the PIVOT clause merely by searching: oracle turn values into columns, so I kinda guess you didn't find it because you weren't sure how to call this. Can't blame you.

How can I query a table for transitive matches?

Giving the following tables
Units:
| id | singular | plural |
|----|----------|--------|
| 3 | onion | onions |
| 4 | bag | bags |
| 5 | gram | grams |
| 6 | ml | ml |
| 7 | mm | mm |
and
Conversions:
| id | convert_from | convert_to | factor |
|----|--------------|------------|--------|
| 3 | 4 | 3 | 5 |
| 4 | 3 | 5 | 125 |
How could I obtain all possible conversion factors from (for example) bag (unit 4)?
I would expect the answers to resemble the form
| convert_from | convert_to | factor |
|--------------|------------|--------|
| 4 | 3 | 5 |
| 4 | 5 | 625 |
Caveats:
There is no guarantee about which column of the conversions table (convert_from, convert_to) a unit might appear in.
Conversions that transit through units 5, 6, or 7 should be ignored.
That is to say,
1->2->4->5 is valid, 1->2->4->5->7 is not.
A SQL solution (or re-architecting of the database to facilitate a SQL solution) would be ideal, but a code solution that makes multiple SQL queries would also be appreciated.
There will be other units in the units table that should be ignored if they do not form part of the conversion graph (or if they form part of the branch through an invalid transition (5, 6, or 7)). This is a simplified view.
Illustrative example
Ignoring SQL and retrieving the data for a moment, here's what I'm trying to achieve:
I want to build a system where users can store household products. A product has a unit associated with it. The unit might be an SI unit, such as mm, ml, g.. or it might be a discrete unit such as onion, or can.
Units can have relationships amongst themselves, so for example 1 can -> 330 ml.
The complexity of my question comes from the fact that the conversions for a single unit might be spread across many products.
Considering the can example again, we can have a product called pepsi (crate of 24) with the unit being crate, and another product called pepsi (can) with the unit of can.
When the user creates the pepsi (can) product, they provide the following conversion:
1 can -> 330 ml
Later, the user creates the pepsi (crate of 24) product, and provides the following conversion:
1 crate -> 24 can
Finally, the user asks the question "how much pepsi do I have?"
I'd like to be able to answer:
25 cans
1.0417 crates
8250 ml.
However, I don't know how to convert crates to ml.
Here's another example in illustrated form:
Edit:
Changed mms and mls to mm and ml. Not sure what I was thinking...
Added diagram to help clarify what i'm looking for rather than the solution.

You can use a recursive CTE, assuming there are no cycles in the data.
I added an extra is_terminal column to identify the terminal units where you don't want to convert from anymore (5, 6, and 7). The query is:
with recursive
e (convert_from, convert_to, factor, is_terminal) as (
select id, id, 1, is_terminal from units where id = 4 -- bag
union all
select e.convert_from, c.convert_to, e.factor * c.factor, u.is_terminal
from e
join conversions c on c.convert_from = e.convert_to
join units u on u.id = c.convert_to
where not e.is_terminal
)
select * from e where convert_from <> convert_to
Result:
convert_from convert_to factor is_terminal
------------ ---------- ------ -----------
4 3 5 false
4 5 625 true
See running example at DB Fiddle . Here's the data script I used to test:
create table units (
id int,
is_terminal boolean
);
insert into units (id, is_terminal) values
(3, false), (4, false),
(5, true), (6, true), (7, true);
create table conversions (
id int,
convert_from int,
convert_to int,
factor int
);
insert into conversions (id, convert_from, convert_to, factor) values
(3, 4, 3, 5),
(4, 3, 5, 125);

T-SQL group by a key that is a parent in another table

I'm trying to write a query where I want to return the user that has the highest number of contributions to a certain table. To keep the case simple let's imagine many users can write many paragraphs that are eventually shown as a complete story. Below is the table structure (I'm given this structure, not my design but I have structured the column names at least ;))
paragraphs
-----
id (PK)
story_id (imaginary FK that I need to group the results by)
user_id (FK to userprop.id)
body
date
userprop
----
id (PK)
supervisor_id (FK to supervisors.id)
username (FK to users.name)
users
----
name
full_name
supervisors
----
id
full_name
I'm trying to write a query where I get each paragraph but with two extra columns showing the supervisors name that is used most times and the actual count of the supervisors' supervisions.
I've been fiddling with inner join subqueries, trying to subquery my way in the main select to the result and trying to group by a subquery. Some of the approaches are not even valid SQL and others didn't work out yet since I'm unsure how to group by supervisors.id through userprop.supervisor_id which is linked to paragraphs.user_id by userprop.id.
Can this be done in a single query, or do I have to incorporate some PHP and use multiple queries and some loops?
I have read only access to the database.
Sample data:
paragraphs
id story_id user_id body
----------------------------
1 1 1 Sample data
2 1 1 Sample data
3 2 1 Sample data
4 1 2 Sample data
5 1 3 Sample data
6 5 1 Sample data
userprop
id supervisor_id username
----------------------------
1 1 user_abc
2 1 user_def
3 2 user_ghi
users
name full_name
---------------------
user_abc Jack Jackson
user_def Bill Winters
user_ghi Sharon Staples
supervisors
id full_name
1 Steve Doppler
2 Frank Frampton
expected output
id story_id user_id body main_supervisor_count main_supervisor
---------------------------------------------------------------------------
1 1 1 Sample data 3 Steve Doppler
2 1 1 Sample data 3 Steve Doppler
3 2 1 Sample data 1 Steve Doppler
4 1 2 Sample data 3 Steve Doppler
5 1 3 Sample data 1 Frank Frampton
6 5 1 Sample data 1 Steve Doppler

It looks like a simple partitioned COUNT is all you need.
Sample data
DECLARE #paragraphs TABLE (id int, story_id int, user_id int, body nvarchar(max));
INSERT INTO #paragraphs(id, story_id, user_id, body) VALUES
(1, 1, 1, 'Sample data'),
(2, 1, 1, 'Sample data'),
(3, 2, 1, 'Sample data'),
(4, 1, 2, 'Sample data'),
(5, 1, 3, 'Sample data'),
(6, 5, 1, 'Sample data');
DECLARE #userprop TABLE (id int, supervisor_id int, username nvarchar(50));
INSERT INTO #userprop (id, supervisor_id, username) VALUES
(1, 1, 'user_abc'),
(2, 1, 'user_def'),
(3, 2, 'user_ghi');
DECLARE #supervisors TABLE (id int, full_name nvarchar(50));
INSERT INTO #supervisors (id, full_name) VALUES
(1, 'Steve Doppler'),
(2, 'Frank Frampton');
Query
SELECT
paragraphs.id
,paragraphs.story_id
,paragraphs.user_id
,paragraphs.body
,COUNT(*) OVER (PARTITION BY paragraphs.story_id, supervisors.id) AS main_supervisor_count
,supervisors.full_name AS main_supervisor
FROM
#paragraphs AS paragraphs
INNER JOIN #userprop AS userprop ON userprop.id = paragraphs.user_id
INNER JOIN #supervisors AS supervisors ON supervisors.id = userprop.supervisor_id
ORDER BY
paragraphs.id;
Result
+----+----------+---------+-------------+-----------------------+-----------------+
| id | story_id | user_id | body | main_supervisor_count | main_supervisor |
+----+----------+---------+-------------+-----------------------+-----------------+
| 1 | 1 | 1 | Sample data | 3 | Steve Doppler |
| 2 | 1 | 1 | Sample data | 3 | Steve Doppler |
| 3 | 2 | 1 | Sample data | 1 | Steve Doppler |
| 4 | 1 | 2 | Sample data | 3 | Steve Doppler |
| 5 | 1 | 3 | Sample data | 1 | Frank Frampton |
| 6 | 5 | 1 | Sample data | 1 | Steve Doppler |
+----+----------+---------+-------------+-----------------------+-----------------+

sql insert from table to table

I have a table Farm with these columns
FarmID:(primary)
Kenizy:
BarBedo:
BarBodo:
MorKodo:
These columns are palm types in some language. each column of those contains a number indicates the number of this type of palm inside a farm.
Example:
FarmID | Kenizy | BarBedo | BarBodo | MorKodo
-----------------------------------------------
3 | 20 | 12 | 45 | 60
22 | 21 | 9 | 41 | 3
I want to insert that table into the following tables:
Table Palm_Farm
FarmID:(primary)
PalmID;(primary)
PalmTypeName:
Count:
That table connects each farm with each palm type.
Example:
FarmID | PalmID | PalmTypeName | Count
-----------------------------------------------
3 | 1 | Kenizy | 20
3 | 2 | BarBedo | 12
3 | 3 | BarBodo | 45
3 | 4 | MorKodo | 60
22 | 1 | Kenizy | 21
22 | 2 | BarBedo | 9
22 | 3 | BarBodo | 41
22 | 4 | MorKodo | 3
I have to use the following table Palms in order to take the PalmID column.
PalmID:(primary)
PlamTypeName:
...other not important columns
This table is to save information about each palm type.
Example:
PalmID | PlamTypeName
-------------------------
1 | Kenizy
2 | BarBedo
3 | BarBodo
4 | MorKodo
The PalmTypeName column has the value the same as the COLUMN NAMES in the Farm table.
So my question is:
How to insert the data from Farm table to Palm_Farm considering that the PalmID exist in the Palm table
I hope I could make my question clear, I tried to solve my problem myself but the fact that the column name in the Farm table must be the column value in the Palm_Farm table couldn't know how to do it.
I can't change the table structure because we are trying to help a customer with this already existing tables
I am using SQL Server 2008 so Merge is welcomed.
Update
After the genius answer by #GarethD, I got this exception

You can use UNPIVOT to turn the columns into rows:
INSERT Palm_Farm (FarmID, PalmID, PalmTypeName, [Count])
SELECT upvt.FarmID,
p.PalmID,
p.PalmTypeName,
upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo], [BarBodo], [MorKodo])
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName;
Example on SQL Fiddle
The docs for UNPIVOT state:
UNPIVOT performs almost the reverse operation of PIVOT, by rotating columns into rows. Suppose the table produced in the previous example is stored in the database as pvt, and you want to rotate the column identifiers Emp1, Emp2, Emp3, Emp4, and Emp5 into row values that correspond to a particular vendor. This means that you must identify two additional columns. The column that will contain the column values that you are rotating (Emp1, Emp2,...) will be called Employee, and the column that will hold the values that currently reside under the columns being rotated will be called Orders. These columns correspond to the pivot_column and value_column, respectively, in the Transact-SQL definition.
To explain further how unpivot works, I will look at the first row original table:
FarmID | Kenizy | BarBedo | BarBodo | MorKodo
-----------------------------------------------
3 | 20 | 12 | 45 | 60
So what UPIVOT will do is look for columns specified in the UNPIVOT statement, and create a row for each column:
SELECT upvt.FarmID, upvt.PalmTypeName, upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo])
) AS upvt;
So here you are saying, for every row find the columns [Kenizy] and [BarBedo] and create a row for each, then for each of these rows create a new column called PalmTypeName that will contain the column name used, then put the value of that column into a new column called [Count]. Giving a result of:
FarmID | Kenizy | Count |
---------------------------
3 | Kenizy | 20 |
3 | BarBedo | 12 |
If you are running SQL Server 2000, or a later version with a lower compatibility level, then you may need to use a different query:
INSERT Palm_Farm (FarmID, PalmID, PalmTypeName, [Count])
SELECT f.FarmID,
p.PalmID,
p.PalmTypeName,
[Count] = CASE upvt.PalmTypeName
WHEN 'Kenizy' THEN f.Kenizy
WHEN 'BarBedo' THEN f.BarBedo
WHEN 'BarBodo' THEN f.BarBodo
WHEN 'MorKodo' THEN f.MorKodo
END
FROM Farm AS f
CROSS JOIN
( SELECT PalmTypeName = 'Kenizy' UNION ALL
SELECT PalmTypeName = 'BarBedo' UNION ALL
SELECT PalmTypeName = 'BarBodo' UNION ALL
SELECT PalmTypeName = 'MorKodo'
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName;
This is similar, but you have to create the additional rows yourself using UNION ALL inside the subquery upvt, then choose the value for [Count] using a case expression.
To update when the row exists you can use MERGE
WITH Data AS
( SELECT upvt.FarmID,
p.PalmID,
p.PalmTypeName,
upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo], [BarBodo], [MorKodo])
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName
)
MERGE Palm_Farm WITH (HOLDLOCK) AS pf
USING Data AS d
ON d.FarmID = pf.FarmID
AND d.PalmID = pf.PalmID
WHEN NOT MATCHED BY TARGET THEN
INSERT (FarmID, PalmID, PalmTypeName, [Count])
VALUES (d.FarmID, d.PalmID, d.PalmTypeName, d.[Count])
WHEN MATCHED THEN
UPDATE
SET [Count] = d.[Count],
PalmTypeName = d.PalmTypeName;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Create new column with all unique values from another column in SQL - sql

you can use GROUP_CONCAT() or any related fun according to your data base select "CROP",count(*) "totalCount",GROUP_CONCAT(grower) as growers from "table" group by "CROP" order by "totalCount" desc

Related

How to calculate percent change between two values in the same column

How get all value in same row without using multiple joins conditions in SQL?

How can I query a table for transitive matches?

T-SQL group by a key that is a parent in another table

sql insert from table to table

Categories

Resources