Selecting distinct values for multiple columns - sql

I have a table where many pieces of data match to one in another column, similar to a tree, and then data at the 'leaf' about each specific leaf
eg
Food Group Name Caloric Value
Vegetables Broccoli 100
Vegetables Carrots 80
Fruits Apples 120
Fruits Bananas 120
Fruits Oranges 90
I would like to design a query that will return only the distinct values of each column, and then nulls to cover the overflow
eg
Food group Name Caloric Value
Vegetables Broccoli 100
Fruit Carrots 80
Apples 120
Bananas 90
Oranges
I'm not sure if this is possible, right now I've been trying to do it with cases, however I was hoping there would be a simpler way

Seems like you are simply trying to have all the distinct values at hand. Why? For displaying purposes? It's the application's job, not the server's. You could simply have three queries like this:
SELECT DISTINCT [Food Group] FROM atable;
SELECT DISTINCT Name FROM atable;
SELECT DISTINCT [Caloric Value] FROM atable;
and display their results accordingly.
But if you insist on having them all in one table, you might try this:
WITH atable ([Food Group], Name, [Caloric Value]) AS (
SELECT 'Vegetables', 'Broccoli', 100 UNION ALL
SELECT 'Vegetables', 'Carrots', 80 UNION ALL
SELECT 'Fruits', 'Apples', 120 UNION ALL
SELECT 'Fruits', 'Bananas', 120 UNION ALL
SELECT 'Fruits', 'Oranges', 90
),
atable_numbered AS (
SELECT
[Food Group], Name, [Caloric Value],
fg_rank = DENSE_RANK() OVER (ORDER BY [Food Group]),
n_rank = DENSE_RANK() OVER (ORDER BY Name),
cv_rank = DENSE_RANK() OVER (ORDER BY [Caloric Value])
FROM atable
)
SELECT
fg.[Food Group],
n.Name,
cv.[Caloric Value]
FROM (
SELECT fg_rank FROM atable_numbered UNION
SELECT n_rank FROM atable_numbered UNION
SELECT cv_rank FROM atable_numbered
) r (rank)
LEFT JOIN (
SELECT DISTINCT [Food Group], fg_rank
FROM atable_numbered) fg ON r.rank = fg.fg_rank
LEFT JOIN (
SELECT DISTINCT Name, n_rank
FROM atable_numbered) n ON r.rank = n.n_rank
LEFT JOIN (
SELECT DISTINCT [Caloric Value], cv_rank
FROM atable_numbered) cv ON r.rank = cv.cv_rank
ORDER BY r.rank

I guess what I would want to know is why you need this in one result set? What does the code look like that would consume this result? The attributes on each row have nothing to do with each other. If you want to, say, build the contents of a set of drop-down boxes, you're better off doing these one at a time. In your requested result set, you'd need to iterate through the dataset three times to do anything useful, and you would need to either check for NULL each time or needlessly iterate all the way to the end of the dataset.
If this is in a stored procedure, couldn't you run three separate SELECT DISTINCT and return the values as three results. Then you can consume them one at a time, which is what you would be doing anyway I would guess.
If there REALLY IS a connection between the values, you could add each of the results to an array or list, then access all three lists in parallel using the index.

Something like this maybe?
select *
from (
select case
when row_number() over (partition by fruit_group) = 1 then fruit_group
else null
end as fruit_group,
case
when row_number() over (partition by name) = 1 then name
else null
end as name,
case
when row_number() over (partition by caloric) = 1 then caloric
else null
end as caloric
from your_table
) t
where fruit_group is not null
or name is not null
or caloric is not null
But I fail to see any sense in this

Related

Use a CASE expression without typing matched conditions manually using PostgreSQL

I have a long and wide list, the following table is just an example. Table structure might look a bit horrible using SQL, but I was wondering whether there's a way to extract IDs' price using CASE expression without typing column names in order to match in the expression
IDs
A_Price
B_Price
C_Price
...
A
23
...
B
65
82
...
C
...
A
10
...
..
...
...
...
...
Table I want to achieve:
IDs
price
A
23;10
B
65
C
82
..
...
I tried:
SELECT IDs, string_agg(CASE IDs WHEN 'A' THEN A_Price
WHEN 'B' THEN B_Price
WHEN 'C' THEN C_Price
end::text, ';') as price
FROM table
GROUP BY IDs
ORDER BY IDs
To avoid typing A, B, A_Price, B_Price etc, I tried to format their names and call them from a subquery, but it seems that SQL cannot recognise them as columns and cannot call the corresponding values.
WITH CTE AS (
SELECT IDs, IDs||'_Price' as t FROM ID_list
)
SELECT IDs, string_agg(CASE IDs WHEN CTE.IDs THEN CTE.t
end::text, ';') as price
FROM table
LEFT JOIN CTE cte.IDs=table.IDs
GROUP BY IDs
ORDER BY IDs
You can use a document type like json or hstore as stepping stone:
Basic query:
SELECT t.ids
, to_json(t.*) ->> (t.ids || '_price') AS price
FROM tbl t;
to_json() converts the whole row to a JSON object, which you can then pick a (dynamically concatenated) key from.
Your aggregation:
SELECT t.ids
, string_agg(to_json(t.*) ->> (t.ids || '_price'), ';') AS prices
FROM tbl t
GROUP BY 1
ORDER BY 1;
Converting the whole (big?) row adds some overhead, but you have to read the whole table for your query anyway.
A union would be one approach here:
SELECT IDs, A_Price FROM yourTable WHERE A_Price IS NOT NULL
UNION ALL
SELECT IDs, B_Price FROM yourTable WHERE B_Price IS NOT NULL
UNION ALL
SELECT IDs, C_Price FROM yourTable WHERE C_Price IS NOT NULL;

convert data from multiple columns into single row sorting descending

I am trying to query the original source which contain totals from a category (in this case Vehicles) into the second table.
Motorcycle
Bicycle
Car
1
3
2
Desired Output:
Vehicle
Quantity
Bicycle
3
Car
2
Motorcycle
1
Additionally, I need that the Quantity is sorted in descending order like showing above.
So far I have tried to do an Unpivot, but there is a syntax error in the Unpivot function. Is there another way to reach out the same results?
My code so far:
SELECT Vehicle_Name
FROM
(
SELECT [Motorcycle], [Bycycle], [Car] from Data
) as Source
UNPIVOT
(
Vehicle FOR Vehicle_Name IN ([Motorcycle], [Bycycle], [Car])
) as Unpvt
Edit: Added sort requirement.
You can use CROSS APPLY here too
select vehicle, amnt
from test
cross apply(
VALUES('motorcycle', motorcycle)
,('bicycle', bicycle)
,('car', car)) x (vehicle, amnt)
order by amnt desc
Fiddle here
Try this
with data1 as
(
Select * from data)
Select * From
(
Select 'motorcycle' as "Vehicle", motorcycle as quantity from data1
union all
Select 'bicycle' , bicycle from data1
union all
Select 'car', car from data1
) order by quantity desc;
Since we don't know what DBMS, here's a way that'd work in the one I use the most.
SELECT *
FROM (SELECT map_from_entries(
ARRAY[('Motorcycle', Motorcycle),
('Bicycle', Bicycle),
('Car', Car)])
FROM Source) AS t1(type_quant)
CROSS JOIN UNNEST(type_quant) AS t2(Vehicle, Quantity)
ORDER BY Quantity DESC
-Trino

SQL unique combinations

I have a table with three columns with an ID, a therapeutic class, and then a generic name. A therapeutic class can be mapped to multiple generic names.
ID therapeutic_class generic_name
1 YG4 insulin
1 CJ6 maleate
1 MG9 glargine
2 C4C diaoxy
2 KR3 supplies
3 YG4 insuilin
3 CJ6 maleate
3 MG9 glargine
I need to first look at the individual combinations of therapeutic class and generic name and then want to count how many patients have the same combination. I want my output to have three columns: one being the combo of generic names, the combo of therapeutic classes and the count of the number of patients with the combination like this:
Count Combination_generic combination_therapeutic
2 insulin, maleate, glargine YG4, CJ6, MG9
1 supplies, diaoxy C4C, KR3
One way to match patients by the sets of pairs (therapeutic_class, generic_name) is to create the comma-separated strings in your desired output, and to group by them and count. To do this right, you need a way to identify the pairs. See my Comment under the original question and my Comments to Gordon's Answer to understand some of the issues.
I do this identification in some preliminary work in the solution below. As I mentioned in my Comment, it would be better if the pairs and unique ID's existed already in your data model; I create them on the fly.
Important note: This assumes the comma-separated lists don't become too long. If you exceed 4000 characters (or approx. 32000 characters in Oracle 12, with certain options turned on), you CAN aggregate the strings into CLOBs, but you CAN'T GROUP BY CLOBs (in general, not just in this case), so this approach will fail. A more robust approach is to match the sets of pairs, not some aggregation of them. The solution is more complicated, I will not cover it unless it is needed in your problem.
with
-- Begin simulated data (not part of the solution)
test_data ( id, therapeutic_class, generic_name ) as (
select 1, 'GY6', 'insulin' from dual union all
select 1, 'MH4', 'maleate' from dual union all
select 1, 'KJ*', 'glargine' from dual union all
select 2, 'GY6', 'supplies' from dual union all
select 2, 'C4C', 'diaoxy' from dual union all
select 3, 'GY6', 'insulin' from dual union all
select 3, 'MH4', 'maleate' from dual union all
select 3, 'KJ*', 'glargine' from dual
),
-- End of simulated data (for testing purposes only).
-- SQL query solution continues BELOW THIS LINE
valid_pairs ( pair_id, therapeutic_class, generic_name ) as (
select rownum, therapeutic_class, generic_name
from (
select distinct therapeutic_class, generic_name
from test_data
)
),
first_agg ( id, tc_list, gn_list ) as (
select t.id,
listagg(p.therapeutic_class, ',') within group (order by p.pair_id),
listagg(p.generic_name , ',') within group (order by p.pair_id)
from test_data t join valid_pairs p
on t.therapeutic_class = p.therapeutic_class
and t.generic_name = p.generic_name
group by t.id
)
select count(*) as cnt, tc_list, gn_list
from first_agg
group by tc_list, gn_list
;
Output:
CNT TC_LIST GN_LIST
--- ------------------ ------------------------------
1 GY6,C4C supplies,diaoxy
2 GY6,KJ*,MH4 insulin,glargine,maleate
You are looking for listagg() and then another aggregation. I think:
select therapeutics, generics, count(*)
from (select id, listagg(therapeutic_class, ', ') within group (order by therapeutic_class) as therapeutics,
listagg(generic_name, ', ') within group (order by generic_name) as generics
from t
group by id
) t
group by therapeutics, generics;

select different Max ID's for different customer

situation:
we have monthly files that get loaded into our data warehouse however instead of being replaced with old loads, these are just compiled on top of each other. the files are loaded in over a period of days.
so when running a SQL script, we would get duplicate records so to counteract this we run a union over 10-20 'customers' and selecting Max(loadID) e.g
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'ASDA')
UNION
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'TESCO'
The above union would have to be done for multiple customers so i was thinking surely there has to be a more efficient way.
we cant use a MAX (LoadID) in the SELECT statement as a possible scenario could entail the following;
Monday: Asda,Tesco,Waitrose loaded into DW (with LoadID as 124)
Tuesday: Sainsburys loaded in DW (with LoadID as 125)
Wednesday: New Tesco loaded in DW (with LoadID as 126)
so i would want LoadID 124 Asda & Waitrose, 125 Sainsburys, & 126 Tesco
Use window functions:
SELECT t.*
FROM (SELECT t.*, MAX(LOADID) OVER (PARTITION BY Customer) as maxLOADID
FROM MyTable t
) t
WHERE LOADID = maxLOADID;
Would a subquery to a derived table meet your needs?
select yourfields
from yourtables join
(select customer, max(loadID) maxLoadId
from yourtables
group by customer) derivedTable on derivedTable.customer = realTable.customer
and loadId = maxLoadId

MySQL intersection in subquery

I'm trying to create a filter for a list (of apartments), with a many-to-many relationship with apartment features through the apartsments_features table.
I would like to include only apartments that have all of some features (marked 'Yes' on a form) excluding all the ones that have any of another set features (marked 'No'). I realized too late that I couldn't use INTERSECT or MINUS in MySQL.
I have a query that looks something like:
SELECT `apartments`.* FROM `apartments` WHERE `apartments`.`id` IN (
SELECT `apartments`.`id` FROM `apartments` INTERSECT (
SELECT `apartment_id` FROM `apartments_features` WHERE `feature_id` = 103
INTERSECT SELECT `apartment_id` FROM `apartments_features` WHERE `feature_id` = 106
) MINUS (
SELECT `apartment_id` FROM `apartments_features` WHERE `feature_id` = 105 UNION
SELECT `apartment_id` FROM `apartments_features` WHERE `feature_id` = 107)
)
ORDER BY `apartments`.`name` ASC
I'm pretty sure there's a way to do this, but at the moment my knowledge is restricted to little more than simple left and right joins.
A slightly different way of doing it:
select a.*
from apartments a
join apartments_features f1
on a.apartment_id = f1.apartment_id and f1.feature_id in (103,106) -- applicable features
where not exists
(select null from apartments_features f2
where a.apartment_id = f2.apartment_id and f2.feature_id in (105,107) ) -- excluded features
group by f1.apartment_id
having count(*) = 2 -- number of applicable features
You could try something like this:
SELECT apartment_id
FROM
(
SELECT apartment_id
FROM apartments_features
WHERE feature_id IN (103, 106)
GROUP BY apartment_id
HAVING COUNT(*) = 2
) T1
LEFT JOIN
(
SELECT apartment_id
FROM apartments_features
WHERE feature_id IN (105, 107)
) T2
ON T1.apartment_id = T2.apartment_id
WHERE T2.apartment_id IS NULL
Join the result of this query to the apartments table to get the name, etc.