PostgreSQL problem with the "Group By" Clause - sql

I have a table that looks something like this when ORDERED by R_Value Desc
Note that it is important to understand that the below table is SQL SELECT ordered by R_Value
Id Letter R_Value
1 A 1500
2 A 1400
4 B 800
9 B 700
10 B 600
11 A 400
12 A 200
I want my result set to look like this. The result set needs to be grouped 3
times. Each time, the letter changes from A to B or vice versa, a new group should
be formed.
Letter Max_RValue
A 1500
B 800
A 400
I have already experimented with various things- lead/lag functions, partitioning, etc. and it seems impossible. A simple Group by(Letter) obviously won't work because
all 'A' will be put in one group which is not what I want.
My next step would be to try this in a procedural language, dump it in an
intermediate table and then read the results. Before doing that, is this even
possible in Ordinary SQL?

You can use LAG() window function to check the previous value of Letter for each row and SUM() window function to create the groups of rows of consecutive occurrences.
Then aggregate in each group:
SELECT Letter, MAX(R_Value) Max_RValue
FROM (
SELECT *, SUM(flag::int) OVER (ORDER BY R_Value DESC) grp
FROM (
SELECT *, Letter <> LAG(Letter, 1, '') OVER (ORDER BY R_Value DESC) flag
FROM tablename
) t
) t
GROUP BY grp, Letter;
Or, simpler just select the rows where the Letter changes:
SELECT Letter, R_Value
FROM (
SELECT *, LAG(Letter, 1, '') OVER (ORDER BY R_Value DESC) prev_Letter
FROM tablename
) t
WHERE Letter <> prev_Letter;
See the demo.

Forpas's second solution is okay but it can be written more simply in Postgres as:
select letter, r_value
from (select t.*, lag(letter) over (order by r_value) as prev_letter
from t
) t
where prev_letter is distinct from letter;

Related

sql - Data with more than one record

I have the following temporary table
Aim is to flag the data with more than one records and put More than one records
In my example below, if Siren appears more than once, I would have
Siren ETS_RS Voie Ville nom_etp
348177155 POITOU-CHARENTES ENGRAIS P.C.E. (SNC) BOULEVARD WLADIMIR MORCH 17000 LA ROCHELLE More than one records
For records that are appearing once, I will have the single name of the company (here nom_etp)
Siren ETS_RS Voie Ville nom_etp
344843347 PRESTIGE AUTO ROCHELAIS (SAS) 4 RUE JEAN DEMEOCQ 17000 LA ROCHELLE NIGER
I tried a few things based on the idea that if I can have a count of more than one, I could flag them easily and use them with a CASE :
First: I tried to do a count
WITH cte_ssrep_moraux AS (...)
SELECT SIREN,ETS_RS,Voie,Ville
,Denomination AS nom_etp,COUNT(SIREN)
FROM cte_ssrep_moraux
GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
It hits a snitch as all counts were equal to one and I have the same dataset as in the picture...
Second:
WITH cte_ssrep_moraux AS (...)
SELECT ETS_RS,Voie,Ville
,Denomination AS nom_etp,SIREN,
RANK() OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS xx
FROM cte_ssrep_moraux
GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
It hits a snitch as all counts were equal to one and I have the same dataset as in the picture...
I'm bit confused on what I should do next. I have the feeling will be an easy one and I'll face palmed myself.
Many thanks for reading my question
If this is your criteria:
if Siren appears more than once,
Then the group by clause should only contain Siren:
SELECT SIREN, COUNT(*)
FROM cte_ssrep_moraux
GROUP BY SIREN
HAVING COUNT(*) > 1;
I'm not sure what you want to do after that, but this will return the SIREN values that appear more than once.
If there is more than one row and you change every nom_etp to 'more than one record', you end up with identical rows. That's why I prepared some tweaked query. See following (table simplified for clarity):
CREATE TABLE Duplicates
(
Id int,
Name varchar(20),
Item varchar(20)
)
INSERT Duplicates VALUES
(1,'Name1', 'Item1'),
(2,'Name2', 'Item2'),
(2,'Name2', 'Item3'),
(3,'Name3', 'Item4'),
(3,'Name3', 'Item5'),
(3,'Name3', 'Item6'),
(4,'Name4', 'Item7');
If you need just a query:
WITH Numbered AS
(
SELECT Id, Name, Item,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Id) RowNum,
COUNT(*) OVER (PARTITION BY Id ORDER BY ID) TotalInGroup
FROM Duplicates
)
SELECT Id, Name,
CASE WHEN RowNum=1 AND TotalInGroup>1 THEN 'More records' ELSE Item END Item
FROM Numbered
If you need to normalize:
WITH Numbered AS
(
SELECT Id, Name, Item,
ROW_NUMBER() OVER (ORDER BY Id) Number,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Id) RowNum,
COUNT(*) OVER (PARTITION BY Id ORDER BY ID) TotalInGroup
FROM Duplicates
)
MERGE Numbered AS tgt
USING Numbered AS src
ON src.Number=tgt.Number
WHEN MATCHED AND tgt.RowNum=1 AND tgt.TotalInGroup>1 THEN
UPDATE SET tgt.Item='More'
WHEN MATCHED AND tgt.RowNum>1 THEN
DELETE;
Table will look like below:
Id Name Item
-- ---- ----
1 Name1 Item1
2 Name2 More
3 Name3 More
4 Name4 Item7
If there are multiple rows with same id, first of them is updated with 'More' constant, all other in the group are deleted.
Use CTE for this purpose
;WITH CTE AS(
SELECT ETS_RS,Voie,Ville,Denomination AS nom_etp,SIREN,
ROW_NUMBER() OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS RN
FROM cte_ssrep_moraux
--GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
)
SELECT ETS_RS,
Voie,Ville,
CASE WHEN RN > 1 THEN 'More than one records'
ELSE nom_etp
END AS 'nom_etp',
SIREN
FROM CTE
;with cte
as
(
select siren,count(*) as cnt
from
yourtable
having count(*)>1
)
update t
set nom_etp='more than one records'
yourtable t where exists(Select 1 from cte c where c.sirenid=t.sirenid)
Since you still want all the records, including the unique.
Then you can use COUNT as a window function.
With a CASE to choose what to display as nom_etp.
select Siren, ETS_RS, Voie, Ville,
(case when count(*) over (partition by Siren) > 1 then 'More than one records' else nom_etp end) as nom_etp
from cte_ssrep_moraux;
Please find what I did
WITH cte_ssrep_moraux AS (
SELECT SIREN,ETS_RS,Voie,Ville
,Denomination AS nom_etp,ROW_NUMBER()
OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS Counting
FROM
(my_initial_cte) AS tb
)
SELECT Siren, ETS_RS, Voie, Ville,nom_etp
FROM cte_ssrep_moraux
WHERE counting = 1
AND Siren NOT IN (SELECT Siren FROM cte_ssrep_moraux WHERE counting > 1)
UNION ALL
SELECT DISTINCT Siren, ETS_RS, Voie, Ville,'More than one records'
FROM cte_ssrep_moraux
WHERE counting > 1
Explanation: After the initial CTE, I tried many of the solutions mentioned above especially using the CASE.
Issue with the CASE was that it would put something like that
Siren ETS_RS Voie Ville nom_etp
xxxx xyxy xyzet Bordeaux More than one records
xxxx xyxy xyzet Bordeaux More than one records
xxxx xyxy xyzet Bordeaux More than one records
xxxy zzzy ssare Paris Firm ABC
So instead of putting everything under a CASE, I said let's split that into 2 part :
First part would put everything with a counting equal to 1
Second part would put the rest with a counting that goes above 1 with a DISTINCT
Join the two results with an UNION ALL as the two sets have the same numbers of fetch rows

How to use ROWNUM for a maximum and another minimum ordering in ORACLE?

Currently i am trying to output the top row for 2 condition. One is max and one is min.
Current code
Select *
from (MY SELECT STATEMENT order by A desc)
where ROWNUM <= 1
UPDATE
I am now able to do for both condition. But i need the A to be the highest, if same then check for the B lowest.
E.g Lets say there is 2 rows, Both A is 100 and B is 50 for one and 60 for other.
In this case the 100:50 shld be choose because A is same then B is lowest.
E.g
Lets say there is 2 rows, A is 100 for one and 90 for other, since one is higher no need to check for B.
I tried using max and min but this method seems to work better, any suggestions
Well, after your clarification, you are looking for one record. With Max A. And the smallest B, in case there is more than one record with MAX A. This is simply:
Select *
from (MY SELECT STATEMENT order by A desc, B)
where ROWNUM = 1;
This sorts by A descending first, so you get all maximal A records first. Then it sorts by B, so inside each A group you get the least B first. This gives you the desired A record first, no matter if the found A is unique or not.
or avoid the vagaries of rownun and go for row_number() instead:
SELECT
*
FROM (
SELECT
*
, ROW_NUMBER (ORDER BY A DESC) adesc
, ROW_NUMBER (ORDER BY B ASC) basc
FROM SomeQuery
)
WHERE adesc = 1
OR basc = 1
footnote: select * is a convenience only, please replace with the actual columns required along with table names etc.
Try this if that works
Select *
from (MY SELECT STATEMENT order by A desc)
where ROWNUM <= 1
union
Select *
from (MY SELECT STATEMENT order by A asc)
where ROWNUM <= 1
SELECT * FROM
(Select foo.*, 0 as union_order
from (MY SELECT STATEMENT order by A desc) foo
where ROWNUM <= 1
UNION
Select foo.*, 1
from (MY SELECT STATEMENT order by B asc) foo
where ROWNUM <= 1)
ORDER BY
union_order

SQL - Find rows starting with the same characters

I have an oracle database with a table containing data that may start with the same prefix and would like to find rows that the 5 digit prefix is duplicated somewhere in the table.
For Example:
Table1
---------------
12345-brsd
12345-wbgb
12345-ydad
34573-diwe
75234-daie
72456-woei
72456-wdgq
I want to return only the ones that the first 5 digits are duplicate, so out of this sample:
12345-brsd
12345-wbgb
12345-ydad
72456-woei
72456-wdgq
You can do this using analytic functions:
select t.*
from (select t.*, count(*) over (partition by substr(column, 1, 5)) as cnt
from table t
) t
where cnt > 1
order by column1;

SQL Query to obtain the maximum value for each unique value in another column

ID Sum Name
a 10 Joe
a 8 Mary
b 21 Kate
b 110 Casey
b 67 Pierce
What would you recommend as the best way to
obtain for each ID the name that corresponds to the largest sum (grouping by ID).
What I tried so far:
select ID, SUM(Sum) s, Name
from Table1
group by ID, Name
Order by SUM(Sum) DESC;
this will arrange the records into groups that have the highest sum first. Then I have to somehow flag those records and keep only those. Any tips or pointers? Thanks a lot
In the end I'd like to obtain:
a 10 Joe
b 110 Casey
You want the row_number() function:
select id, [sum], name
from (select t.*]
row_number() over (partition by id order by [sum] desc) as seqnum
from table1
) t
where seqnum = 1;
Your question is more confusing than it needs to be because you have a column called sum. You should avoid using SQL reserved words for identifiers.
The row_number() function assigns a sequential number to a group of rows, starting with 1. The group is defined by the partition by clause. In this case, all rows with the same id are in the same group. The ordering of the numbers is determined by the order by clause, so the one with the largest value of sum gets the value of 1.
If you might have duplicate maximum values and you want all of them, use the related function rank() or dense_rank().
select *
from
(
select *
,rn = row_number() over (partition by Id order by sum desc)
from table
)x
where x.rn=1
demo

SQL select segment

I'm using SQL Server 2008.
I have a table with x amount of rows. I would like to always divide x by 5 and select the 3rd group of records.
Let's say there are 100 records in the table:
100 / 5 = 20
the 3rd segment will be record 41 to 60.
How will I be able in SQL to calculate and select this 3rd segment only?
Thanks.
You can use NTILE.
Distributes the rows in an ordered partition into a specified number of groups.
Example:
SELECT col1, col2, ..., coln
FROM
(
SELECT
col1, col2, ..., coln,
NTILE(5) OVER (ORDER BY id) AS groupno
FROM yourtable
)
WHERE groupno = 3
That's a perfect use for the NTILE ranking function.
Basically, you define your query inside a CTE and add an NTILE to your rows - a number going from 1 to n (the argument to NTILE). You order your rows by some column, and then you get the n groups of rows you're looking for, and you can operate on any one of those "groups" of data.
So try something like this:
;WITH SegmentedData AS
(
SELECT
(list of your columns),
GroupNo = NTILE(5) OVER (ORDER BY SomeColumnOfYours)
FROM dbo.YourTable
)
SELECT *
FROM SegmentedData
WHERE GroupNo = 3
Of course, you can also use an UPDATE statement after the CTE to update those rows.