MariaDB - GROUP BY with an order - sql

So I have a dataset, where I would like to order it based on strings ORDER BY FIELD(field_name, ...) after the order I wan't it to group the dataset based on another column.
I have tried with a subquery, but it seems like it ignores by ORDER BY when it gets subqueried.
This is the query I would like to group with GROUP BY setting_id
SELECT *
FROM `setting_values`
WHERE ((`owned_by_type` = 'App\\Models\\Utecca\\User' AND `owned_by_id` = 1 OR ((`owned_by_type` = 'App\\Models\\Utecca\\Agreement' AND `owned_by_id` = 1006))) OR (`owned_by_type` = 'App\\Models\\Utecca\\Employee' AND `owned_by_id` = 1)) AND `setting_values`.`deleted_at` IS NULL
ORDER BY FIELD(owned_by_type, 'App\\Models\\Utecca\\Employee', 'App\\Models\\Utecca\\Agreement', 'App\\Models\\Utecca\\User')
The order by works just fine, but I cannot get it to group it based on my order, it always selects the one with lowest primary key (id).
Here is my attempt which did not work.
SELECT * FROM (
SELECT *
FROM `setting_values`
WHERE ((`owned_by_type` = 'App\\Models\\Utecca\\User' AND `owned_by_id` = 1 OR ((`owned_by_type` = 'App\\Models\\Utecca\\Agreement' AND `owned_by_id` = 1006))) OR (`owned_by_type` = 'App\\Models\\Utecca\\Employee' AND `owned_by_id` = 1)) AND `setting_values`.`deleted_at` IS NULL
ORDER BY FIELD(owned_by_type, 'App\\Models\\Utecca\\Employee', 'App\\Models\\Utecca\\Agreement', 'App\\Models\\Utecca\\User')
) AS t
GROUP BY setting_id;
Here is some sample data
What I am trying to accomplish with this sample data is 1 row with the id 3 as the row.
The desired result set from the query should obey these rules
1 row for each setting_id
owned_by_type together with owned_by_id is filtered the following way agreement = 1006, user = 1, employee = 1.
When limiting the 1 row for each setting_idit should be done with the following priority in owned_by_type column Employee, Agreement, User
Here is a SQLFiddle with it.
Running MariaDB version 10.2.6-MariaDB

First of all, the Optimizer is free to ignore the inner ORDER BY. So, please describe further what your intent is.
Getting past that, you can use a subquery:
SELECT ...
FROM ( SELECT
...
GROUP BY ...
ORDER BY ... -- This is lost unless followed by :
LIMIT 9999999999 -- something valid; or very high (for all)
) AS x
GROUP BY ...
Perhaps you are doing groupwise max ??

Related

Self join taking time

I am having below query which selects SUM of AAD_00TO30 columns depending upon some conditions.
The query executes in 1 sec when I remove below condition, but it takes more than a min when same condition is included.
Can someone please suggest me any alternative to modify the query for better performance.
AND A.AAD_DATE >= (SELECT MAX(B.AAD_DATE)
FROM MST_AR_AS_ON_DATE B
WHERE MONTH(B.AAD_DATE) = MONTH(A.AAD_DATE) AND YEAR(B.AAD_DATE) = YEAR(A.AAD_DATE))
Query:
SELECT '00-30 #66ff66',SUM(A.AAD_00TO30) FROM MST_AR_AS_ON_DATE A
WHERE MONTH(A.AAD_DATE) = MONTH(DATEADD(MM,-1,GETDATE()))
AND YEAR(A.AAD_DATE) = YEAR(DATEADD(MM,-1,GETDATE()))
AND A.AAD_RESP_NOW = 4
AND A.AAD_DATE >= (SELECT MAX(B.AAD_DATE)
FROM MST_AR_AS_ON_DATE B
WHERE MONTH(B.AAD_DATE) = MONTH(A.AAD_DATE) AND YEAR(B.AAD_DATE) = YEAR(A.AAD_DATE))
Try using RANK() to tag rows that meet the criteria of having the last date of the month. Then eliminate rows without a winning rank:
WITH
MST_AR_AS_ON_DATE_RANKED AS (
SELECT
*,
RANK() OVER (
PARTITION BY
YEAR(AAD_DATE),
MONTH(AAD_DATE)
ORDER BY
AAD_DATE DESC -- last day of month ranked highest
) AS AAD_DATE_RANK
FROM
MST_AR_AS_ON_DATE
)
SELECT
'00-30 #66ff66',
SUM(AAD_00TO30)
FROM
MST_AR_AS_ON_DATE_RANKED
WHERE
MONTH(AAD_DATE) = MONTH(DATEADD(MM,-1,GETDATE()))
AND YEAR(AAD_DATE) = YEAR(DATEADD(MM,-1,GETDATE()))
AND AAD_RESP_NOW = 4
AND AAD_DATE_RANK = 1
;

More than one row returned by a subquery used as an expression when UPDATE on multiple rows

I'm trying to update rows in a single table by splitting them into two "sets" of rows.
The top part of the set should have a status set to X and the bottom one should have a status set to status Y.
I've tried putting together a query that looks like this
WITH x_status AS (
SELECT id
FROM people
WHERE surname = 'foo'
ORDER BY date_registered DESC
LIMIT 5
), y_status AS (
SELECT id
FROM people
WHERE surname = 'foo'
ORDER BY date_registered DESC
OFFSET 5
)
UPDATE people
SET status = folks.status
FROM (values
((SELECT id from x_status), 'X'),
((SELECT id from y_status), 'Y')
) as folks (ids, status)
WHERE id IN (folks.ids);
When I run this query I get the following error:
pq: more than one row returned by a subquery used as an expression
This makes sense, folks.ids is expected to return a list of IDs, hence the IN clause in the UPDATE statement, but I suspect the problem is I can not return the list in the values statement in the FROM clause as it turns into something like this:
(1, 2, 3, 4, 5, 5)
(6, 7, 8, 9, 1)
Is there a way how this UPDATE can be done using a CTE query at all? I could split this into two separate UPDATE queries, but CTE query would be better and in theory faster.
I think I understand now... if I get your problem, you want to set the status to 'X' for the oldest five records and 'Y' for everything else?
In that case I think the row_number() analytic would work -- and it should do it in a single pass, two scans, and eliminating one order by. Let me know if something like this does what you seek.
with ranked as (
select
id, row_number() over (order by date_registered desc) as rn
from people
)
update people p
set
status = case when r.rn <= 5 then 'X' else 'Y' end
from ranked r
where
p.id = r.id
Any time you do an update from another data set, it's helpful to have a where clause that defines the relationship between the two datasets (the non-ANSI join syntax). This makes it iron-clad what you are updating.
Also I believe this code is pretty readable so it will be easier to build on if you need to make tweaks.
Let me know if I missed the boat.
So after more tinkering, I've come up with a solution.
The problem with why the previous query fails is we are not grouping the IDs in the subqueries into arrays so the result expands into a huge list as I suspected.
The solution is grouping the IDs in the subqueries into ARRAY -- that way they get returned as a single result (tuple) in ids value.
This is the query that does the job. Note that we must unnest the IDs in the WHERE clause:
WITH x_status AS (
SELECT id
FROM people
WHERE surname = 'foo'
ORDER BY date_registered DESC
LIMIT 5
), y_status AS (
SELECT id
FROM people
WHERE surname = 'foo'
ORDER BY date_registered DESC
OFFSET 5
)
UPDATE people
SET status = folks.status
FROM (values
(ARRAY(SELECT id from x_status), 'X'),
(ARRAY(SELECT id from y_status), 'Y')
) as folks (ids, status)
WHERE id IN (SELECT * from unnest(folks.ids));

Oracle: filter all rows before the ID

I have a big query that brings me a lot of rows, and based on each row I use this another query as a subselect.
This subselect brings me the following result rest on Oracle:
SELECT oc3.ID_ORGAO_INTELIGENCIA,
oc3.ord,
lag(oc3.ID_ORGAO_INTELIGENCIA, 1, NULL) OVER (
ORDER BY oc3.ord) ultimo
FROM
( SELECT DISTINCT oc2.*
FROM
( SELECT oc1.ID_ORGAO_INTELIGENCIA,
oc1.ID_ORGAO_INTELIGENCIA_PAI,
oc1.SG_ORGAO_INTELIGENCIA,
rownum AS ord
FROM TB_ORGAO_INTERNO oc1
WHERE oc1.DH_EXCLUSAO IS NULL START WITH oc1.ID_ORGAO_INTELIGENCIA =
-- this is a value that come from an outer select
-- If I put the value directly, like: S.ID_ORGAO_INTELIGENCIA, it does not work... I dont know why...
(SELECT sa.ID_ORGAO_INTELIGENCIA
FROM TB_SOLICITACAO sa
WHERE sa.ID_SOLICITACAO = 1077)-- s.ID_SOLICITACAO)
CONNECT BY
PRIOR oc1.ID_ORGAO_INTELIGENCIA_PAI = oc1.ID_ORGAO_INTELIGENCIA) oc2
INNER JOIN TB_PERMISSAO pe2_ ON pe2_.ID_ORGAO_INTELIGENCIA = oc2.ID_ORGAO_INTELIGENCIA
INNER JOIN TB_USUARIO u_ ON u_.ID_USUARIO = pe2_.ID_USUARIO
WHERE pe2_.ID_STATUS_PERMISSAO = 7
AND pe2_.ID_ATRIBUICAO IN :atribuicoes
ORDER BY oc2.ord) oc3
The result:
That important value from each row is the S.ID_SOLICITACAO, because based on that value that the subquery will be started.
I need to be able to filter the results by oc3.ID_ORGAO_INTELIGENCIA where it brings me all the rows before that number.
So, If I filter by 430, only the row with 311 will return.
If I filter by 329, it will bring me the: 311 and 430.
Is there a way to achieve this result?
One option might be to use your current query as a CTE, and then filter data it returns. Something like this:
with ycq as
-- your current query
(select ...
from ...
)
select *
from ycq a
where a.ord < (select b.ord
from ycq b
where b.id_orgao_inteligencia = :par_id_orgao_inteligencia
);

SQL Server CTE - SELECT after UPDATE using OUTPUT INSERTED

I've seen a few posts on using CTE (WITH) that I thought would address my issue but I can't seem to make it work for my specific use case. My use case is that I have a table with a series of records, and I need to pull some number of records AFTER a small update has been made to them.
i.e.
- retrieve records where a series of conditions are met
- update one or more columns in each of those records
- return the updated records
I know I can return the IDs of the records using the following:
WITH cte AS
( SELECT TOP 1 * FROM msg
WHERE guid = 'abcd'
AND active = 1
ORDER BY created DESC )
UPDATE cte SET active = 0
OUTPUT INSERTED.msg_id
WHERE guid = 'abcd'
That nicely returns the msg_id field. I tried wrapping all of that in a SELECT * FROM msg WHERE msg_id IN () query, but it fails.
Anyone have a suggestion? For reference, using SQL Server 2008 R2.
CREATE TABLE #t (msg_id int)
;
WITH cte AS
( SELECT TOP 1 * FROM msg
WHERE guid = 'abcd'
AND active = 1
ORDER BY created DESC )
UPDATE cte SET active = 0
OUTPUT INSERTED.msg_id INTO #t
WHERE guid = 'abcd'
SELECT *
FROM #t
You can select the data that you need by just adding all columns that you want. INSERTED contains all columns, not just the ones written to. You can also output columns from the cte alias. Example:
OUTPUT INSERTED.SomeOtherColumn, cte.SomeOtherColumn

Sql Server Query Selecting Top and grouping by

SpousesTable
SpouseID
SpousePreviousAddressesTable
PreviousAddressID, SpouseID, FromDate, AddressTypeID
What I have now is updating the most recent for the whole table and assigning the most recent regardless of SpouseID the AddressTypeID = 1
I want to assign the most recent SpousePreviousAddress.AddressTypeID = 1
for each unique SpouseID in the SpousePreviousAddresses table.
UPDATE spa
SET spa.AddressTypeID = 1
FROM SpousePreviousAddresses AS spa INNER JOIN Spouses ON spa.SpouseID = Spouses.SpouseID,
(SELECT TOP 1 SpousePreviousAddresses.* FROM SpousePreviousAddresses
INNER JOIN Spouses AS s ON SpousePreviousAddresses.SpouseID = s.SpouseID
WHERE SpousePreviousAddresses.CountryID = 181 ORDER BY SpousePreviousAddresses.FromDate DESC) as us
WHERE spa.PreviousAddressID = us.PreviousAddressID
I think I need a group by but my sql isn't all that hot. Thanks.
Update that is Working
I was wrong about having found a solution to this earlier. Below is the solution I am going with
WITH result AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY SpouseID ORDER BY FromDate DESC) AS rowNumber, *
FROM SpousePreviousAddresses
WHERE CountryID = 181
)
UPDATE result
SET AddressTypeID = 1
FROM result WHERE rowNumber = 1
Presuming you are using SQLServer 2005 (based on the error message you got from the previous attempt) probably the most straightforward way to do this would be to use the ROW_NUMBER() Function couple with a Common Table Expression, I think this might do what you are looking for:
WITH result AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY SpouseID ORDER BY FromDate DESC) as rowNumber,
*
FROM
SpousePreviousAddresses
)
UPDATE SpousePreviousAddresses
SET
AddressTypeID = 2
FROM
SpousePreviousAddresses spa
INNER JOIN result r ON spa.SpouseId = r.SpouseId
WHERE r.rowNumber = 1
AND spa.PreviousAddressID = r.PreviousAddressID
AND spa.CountryID = 181
In SQLServer2005 the ROW_NUMBER() function is one of the most powerful around. It is very usefull in lots of situations. The time spent learning about it will be re-paid many times over.
The CTE is used to simplyfy the code abit, as it removes the need for a temporary table of some kind to store the itermediate result.
The resulting query should be fast and efficient. I know the select in the CTE uses *, which is a bit of overkill as we dont need all the columns, but it may help to show what is happening if anyone want to see what is happening inside the query.
Here's one way to do it:
UPDATE spa1
SET spa1.AddressTypeID = 1
FROM SpousePreviousAddresses AS spa1
LEFT OUTER JOIN SpousePreviousAddresses AS spa2
ON (spa1.SpouseID = spa2.SpouseID AND spa1.FromDate < spa2.FromDate)
WHERE spa1.CountryID = 181 AND spa2.SpouseID IS NULL;
In other words, update the row spa1 for which no other row spa2 exists with the same spouse and a greater (more recent) date.
There's exactly one row for each value of SpouseID that has the greatest date compared to all other rows (if any) with the same SpouseID.
There's no need to use a GROUP BY, because there's kind of an implicit grouping done by the join.
update: I think you misunderstand the purpose of the OUTER JOIN. If there is no row spa2 that matches all the join conditions, then all columns of spa2.* are returned as NULL. That's how outer joins work. So you can search for the cases where spa1 has no matching row spa2 by testing that spa2.SpouseID IS NULL.
UPDATE spa SET spa.AddressTypeID = 1
WHERE spa.SpouseID IN (
SELECT DISTINCT s1.SpouseID FROM Spa S1, SpousePreviousAddresses S2
WHERE s1.SpouseID = s2.SpouseID
AND s2.CountryID = 181
AND s1.PreviousAddressId = s2.PreviousAddressId
ORDER BY S2.FromDate DESC)
Just a guess.