I'm trying to write an SQL query (SQL Server) that returns the latest value of a field from a history table.
The table structure is basically as below:
ISSUE TABLE:
issueid
10
20
30
CHANGEGROUP TABLE:
changegroupid | issueid | updated |
1 | 10 | 01/01/2020 |
2 | 10 | 02/01/2020 |
3 | 10 | 03/01/2020 |
4 | 20 | 05/01/2020 |
5 | 20 | 06/01/2020 |
6 | 20 | 07/01/2020 |
7 | 30 | 04/01/2020 |
8 | 30 | 05/01/2020 |
9 | 30 | 06/01/2020 |
CHANGEITEM TABLE:
changegroupid | field | newvalue |
1 | ONE | 1 |
1 | TWO | A |
1 | THREE | Z |
2 | ONE | J |
2 | ONE | K |
2 | ONE | L |
3 | THREE | K |
3 | ONE | 2 |
3 | ONE | 1 | <--
4 | ONE | 1A |
5 | ONE | 1B |
6 | ONE | 1C | <--
7 | ONE | 1D |
8 | ONE | 1E |
9 | ONE | 1F | <--
EXPECTED RESULT:
issueid | updated | newvalue
10 | 03/01/2020 | 1
20 | 07/01/2020 | 1C
30 | 06/01/2020 | 1F
So each change to an issue item creates 1 change group record with the date the change was made, which can then contain 1 or more change item records.
Each change item shows the field name that was changed and the new value.
I then need to link those tables together to get each issue, the latest value of the field name called 'ONE', and ideally the date of the latest change.
These tables are from Jira, for those familiar with that table structure.
I've been trying to get this to work for a while now, so far I've got this query:
SELECT issuenum, MIN(created) AS updated FROM
(
SELECT ISSUE.IssueId, UpdGrp.Created as Created, UpdItm.NEWVALUE
FROM ISSUE
JOIN ChangeGroup UpdGrp ON (UpdGrp.IssueID = CR.ID)
JOIN CHANGEITEM UpdItm ON (UpdGrp.ID = UpdItm.groupid)
WHERE UPPER(UpdItm.FIELD) = UPPER('ONE')
) AS dummy
GROUP BY issuenum
ORDER BY issuenum
This returns the first 2 columns I'm looking for but I'm struggling to work out how to return the final column as when I include that in the first line I get an error saying "Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
I've done a search on here and can't find anything that exactly matches my requirements.
Use window functions:
SELECT i.*
FROM (SELECT i.IssueId, cg.Created as Created, ui.NEWVALUE,
ROW_NUMBER() OVER (PARTITION BY i.IssueId ORDER BY cg.Created DESC) as seqnum
FROM ISSUE i JOIN
ChangeGroup cg
ON cg.IssueID = CR.ID JOIN
CHANGEITEM ci
ON cg.ID = ci.groupid
WHERE UPPER(UpdItm.FIELD) = UPPER('ONE')
) i
WHERE seqnum = 1
ORDER BY issueid;
So I want to select all rows where a subset of rows in another table match the given values.
I have following tables:
Main Profile:
+----+--------+---------------+---------+
| id | name | subprofile_id | version |
+----+--------+---------------+---------+
| 1 | Main 1 | 4 | 1 |
| 2 | Main 1 | 5 | 2 |
| 3 | Main 2 | ... | 1 |
+----+--------+---------------+---------+
Sub Profile:
+---------------+----------+
| subprofile_id | block_id |
+---------------+----------+
| 4 | 6 |
| 4 | 7 |
| 5 | 8 |
| 5 | 9 |
+---------------+----------+
Block:
+----------+-------------+
| block_id | property_id |
+----------+-------------+
| 7 | 10 |
| 7 | 11 |
| 7 | 12 |
| 7 | 13 |
| 8 | 14 |
| 8 | 15 |
| 8 | 16 |
| 8 | 17 |
| ... | ... |
+----------+-------------+
Property:
+----+--------------------+--------------------------+
| id | name | value |
+----+--------------------+--------------------------+
| 10 | Description | XY |
| 11 | Responsible person | Mr. Smith |
| 12 | ... | ... |
| 13 | ... | ... |
| 14 | Description | XY |
| 15 | Responsible person | Mrs. Brown |
| 16 | ... | ... |
| 17 | ... | ... |
+----+--------------------+--------------------------+
The user can define multiple conditions on the property table. For example:
Description = 'XY'
Responsible person = 'Mr. Smith'
I need all 'Main Profiles' with the highest version which have ALL matching properties and can have more of course which do not match.
It should be doable in JPA because i would translate it into QueryDSL to build typesafe, dynamic queries with the users input.
I already searched trough all questions regarding similar problems but couldn't project the answer onto my problem.
Also, I've already tried to write a query which worked quite good but retrieved all rows with at least one matching condition. Therefore i need all properties in my set but it only fetched (fetch join, which is missing in my code examplte) the matching ones.
from MainProfile as mainProfile
left join mainProfile.subProfile as subProfile
left join subProfile.blocks as block
left join block.properties as property
where mainProfile.version = (select max(mainProfile2.version)from MainProfile as mainProfile2 where mainProfile2.name = mainProfile.name) and ((property.name = 'Description' and property.value = 'XY') or (property.name = 'Responsible person' and property.value = 'Mr. Smith'))
Running my query i got two rows:
Main 1 with version 2
Main 2 with version 1
I would have expected to get only one row due to mismatch of 'responsible person' in 'Main 2'
EDIT 1:
So I found a solution which works but could be improved:
select distinct mainProfile
from MainProfile as mainProfile
left join mainProfile.subProfile as subProfile
left join subProfile.blocks as block
left join block.properties as property
where mainProfile.version = (select max(mainProfile2.version)from MainProfile mainProfile2 where mainProfile2.name = mainProfile.name)
and ((property.name = 'Description' and property.content = 'XY') or (property.name = 'Responsible person' and property.content = 'Mr. Smith'))
group by mainProfile.id
having count (distinct property) = 2
It actually retrieves the right 'Main Profiles'. But the problem is, that only the two found properties are getting fetched. I need all properties though because of further processing.
I am trying my hardest to get a list of the most recent rows by date in a DB2 file. The file has no unique id, so I am trying to get the entries by matching a set of columns. I need DESCGA most importantly as that changes often. When it does they keep another row for historical reasons.
SELECT B.COGA, B.COMSUBGA, B.ACCTGA, B.PRFXGA, B.DESCGA
FROM mylib.myfile B
WHERE
(
SELECT COUNT(*)
FROM
(
SELECT A.COGA,A.COMSUBGA,A.ACCTGA,A.PRFXGA,MAX(A.DATEGA) AS EDATE
FROM mylib.myfile A
GROUP BY A.COGA, A.COMSUBGA, A.ACCTGA, A.PRFXGA
) T
WHERE
(B.ACCTGA = T.ACCTGA AND
B.COGA = T.COGA AND
B.COMSUBGA = T.COMSUBGA AND
B.PRFXGA = T.PRFXGA AND
B.DATEGA = T.EDATE)
) > 1
This is what I am trying and so far I get 0 results.
If I remove
B.ACCTGA = T.ACCTGA AND
It will return results (of course wrong).
I am using ODBC in VS 2013 to structure this query.
I have a table with the following
| a | b | descri | date |
-----------------------------
| 1 | 0 | string | 20140102 |
| 2 | 1 | string | 20140103 |
| 1 | 1 | string | 20140101 |
| 1 | 1 | string | 20150101 |
| 1 | 0 | string | 20150102 |
| 2 | 1 | string | 20150103 |
| 1 | 1 | string | 20150103 |
and i need
| 1 | 0 | string | 20150102 |
| 2 | 1 | string | 20150103 |
| 1 | 1 | string | 20150103 |
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by a, b order by date desc) as seqnum
from mylib.myfile t
) t
where seqnum = 1;
Assume I have this schema (tested on postgresql) where the 'Scorelines' relation contains results of sport matches. (kickoff is a TIMESTAMP but replaced by INT for readability)
SQLFiddle here: http://sqlfiddle.com/#!12/52475/3
CREATE TABLE Scorelines (
team TEXT,
kickoff INT,
scored INT,
conceded INT
);
Now I want to produce another column 'three_matches_scored' that contains the sum of the points scored
over the 3 preceding game (determined by kickoff) of the same team. I have this:
SELECT team, kickoff, scored, conceded, SUM(scored) OVER three_matches AS three_matches_scored
FROM Scorelines
WINDOW three_matches AS
(PARTITION BY team ORDER BY kickoff
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING)
ORDER BY kickoff;
This works beautifully so far, except that I get values starting from the second game. Example:
| TEAM | KICKOFF | SCORED | CONCEDED | THREE_MATCHES_SCORED |
|------|---------|--------|----------|----------------------|
| A | 1 | 1 | 0 | (null) |
| B | 2 | 1 | 1 | (null) |
| A | 3 | 1 | 1 | 1 |
| A | 4 | 3 | 0 | 2 |
| B | 4 | 1 | 4 | 1 |
| A | 6 | 0 | 2 | 5 |
| B | 6 | 4 | 2 | 2 |
| B | 8 | 1 | 2 | 6 |
| B | 10 | 1 | 1 | 6 |
| A | 11 | 2 | 1 | 4 |
I want the column 'three_matches_scored' to be (null) for the first 3 games because there are no 3 results to sum up. How can I achieve this?
I'd prefer simple understandable solutions, performance is not critical for this particular case.
My only idea right now, is to define a stored function SUM3, that results in (null) with less than 3 values to add up. But I never defined a function in SQL and can't seem to figure it out.
You can use a case statement to null the rows where there are less than 3 games:
SELECT team, kickoff, scored, conceded,
CASE WHEN COUNT(scored) OVER three_matches = 3
THEN SUM(scored) OVER three_matches
ELSE NULL
END AS three_matches_scored
FROM Scorelines
WINDOW three_matches AS
(PARTITION BY team ORDER BY kickoff
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING)
ORDER BY kickoff;
Output:
team | kickoff | scored | conceded | three_matches_scored
------+---------+--------+----------+----------------------
A | 1 | 1 | 0 |
B | 2 | 1 | 1 |
A | 3 | 1 | 1 |
A | 4 | 3 | 0 |
B | 4 | 1 | 4 |
A | 6 | 0 | 2 | 5
B | 6 | 4 | 2 |
B | 8 | 1 | 2 | 6
B | 10 | 1 | 1 | 6
A | 11 | 2 | 1 | 4
(10 rows)
See harmics answer above.
(my first solution, just for reference)
Solution with user defined aggregate:
CREATE TYPE intermediate_sum AS (
sum INT,
count INT
);
CREATE FUNCTION sum_sfunc(intermediate_sum, INTEGER) RETURNS intermediate_sum AS
$$ SELECT $2 + $1.sum AS sum, $1.count - 1 AS count $$ LANGUAGE SQL;
CREATE FUNCTION sum_ffunc(intermediate_sum) RETURNS INTEGER AS
$$ SELECT (CASE WHEN $1.count > 1 THEN null
WHEN $1.count = 0 THEN $1.sum
END)
$$ LANGUAGE SQL;
CREATE AGGREGATE sum3(INTEGER) (
sfunc = sum_sfunc,
finalfunc = sum_ffunc,
stype = intermediate_sum,
initcond = '(0,3)'
);
The aggregate SUM3 wants at least 3 values, otherwise it returns (null). One can define other aggreates like SUM4 by changing the initcond, for example to '(0,4)'.
We have a phone dialer who call us store to inform them about gas price in their region.
We have 3 tables (WBDAPP00,WBDCIE00,WBDCIA00)
WBDAPP00 is where we store information about the call.
DANOID = ID
DA#INT,DA#IND,DA#TEL = phone number
DA#ENV = The number of group call, we send 1 message to few store.
DASTAT = The status of the call (Confirm by store,canceled,running, confirmed by us, in pause)
DADTHR = The timestamp of the last status modification
WBDCIE00 is where we store information about the group of store
CIE#EN = ID
CIEDHC = The timestamp where the call is effective, we can call the morning to tell the price will change at 14h30
CIE$OR = The price for regular
CIE$PL = The price for plus
CIE$SP = The price for super
CIE$DI = The price for diesel
WBDCIA00 is complementary information about WBDAPP00
CIA#ST = The ID of the store
CIA#AP = The ID of the call
CIE#EN = The ID of the group call
CIABAN = This is the number of the compagny of the store
This is a sample output of these 3 tables
SELECT * FROM PRDCM/WBDAPP00 WHERE DA#ENV = 17258 OR DA#ENV = 17257
+--------+--------+--------+---------+--------+--------+----------------------------+-----------+--------+
| DANOID | DA#INT | DA#IND | DA#TEL | DA#ENV | DASTAT | DADTHR | DAPARM | DAMUSR |
+--------+--------+--------+---------+--------+--------+----------------------------+-----------+--------+
| 100420 | 1 | 418 | 9600055 | 17257 | 4 | 2012-05-07-09.15.04.768228 |1;2;1;1;1;1| ISALAP |
| 100421 | 1 | 819 | 7346491 | 17258 | 0 | 2012-05-07-09.23.32.362971 |0;4;0;1;0;0| ISALAP |
| 100422 | 1 | 819 | 7624747 | 17258 | 1 | 2012-05-07-09.24.28.042330 |0;3;1;1;0;1| ISALAP |
| 100423 | 1 | 819 | 6377874 | 17258 | 0 | 2012-05-07-09.23.32.803073 |0;3;0;1;0;1| ISALAP |
| 100424 | 1 | 819 | 8742844 | 17258 | 1 | 2012-05-07-09.24.25.347116 |1;1;1;1;0;1| ISALAP |
| 100425 | 1 | 819 | 8255744 | 17258 | 0 | 2012-05-07-09.23.33.207688 |1;3;1;1;0;1| ISALAP |
+--------+--------+--------+---------+--------+--------+----------------------------+-----------+--------+
SELECT * FROM PRDCM/WBDCIE00 WHERE CIE#EN = 17258 OR CIE#EN = 17257
+--------+----------------------------+--------+--------+--------+--------+
| CIE#EN | CIEDHC | CIE$OR | CIE$PL | CIE$SP | CIE$DI |
+--------+----------------------------+--------+--------+--------+--------+
| 17257 | 2012-05-04-17.00.00.000000 | 0 | 0 | 0 | 1,359 |
| 17258 | 2012-05-07-09.30.00.000000 | 1,354 | 0 | 0 | 0 |
+--------+----------------------------+--------+--------+--------+--------+
SELECT * FROM PRDCM/WBDCIA00 WHERE CIA#EN = 17258 OR CIA#EN = 17257
+--------+--------+--------+--------+
| CIA#ST | CIA#AP | CIA#EN | CIABAN |
+--------+--------+--------+--------+
| 96 | 100420 | 17257 | 2 |
| 316 | 100421 | 17258 | 4 |
| 320 | 100422 | 17258 | 3 |
| 321 | 100423 | 17258 | 3 |
| 338 | 100424 | 17258 | 1 |
| 366 | 100425 | 17258 | 3 |
+--------+--------+--------+--------+
This is the relation between tables
CIA#AP = DANOID
CIA#EN = CIE#EN = DA#ENV
I want to extract the last CIE$OR (not 0) and the last CIE$DI (not 0) for each CIA#ST.
The last one is determined by CIEDHC (Desc order).
DASTAT needs to be 1 or 4.
This is an example of want I want to extract from the data above :
+--------+--------+--------+
| CIA#ST | CIE$OR | CIE$DI |
+--------+--------+--------+
| 96 | 0 | 1,359 |
| 316 | 1,354 | 0 |
| 320 | 1,354 | 0 |
| 321 | 1,354 | 0 |
| 338 | 1,354 | 0 |
| 366 | 1,354 | 0 |
+--------+--------+--------+
Or like this one, that's not ideal but I will tolerate it in this case
+--------+-------------+-------+
| CIA#ST | productType | price |
+--------+-------------+-------+
| 96 | 3 | 1,359 |
| 316 | 6 | 1,354 |
| 320 | 6 | 1,354 |
| 321 | 6 | 1,354 |
| 338 | 6 | 1,354 |
| 366 | 6 | 1,354 |
+--------+-------------+-------+
For those who don't know AS400, FETCH FIRST 1 ROWS ONLY is equal to TOP 1 AND LIMIT 1
LAST does not exist in AS400 so I need to replace
SELECT LAST(Column1) AS test FROM table1
by
SELECT Column1,Column2 FROM table1 ORDER BY Column2 DESC LIMIT 1
I have tried with subselect but you can't use ORDER BY and FETCH FIRST 1 ROWS ONLY.
We are in V5R1 without any PTF.
This is an exemple of extraction
SELECT CIA#ST,CIE$OR,CIE$DI,CIEDHC
FROM PRDCM/WBDAPP03
INNER JOIN PRDCM/WBDCIE01 ON CIE#EN = DA#ENV
INNER JOIN PRDCM/WBDCIA01 ON CIA#AP = DANOID
WHERE DASTAT IN (1,4)
ORDER BY CIEDHC,DA#ENV
FETCH FIRST 5 ROWS ONLY
+--------+--------+--------+----------------------------+
| CIA#ST | CIE$OR | CIE$DI | CIEDHC |
+--------+--------+--------+----------------------------+
| 88 | 1,014 | 1,039 | 2010-08-25-09.00.00.000000 |
| 89 | 1,014 | 1,039 | 2010-08-25-09.00.00.000000 |
| 90 | 1,014 | 1,039 | 2010-08-25-09.00.00.000000 |
| 91 | 1,014 | 1,039 | 2010-08-25-09.00.00.000000 |
| 119 | 1,084 | 0 | 2010-08-25-09.00.00.000000 |
| 522 | 1,014 | 1,039 | 2010-08-25-09.00.00.000000 |
+--------+--------+--------+----------------------------+
I'll try all your suggestions.
Frankly, I'm a little twitchy about your schema here - there's some denormalization I'm not happy with, among other things (a multi-value column, really?). But you probably have a limited ability to change it, so... If possible, you should consider upgrading to at least V6R1 (which is what we're on), as the database gets more goodies. Thankfully, you still have CTEs, which will help a bit.
I'm assuming that what you want is the latest price change for a store (given by CIEDHC) with a call for that store in DASTAT as 1 or 4, not given by the call-time (so, what happens if an earlier group-call is 'confirmed' after a later one?). In other words, this isn't the last 'confirmed' change, it's the last 'entered' change.
I'm also assuming you have a 'store' table, with all the actual store ids defined. However, since you didn't list it, I created a CTE to manufacture one. You can (and probably should) swap it out in the resulting statement.
WITH Store (storeId) as (
SELECT DISTINCT cia#st
FROM Wbdcia00),
Price_Change (callGroup, occurredAt, productType, newPrice) as (
SELECT cie#en, ciedhc, 1, cie$or
FROM Wbdcie00
WHERE cie$or > 0
UNION ALL
SELECT cie#en, ciedhc, 4, cie$di
FROM Wbdcie00
WHERE cie$di > 0),
Confirmed_Changes (storeId, occurredAt, productType, newPrice) as (
SELECT WarehouseCall.cia#st, Change.occurredAt,
Change.productType, Change.newPrice
FROM Wbdcia00 as WarehouseCall
JOIN Wbdapp00 as Call
ON Call.danoid = WarehouseCall.cia#ap
AND Call.dastat IN (1, 4)
JOIN Price_Change as Change
ON Change.callGroup = da#env),
Latest_Change (storeId, productType, newPrice) as (
SELECT Actual.storeId, Actual.productType, Actual.newPrice
FROM Confirmed_Changes as Actual
EXCEPTION JOIN Confirmed_Changes as Remove
ON Remove.storeId = Actual.storeId
AND Remove.productType = Actual.productType
AND Remove.occurredAt > Actual.occurredAt)
SELECT store.storeId, COALESCE(Regular.newPrice, 0) as regularPrice,
COALESCE(Diesel.newPrice, 0) as dieselPrice
FROM Store
LEFT JOIN Latest_Change as Regular
ON Regular.storeId = Store.storeId
AND Regular.productType = 1
LEFT JOIN Latest_Change as Diesel
ON Diesel.storeId = Store.storeId
AND Diesel.productType = 4
Some things to note -
I figured you weren't actually giving a product a price of 0. This means that you're not looking for the individual call that went out, with both prices listed - you're going for the last change that happened, for each product. Which is why I pivoted/unpivoted that table like I did.
Needless to say, this statement reports the last entered change that was 'confirmed'. This is not the last confirmation of a change (indicated by dadthr), however.