I have the below table
substring(area,6,3)
qty
101
10
103
15
102
11
104
30
105
25
107
17
108
23
106
48
And I am looking to get a result as below without repeating the IIF ( as it's a cumulative of 4 sequences) in the area:
new_area(substring(area,6,3)
sum_qty
101-104
66
105-108
117
I don't know how to create the new area column to be able to get the sum qty
Looking forward to your help.
Please also add an explanation so I will understand how the query is running.
I think this is what you are looking for.
We just use the window function row_number() to create the Grp
NOTE: If you have repeating values in AREA use dense_rank() instead of row_number()
Example
Select new_area = concat(min(area),'-',max(area))
,qty = sum(qty)
From (
Select area=substring(area,6,3)
,qty
,Grp = (row_number() over (order by substring(area,6,3))-1) / 4
From YourTable
) A
Group By Grp
Results
new_area qty
101-104 66
105-108 113 -- get different results
If you were to run the subquery, you would see the following.
Then it becomes a small matter to aggregate the data grouped by the created column GRP
Related
I have a dataframe that looks like this:
id name datetime
44 once 2022-11-22T15:41:00
44 once 2022-11-22T15:42:00
44 once 2022-11-22T15:43:00
44 twice 2022-11-22T15:44:00
44 once 2022-11-22T16:41:00
55 thrice 2022-11-22T17:44:00
55 thrice 2022-11-22T17:46:00
55 once 2022-11-22T17:47:00
55 once 2022-11-22T17:51:00
55 twice 2022-11-22T18:41:00
55 thrice 2022-11-22T18:51:00
My desired output is
id name datetime cnt
44 once 2022-11-22T15:41:00 3
44 once 2022-11-22T15:42:00 3
44 once 2022-11-22T15:43:00 3
44 twice 2022-11-22T15:44:00 1
44 once 2022-11-22T16:41:00 1
55 thrice 2022-11-22T17:44:00 2
55 thrice 2022-11-22T17:46:00 2
55 once 2022-11-22T17:47:00 2
55 once 2022-11-22T17:51:00 2
55 twice 2022-11-22T18:41:00 1
55 thrice 2022-11-22T18:51:00 1
where the new column, cnt, is the maximum count of the name column per block that they follow themselves consecutively.
I attempted the problem by doing:
select
id,
name,
datetime,
row_number() over (partition by id order by datetime) rn1,
row_number() over (partition by id, name order by name, datetime) rn2
from table
but it is obviously not giving the desired output.
I tried also looking at the solutions in SQL count consecutive days but could not figure out from answers given there.
As noted in the question you linked to, this is a typical gaps & islands problem.
The solution is provided in the answers to that question, but I've applied to your sample data specifically for you here:
with gp as (
select *,
Row_Number() over(partition by id order by [datetime])
- Row_Number() over(partition by id, name order by [datetime]) g
from t
)
select id, name, [datetime],
Count(*) over(partition by id, name, g) cnt
from gp;
See Demo DBFiddle
I have a table like below:
type_id
date
order
20
2021-06-23
123
20
2021-06-23
217
35
2021-06-23
121
35
2021-06-24
128
20
2021-06-24
55
35
2021-06-25
77
20
2021-06-26
72
20
2021-06-26
71
and want to create a query only where type_id=20 likie this:
2021-06-23
2021-06-24
2021-06-25
2021-06-25
123
55
72
217
71
is it possible to do this with sql without vba?
if vba needed do I need to create a extra table and every time add/delete a new columns ?
Thnak You for any idea
You can use conditional aggregation. But this is a pain in MS Access because you need a sequential value. You can calculate one:
select max(iif(date = "2021-06-23", order, null)) as val_2021_06_23,
max(iif(date = "2021-06-24", order, null)) as val_2021_06_24,
max(iif(date = "2021-06-25", order, null)) as val_2021_06_25,
max(iif(date = "2021-06-26", order, null)) as val_2021_06_26
from (select t.*,
(select count(*)
from t as t2
where t2.type_id = t.type_id and t2.date = t.date and t2.order <= t.order
) as seqnum
from t
where type_id = 20
) t
group by seqnum;
Thank You, it works !
(only one change in the code needed "2021-06-23" ----> #2021-06-23#
In the meantime I found a other solution but this need add a new field into the table. The field is a numeric field which contain sequence numer for each day from 1 to n. In my project it's even helpful because in this case I can control order by in columns
here is the code. maybe helpful for someone in future
TRANSFORM
First ([tabela1].[order])
SELECT [tabela1].[sequence]
FROM [tabela1]
WHERE [tabela1].[type_id] = 20
GROUP BY [tabela1].[sequence]
PIVOT [tabela1].[date]
I'm trying to get a running total within a group but my current code just gives me an aggregate sum.
For example, my data looks like this
ID ShiftNum Status Type Rate HourlyWage Hours Total_Amount
12542 1 Full A 1 12.5 40 500
12542 1 Full A 1 12.5 35 420
12542 2 Full A 1 10 40 400
12542 2 Full B 1.2 10 40 480
17842 1 Full A 1 11 27 297
17842 1 Full B 1.3 11 30 429
And what I want is a running total within the same ID, Shift Number, and Status. For example, I want something like this as my final result
ID ShiftNum Status Type Rate HourlyWage Hours Total_Amount Running_Tot
12542 1 Full A 1 12.5 40 500 500
12542 1 Full A 1 12.5 35 420 920
12542 2 Full A 1 10 40 400 400
12542 2 Full B 1.2 10 40 480 880
17842 1 Full A 1 11 27 297 297
17842 1 Full B 1.3 11 30 429 726
However, my current code just gives me the total sum within each group. For example, 920, 920 for row 1&2. Here's my code.
Select a.*,
SUM(Hours) OVER (PARTITION BY ID, ShiftNum, Status ORDER BY ID, ShiftNum, Status) as Runnint_Tot
from table a
How do I fix my code to get the final result I want?
You need an ordering column that uniquely defines each row. There is not an obvious one in your row, but something like this:
SUM(Hours) OVER (PARTITION BY ID, ShiftNum, Status ORDER BY hours) as Running_Tot
Or:
SUM(Hours) OVER (PARTITION BY ID, ShiftNum, Status
ORDER BY (SELECT NULL)
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as Running_Tot
The problem you are facing is because the ORDER BY keys have ties. The default window frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Note the RANGE. That means that all rows with ties are combined.
Also note that there is no utility to including the PARTITION BY keys in the ORDER BY (well . . . there is one exception in SQL Server if you don't care about the ordering, then including a key can be a handy short-cut). The ordering occurs within a partition.
If your rows can have exact duplicates, I would first suggest that you add a primary key. But, in the meantime, you could use:
with a as (
select a.*,
row_number() over (order by id, shiftnum, status) as seqnum
from tablea a
)
Select a.*,
SUM(Hours) OVER (PARTITION BY ID, ShiftNum, Status ORDER BY seqnum) as Running_Tot
from a;
The ordering will be arbitrary, but it will at least accumulate.
I have a control table, where Prices with Item number are tracked date wise.
id ItemNo Price Date
---------------------------
1 a001 100 1/1/2003
2 a001 105 1/2/2003
3 a001 110 1/3/2003
4 b100 50 1/1/2003
5 b100 55 1/2/2003
6 b100 60 1/3/2003
7 c501 35 1/1/2003
8 c501 38 1/2/2003
9 c501 42 1/3/2003
10 a001 95 1/1/2004
This is the query I am running.
SELECT pr.*
FROM prices pr
INNER JOIN
(
SELECT ItemNo, max(date) max_date
FROM prices
GROUP BY ItemNo
) p ON pr.ItemNo = p.ItemNo AND
pr.date = p.max_date
order by ItemNo ASC
I am getting below values
id ItemNo Price Date
------------------------------
10 a001 95 2004-01-01
6 b100 60 2003-01-03
9 c501 42 2003-01-03
Question is, is my query right or wrong? though I am getting my desired result.
Your query does what you want, and is a valid approach to solve your problem.
An alternative option would be to use a correlated subquery for filtering:
select p.*
from prices p
where p.date = (select max(p1.date) from prices where p1.itemno = p.itemno)
The upside of this query is that it can take advantage of an index on (itemno, date).
You can also use window functions:
select *
from (
select p.*, rank() over(partition by itemno order by date desc) rn
from prices p
) p
where rn = 1
I would recommend benchmarking the three options against your real data to assess which one performs better.
I have a Postgresql database, and I'm having trouble getting my query right, even though this seems like a common problem.
My table looks like this:
CREATE TABLE orders (
account_id INTEGER,
order_id INTEGER,
ts TIMESTAMP DEFAULT NOW()
)
Everytime there is a new order, I use it to link the account_id and order_id.
Now my problem is that I want to get a list that has the last order (by looking at ts) for each account.
For example, if my data is:
account_id order_id ts
5 178 July 1
5 129 July 6
4 190 July 1
4 181 July 9
3 348 July 1
3 578 July 4
3 198 July 1
3 270 July 12
Then I'd like the query to return only the last row for each account:
account_id order_id ts
5 129 July 6
4 181 July 9
3 270 July 12
I've tried GROUP BY account_id, and I can use that to get the MAX(ts) for each account, but then I have no way to get the associated order_id. I've also tried sub-queries, but I just can't seem to get it right.
Thanks!
select distinct on (account_id) *
from orders
order by account_id, ts desc
https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT:
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
The row_number() window function can help:
select account_id, order_id, ts
from (select account_id, order_id, ts,
row_number() over(partition by account_id order by ts desc) as rn
from tbl) t
where rn = 1