I have a table that gets one value of only one day in each month. I want to duplicate that value to the whole month until a new value shows up. the result will be a table with data for each day of the month based on the last known value.
Can someone help me writing this query?
This is untested, due to a lack of consumable sample data, but this looks like a gaps and island problem. Here you can count the number of non-NULL values for Yield to assign the group "number" and then get the windowed MAX in the outer SELECT:
WITH CTE AS(
SELECT Yield,
[Date],
COUNT(yield) OVER (ORDER BY [Date]) AS Grp
FROM dbo.YourTable)
SELECT MAX(yield) OVER (PARTITION BY grp) AS yield
[Date],
DATENAME(WEEKDAY,[Date]) AS [Day]
FROM CTE;
You seem to have data on the first of the month. That suggests an alternative approach:
select t.*, t2.yield as imputed_yield
from t cross apply
(select t2.*
from t t2
where t2.date = datefromparts(year(t.date), month(t.date), 1)
) t2;
This should be able to take advantage of an index on (date, yield). And it does assume that the value you want is on the first date of the month.
Related
I have a table like below:
I want the results to be like below which fetch the start and end of the balance but we can't use group by as balance should be grouped only based on consecutive groups. can you please help me with this ?:
There is most certainly a duplicate of this question, however, it is easier to crank out an answer than to search. These types of problems, data in the order inserted or shown with no order indicator, can simply be solved by two derivative queries. The first to use LAG or LEAD to check for gaps and the second to sum up the changes which are represented by a value of 1 as opposed to 0. The key here, using MSSQL Server, is SUM(x) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING).
DECLARE #T TABLE(balance INT, date DATETIME)
INSERT #T VALUES
(36,'1/1/2020'),
(36,'1/2/2020'),
(36,'1/3/2020'),
(24,'1/4/2020'),
(24,'1/5/2020'),
(36,'1/6/2020'),
(36,'1/7/2020'),
(37,'1/8/2020'),
(38,'1/9/2020')
;WITH GapsMarked AS
(
--If the prev value by date (by natural order of data above) does not equal this value mark it as a boundry
SELECT *,
IsBoundry = CASE WHEN ISNULL(LAG(balance) OVER (ORDER BY date),balance) = balance THEN 0 ELSE 1 END
FROM #T
)
,VirtualGroup AS
(
SELECT
*,
--This serialzes the marked groups into seequntial clusters
IslandsMarked = SUM(IsBoundry) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING)
FROM
GapsMarked
)
SELECT
MAX(balance) AS balance,
MIN(date) AS start,
MAX(date) AS [end]
FROM
VirtualGroup
GROUP BY
IslandsMarked
select balance, min(start), max(end) from table where balance is in (
select balance from table
group by balance)
Hope it will help you
What I need to do: if a customer makes more than one transaction in a day, I need to display the greatest value (and ignore any other values).
The query is pretty big, but the code I inserted below is the focus of the issue. I’m not getting the results I need. The subselect ideally should be reducing the number of rows the query generates since I don’t need all the transactions, just the greatest one, however my code isn’t cutting it. I’m getting the exact same number of rows with or without the subselect.
Note: I don’t actually have a t. in the actual query, there’s just a dozen or so other fields being pulled in. I added the t.* just to simplify the code example.*
SELECT
t.*,
(SELECT TOP (1)
t1.CustomerGUID
t1.Value
t1.Date
FROM #temp t1
WHERE t1.CustomerGUID = t.CustomerGUID
AND t1.Date = t.Date
ORDER BY t1.Value DESC) AS “Value”
FROM #temp t
Is there an obvious flaw in my code or is there a better way to achieve the result of getting the greatest value transaction per day per customer?
Thanks
you may want to do as follows:
SELECT
t1.CustomerGUID,
t1.Date,
MAX(t1.Value) AS Value
FROM #temp t1
GROUP BY
t1.CustomerGUID,
t1.Date
You can use row_number() as shown below.
SELECT
*
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY CustomerGUID ORDER BY Date Desc) AS SrNo FROM <YourTable>
)
<YourTable>
WHERE
SrNo = 1
Sample data will be more helpful.
Try this window function:
MAX(value) OVER(PARTITION BY date,customer ORDER BY value DESC)
Its faster and more efficient.
Probably many other ways to do it, but this one is simple and works
select t.*
from (
select
convert(varchar(8), r.date,112) one_day
,max(r.Value) max_sale
from #temp r
group by convert(varchar(8), r.date,112)
) e
inner join #temp t on t.value = e.max_sale and convert(varchar(8), t.date,112) = e.one_day
if you have 2 people who spend the exact same amount that's also max, you'll get 2 records for that day.
the convert(varchar(8), r.date,112) will perform as desired on date, datetime and datetime2 data types. If you're date is a varchar,char,nchar or nvarchar you'll want to examine the data to find out if you left(t.date,10) or left(t.date,8) it.
If i've understood your requirement correctly you have stated"greatest value transaction per day per customer". That suggests to me you don't want 1 row per customer in the output but a row per day per customer.
To achieve this you can group on the day like this
Select t.customerid, datepart(day,t.date) as Daydate,
max(t.value) as value from #temp t group by
t.customerid, datepart(day,t.date);
My table contains different house IDs(dataid), time of observation(readtime), meter reading Basic Output
And the query is as follows Query statement :
select *
from university.gas_ert
where readtime between '01/01/2014' and '01/02/2014'
I am trying to get only the first observation of each day of all the dataids between the time span. I have tried GROUP BY, but it doesn't seem working.
Distinct ON could make your query much more simple.. More read in Documentation
Definition :
Keeps only the first row of each set of rows where the given
expressions evaluate to equal. Note that the “first row” of each set
is unpredictable unless ORDER BY is used to ensure that the desired
row appears first.
SELECT
DISTINCT ON (meter_value) meter_value,
dataid,
readtime
FROM
university.gas.ert
WHERE
readtime between '2014-01-01' and '2014-01-02'
ORDER BY
meter_value,
readtime ASC;
If you want one row for each unique dataid within the time range, you should use the DISTINCT ON construction. The following query will give you a row for each dataid for each day in the range described in the WHERE clause and lets you extend the range if you want to return rows for each day x dataid combination.
select distinct on(dataid, date_trunc('day', readtime)) *
from university.gas_ert
where readtime between '2014-01-01' and '2014-01-02'
order by dataid, date_trunc('day', readtime) asc
You can take a look at window functions to help out in this. ROW_NUMBER.
GROUP the records on the basis of day using date_trunc(ie without the time component) and then rank them on the basis of readtime asc
select *
from (
select *
,row_number() over(partition by date_trunc('day',a.readtime) order by a.readtime asc ) as rnk
from university.gas_ert a
)x
where x.rnk=1
I have a survey form of certain questions for a certain facility.
the facility can be monitored(data entry) more than once in a month.
now i need the latest data(values) against the questions
but if there is no latest data against any question i will traverse through prior records(previous dates) of the same month.
i can get the latest record but i don't know how to get previous record of the same month id there is no latest data.
i am using PostgreSQL 10.
Table Structure is
Desired output is
You can try to use ROW_NUMBER window function to make it.
SELECT to_char(date, 'MON') month,
facility,
idquestion,
value
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY facility,idquestion ORDER BY DATE DESC) rn
FROM T
) t1
where rn = 1
demo:db<>fiddle
SELECT DISTINCT
to_char(qdate, 'MON'),
facility,
idquestion,
first_value(value) OVER (PARTITION BY facility, idquestion ORDER BY qdate DESC) as value
FROM questions
ORDER BY facility, idquestion
Using window functions:
first_value(value) OVER ... gives you the first value of a window frame. The frame is a group of facility and idquestion. Within this group the rows are ordered by date DESC. So the very last value is first no matter which date it is
DISTINCT filtered the tied values (e.g. there are two values for facility == 1 and idquestion == 7)
Please notice:
"date" is a reserved word in Postgres. I strongly recommend to rename your column to avoid certain trouble. Furthermore in Postgres lower case is used and is recommended.
I have a unique scenario to which i can't find a solution, so i thought to ask the experts :)
I have a query that returns a course syllabus, which each row represent a day of training. You can see in the picture below that there are rest days in the middle of the training
I can't find a way to group the each consecutive training days
Please see screenshot below detailed the rows and what i want to achieve
I am using MS-SQL 2014
Here is a Fiddle with the data i have and the expected results
SQL Fiddle
The simplest method is a difference of row_number(). The following identifies each consecutive group with a number:
select td.*,
dense_rank() over (order by dateadd(day, - seqnum, DayOfTraining)) as grpnum
from (select td.*,
row_number() over (order by DayOfTraining) as seqnum
from TrainingDays td
) td;
The key idea is that subtracting a sequence from consecutive days produces a constant for those days.
Here is the SQL Fiddle.
After many hit and trials, this is the closest I could come up with
http://rextester.com/ECBQ88563
The problem here is that if last row belongs to another group, it will still use it with previous group. So in your sample if you change last date from 19 to 20, the output will still be the same. May be with another condition we can eliminate it. Other than that this should work.
SELECT DayOfTraining1,
dense_rank() over (ORDER BY grp_dt) AS grp
FROM
(SELECT DayOfTraining1,
min(DayOfTraining) AS grp_dt
FROM
(SELECT trng.DayOfTraining AS DayOfTraining1,
dd.DayOfTraining
FROM trng
CROSS JOIN
(SELECT d.*
FROM
(SELECT trng.*,
lag (DayOfTraining,1) OVER (
ORDER BY DayOfTraining) AS nxt_DayOfTraining,
lead (DayOfTraining,1) OVER (
ORDER BY DayOfTraining) AS prev_DayOfTraining,
datediff(DAY, lag (DayOfTraining,1) OVER (
ORDER BY DayOfTraining), DayOfTraining) AS ddf
FROM trng
) d
WHERE d.ddf <> 1
OR prev_DayOfTraining IS NULL
) dd
WHERE trng.DayOfTraining <= dd.DayOfTraining
) t
GROUP BY DayOfTraining1
) t1;
Explanation: The inner query d is using lag and lead functions to capture previous and next rows values. Then we are taking the days difference and using and capturing dates where difference is not 1. These are the dates where group should switch. Use a derived table dd for the same.
Now cross join this with main table and use aggregate function to determine the continuous groups (took me many hit and trials) to achieve this.
Then use dense_rank function on it to get the group.