Construct a grouping column in SQL Server 2012 - sql

I have created a table that looks something like this:
ID TSPPLY_DT NEXT_DT DAYS_BTWN TIME_TO_EVENT CENSORED ENDPOINT
-----------------------------------------------------------------------------
1 2014-01-01 2014-01-10 10 10 0 0
1 2014-01-10 2014-01-21 11 21 0 0
1 2014-01-21 NULL NULL 21 1 0
2 2015-04-01 2015-04-30 30 30 0 0
2 2015-04-30 2015-05-03 1 31 0 1
2 2015-05-03 2015-05-06 3 34(should be 3)0 0
2 2015-05-06 2015-05-16 10 44(shouldbe 13)1 0
The TIME_TO_EVENT column however is not adding up correctly with my code - The idea is to add up the days between until either ID changes, CENSORED = 1 or ENDPOINT = 1.
I think what I need is an addition column where I can sum based on an aggregate of ID and GROUPING... With an output as follows:
ID TSPPLY_DT NEXT_DT DAYS_BTWN TIME_TO_EVENT CENSORED ENDPOINT GROUPING
----------------------------------------------------------------------------------------
1 2014-01-01 2014-01-10 10 10 0 0 A
1 2014-01-10 2014-01-21 11 21 0 0 A
1 2014-01-21 NULL NULL 21 1 0 A
2 2015-04-01 2015-04-30 30 30 0 0 A
2 2015-04-30 2015-05-03 1 31 0 1 A
2 2015-05-03 2015-05-06 3 3 0 0 B
2 2015-05-06 2015-05-16 10 13 1 0 B
So any ideas on how to create the GROUPING column? It would be something like IF next rows ID is the same as current row, check CENSORED AND ENDPOINT. If either = 1, for the next row, change the grouping to a new value. Once a new ID is reached, reset the grouping to A (or whatever arbitrary value) and run the test again.

You need to use the DATEDIFF function, like this:
DATEDIFF(d, TSPPLY_DT, NEXT_DT) AS DAYS_BTWN
Now you don't need GROUP BY.

Related

How to write the query to make report by month in sql

I have the receiving and sending data for whole year. so i want to built the monthly report base on that data with the rule is Fisrt in first out. It means is the first receiving will be sent out first ...
DECLARE #ReceivingTbl AS TABLE(Id INT,ProId int, RecQty INT,ReceivingDate DateTime)
INSERT INTO #ReceivingTbl
VALUES (1,1001,210,'2019-03-12'),
(2,1001,315,'2019-06-15'),
(3,2001,500,'2019-04-01'),
(4,2001,10,'2019-06-15'),
(5,1001,105,'2019-07-10')
DECLARE #SendTbl AS TABLE(Id INT,ProId int, SentQty INT,SendMonth int)
INSERT INTO #SendTbl
VALUES (1,1001,50,3),
(2,1001,100,4),
(3,1001,80,5),
(4,1001,80,6),
(5,2001,200,6)
SELECT * FROM #ReceivingTbl ORDER BY ProId,ReceivingDate
SELECT * FROM #SendTbl ORDER BY ProId,SendMonth
Id ProId RecQty ReceivingDate
1 1001 210 2019-03-12
2 1001 315 2019-06-15
5 1001 105 2019-07-10
3 2001 500 2019-04-01
4 2001 10 2019-06-15
Id ProId SentQty SendMonth
1 1001 50 3
2 1001 100 4
3 1001 80 5
4 1001 80 6
5 2001 200 6
--- And the below is what i want:
Id ProId RecQty ReceivingDate ... Mar Apr May Jun
1 1001 210 2019-03-12 ... 50 100 60 0
2 1001 315 2019-06-15 ... 0 0 20 80
5 1001 105 2019-07-10 ... 0 0 0 0
3 2001 500 2019-04-01 ... 0 0 0 200
4 2001 10 2019-06-15 ... 0 0 0 0
Thanks!
Your question is not clear to me.
If you want to purely use the FIFO approach, therefore ignore any data the table contains, you necessarely need to order by ID, which in your example you are providing, and looks like it is in order of insert.
The first line inserted should be also the first line appearing in the select (FIFO), in order to do so you have to use:
ORDER BY Id ASC
Which will place the lower value of the ID first (1, 2, 3, ...)
To me though, this doesn't make much sense, so pay attention to the meaning o the data you actually have and leverage dates like ReceivingDate, and order by that, maybe even filtering by month of the date, below an example for January data:
WHERE MONTH(ReceivingDate) = 1

When calculating using partition show 0

I have trouble composing special query and would be grateful if somebody helped.
I made query looking like this:
select distinct
date, time,
case
when status_id = 5 and flag = 1 then 'autoprocessed'
when status_id = 5 and flag = 0 then 'manually processed'
else 'other state'
end as "state",
count (distinct doc_id) over (partition by date, time,
case when status_id = 5 and flag = 1 then 1
when status_id = 5 and flag = 0 then 2
else 0 end)
from
journal
Now my results look like this:
17.06.16 00:00:00 19 other state 2
17.06.16 00:00:00 19 autoprocessed 3
18.06.16 00:00:00 14 other state 1
20.06.16 00:00:00 14 other state 1
20.06.16 00:00:00 21 other state 2
21.06.16 00:00:00 15 other state 2
-----------------
but I would like to see them more like
17.06.16 00:00:00 19 other state 2
17.06.16 00:00:00 19 autoprocessed 3
17.06.16 00:00:00 19 manually processed 0
18.06.16 00:00:00 14 other state 1
18.06.16 00:00:00 14 autoprocessed 0
18.06.16 00:00:00 14 manually processed 0
----------------
Is it possible? I know that regular count can show zeros, even for groups. What about count with partition?
Edit: sample data
date flag staus_id
17.06.16 19:32:45 0 5
17.06.16 19:33:39 0 5
17.06.16 19:34:31 0 23
17.06.16 19:37:25 0 5
17.06.16 19:42:19 0 1
18.06.16 14:33:19 0 1
20.06.16 14:35:55 0 10
20.06.16 21:24:22 0 2
20.06.16 21:24:47 0 2
-------------------
as you can see, at 17.06.16, 19 hours, there are 3 "autoprocessed" doucments (flag = 0, status_id = 5), 2 documents in "other sate" (staus_id <> 5) and 0 "manualy processed documents" (flag = 1, status_id = 5) which gives me first 2 strings of results:
17.06.16 00:00:00 19 other state 2
17.06.16 00:00:00 19 autoprocessed 3

sql - update statement from another table

Can anyone help with the following select statement?
I have 2 tables and I need to update table#1 with data from table#2
The field to update is 'amount'
Table#1
date amount
1 2015-05-01 0
2 2015-05-02 0
3 2015-05-03 0
4 2015-05-04 0
5 2015-05-05 0
6 2015-05-06 0
7 2015-05-07 0
8 2015-05-08 0
table#2
date amount
1 2015-05-01 12
2 2015-05-04 23
3 2015-05-07 30
the update statement should take table#2 to update table#1 and this is how it should look like:
table#1
date amount
1 2015-05-01 12
2 2015-05-02 12
3 2015-05-03 12
4 2015-05-04 23
5 2015-05-05 23
6 2015-05-06 23
7 2015-05-07 30
8 2015-05-08 30
A standard way to do this is using a correlated subquery:
update t1
set amount = (select top 1 t2.amount
from t2
where t2.date <= t1.date
order by t2.date
);
The top might vary from database to database.

return the count of row even if null sql server

I trying to do a sql query to get the count for shift for each user
I used this query :
SELECT
COUNT(s.id) AS count, s.user_id
FROM
sarcuser AS u
INNER JOIN
sarcshiftpointuser AS s ON s.user_id = u.id
INNER JOIN
sarcalllevel AS l ON l.id = u.levelid
INNER JOIN
sarcshiftpointtable AS t ON t.shift_id = s.shift_id AND s.table_id = t.table_id
WHERE
(s.shift_id + '' LIKE '2')
AND (CAST(s.xdate AS DATE) BETWEEN CAST(N'2014-01-01' AS DATE) AND CAST(N'2015-01-01' AS DATE))
AND (u.gender + '' LIKE N'%')
AND (u.levelid + '' LIKE N'%')
AND (s.point_id + '' LIKE '2')
GROUP BY
s.user_id
ORDER BY
count
It works very well ... but there is a logic problem :
when the user didn't appear in the shift didn't return the count and I need it to return 0
For example :
user1 user2
shift1 2 2
shift2 5 0
shift3 6 10
but actually the code returns :
user1 user2
shift1 2 2
shift2 5 10
shift3 6
and that's wrong ... how to return the count even if it zero with this condition and this inner join ?
Sample for data in table :
sarcuser :
id firstname lastname gender levelid
52 samy sammour male 1
62 ibrahim jackob male 1
71 rebeca janson female 3
sarcalllevel :
id name
1 field leader
2 leader
3 paramdic
sarcshiftpointtable :
id shift_id table_id name_of_shift point_id
1 1 1 shift1 2
2 2 1 shift2 2
3 3 1 shift3 2
4 1 2 shift1 7
5 2 2 shift2 7
6 3 2 shift3 7
sarcshiftpointuser :
id point_id shift_id table_id user_id xdate
1 2 1 1 62 2014-01-05
2 2 1 1 0 2014-01-05
3 2 1 1 71 2014-01-05
4 2 2 1 0 2014-01-05
5 2 2 1 0 2014-01-05
6 2 2 1 52 2014-01-05
7 2 3 1 52 2014-01-05
8 2 3 1 62 2014-01-05
9 2 3 1 71 2014-01-05
10 2 1 1 71 2014-01-06
11 2 1 1 52 2014-01-06
12 2 1 1 0 2014-01-06
13 2 2 1 62 2014-01-06
14 2 2 1 0 2014-01-06
15 2 2 1 52 2014-01-06
16 2 3 1 62 2014-01-06
17 2 3 1 52 2014-01-06
18 2 3 1 71 2014-01-06
if i apply this query 3 times by changing the shift should return :
52 62 71
shift1 1 2 2
shift2 2 1 0
shift3 2 2 2
in shift2 in sarcshiftpointuser the user 71 is not appear
so when I do the code it will return just to field not three ? the count 0 is not returned
52 62 71
shift2 2 1
to be more specific :
I need to export this table into excel so when the 0 is not return it give me a wrong order and wrong value (logically )
You will need to use a nested query using IFNULL
Take a look to this
http://www.w3schools.com/sql/sql_isnull.asp
Something like,
IFNULL(user,0)
I think you are referring a crosstab query. you can use PIVOT to return your result set. Please refer below link.
Sql Server 2008 Cross Tab Query.
If you give few sample data for sarcuser , sarcshiftpointuser, sarcalllevel & sarcshiftpointtable tables, then we can give you a better answer.

Complex grouping - design / performance problem

WARNING : This is one BIG Question
I have a design problem that started simple, but in one step of growth has stumped me completely.
The simple version of reality has a nice flat fact table...
All names have been changed to protect the innocent
CREATE TABLE raw_data (
tier0_id INT, tier1_id INT, tier2_id INT, tier3_id INT,
metric0 INT, metric1 INT, metric2 INT, metric3 INT
)
The tierIDs relate to entities in a fixed depth tree. Such as a business hierarchy.
The metrics are just performance figures, such as number of frogs captured, or pigeons released.
In the reporting the kindly user would make selections to mean something like the following:
tier0_id's 34 and 55 - shown separately
all of tier1_id's - grouped together
all of tier2_id's - grouped together
all of tier3_id's - shown separately
metrics 2 and 3
This gives me the following type of query:
SELECT
CASE WHEN #t0_grouping = 1 THEN NULL ELSE tier0_id END AS tier0_id,
CASE WHEN #t1_grouping = 1 THEN NULL ELSE tier1_id END AS tier1_id,
CASE WHEN #t2_grouping = 1 THEN NULL ELSE tier2_id END AS tier2_id,
CASE WHEN #t3_grouping = 1 THEN NULL ELSE tier3_id END AS tier3_id,
SUM(metric2) AS metric2, SUM(metric3) AS metric3
FROM
raw_data
INNER JOIN tier0_values ON tier0_values.id = raw_data.tier0_id OR tier0_values.id IS NULL
INNER JOIN tier1_values ON tier1_values.id = raw_data.tier1_id OR tier1_values.id IS NULL
INNER JOIN tier2_values ON tier2_values.id = raw_data.tier2_id OR tier2_values.id IS NULL
INNER JOIN tier3_values ON tier3_values.id = raw_data.tier3_id OR tier3_values.id IS NULL
GROUP BY
CASE WHEN #t0_grouping = 1 THEN NULL ELSE tier0_id END,
CASE WHEN #t1_grouping = 1 THEN NULL ELSE tier1_id END,
CASE WHEN #t2_grouping = 1 THEN NULL ELSE tier2_id END,
CASE WHEN #t3_grouping = 1 THEN NULL ELSE tier3_id END
It's a nice hybrid of Dynamic SQL, and parametrised queries. And yes, I know, but SQL-CE makes people do strange things. Besides, that can be tidied up as and when the following change gets incorporated...
From now on, we need to be able to include NULLs in the different tiers. This will mean "applies to ALL entities in that tier".
For example, with the following very simplified data:
Activity WorkingTime ActiveTime BusyTime
1 0m 10m 0m
2 0m 15m 0m
3 0m 20m 0m
NULL 60m 0m 45m
WorkingTime never applies to an activity, so al the values go in with a NULL ID. But ActiveTime is specifically about a specific activity, so it goes in with a legitimate ID. BusyTime is also against a NULL activity because it's the cumulation of all the ActiveTime.
If one were to report on this data, the NULL values -always- get included in every row, because the NULL -means- "applies to everything". The data would look like...
Activity WorkingTime ActiveTime BusyTime (BusyOnOtherActivities)
1 60m 10m 45m (45-10 = 35m)
2 60m 15m 45m (45-15 = 30m)
3 60m 20m 45m (45-20 = 25m)
1&2 60m 25m 45m (45-25 = 20m)
1&3 60m 30m 45m (45-30 = 15m)
2&3 60m 35m 45m (45-35 = 10m)
ALL 60m 45m 45m (45-45 = 0m)
Hopefully this example makes sense, because it's actually a multi-tiered hierarchy (as per the original example), and in every tier NULLs are allowed. So I'll try an example with 3 tiers...
t0_id | t1_id | t2_id | m1 | m2 | m3 | m4 | m5
1 3 10 | 0 10 0 0 0
1 4 10 | 0 15 0 0 0
1 5 10 | 0 20 0 0 0
1 NULL 10 | 60 0 45 0 0
2 3 10 | 0 5 0 0 0
2 5 10 | 0 10 0 0 0
2 6 10 | 0 15 0 0 0
2 NULL 10 | 50 0 30 0 0
1 3 11 | 0 7 0 0 0
1 4 11 | 0 8 0 0 0
1 5 11 | 0 9 0 0 0
1 NULL 11 | 30 0 24 0 0
2 3 11 | 0 8 0 0 0
2 5 11 | 0 10 0 0 0
2 6 11 | 0 12 0 0 0
2 NULL 11 | 40 0 30 0 0
NULL NULL 10 | 0 0 0 60 0
NULL NULL 11 | 0 0 0 60 0
NULL NULL NULL | 0 0 0 0 2
This would give many, many possible different output records in the reporting, but here are a few examples...
t0_id | t1_id | t2_id | m1 | m2 | m3 | m4 | m5
1 3 10 | 60 10 45 60 2
1 4 10 | 60 15 45 60 2
1 5 10 | 60 20 45 60 2
2 3 10 | 50 5 30 60 2
2 5 10 | 50 10 30 60 2
2 6 10 | 50 15 30 60 2
1 ALL 10 | 60 45 45 60 2
2 ALL 10 | 50 30 30 60 2
ALL 3 10 | 110 15 75 60 2
ALL 4 10 | 60 15 45 60 2
ALL 5 10 | 110 30 75 60 2
ALL 6 10 | 50 15 30 60 2
ALL 3 ALL | 180 30 129 120 2
ALL 4 ALL | 90 23 69 120 2
ALL 5 ALL | 180 49 129 120 2
ALL 6 ALL | 90 27 60 120 2
ALL ALL 10 | 110 129 129 60 2
ALL ALL 11 | 70 129 129 60 2
ALL ALL ALL | 180 129 129 120 2
1 3&4 ALL | 90 40 69 120 2
ALL 3&4 ALL | 180 53 129 120 2
As messy as this is to explain, it makes complete and logical sense in my head. I understand what is being asked, but for the life of me I can not seem to write a query for this that doesn't take excruciating amounts of time to execute.
So, how would you write such a query, and/or refactor the schema?
I appreciate that people will ask for examples of what I've done so far, but I'm eager to hear other people's uncorrupted ideas and advice first ;)
The problem looks more like a normalization activity. I would start with normalizing the table
to something like: (You may need some more identity fields depending on your usage)
CREATE TABLE raw_data (
rawData_ID INT,
Activity_id INT,
metric0 INT)
I'd create a tiering table that looks something like: (tierplan allows for multiple groupings. If a tier_id has no parent to roll up under, then tierparent_id is NULL This alllows for recursion in the query.)
CREATE TABLE tiers (
tierplan_id INT,
tier_id INT,
tierparent_id INT)
Finally, I'd create a table that relates tiers and Activities something like:
CREATE TABLE ActivTiers (
Activplan_id INT, --id on the table
tierplan_id INT, --tells what tierplan the raw_data falls under
rawdata_id INT) --this allows the ActivityId to be payload instead of identifier.
Queries off of this ought to be "not too difficult."