Very special kind of AVG statement - sql

Table example:
time a b c
-------------
12:00 1 0 1
12:00 2 3 1
13:00 3 2 1
13:00 3 3 3
14:00 1 1 1
How can I get AVG(a) from row WHERE b!=0 and AVG(c) grouped by time. Is it possible to solve with sql only? I mean that query should not count 1st row to get AVG(a), but not the same with AVG(c).

You can utilize CASE statements to get conditional aggregates:
SELECT AVG(CASE WHEN b != 0 THEN a END)
,AVG(c)
FROM YourTable
GROUP BY time
Demo: SQL Fiddle
This works because a value not captured by WHEN criteria in a CASE statement will default to NULL, and NULL values are ignored by aggregate functions.

SELECT AVG(a), AVG(c) from table WHERE b != 0
group by time
Yea... is this what you need?

You might want to try something like
SELECT T.tTIME
, AVG(CASE WHEN T.B != 0 THEN T.A END)
, AVG(T.C)
FROM #T T
GROUP BY T.tTIME
The output is the following:
tTIME (No column name) (No column name)
12:00:00.0000000 2 1
13:00:00.0000000 3 2
14:00:00.0000000 1 1

Related

Select rows from a particular row to latest row if that particular row type exist

I want to achieve these two requirements using a single query. Currently I'm using 2 queries in the program and use C# to do the process part something like this.
Pseudocode
select top 1 id from table where type=b
if result.row.count > 0 {var typeBid = row["id"]}
select * from table where id >= {typeBid}
else
select * from table
Req1: If there is records exist with type=b, Result should be latest row with type=b and all other rows added after.
Table
--------------------
id type date
--------------------
1 b 2021-10-15
2 a 2021-11-16
3 b 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Result
--------------------
id type date
--------------------
3 b 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Req2: There is NO record exist with type=b. Query should select all the records in the table
Table
---------------------
id type date
---------------------
1 a 2021-10-15
2 a 2021-11-16
3 a 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Result
--------------------
id type date
--------------------
1 a 2021-10-15
2 a 2021-11-16
3 a 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
with max_b_date as (select max(date) as date
from table1 where type = 'b')
select t1.*
from table1 t1
cross join max_b_date
where t1.date >= max_b_date.date
or max_b_date.date is null
(table is a SQL reserved word, https://en.wikipedia.org/wiki/SQL_reserved_words, so I used table1 as table name instead.)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bd05543a9712e27f01528708f10b209f
Please try this(It's somewhat deep but might you exact looking for)
select ab.* from
((select top 1 id, type, date from test where type = 'b' order by id desc)
union
select * from test where type != 'b') as ab
where ab.id >= (select COALESCE((select top 1 id from test where type = 'b' order by id desc), 0))
order by ab.id;
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=739eb6bfee787e5079e616bbf4e933b1
Looks Like you can use an OR condition here
SELECT
*
FROM
(
SELECT
*,
BCount = COUNT(CASE type WHEN 'B' THEN 1 ELSE NULL END)-- to get the Count of Records with Type b.
FROM Table
)Q
WHERE
(
BCount > 0 AND id >= (select top 1 id from table where type=b)-- if there are Row's with Type b then select Req#1
)
OR
(
BCount = 0 -- if THere are no rows with Type B select All
)

SQL Group by only correlative rows

Say I have the following table:
Code A B C Date ID
------------------------------
50 1 1 A 2018-01-08 150001
50 1 1 A 2018-01-15 165454
50 1 1 B 2018-02-01 184545
50 1 1 A 2018-02-02 195487
I need the sql query to output the following:
Code A B C Min(Date) Min(ID)
-------------------------------
50 1 1 A 2018-01-08 150001
50 1 1 B 2018-02-01 184545
50 1 1 A 2018-02-02 195487
If I use standard group by, rows 1,2,4 are grouped in 1 row, and this is not that I want.
I want to select the row with MIN(date) and MIN(id) from the duplicate records that are together based on column code, A, B and C
in this case 1st 2 rows are duplicates so i want the min() row.
and 3rd and 4th row are distinct.
Note that the database is Vertica 8.1, that is very similar to Oracle or PostgreSQL
I think you would need the analytic function LAG(). Using this function, you can get the value of the previous row (or NULL if it's the first row itself). So you can check if the value on the previous row is different or not, and filter accordingly.
I'm not familiar with Vertica, but this should be the correct documentation for it: https://my.vertica.com/docs/7.0.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Analytic/LAGAnalytic.htm
Please try the query below, it should do it:
SELECT l.Code, l.A, l.B, l.C, l.Date, l.ID
FROM (SELECT t.*,
LAG(t.C, 1) OVER (PARTITION BY t.Code, t.A ORDER BY t.Date) prev_val
FROM table_1 t) l
WHERE l.C != l.prev_val
OR l.prev_val IS NULL
ORDER BY l.Code, l.A, l.Date

Conditional Row Deleting in SQL

I have a table that contains 4 columns. I need to remove some of the rows based on the Code and ID columns. A code of 1 initiates the process I'm trying to track and a code of 2 terminates it. I would like to remove all rows for a specific ID when a code of 2 comes after a code of 1 and there is not an additional code 1. For example, my current data set looks like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
2 $70 2/8/2016 3
1 $120 1/3/2016 3
2 $0 6/15/2015 2
1 $120 3/22/2016 2
1 $50 8/15/2015 1
2 $200 8/1/2015 1
After I run my script I would like it to look like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
1 $50 8/15/2015 1
2 $200 8/1/2015 1
In all I have about 150,000 ID's in my actual table but this is the general idea.
You can get the ids using logic like this:
select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end) -- no code 1 after code2
It is then easy enough to incorporate this into a query to get the rest of the details:
select t.*
from t
where t.id not in (select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end)
);
The approach I took was to add up the Code per each ID. If it equals 3 exactly, it should be removed.
;WITH keepID as (
Select
ID
,SUM(code) as 'sumCode'
From #testInit
Group by ID
HAVING SUM(code) <> 3
)
Select *
From #testInit
Where ID IN (Select ID from keepID)
Your post showed keeping ID = 1 which does not seem to fit the criteria ? Are you sure you would be keeping ID = 1 ? It only as 2 records with a code of 1 and a code of 2 which adds up to 3 ... thus, remove it.
I just showed the approach in logic ... let me know if you need help with the delete code.
delete from table
where table.id in
(select id from B where A.id=B.id and B.date>A.date
from
(select code,id,max(date),id where code=1 group by id) as A,
(select code ,id,max(date),id where code=2 group by id) as B)
explanation: select code,id,max(date),id where code=1 as A
will fetch data with the highest date for a specific id of code 1
select code ,id,max(date),id where code=2 group by id) as B
will fetch data with the highest date for a specific id of code 2
select id from B where A.id=B.id and B.date>A.date wil select all the ids for which the code 2 date is higher than code 1 date.

Count of distinct values per day, excluding reoccuring until value changes

I'm really struggling with how to explain this so I'll try and give you the format of the table below, and the desired outcome.
I have a table which contains a uniqueID, date, userID and result. I'm trying to count the number of results that are 'Correct' per day, but I only want to count unique occurances based on the userID column. I then want to exclude any furhter occurances of 'Correct' for that particular userID, until the result for the userID changes to 'Success'.
UID Date UserID Result
1 01/01/2014 5 Correct
2 01/01/2014 5 Correct
3 02/01/2014 4 Correct
4 03/01/2014 4 Correct
5 03/01/2014 5 Incorrect
6 03/01/2014 4 Incorrect
7 05/01/2014 5 Correct
8 07/01/2014 4 Correct
9 08/01/2014 5 Success
10 08/01/2014 4 Success
Based on the above data, I'd expect to see the below:
Date Correct Success
01/01/2014 1 0
02/01/2014 1 0
03/01/2014 0 0
05/01/2014 0 0
07/01/2014 0 0
08/01/2014 0 2
Can anyone help? I'm using SQL Server 2008
Use count(distinct) with case:
select date,
count(distinct case when result = 'Correct' then UserId end) as Correct,
count(distinct case when result = 'Success' then UserId end) as Success
from data d
group by date
order by date;
EDIT:
The above counts correct on all occurrences. If you only want the first one to be counted:
select date,
count(case when result = 'Correct' and seqnum = 1 then UserId end) as Correct,
count(case when result = 'Success' and seqnum = 1 then UserId end) as Success
from (select d.*,
row_number() over (partition by UserId, result order by Uid) as seqnum
from data d
) d;
In this case, the distinct is unnecessary.

Count number of occurrences for each unique value [duplicate]

This question already has answers here:
Count the occurrences of DISTINCT values
(4 answers)
Closed 5 years ago.
Basically I have a table similar to this:
time.....activities.....length
13:00........3.............1
13:15........2.............2
13:00........3.............2
13:30........1.............1
13:45........2.............3
13:15........5.............1
13:45........1.............3
13:15........3.............1
13:45........3.............2
13:45........1.............1
13:15........3.............3
A couple of notes:
Activities can be between 1 and 5
Length can be between 1 and 3
The query should return:
time........count
13:00.........2
13:15.........2
13:30.........0
13:45.........1
Basically for each unique time I want a count of the number of rows where the activities value is 3.
So then I can say:
At 13:00 there were X amount of activity 3s.
At 13:45 there were Y amount of activity 3s.
Then I want a count for activity 1s,2s,4s and 5s. so I can plot the distribution for each unique time.
Yes, you can use GROUP BY:
SELECT time,
activities,
COUNT(*)
FROM table
GROUP BY time, activities;
select time, coalesce(count(case when activities = 3 then 1 end), 0) as count
from MyTable
group by time
SQL Fiddle Example
Output:
| TIME | COUNT |
-----------------
| 13:00 | 2 |
| 13:15 | 2 |
| 13:30 | 0 |
| 13:45 | 1 |
If you want to count all the activities in one query, you can do:
select time,
coalesce(count(case when activities = 1 then 1 end), 0) as count1,
coalesce(count(case when activities = 2 then 1 end), 0) as count2,
coalesce(count(case when activities = 3 then 1 end), 0) as count3,
coalesce(count(case when activities = 4 then 1 end), 0) as count4,
coalesce(count(case when activities = 5 then 1 end), 0) as count5
from MyTable
group by time
The advantage of this over grouping by activities, is that it will return a count of 0 even if there are no activites of that type for that time segment.
Of course, this will not return rows for time segments with no activities of any type. If you need that, you'll need to use a left join with table that lists all the possible time segments.
If i am understanding your question, would this work? (you will have to replace with your actual column and table names)
SELECT time_col, COUNT(time_col) As Count
FROM time_table
GROUP BY time_col
WHERE activity_col = 3
You should change the query to:
SELECT time_col, COUNT(time_col) As Count
FROM time_table
WHERE activity_col = 3
GROUP BY time_col
This vl works correctly.