Getting the min() of a count(*) column - sql

I have a table called Vehicle_Location containing the columns (and more):
ID NUMBER(10)
SEQUENCE_NUMBER NUMBER(10)
TIME DATE
and I'm trying to get the min/max/avg number of records per day per id.
So far, I have
select id, to_char(time), count(*) as c
from vehicle_location
group by id, to_char(time), min having id = 16
which gives me:
ID TO_CHAR(TIME) COUNT(*)
---------------------- ------------- ----------------------
16 11-05-31 159
16 11-05-23 127
16 11-06-03 56
So I'd like to get the min/max/avg of the count(*) column. I am using Oracle as my RDBMS.

I don't have an oracle station to test on but you should be able to just wrap the aggregator around your SELECT as a subquery/derived table/inline view
So it would be (UNTESTED!!)
SELECT
AVG(s.c)
, MIN(s.c)
, MAX(s.c)
, s.ID
FROM
--Note this is just your query
(select id, to_char(time), count(*) as c from vehicle_location group by id, to_char(time), min having id = 16) as s
GROUP BY s.ID
Here's some reading on it:
http://www.devshed.com/c/a/Oracle/Inserting-SubQueries-in-SELECT-Statements-in-Oracle/3/
EDIT: Though normally it is a bad idea to select both the MIN and MAX in a single query.
EDIT2: The min/max issue is related to how some RDBMS (including oracle) handle aggregations on indexed columns. It may not affect this particular query but the premise is that it's easy to use the index to find either the MIN or the MAX but not both at the same time because any index may not be used effectively.
Here's some reading on it:
http://momendba.blogspot.com/2008/07/min-and-max-functions-in-single-query.html

Related

Group by question in SQL Server, migration from MySQL

Failed finding a solution to my problem, would love your help.
~~ Post has been edited to have only one question ~~-
Group by one query while selecting multiple columns.
In MySQL you can simply group by whatever you want, and it will still select all of them, so if for example I wanted to select the newest 100 transactions, grouped by Email (only get the last transaction of a single email)
In MySQL I would do that:
SELECT * FROM db.transactionlog
group by Email
order by TransactionLogId desc
LIMIT 100;
In SQL Server its not possible, googling a bit suggested to specify each column that I want to have with an aggregate as a hack, that couldn't cause a mix of values (mixing columns between the grouped rows)?
For example:
SELECT TOP(100)
Email,
MAX(ResultCode) as 'ResultCode',
MAX(Amount) as 'Amount',
MAX(TransactionLogId) as 'TransactionLogId'
FROM [db].[dbo].[transactionlog]
group by Email
order by TransactionLogId desc
TransactionLogId is the primarykey which is identity , ordering by it to achieve the last inserted.
Just want to know that the ResultCode and Amount that I'll get doing such query will be of the last inserted row, and not the highest of the grouped rows or w/e.
~Edit~
Sample data -
row1:
Email : test#email.com
ResultCode : 100
Amount : 27
TransactionLogId : 1
row2:
Email: test#email.com
ResultCode:50
Amount: 10
TransactionLogId: 2
Using the sample data above, my goal is to get the row details of
TransactionLogId = 2.
but what actual happens is that I get a mixed values of the two, as I do get transactionLogId = 2, but the resultcode and amount of the first row.
How do I avoid that?
Thanks.
You should first find out which is the latest transaction log by each email, then join back against the same table to retrieve the full record:
;WITH MaxTransactionByEmail AS
(
SELECT
Email,
MAX(TransactionLogId) as LatestTransactionLogId
FROM
[db].[dbo].[transactionlog]
group by
Email
)
SELECT
T.*
FROM
[db].[dbo].[transactionlog] AS T
INNER JOIN MaxTransactionByEmail AS M ON T.TransactionLogId = M.LatestTransactionLogId
You are currently getting mixed results because your aggregate functions like MAX() is considering all rows that correspond to a particular value of Email. So the MAX() value for the Amount column between values 10 and 27 is 27, even if the transaction log id is lower.
Another solution is using a ROW_NUMBER() window function to get a row-ranking by each Email, then just picking the first row:
;WITH TransactionsRanking AS
(
SELECT
T.*,
MostRecentTransactionLogRanking = ROW_NUMBER() OVER (
PARTITION BY
T.Email -- Start a different ranking for each different value of Email
ORDER BY
T.TransactionLogId DESC) -- Order the rows by the TransactionLogID descending
FROM
[db].[dbo].[transactionlog] AS T
)
SELECT
T.*
FROM
TransactionsRanking AS T
WHERE
T.MostRecentTransactionLogRanking = 1

Group BY Statement error to get unique records

I am new to SQL Server, used to work with MYSQL and trying to get the records from a table using Group By.
The table structure is given below:
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active FROM "Schedule" AS S1;
Output:
ID Template_ID Assigned_By Assignees Active
2 25 1 3 1
3 25 5 6 1
6 26 5 6 1
I need to get the values of all columns using the Group By statement below
SELECT Template_ID FROM "Schedule" WHERE "Assignees" IN(6, 3) GROUP BY "Template_ID";
Output:
Template_ID
25
26
I tried the following code to fetch the table using Group By, but it's fetching all the rows.
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active FROM "Schedule" AS S1 INNER JOIN(SELECT Template_ID FROM "Schedule" WHERE "Assignees" IN(6, 3) GROUP BY "Template_ID") AS S2 ON S2.Template_ID=S1.Template_ID
My Output Should be like,
ID Template_ID Assigned_By Assignees Active
2 25 1 3 1
6 26 5 6 1
I was wondering whether I can get ID of the column as well? I use the ID for editing the records in the web.
The query doesn't work as expected in MySQL either, except by accident.
Nonaggregated columns in MySQL aren't part of the SQL standard and not even allowed in MySQL 5.7 and later unless the default value of the ONLY_FULL_GROUP_BY mode is changed.
In earlier versions the result is non-deterministic.
The server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.
This means there's was no way to know what rows will be returned this query :
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active
FROM "Schedule" AS S1
GROUP BY Template_ID;
To get deterministic results you'd need a way to rank rows with the ranking functions introduced in MySQL 8, like ROW_NUMBER(). These are already available in SQL Server since SQL Server 2012 at least. The syntax is the same for both databases :
WITH ranked as AS
(
SELECT
ID,Template_ID,Assigned_By,Assignees Active,
ROW_NUMBER(PARTITION BY Template_ID Order BY ID)
FROM Scheduled
WHERE Assignees IN(6, 3)
)
SELECT ID,Template_ID,Assigned_By,Assignees Active
FROM ranked
Where RN=1
PARTITION BY Template_ID splits the result rows based on their Template_ID value into separate partitions. Within that partition, the rows are ordered based on the ORDER BY clause. Finally, ROW_NUMBER calculates a row number for each ordered partition row.

How to distinguish rows in a database table on the basis of two or more columns while returning all columns in sql server

I want to distinguish Rows on the basis of two or more columns value of the same table at the same time returns all columns from the table.
Ex: I have this table
DB Table
I want my result to be displayed as: filter on the basis of type and Number only. As in abover table type and Number for first and second Row is same so it should be suppressed in result.
txn item Discrip Category type Number Mode
60 2 Loyalty L 6174 XXXXXXX1390 0
60 4 Visa C 1600 XXXXXXXXXXXX4108 1
I have tried with sub query but yet unsuccessful. Please suggest what to try.
Thanks
You can do what you want with row_number():
select t.*
from (select t.*,
row_number() over (partition by type, number order by item) as seqnum
from t
) t
where seqnum = 1;

Get unique records from table avoiding all duplicates based on two key columns

I have a table Trial_tb with columns p_id,t_number and rundate.
Sample values:
p_id|t_number|rundate
=====================
111|333 |1/7/2016||
111|333 |1/1/2016||
222|888 |1/8/2016||
222|444 |1/2/2016||
666|888 |1/6/2016||
555|777 |1/5/2016||
pid and tnumber are key columns. I need fetch values such that the result should not have any record in which pid-tnumber combination are duplicated. For example there is duplication for 111|333 and hence not valid. The query should fetch all other than first two records.
I wrote below script but it fetches only the last record. :(
select rundate,p_id,t_number from
(
select rundate,p_id,t_number,
count(p_id) over (partition by p_id) PCnt,
count(t_number) over (partition by t_number) TCnt
from trialtb
)a
where a.PCnt=1 and a.TCnt=1
The having clause is ideal for this job. Having allows you to filter on aggregated records.
-- Finding unique combinations.
SELECT
p_id,
t_number
FROM
trialtb
GROUP BY
p_id,
t_number
HAVING
COUNT(*) = 1
;
This query returns combinations of p_id and t_number that occur only once.
If you want to include rundate you could add MAX(rundate) AS rundate to the select clause. Because you are only looking at unique occurrences the max or min would always be the same.
Do you mean:
select
p_id,t_number
from
trialtb
group by
p_id,t_number
having
count(*) = 1
or do you need the run date too?
select
p_id,t_number,max(rundate)
from
trialtb
group by
p_id,t_number
having
count(*) = 1
Seeing as you are only looking items with one result using max or min should work fine

How do I check if all posts from a joined table has the same value in a column?

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.
select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.
Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …