Get two most frequent data from SQL tbl? - sql

i have a tbl call it tbl_test in which continously data are inserting and it has approx 10^6 records at a time it has colums
Acquire_Id (Value between 1 to 20 ),
Status_Msg(value between 'A' to 'Z'),
Status_Code(value between 1 to 26)
There is one to one mapping b/w Status_Msg and Status_Code
Now i want to get two most frequent status_msg and Staus_Code Count for each acquirer if they are present in table
Query should be Cost Saving

Most databases support the ANSI standard window functions. You can get what you want using row_number() (or rank() or dense_rank(), depending on how ties are returned) after aggregating the values.
The following returns exactly two rows for each acquirer (even if there are ties).
select t.*
from (select t.acquire_id, t.status_msg, t.status_code, count(*) as cnt,
row_number() over (partition by t.acquire_id order by count(*) desc) as seqnum
from tbl_test t
group by t.acquire_id, t.status_msg, t.status_code
) t
where seqnum <= 2;

Related

How to get nth record in a sql server table without changing the order?(sql server)

for example i have data like this(sql server)
id name
4 anu
3 lohi
1 pras
2 chand
i want 2nd record in a table (means 3 lohi)
if i use row_number() function its changes the order and i get (2 chand)
i want 2nd record from table data
can anyonr please give me the query fro above scenario
There is no such thing as the nth row in a table. And for a simple reason: SQL tables represent unordered sets (technically multi-sets because they allow duplicates).
You can do what you want use offset/fetch:
select t.*
from t
order by id desc
offset 1 fetch first 1 row only;
This assumes that the descending ordering on id is what you want, based on your example data.
You can also do this using row_number():
select t.*
from (select t.*,
row_number() over (order by id desc) as seqnum
from t
) t
where seqnum = 2;
I should note that that SQL Server allows you to assign row_number() without having an effective sort using something like this:
select t.*
from (select t.*,
row_number() over (order by (select NULL)) as seqnum
from t
) t
where seqnum = 2;
However, this returns an arbitrary row. There is no guarantee it returns the same row each time it runs, nor that the row is "second" in any meaningful use of the term.

How to get first and last record from same group in SQL Server?

I'm a new SQL user and need help.
Let's say I have a vehicle number 123 and I've traveled from Region 3 to final destination Region 4. In between, I've visited Region 1 and 5 as well but that's not my concern.
Simple example would be as follow.
Original Table
Desired Output
How can this be done in SQL query?
You have a sequence number so you can use some form of aggregation. One method is:
select records,
max(case when sequence = 1 then fromregion end) as fromregion,
max(case when sequence = maxsequence then toregion) as toregion
from (select t.*, max(sequence) over (partition by records) as max_sequence
from t
) t
group by records;
Unfortunately, SQL Server doesn't offer "first()" or "last()" as aggregation functions. But it does support first_value() as a window function. This allows you to do the logic without a subquery:
select distinct records,
first_value(fromRegion) over (partition by records order by sequence) as fromregion,
first_value(toRegion) over (partition by records order by sequence desc) as toregion
from t;

Producing n rows per group

It is known that GROUP BY produces one row per group. I want to produce multiple rows per group. The particular use case is, for example, selecting two cheapest offerings for each item.
It is trivial for two or three elements in the group:
select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type)
or price = (select min(price) from fruits as f where f.type = fruits.type
and price > (select min(price) from fruits as f2 where f2.type = fruits.type));
(Select n rows per group in mysql)
But I am looking for a query that can show n rows per group, where n is arbitrarily large. In other words, a query that displays 5 rows per group should be convertible to a query that displays 7 rows per group by just replacing some constants in it.
I am not constrained to any DBMS, so I am interested in any solution that runs on any DBMS. It is fine if it uses some non-standard syntax.
For any database that supports analytic functions\ window functions, this is relatively easy
select *
from (select type,
variety,
price,
rank() over ([partition by something]
order by price) rnk
from fruits) rank_subquery
where rnk <= 3
If you omit the [partition by something], you'll get the top three overall rows. If you want the top three for each type, you'd partition by type in your rank() function.
Depending on how you want to handle ties, you may want to use dense_rank() or row_number() rather than rank(). If two rows tie for first, using rank, the next row would have a rnk of 3 while it would have a rnk of 2 with dense_rank. In both cases, both tied rows would have a rnk of 1. row_number would arbitrarily give one of the two tied rows a rnk of 1 and the other a rnk of 2.
To save anyone looking some time, at the time of this writing, apparently this won't work because https://dev.mysql.com/doc/refman/5.7/en/subquery-restrictions.html.
I've never been a fan of correlated subqueries as most uses I saw for them could usually be written more simply, but I think this has changed by mind... a little. (This is for MySQL.)
SELECT `type`, `variety`, `price`
FROM `fruits` AS f2
WHERE `price` IN (
SELECT DISTINCT `price`
FROM `fruits` AS f1
WHERE f1.type = f2.type
ORDER BY `price` ASC
LIMIT X
)
;
Where X is the "arbitrary" value you wanted.
If you know how you want to limit further in cases of duplicate prices, and the data permits such limiting ...
SELECT `type`, `variety`, `price`
FROM `fruits` AS f2
WHERE (`price`, `other_identifying_criteria`) IN (
SELECT DISTINCT `price`, `other_identifying_criteria`
FROM `fruits` AS f1
WHERE f1.type = f2.type
ORDER BY `price` ASC, `other_identifying_criteria` [ASC|DESC]
LIMIT X
)
;
"greatest N per group problems" can easily be solved using window functions:
select type, variety, price
from (
select type, variety, price,
dense_rank() over (partition by type) order by price as rnk
from fruits
) t
where rnk <= 5;
Windows functions only work on SQL Server 2012 and above. Try this out:
SQL Server 2005 and Above Solution
DECLARE #yourTable TABLE(Category VARCHAR(50), SubCategory VARCHAR(50), price INT)
INSERT INTO #yourTable
VALUES ('Meat','Steak',1),
('Meat','Chicken Wings',3),
('Meat','Lamb Chops',5);
DECLARE #n INT = 2;
SELECT DISTINCT Category,CA.SubCategory,CA.price
FROM #yourTable A
CROSS APPLY
(
SELECT TOP (#n) SubCategory,price
FROM #yourTable B
WHERE A.Category = B.Category
ORDER BY price DESC
) CA
Results in two highest priced subCategories per Category:
Category SubCategory price
------------------------- ------------------------- -----------
Meat Chicken Wings 3
Meat Lamb Chops 5

Selecting 5 Most Recent Records Of Each Group

The below statement retrieves the top 2 records within each group in SQL Server. It works correctly, however as you can see it doesn't scale at all. I mean that if I wanted to retrieve the top 5 or 10 records instead of just 2, you can see how this query statement would grow very quickly.
How can I convert this query into something that returns the same records, but that I can quickly change it to return the top 5 or 10 records within each group instead, rather than just 2? (i.e. I want to just tell it to return the top 5 within each group, rather than having 5 unions as the below format would require)
Thanks!
WITH tSub
as (SELECT CustomerID,
TransactionTypeID,
Max(EventDate) as EventDate,
Max(TransactionID) as TransactionID
FROM Transactions
WHERE ParentTransactionID is NULL
Group By CustomerID,
TransactionTypeID)
SELECT *
from tSub
UNION
SELECT t.CustomerID,
t.TransactionTypeID,
Max(t.EventDate) as EventDate,
Max(t.TransactionID) as TransactionID
FROM Transactions t
WHERE t.TransactionID NOT IN (SELECT tSub.TransactionID
FROM tSub)
and ParentTransactionID is NULL
Group By CustomerID,
TransactionTypeID
Use Partition by to solve this type problem
select values from
(select values ROW_NUMBER() over (PARTITION by <GroupColumn> order by <OrderColumn>)
as rownum from YourTable) ut where ut.rownum<=5
This will partitioned the result on the column you wanted order by EventDate Column then then select those entry having rownum<=5. Now you can change this value 5 to get the top n recent entry of each group.

How do I use ROW_NUMBER()?

I want to use the ROW_NUMBER() to get...
To get the max(ROW_NUMBER()) --> Or i guess this would also be the count of all rows
I tried doing:
SELECT max(ROW_NUMBER() OVER(ORDER BY UserId)) FROM Users
but it didn't seem to work...
To get ROW_NUMBER() using a given piece of information, ie. if I have a name and I want to know what row the name came from.
I assume it would be something similar to what I tried for #1
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
but this didn't work either...
Any Ideas?
For the first question, why not just use?
SELECT COUNT(*) FROM myTable
to get the count.
And for the second question, the primary key of the row is what should be used to identify a particular row. Don't try and use the row number for that.
If you returned Row_Number() in your main query,
SELECT ROW_NUMBER() OVER (Order by Id) AS RowNumber, Field1, Field2, Field3
FROM User
Then when you want to go 5 rows back then you can take the current row number and use the following query to determine the row with currentrow -5
SELECT us.Id
FROM (SELECT ROW_NUMBER() OVER (ORDER BY id) AS Row, Id
FROM User ) us
WHERE Row = CurrentRow - 5
Though I agree with others that you could use count() to get the total number of rows, here is how you can use the row_count():
To get the total no of rows:
with temp as (
select row_number() over (order by id) as rownum
from table_name
)
select max(rownum) from temp
To get the row numbers where name is Matt:
with temp as (
select name, row_number() over (order by id) as rownum
from table_name
)
select rownum from temp where name like 'Matt'
You can further use min(rownum) or max(rownum) to get the first or last row for Matt respectively.
These were very simple implementations of row_number(). You can use it for more complex grouping. Check out my response on Advanced grouping without using a sub query
If you need to return the table's total row count, you can use an alternative way to the SELECT COUNT(*) statement.
Because SELECT COUNT(*) makes a full table scan to return the row count, it can take very long time for a large table. You can use the sysindexes system table instead in this case. There is a ROWS column that contains the total row count for each table in your database. You can use the following select statement:
SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name') AND indid < 2
This will drastically reduce the time your query takes.
You can use this for get first record where has clause
SELECT TOP(1) * , ROW_NUMBER() OVER(ORDER BY UserId) AS rownum
FROM Users
WHERE UserName = 'Joe'
ORDER BY rownum ASC
ROW_NUMBER() returns a unique number for each row starting with 1. You can easily use this by simply writing:
ROW_NUMBER() OVER (ORDER BY 'Column_Name' DESC) as ROW_NUMBER
May not be related to the question here. But I found it could be useful when using ROW_NUMBER -
SELECT *,
ROW_NUMBER() OVER (ORDER BY (SELECT 100)) AS Any_ID
FROM #Any_Table
select
Ml.Hid,
ml.blockid,
row_number() over (partition by ml.blockid order by Ml.Hid desc) as rownumber,
H.HNAME
from MIT_LeadBechmarkHamletwise ML
join [MT.HAMLE] h on ML.Hid=h.HID
SELECT num, UserName FROM
(SELECT UserName, ROW_NUMBER() OVER(ORDER BY UserId) AS num
From Users) AS numbered
WHERE UserName='Joe'
You can use Row_Number for limit query result.
Example:
SELECT * FROM (
select row_number() OVER (order by createtime desc) AS ROWINDEX,*
from TABLENAME ) TB
WHERE TB.ROWINDEX between 0 and 10
--
With above query, I will get PAGE 1 of results from TABLENAME.
If you absolutely want to use ROW_NUMBER for this (instead of count(*)) you can always use:
SELECT TOP 1 ROW_NUMBER() OVER (ORDER BY Id)
FROM USERS
ORDER BY ROW_NUMBER() OVER (ORDER BY Id) DESC
Need to create virtual table by using WITH table AS, which is mention in given Query.
By using this virtual table, you can perform CRUD operation w.r.t row_number.
QUERY:
WITH table AS
-
(SELECT row_number() OVER(ORDER BY UserId) rn, * FROM Users)
-
SELECT * FROM table WHERE UserName='Joe'
-
You can use INSERT, UPDATE or DELETE in last sentence by in spite of SELECT.
SQL Row_Number() function is to sort and assign an order number to data rows in related record set. So it is used to number rows, for example to identify the top 10 rows which have the highest order amount or identify the order of each customer which is the highest amount, etc.
If you want to sort the dataset and number each row by seperating them into categories we use Row_Number() with Partition By clause. For example, sorting orders of each customer within itself where the dataset contains all orders, etc.
SELECT
SalesOrderNumber,
CustomerId,
SubTotal,
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY SubTotal DESC) rn
FROM Sales.SalesOrderHeader
But as I understand you want to calculate the number of rows of grouped by a column. To visualize the requirement, if you want to see the count of all orders of the related customer as a seperate column besides order info, you can use COUNT() aggregation function with Partition By clause
For example,
SELECT
SalesOrderNumber,
CustomerId,
COUNT(*) OVER (PARTITION BY CustomerId) CustomerOrderCount
FROM Sales.SalesOrderHeader
This query:
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
will return all rows where the UserName is 'Joe' UNLESS you have no UserName='Joe'
They will be listed in order of UserID and the row_number field will start with 1 and increment however many rows contain UserName='Joe'
If it does not work for you then your WHERE command has an issue OR there is no UserID in the table. Check spelling for both fields UserID and UserName.