Other alternatives to achieve LIMIT in SQL - sql

I have created an SQL query to get certain data with LIMIT so I can use it in datatable. It has 76288 rows.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber,
ContainerNumber, BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber,
b.SealNumber, b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 100
ORDER BY TransDate ASC;
0 and 100 is inside a variable that changes when the pagination is clicked.
If it's for the first hundred pages, it loads okay. But when I click the last page, it breaks saying timeout exceeded. Also, when I use the search function of the datatable, it's not working the way it should.
Example: I searched for dino in the datatable, it will say it has 95 records but will only show 1 record since the query is only between 0 and 10.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber, ContainerNumber,
BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber, b.SealNumber,
b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 10 AND AgentName LIKE '%dino%'
ORDER BY TransDate ASC;
I also tried TOP and EXCEPT but when I search for SELECT TOP 0... EXCEPT SELECT TOP 100... but it's only showing 9 rows.
UPDATE:
I was able to make it work by including the WHERE clause in the subquery. My only problem now is the ORDER BY. It only works in the current page shown which is data 1 - 10 but not for all the data.
Any alternatives? Your help is highly appreciated. Thanks!

Related

SQL get top 3 values / bottom 3 values with group by and sum

I am working on a restaurant management system. There I have two tables
order_details(orderId,dishId,createdAt)
dishes(id,name,imageUrl)
My customer wants to see a report top 3 selling items / least selling 3 items by the month
For the moment I did something like this
SELECT
*
FROM
(SELECT
SUM(qty) AS qty,
order_details.dishId,
MONTHNAME(order_details.createdAt) AS mon,
dishes.name,
dishes.imageUrl
FROM
rms.order_details
INNER JOIN dishes ON order_details.dishId = dishes.id
GROUP BY order_details.dishId , MONTHNAME(order_details.createdAt)) t
ORDER BY t.qty
This gives me all the dishes sold count order by qty.
I have to manually filter max 3 records and reject the rest. There should be a SQL way of doing this. How do I do this in SQL?
You would use row_number() for this purpose. You don't specify the database you are using, so I am guessing at the appropriate date functions. I also assume that you mean a month within a year, so you need to take the year into account as well:
SELECT ym.*
FROM (SELECT YEAR(od.CreatedAt) as yyyy,
MONTH(od.createdAt) as mm,
SUM(qty) AS qty,
od.dishId, d.name, d.imageUrl,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_desc,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_asc
FROM rms.order_details od INNER JOIN
dishes d
ON od.dishId = d.id
GROUP BY YEAR(od.CreatedAt), MONTH(od.CreatedAt), od.dishId
) ym
WHERE seqnum_asc <= 3 OR
seqnum_desc <= 3;
Using the above info i used i combination of group by, order by and limit
as shown below. I hope this is what you are looking for
SELECT
t.qty,
t.dishId,
t.month,
d.name,
d.mageUrl
from
(
SELECT
od.dishId,
count(od.dishId) AS 'qty',
date_format(od.createdAt,'%Y-%m') as 'month'
FROM
rms.order_details od
group by date_format(od.createdAt,'%Y-%m'),od.dishId
order by qty desc
limit 3) t
join rms.dishes d on (t.dishId = d.id)

Delete the records repeated by date, and keep the oldest

I have this query, and it returns the following result, I need to delete the records repeated by date, and keep the oldest, how could I do this?
select
a.EMP_ID, a.EMP_DATE,
from
EMPLOYES a
inner join
TABLE2 b on a.table2ID = b.table2ID and b.ID_TYPE = 'E'
where
a.ID = 'VJAHAJHSJHDAJHSJDH'
and year(a.DATE) = 2021
and month(a.DATE) = 1
and a.ID <> 31
order by
a.DATE;
Additionally, I would like to fill in the missing days of the month ... and put them empty if I don't have that data, can this be done?
I would appreciate if you could guide me to solve this problem
Thank you!
The other answers miss some of the requirement..
Initial step - do this once only. Make a calendar table. This will come in handy for all sorts of things over the time:
DECLARE #Year INT = '2000';
DECLARE #YearCnt INT = 50 ;
DECLARE #StartDate DATE = DATEFROMPARTS(#Year, '01','01')
DECLARE #EndDate DATE = DATEADD(DAY, -1, DATEADD(YEAR, #YearCnt, #StartDate));
;WITH Cal(n) AS
(
SELECT 0 UNION ALL SELECT n + 1 FROM Cal
WHERE n < DATEDIFF(DAY, #StartDate, #EndDate)
),
FnlDt(d, n) AS
(
SELECT DATEADD(DAY, n, #StartDate), n FROM Cal
),
FinalCte AS
(
SELECT
[D] = CONVERT(DATE,d),
[Dy] = DATEPART(DAY, d),
[Mo] = DATENAME(MONTH, d),
[Yr] = DATEPART(YEAR, d),
[DN] = DATENAME(WEEKDAY, d),
[N] = n
FROM FnlDt
)
SELECT * INTO Cal FROM finalCte
ORDER BY [Date]
OPTION (MAXRECURSION 0);
credit: mostly this site
Now we can write some simple query to stick your data (with one small addition) onto it:
--your query, minus the date bits in the WHERE, and with a ROW_NUMBER
WITH yourQuery AS(
SELECT a.emp_id, a.emp_date,
ROW_NUMBER() OVER(PARTITION BY CAST(a.emp_date AS DATE) ORDER BY a.emp_date) rn
FROM EMPLOYES a
INNER JOIN TABLE2 b on a.table2ID = b.table2ID
WHERE a.emp_id = 'VJAHAJHSJHDAJHSJDH' AND a.id <> 31 AND b.id_type = 'E'
)
--your query, left joined onto the cal table so that you get a row for every day even if there is no emp data for that day
SELECT c.d, yq.*
FROM
Cal c
LEFT JOIN yourQuery yq
ON
c.d = CAST(yq.emp_date AS DATE) AND --cut the time off
yq.rn = 1 --keep only the earliest time per day
WHERE
c.d BETWEEN '2021-01-01' AND EOMONTH('2021-01-01')
We add a rownumbering to your table, it restarts every time the date changes and counts up in order of time. We make this into a CTE (or a subquery, CTE is cleaner) then we simply left join it to the calendar table. This means that for any date you don't have data, you still have the calendar date. For any days you do have data, the rownumber rn being a condition of the join means that only the first datetime from each day is present in the results
Note: something is wonky about your question . You said you SELECT a.emp_id and your results show 'VJAHAJHSJHDAJHSJDH' is the emp id, but your where clause says a.id twice, once as a string and once as a number - this can't be right, so I've guessed at fixing it but I suspect you have translated your query into something for SO, perhaps to hide real column names.. Also your SELECT has a dangling comma that is a syntax error.
If you have translated/obscured your real query, make absolutely sure you understand any answer here when translating it back. It's very frustrating when someone is coming back and saying "hi your query doesn't work" then it turns out that they damaged it trying to translate it back to their own db, because they hid the real column names in the question..
FInally, do not use functions on table data in a where clause; it generally kills indexing. Always try and find a way of leaving table data alone. Want all of january? Do like I did, and say table.datecolumn BETWEEN firstofjan AND endofjan etc - SQLserver at least stands a chance of using an index for this, rather than calling a function on every date in the table, every time the query is run
You can use ROW_NUMBER
WITH CTE AS
(
SELECT a.EMP_ID, a.EMP_DATE,
RN = ROW_NUMBER() OVER (PARTITION BY a.EMP_ID, CAST(a.DATE as Date) ORDER BY a.DATE ASC)
from EMPLOYES a INNER JOIN TABLE2 b
on a.table2ID = b.table2ID
and b.ID_TYPE = 'E'
where a.ID = 'VJAHAJHSJHDAJHSJDH'
and year(a.DATE) = 2021
and MONTH(a.DATE) = 1
and a.ID <> 31
)
SELECT * FROM CTE
WHERE RN = 1
Try with an aggregate function MAX or MIN
create table #tmp(dt datetime, val numeric(4,2))
insert into #tmp values ('2021-01-01 10:30:35', 1)
insert into #tmp values ('2021-01-02 10:30:35', 2)
insert into #tmp values ('2021-01-02 11:30:35', 3)
insert into #tmp values ('2021-01-03 10:35:35', 4)
select * from #tmp
select tmp.*
from #tmp tmp
inner join
(select max(dt) as dt, cast(dt as date) as dt_aux from #tmp group by cast(dt as date)) compressed_rows on
tmp.dt = compressed_rows.dt
drop table #tmp
results:

SQL with as expression shows multiple results

I am writing a SQL query using with as expression. I always get a result in the square of what I required.
This is my query:
DECLARE #MAX_DATE AS INT
SET #MAX_DATE = (SELECT DATEPART(MONTH,FECHA) FROM ALBVENTACAB WHERE NUMALBARAN IN (SELECT DISTINCT MAX(NUMALBARAN) FROM ALBVENTACAB));
;WITH TABLE_LAST AS (
SELECT CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA)) as LAST_YEAR_MONTH
,SUM(TOTALNETO) AS LAST_YEAR_VALUE
FROM ALBVENTACAB
WHERE DATEPART(YEAR,CURRENT_TIMESTAMP) -1 = DATEPART(YEAR,FECHA) AND NUMSERIE LIKE 'A%'
AND DATEPART(MONTH,FECHA) <= #MAX_DATE
GROUP BY CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA))
)
,TABLE_CURRENT AS(
SELECT CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA)) as CURR_YEAR_MONTH
,SUM(TOTALNETO) AS CURR_YEAR_VALUE
FROM ALBVENTACAB
WHERE DATEPART(YEAR,CURRENT_TIMESTAMP) <= DATEPART(YEAR,FECHA) AND NUMSERIE LIKE 'A%'
GROUP BY CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA))
)
SELECT *
FROM TABLE_CURRENT, TABLE_LAST
When I run the query I get exactly the square of the result.
I want to compare sale monthly with last year.
2-2020 814053.3 2-2019 840295.1
1-2020 1094993.65 2-2019 840295.1
3-2020 293927.3 2-2019 840295.1
2-2020 814053.3 1-2019 1050701.68
1-2020 1094993.65 1-2019 1050701.68
3-2020 293927.3 1-2019 1050701.68
2-2020 814053.3 3-2019 887776.1
1-2020 1094993.65 3-2019 887776.1
3-2020 293927.3 3-2019 887776.1
I should get only 3 rows instead of 9 rows.
You need to properly join your two CTE - the way you're doing it now, you're getting a Cartesian product of each row in either CTE together.
Do something like:
*;WITH TABLE_LAST AS
( ....
),
TABLE_CURRENT AS
( ....
)
SELECT *
FROM TABLE_CURRENT curr
INNER JOIN TABLE_LAST last ON (some join condition here)
What that join condition is going to be - I have no idea, and cannot tell from your question - but you have to define how these two sets of data "connect" ....
It could be something like:
SELECT *
FROM TABLE_CURRENT curr
INNER JOIN TABLE_LAST last ON curr.CURR_YEAR_MONTH = last.LAST_YEAR_MONT
or whatever else makes sense in your situation - but basically, you need to somehow "tie together" these two sets of data and get only those rows that make sense - not just every row from "last" combined with every row from "curr" ....
While you already got the answer on how to join the two results, I thought I'd tell you how to typically approach such problems.
From the same table, you want two sums on different conditions (different years that is). You solve this with conditional aggregation, which does just that: aggregate (sum) based on a condition (year).
select
datepart(month, fecha) as month,
sum(case when datepart(year, fecha) = datepart(year, getdate()) then totalneto end) as this_year,
sum(case when datepart(year, fecha) = datepart(year, getdate()) -1 then totalneto end) as last_year
from albventacab
where numserie like 'A%'
and fecha > dateadd(year, -2, getdate())
group by datepart(month, fecha)
order by datepart(month, fecha);

Taking most recent values in sum over date range

I have a table which has the following columns: DeskID *, ProductID *, Date *, Amount (where the columns marked with * make the primary key). The products in use vary over time, as represented in the image below.
Table format on the left, and a (hopefully) intuitive representation of the data on the right for one desk
The objective is to have the sum of the latest amounts of products by desk and date, including products which are no longer in use, over a date range.
e.g. using the data above the desired table is:
So on the 1st Jan, the sum is 1 of Product A
On the 2nd Jan, the sum is 2 of A and 5 of B, so 7
On the 4th Jan, the sum is 1 of A (out of use, so take the value from the 3rd), 5 of B, and 2 of C, so 8 in total
etc.
I have tried using a partition on the desk and product ordered by date to get the most recent value and turned the following code into a function (Function1 below) with #date Date parameter
select #date 'Date', t.DeskID, SUM(t.Amount) 'Sum' from (
select #date 'Date', t.DeskID, t.ProductID, t.Amount
, row_number() over (partition by t.DeskID, t.ProductID order by t.Date desc) as roworder
from Table1 t
where 1 = 1
and t.Date <= #date
) t
where t.roworder = 1
group by t.DeskID
And then using a utility calendar table and cross apply to get the required values over a time range, as below
select * from Calendar c
cross apply Function1(c.CalendarDate)
where c.CalendarDate >= '20190101' and c.CalendarDate <= '20191009'
This has the expected results, but is far too slow. Currently each desk uses around 50 products, and the products roll every month, so after just 5 years each desk has a history of ~3000 products, which causes the whole thing to grind to a halt. (Roughly 30 seconds for a range of a single month)
Is there a better approach?
Change your function to the following should be faster:
select #date 'Date', t.DeskID, SUM(t.Amount) 'Sum'
FROM (SELECT m.DeskID, m.ProductID, MAX(m.[Date) AS MaxDate
FROM Table1 m
where m.[Date] <= #date) d
INNER JOIN Table1 t
ON d.DeskID=t.DeskID
AND d.ProductID=t.ProductID
and t.[Date] = d.MaxDate
group by t.DeskID
The performance of TVF usually suffers. The following removes the TVF completely:
-- DROP TABLE Table1;
CREATE TABLE Table1 (DeskID int not null, ProductID nvarchar(32) not null, [Date] Date not null, Amount int not null, PRIMARY KEY ([Date],DeskID,ProductID));
INSERT Table1(DeskID,ProductID,[Date],Amount)
VALUES (1,'A','2019-01-01',1),(1,'A','2019-01-02',2),(1,'B','2019-01-02',5),(1,'A','2019-01-03',1)
,(1,'B','2019-01-03',4),(1,'C','2019-01-03',3),(1,'B','2019-01-04',5),(1,'C','2019-01-04',2),(1,'C','2019-01-05',2)
GO
DECLARE #StartDate date=N'2019-01-01';
DECLARE #EndDate date=N'2019-01-05';
;WITH cte_p
AS
(
SELECT DISTINCT DeskID,ProductID
FROM Table1
WHERE [Date] <= #EndDate
),
cte_a
AS
(
SELECT #StartDate AS [Date], p.DeskID, p.ProductID, ISNULL(a.Amount,0) AS Amount
FROM (
SELECT t.DeskID, t.ProductID
, MAX(t.Date) AS FirstDate
FROM Table1 t
WHERE t.Date <= #StartDate
GROUP BY t.DeskID, t.ProductID) f
INNER JOIN Table1 a
ON f.DeskID=a.DeskID
AND f.ProductID=a.ProductID
AND f.[FirstDate]=a.[Date]
RIGHT JOIN cte_p p
ON p.DeskID=a.DeskID
AND p.ProductID=a.ProductID
UNION ALL
SELECT DATEADD(DAY,1,a.[Date]) AS [Date], t.DeskID, t.ProductID, t.Amount
FROM Table1 t
INNER JOIN cte_a a
ON t.DeskID=a.DeskID
AND t.ProductID=a.ProductID
AND t.[Date] > a.[Date]
AND t.[Date] <= DATEADD(DAY,1,a.[Date])
WHERE a.[Date]<#EndDate
UNION ALL
SELECT DATEADD(DAY,1,a.[Date]) AS [Date], a.DeskID, a.ProductID, a.Amount
FROM cte_a a
WHERE NOT EXISTS(SELECT 1 FROM Table1 t
WHERE t.DeskID=a.DeskID
AND t.ProductID=a.ProductID
AND t.[Date] > a.[Date]
AND t.[Date] <= DATEADD(DAY,1,a.[Date]))
AND a.[Date]<#EndDate
)
SELECT [Date], DeskID, SUM(Amount)
FROM cte_a
GROUP BY [Date], DeskID;

Selecting only if at least one row matches condition

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it
I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.
Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60
Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!