PostgreSQL : Applying Pagination on UNION queries

PostgreSQL : Applying Pagination on UNION queries - sql

I've got 2 tables : Sale & Purchase.
I want to get a paginated Ledger report in a date range.
Query :
SELECT X.*
FROM
(
SELECT * FROM Sales WHERE date >= <> AND date <= <>
UNION ALL
SELECT * FROM Purchases WHERE date >= <> AND date <= <>
) X
ORDER BY
X.date
LIMIT 50 OFFSET 100
This query first fetches ALL the transactions in the date range and then applies LIMIT OFFSET on it; so there is no optimization on the SQL cost.
If I apply LIMIT OFFSET on the inner 2 select queries, it leads to data loss as the UNION is sorted by date finally.
Is there a way in which ALL data is not fetched; but only the paginated Sales & Purchases and then I get a merged ledger report ?
Thanks for any help !

Have you tried doing the UNION inside a CTE? (untested)
WITH x AS (
SELECT * FROM Sales WHERE ...
UNION ALL
SELECT * FROM Purchases WHERE ...
)
SELECT * FROM x
ORDER BY id
LIMIT 50 OFFSET 100

Related

How to write a SQL query to find the first time when sum greater than a number？

I have a postgresql table:
create table orders
(
id int,
cost int,
time timestamp
);
How to write a PostgreSQL query to find the first time when sum(cost) is greater than 200？
For example：
id cost time
------------------
1 120 2019-10-10
2 50 2019-11-11
3 80 2019-12-12
4 60 2019-12-16
The first time sum(cost) greater than 200 is 2019-12-12.

This is a variation of Nick's answer (which would be correct with an ORDER BY). However, this version is more efficient:
select d.*
from (select d.*,
sum(d.cost) over (order by d.time) as running_cost
from d
) d
where running_cost - cost < 200 and
running_cost >= 200;
Note that this does not require an order by in the outer query to work correctly.
There is also almost a way to solve this without using a subquery:
select o.*
from orders o
order by (sum(cost) over (order by time) >= 200) desc,
time asc
limit 1;
The only issue is that this will return a row if no row matches the condition. You could get around this by using a subquery in the limit:
limit (case when (select sum(cost) from orders) >= 400 then 1 else 0 end)
But then a subquery would be needed.

For PostgreSQL, you can get this result by using a CTE to calculate the SUM of cost for rows up to and including the current one, and then selecting the first row which has total cost >= 200:
WITH CTE AS (
SELECT time,
SUM(cost) OVER (ORDER BY time ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS total
FROM data
)
SELECT *
FROM CTE
WHERE total >= 200
ORDER BY total
LIMIT 1
Output:
time total
2019-12-12 250
Demo on SQLFiddle

cross join to get all dates and hours and avoid duplicate values

We have 2 tables:
sales
hourt (only 1 field (hourt) of numbers: 0 to 23)
The goal is to list all dates and all 24 hours for each day and group hours that have sales. For hours that do not have sales, zero will be shown.
This query cross joins the sales table with the hourt table and does list all dates and 24 hours. However, there are also many duplicate rows. How can we avoid the duplicates?
We're using Amazon Redshift (based on Postgres 8.0).
with h as (
SELECT
a.purchase_date,
CAST(DATE_PART("HOUR", AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS
DATETIME), "0:00"), "PST")) as INTEGER) AS Hour,
COUNT(a.quantity) AS QtyCount,
SUM(a.quantity) AS QtyTotal,
SUM((a.price) AS Price
FROM sales a
GROUP BY CAST(DATE_PART("HOUR",
AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS DATETIME), "0:00"),
"PST")) as INTEGER),
DATE_FORMAT(AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS DATETIME),
"0:00"), "PST"), "yyyy-MM-dd")
ORDER by a.purchase_date
),
hr as (
SELECT
CAST(hourt AS INTEGER) AS hourt
FROM hourt
),
joined as (
SELECT
purchase_date,
hourt,
QtyCount,
QtyTotal,
Price
FROM h
cross JOIN hr
)
SELECT *
FROM joined
Order by purchase_date,hourt
Sample Tables:
Before the cross join, query returned correct sales and grouped hours, as seen in the below table.
Desired results table:

Need to create a series of all the hour values and left join your data back to that. Comments inline explain the logic.
WITH data AS (-- Do the basic aggregation first
SELECT DATE_TRUNC('hour',a.purchase_date) purchase_hour --Truncate timestamp to the hour is simpler
,COUNT(a.quantity) AS QtyCount
,SUM(a.quantity) AS QtyTotal
,SUM((a.price) AS Price
FROM sales a
GROUP BY DATE_TRUNC('hour',a.purchase_date)
ORDER BY DATE_TRUNC('hour',a.purchase_date)
-- SELECT '2017-01-13 12:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
-- UNION ALL SELECT '2017-01-13 15:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
-- UNION ALL SELECT '2017-01-14 21:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
)
,time_range AS (--Calculate the start and end **date** values
SELECT DATE_TRUNC('day',MIN(purchase_hour)) start_date
, DATE_TRUNC('day',MAX(purchase_hour))+1 end_date
FROM data
)
,hr AS (--Generate all hours between start and end
SELECT (SELECT start_date
FROM time_range
LIMIT 1) --Limit 1 so the optimizer knows it's not a correlated subquery
+ ((n-1) --Make the series start at zero so we don't miss the starting value
* INTERVAL '1 hour') AS "hour"
FROM (SELECT ROW_NUMBER() OVER () n
FROM stl_query --Can use any table here as long as it enough rows
LIMIT 100) series
WHERE "hour" < (SELECT end_date FROM time_range LIMIT 1)
)
--Use NVL to replace missing values with zeroes
SELECT hr.hour AS purchase_hour --Timestamp like `2017-01-13 12:00:00`
, NVL(data.qty_count, 0) AS qty_count
, NVL(data.qty_total, 0) AS qty_total
, NVL(data.price, 0) AS price
FROM hr
LEFT JOIN data
ON hr.hour = data.purchase_hour
ORDER BY hr.hour
;

I achieved the desired results by using Left Join (table A with table B) instead of Cross Join of these two tables:
Table A has all the dates and hours
Table B is the first part of the original query

Sort Numbers in varchar value in SQL Server

My Goal is to load a monthly-daily tabular presentation of sales data with sum total and other average computation at the bottom,
I have one data result set with one column that is named as 'Day' which corresponds to the days of the month, with automatic datatype of int.
select datepart(day, a.date ) as 'Day'
On my second result set, is the loading of the sum at the bottom, it happens that the word 'Sum' is aligned to the column of Day, and I used Union All TO COMBINE the result set together, expected result set is something to this like
day sales
1 10
2 20
3 30
4 10
5 20
6 30
.
.
.
31 10
Sum 130
What I did is to convert the day value, originally in int to varchar datatype. this is to successfully join columns and it did, the new conflict is the sorting of the number
select * from #SalesDetailed
UNION ALL
select * from #SalesSum
order by location, day

Assuming your union query returns the correct results, just messes up the order, you can use case with isnumeric in the order by clause to manipulate your sort:
SELECT *
FROM
(
SELECT *
FROM #SalesDetailed
UNION ALL
SELECT *
FROM #SalesSum
) u
ORDER BY location,
ISNUMERIC(day) DESC,
CASE WHEN ISNUMERIC(day) = 1 THEN cast(day as int) end
The isnumeric will return 1 when day is a number and 0 when it's not.

Try this
select Day, Sum(Col) as Sales
from #SalesDetailed
Group by Day With Rollup
Edit (Working Sample) :
select
CASE WHEN Day IS NULL THEN 'SUM' ELSE STR(Day) END as Days,
Sum(Sales) from
(
Select 1 as Day , 10 as Sales UNION ALL
Select 2 as Day , 20 as Sales
) A
Group by Day With Rollup
EDIT 2:
select CASE WHEN Day IS NULL THEN 'SUM' ELSE STR(Day) END as Days,
Sum(Sales) as Sales
from #SalesDetailed
Group by Day With Rollup

Decrease Date for Average on this SQL

I want to query price average from my top 25 price sort by last date (and select some symbols).
I use this code, It's Work !!
SELECT AVG(PRICE)
FROM (SELECT PRICE FROM ms_data where SYMBOL='$symbol'
ORDER BY DATE DESC LIMIT 25) var;
If I don't want top date or 2 top date.
Example.
top date (lastest date) = 2014-04-16
2nd of top date = 2014-04-15
3rd of top date = 2014-04-4
...
I should not query first result,right?
I use this code but It doesn't work.
SELECT AVG(PRICE)
FROM (SELECT PRICE FROM ms_data where SYMBOL='$symbol' AND
NOT EXISTS (SELECT PRICE FROM ms_data where SYMBOL='$symbol'
ORDER BY DATE DESC LIMIT 1) ORDER BY DATE DESC LIMIT 26) var;
If use variable instead of limit.
SELECT AVG(PRICE)
FROM (SELECT PRICE FROM ms_data where SYMBOL='$symbol' AND
NOT EXISTS (SELECT PRICE FROM ms_data where SYMBOL='$symbol'
ORDER BY DATE DESC LIMIT $i) ORDER BY DATE DESC LIMIT 25+$i) var;
Any suggestion?
Thank you in advance.

Select X Most Recent Non-Consecutive Days Worth of Data

Anyone got any insight as to select x number of non-consecutive days worth of data? Dates are standard sql datetime. So for example I'd like to select 5 most recent days worth of data, but there could be many days gap between records, so just selecting records from 5 days ago and more recent will not do.

Following the approach Tony Andrews suggested, here is a way of doing it in T-SQL:
SELECT
Value,
ValueDate
FROM
Data
WHERE
ValueDate >=
(
SELECT
CONVERT(DATETIME, MIN(TruncatedDate))
FROM
(
SELECT DISTINCT TOP 5
CONVERT(VARCHAR, ValueDate, 102) TruncatedDate
FROM
Event
ORDER BY
TruncatedDate DESC
) d
)
ORDER BY
ValueDate DESC

I don't know the SQL Server syntax, but you need to:
1) Select the dates (with time component truncated) in descending order
2) Pick off top 5
3) Obtain 5th value
4) Select data where the datetime >= 5th value
Something like this "pseudo-SQL":
select *
from data
where datetime >=
( select top 1 date
from
( select top 5 date from
( select truncated(datetime) as date
from data
order by truncated(datetime) desc
)
order by date
)
)

This should do it and be reasonably good from a performance standpoint. You didn't mention how to handle ties, so you can add the WITH TIES clause if you need to do that.
SELECT TOP (#number_to_return)
* -- Write out your columns here
FROM
dbo.MyTable
ORDER BY
MyDateColumn DESC

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

PostgreSQL : Applying Pagination on UNION queries - sql

Have you tried doing the UNION inside a CTE? (untested) WITH x AS ( SELECT * FROM Sales WHERE ... UNION ALL SELECT * FROM Purchases WHERE ... ) SELECT * FROM x ORDER BY id LIMIT 50 OFFSET 100

Related

How to write a SQL query to find the first time when sum greater than a number？

cross join to get all dates and hours and avoid duplicate values

Sort Numbers in varchar value in SQL Server

Decrease Date for Average on this SQL

Select X Most Recent Non-Consecutive Days Worth of Data

Categories

Resources