I have a sales table that shows the date and time of each sale.
For example:
saleid | saledate | saletime
1 | 20110327 | 101
2 | 20110327 | 102
3 | 20110328 | 201
(So sale 2 occurred on 20110327 at 102)
I need to construct a single SQL statement that:
Groups the sales by date (each row is a different date) and then
counts the sales for each time range. (With each time range being a separate column)
The table should look something like this:
saledate | 101-159 | 200-259 |
20110327 | 2 | 0 |
20110328 | 0 | 1 |
It needs to be a single statement and saledate and saletime need to remain in numeric format.
(I am pulling from a database table with several million rows)
I am using MS Access
Any advice is greatly appreciated.
Thank you so much!
SELECT saledate,
SUM(IIF(saletime >= 101 and saletime <= 159), 1, 0) as [101To159)
SUM(IIF(saletime >= 200 and saletime <= 259), 1, 0) as [200To259)
FROM myTable
GROUP BY saledate
Note: I haven't run this query. However, this is how it could be.
SELECT saledate,
SUM(case when saletime between 101 and 159 then 1 else 0 end ) as R101_159,
SUM(case when saletime between 200 and 259 then 1 else 0 end ) as R200_59
FROM myTable
GROUP BY saledate
Related
Let assume that I have two tables in my database as below:
Table 1 (current_prices): Which contains some stuffs and their prices and it updates one time per day:
# current_prices
| commodity | price |
____________________
| stuff1 | price1
| stuff2 | price2
| stuff3 | price3
|. |
|. |
|. |
| stuffN | priceN
Table 2 (stat_history): Which divide stuffs in price ranges and keep the number of elements of each range for all days as below:
# stat_history
| date | range1_count | range2_count | range3_count
________________________________________________________
| 20200411 | 12 | 5 | 9
| 20200412 | 10 | 5 | 11
| 20200413 | 13 | 4 | 9
| 20200414 | 15 | 3 | 8
The content of stat_history table are generated from current_price contents at the end of the day.
Currently I use multiple Update-Insert (Upsert) queries to update my stat_history table as below:
insert into stat_history (date, range1_count)
select now()::date , count(stuff) as range1_count from current_prices
where 0 < price and price < VAL1
on conflict(day)
update set
range1_count = excluded.range1_count
insert into stat_history (date, range2_count)
select now()::date , count(stuff) as range2_count from current_prices
where VAL1 < price and price < VAL2
on conflict(day)
update set
range2_count = excluded.range2_count
..... (blah blah)
The question is:
Is there any shorter, simpler or more efficient way to do this (In a single SQL query for example)?
You could do conditional counts, using Postgres standard filter clause:
insert into stat_history (date, range1_count)
select
now()::date,
count(stuff) filter(where price >= 0 and price < VAL1) as range1_count,
count(stuff) filter(where price >= VAL1 and price < VAL2) as range2_count
from current_prices
where price >= 0 and price < VAL2
on conflict(day)
update set
range1_count = excluded.range1_count
range2_count = excluded.range2_count
Notes:
I adapted the logic that puts rows in the intervals to make them contiguous (in your original query for example, a price that is equal to VA1 would never be counted in)
with this logic at hand, you might not even need the on conflict clause
lets say I have a table which stores itemID, Date and total_shipped over a period of time:
ItemID | Date | Total_shipped
__________________________________
1 | 1/20/2000 | 2
2 | 1/20/2000 | 3
1 | 1/21/2000 | 5
2 | 1/21/2000 | 4
1 | 1/22/2000 | 1
2 | 1/22/2000 | 7
1 | 1/23/2000 | 5
2 | 1/23/2000 | 6
Now I want to aggregate based on several periods of time. For example, I Want to know how many of each item was shipped every two days and in total. So the desired output should look something like:
ItemID | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
_____________________________________________
1 | 7 | 6 | 13
2 | 7 | 13 | 20
How do I do that in the most efficient way
I know I can make three different subqueries but I think there should be a better way. My real data is large and there are several different time periods to be considered i. e. in my real problem I want the shipped items for current_week, last_week, two_weeks_ago, three_weeks_ago, last_month, two_months_ago, three_months_ago so I do not think writing 7 different subqueries would be a good idea.
Here is the general idea of what I can already run but is very expensive for the database
WITH
sq1 as (
SELECT ItemID, sum(Total_shipped) sum1
FROM table
WHERE Date BETWEEN '1/20/2000' and '1/21/2000'
GROUP BY ItemID),
sq2 as (
SELECT ItemID, sum(Total_Shipped) sum2
FROM table
WHERE Date BETWEEN '1/22/2000' and '1/23/2000'
GROUP BY ItemID),
sq3 as(
SELECT ItemID, sum(Total_Shipped) sum3
FROM Table
GROUP BY ItemID)
SELECT ItemID, sq1.sum1, sq2.sum2, sq3.sum3
FROM Table
JOIN sq1 on Table.ItemID = sq1.ItemID
JOIN sq2 on Table.ItemID = sq2.ItemID
JOIN sq3 on Table.ItemID = sq3.ItemID
I dont know why you have tagged this question with multiple database.
Anyway, you can use conditional aggregation as following in oracle:
select
item_id,
sum(case when "date" between date'2000-01-20' and date'2000-01-21' then total_shipped end) as "Jan20-Jan21",
sum(case when "date" between date'2000-01-22' and date'2000-01-23' then total_shipped end) as "Jan22-Jan23",
sum(case when "date" between date'2000-01-20' and date'2000-01-23' then total_shipped end) as "Jan20-Jan23"
from my_table
group by item_id
Cheers!!
Use FILTER:
select
item_id,
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-21') as "Jan20-Jan21",
sum(total_shipped) filter (where date between '2000-01-22' and '2000-01-23') as "Jan22-Jan23",
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-23') as "Jan20-Jan23"
from my_table
group by 1
item_id | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
---------+-------------+-------------+-------------
1 | 7 | 6 | 13
2 | 7 | 13 | 20
(2 rows)
Db<>fiddle.
I have a problem that I've spent way to much time trying to figure out, with close to no success at all.. I'll try to describe the problem as good as I can, and use an example, which is the solution I use right now.
I have two different MS SQL tables.
Table 1:
itemNumber - 192031, 533853 etc.
date - the date the database post was added
quantity - the amount of items for each item number
Table 2:
MTITNO - also item number, contains many different item numbers (more than Table 1)
MTTRDT - the date the database post was added
MTTYP - transaction type. I will be looking for MTTYP = 11
MTTRQT - transaction quantity. I will be looking for MTTRQT < 0
So what I want to do is to get DISTINCT itemNumber between two dates from Table 1. Once I have those item numbers, I would like to join Table 2 on item number, and also between the same dates that I use in the query for Table 1. I also need to only get the values from Table 2 where MTTYP = 11 and MTTRQT < 0 and SUM MTTRQT.
I've sorted this by using loops in java code, which isn't that good to be honest. What I do is this:
SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;
Take the top value from this result (that is the first item number) and then:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO = "the first item number from the result query from above"
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0
Add the result to a new list. Remove the item number used
Loop through all the item numbers in the list and run step 3 and 4 for every single item number (this is the bad part).
Surely there must be a SQL query that produces the same result!?
Appreciate any help I can get!
Update:
This is the data I have.
Table 1
|Item number | Quantity | date
192031 | 1 | 20190521
192031 | 1 | 20190522
19192301 | 2 | 20190521
19189507 | 1 | 20190523
19189507 | 1 | 20190521
19189507 | 1 | 20190524
Table 2
|MTITNO | MTTRDT | MTTTYP | MTTRQT
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
Table 2 contains all sorts of item numbers (that is item numbers that you can find in Table 2, but not in Table 1), and many more posts. There can be posts in Table 1 and no posts in Table 2 for one or more item numbers.
I want to summarise the MTTRQT for all items where the item number is in both Table 1 and Table 2 and within the date span I have set. The "amount used" in the desired result below is MTTRQT added up for every single item number.
Desired result
So if I look for all the item numbers with date between 20190520 - 20190524, I should get the list below.
"Item number" is supposed to be DISTINCT item numbers from Table 1.
"Amount used" is the SUM function, that sums MTTRQT where all the conditions are met.
|Item Number | Amount used
192031 | -4
19189507 | -7
Reading through the lines a bit, but is this not what you're after?
SELECT SUM(T2.MTTRQT) AS [SUM_MTTRQT_]
FROM [Table 2] T2
LEFT JOIN (SELECT TOP (1)
T1.ItemNumber
FROM [Table 1] T1
WHERE T1.[date] BETWEEN #fromDate AND #toDate --Note, if [date] has a time portion, this is unlikely to work as you expect
ORDER BY T1.ItemNumber) T1 ON T2.MTITNO = T1.ItemNumber --Assumed ORDER BY clause
WHERE T2.MTTTYP = 11
AND T2.MTTRDT BETWEEN #fromDate AND #toDate --Note, if MTTRDT has a time portion, this is unlikely to work as you expect
AND T2.MITTRA.MTTRQT < 0;
If I am following your logic correctly:
select sum(mttrqt)
from table2 t2
where t2.mtitno in (select t1.itemno
from table1 t1
where t1.date >= #date1 and t1.date <= #date2
) and
t2.mttrdt >= #date1 and
t2.mttrdt <= #date1 and
t2.mttype = 11 and
t2.mttrqt < 10;
Have you tried this:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO in (SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;)
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0
Hoping someone can help me with this query.
If I have a sample table:
orderId | itemId | quantity | created
123456 | 1 | 100 | 1478822402
123457 | 1 | 5 | 1478736001
123458 | 2 | 10 | 1478736001
123459 | 2 | 40 | 1478822402
I am trying to get a result set which gives me the sum of the quantities of items sold today and yesterday - grouped by item. For example:
item | numSoldToday | numSoldYesterday
1 | 100 | 5
2 | 40 | 10
Ignoring the hard coded timestamps for testing purposes, I can do this for just items sold today using:
SELECT item, sum(quantity) as numSoldToday FROM orders
WHERE created >= 1478822400 AND created <= 1478908799
GROUP by item
And I can do do this for just ONE item sold today vs. yesterday using a subquery:
SELECT sum(quantity) as numSoldToday,
(SELECT (sum(quantity) FROM orders WHERE created >= 1478736000 AND created <= 1478822399 AND item = 1) as numSoldYesterday
FROM orders
WHERE created >= 1478822400 AND created <= 1478908799 AND item = 1
But I can't seem to figure out how to combine the two - a grouped result set of each item with the number sold today vs. yesterday without using the client side programming language to get a list of items and loop through each one which seems a bit inefficient.
I tried putting a GROUP BY in the subquery but this just gave me an error and I can't think of the best way to proceed with doing this purely with SQL.
Any ideas welcome!
Thanks :-)
You can do this in an aggregated case statement rather than subqueries, something like this;
SELECT
o.item
,SUM(CASE WHEN o.created >= 1478822400 AND o.created <= 1478908799 THEN quantity ELSE 0 END) numSoldToday
,SUM(CASE WHEN o.created >= 1478736000 AND o.created <= 1478822399 THEN quantity ELSE 0 END) numSoldYesterday
FROM orders o
GROUP BY o.item
toi am fairly new to hive.
I have a table called stats as shown below
from_date | to_date | customer_name | callcount
-------------------------------------------------------
2016_01_01 | 2016_01_02 | ABC | 25
2016_01_02 | 2016_01_03 | ABC | 53
2016_01_03 | 2016_01_04 | ABC | 44
2016_01_04 | 2016_01_05 | ABC | 55
I want to build a hive query will accept:
a) current time range (from and to time)
b) previous time range (from and to time)
c) customer name
For e.g.: the inputs will be:
current time range can be from time(2016_01_03) and to time(2016_01_05)
current time range can be from time(2016_01_01) and to time(2016_01_02)
customer name can be ABC
And the result i want to display is:
current_call_count(sum of call counts for current time range), previous_call_count(sum of call counts for previous time range)
and difference between current_call_count & previous_call_count
like this:
customer | current_call_count | previous_call_count | Diff
---------------------------------------------------------
ABC | 99 | 25 | 74
The query which i built was:
select * from
(
select sum(callCount) as current_count from stats
where customer_name='ABC' and from_date>='2016-04-03' and to_date<='2016-04-05'
UNION ALL
select sum(callCount) as current_count from stats
where customer_name='ABC' and from_date>='2016-04-01' and to_date<='2016-04-02'
) FINAL
I am not able to get the calculation and also not able to display result as columns. Please help
Try conditional aggregation:
select
sum(case when from_date >= '2016_01_04' and to_date <= '2016_01_05' then callcount else 0 end)
as current_call_count,
sum(case when from_date >= '2016_01_02' and to_date <= '2016_01_03' then callcount else 0 end)
as previous_call_count,
sum(case when from_date >= '2016_01_04' and to_date <= '2016_01_05' then callcount else 0 end)
- sum(case when from_date >= '2016_01_02' and to_date <= '2016_01_03' then callcount else 0 end)
as difference
from stats
where customer_name = 'ABC'
Note that your example data uses _ instead of - (which is used in your query) as a date separator.