SQL query between two times on every day - sql

I'm currently trying to write a query that will allow me to find all records that occur between two times every day. As an example, say you had five records each with their own unique timestamps that represent when the record was created. They look something like this:
|--|------|-------------------|
|id|letter| created_at |
|--|------|-------------------|
|1 |a |2013-10-30 10:00:00|
|2 |b |2013-10-31 18:00:00|
|3 |c |2013-11-01 14:00:00|
|4 |d |2013-11-03 23:00:00|
|5 |e |2013-11-04 05:00:00|
|--|------|-------------------|
I'm trying to write a query that would return all records created between 08:00:00 and 15:00:00. The expected result would be:
|--|------|-------------------|
|id|letter| created_at |
|--|------|-------------------|
|1 |a |2013-10-30 10:00:00|
|3 |c |2013-11-01 14:00:00|
|--|------|-------------------|
What would a query look like to achieve this result? I'm familiar with how to use BETWEEN to get dates but not how to focus on times specifically. Thanks.

Alternatively, extract a native TIME value from your datetime field, and compare date values directly:
SELECT *
FROM yourtable
WHERE TIME(created_at) BETWEEN '08:00:00' AND '15:00:00'
MySQL has a very comprehensive set of date/time manipulation functions available here: http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html

EDIT Forgot this was MySQL
You can use the EXTRACT function to pull that out
SELECT id, letter, created_at
FROM table
WHERE EXTRACT(HOUR, created_at) BETWEEN 8 AND 15

Related

Aggregate query in MS-Access that calculates current and cumulative amounts by account is exceptionally slow

I'm having trouble with an Access query. Although it runs , it is exceptionally slow; and I fear I may be overlooking a simpler, more elegant solution in my query design.
For context, I work in an accounts receivable office. We have thousands of customers, and each customer can have one or more accounts. Every month, transactions post to the various accounts, and I am preparing the invoices for the customers. In my particular case, a customer's first invoice is always 001, then 002, and so on. We bill monthly.
To describe a simplified example, in month of January 2020, customer A may have the following transactions in the Transaction table:
+-----------------------------+
|TransID|Account|Amount|InvNum|
+-----------------------------+
|1 |1 |$10.00|001 |
|2 |2 |$5.00 |001 |
|3 |3 |$2.00 |001 |
+-----------------------------+
So, in the above example, I would want to issue invoice 001 to customer A for a total of $17.00, broken out by account. The invoice would look something like this:
+-----------------------+
|Account|Current|ToDate |
|1 |$10.00 |$10.00 |
|2 |$5.00 |$5.00 |
|3 |$2.00 |$2.00 |
+-----------------------+
$17.00 $17.00
Now, suppose that in February 2020, additional transactions post. A simplified version of the Transaction table would look like this:
+-----------------------------+
|TransID|Account|Amount|InvNum|
+-----------------------------+
|1 |1 |$10.00|001 |
|2 |2 |$5.00 |001 |
|3 |3 |$2.00 |001 |
|4 |1 |$3.00 |002 |
|5 |3 |$4.00 |002 |
+-----------------------------+
Invoice #002 issued to customer A would need to look something like this:
+-----------------------+
|Account|Current|ToDate |
|1 |$3.00 |$13.00 |
|2 |$0.00 |$5.00 |
|3 |$4.00 |$6.00 |
+-----------------------+
$7.00 $24.00
The query I'm having trouble with is specifically designed to capture the month's activity by account and to calculate a cumulative total for the "ToDate" column on the invoice. The challenge is that not every account will have transactions in a given month. Note that account 2 did not post any transactions in February. So invoice 002 has to show a current amount of $0.00 for account 2, but it also needs to know the cumulative amount ($5.00 + $0.00 = $5.00) for account 2.
The problematic query is itself made up of a few subqueries:
BillNumByAcccountQ: an aggregate query that selects and groups all accounts by invoice number.
CurrentQ: Also an aggregate query that selects and sums all the transaction amounts (from the transaction table), which is left-joined to BillNumByAccountQ. The left-join is necessary to ensure that there is a row for every bill number. The "Current" field in this query is given by the expression Sum(Nz(Amount,0)). The result set of this query contains over 20K rows.
Finally, the problematic query is defined by the following SQL statement:
SELECT
Q1.Account
,Q1.InvNum
,Q1.CURRENT
,(
SELECT SUM(CURRENT)
FROM CurrentQ
WHERE Q1.Account = Account
AND Q1.InvNum >= InvNum
) AS ToDate
FROM CurrentQ AS Q1;
This query runs and runs and runs, and it eventually causes Access to stop responding. I do not even know how many rows it has because it never finishes running. I fear that I'm overlooking a way simpler solution.
Apologies for so much information, and I appreciate any advice on simplifying this.
Generally, doing a sub-query inside a select statement is slow, since it often needs to run the sub-query for every single row of the main query.
Doing the aggregation all at once is likely going to be faster:
SELECT
Q1.Account
,Q1.InvNum
,Q1.CURRENT
,SUM(i.CURRENT) AS ToDate
FROM CurrentQ AS Q1
JOIN CurrentQ AS i
ON i.Account = Q1.Account
AND i.InvNum >= Q1.InvNum
GROUP BY Q1.Account, Q1.InvNum, Q1.Current;
In addition, if you're able to edit the database, you'd probably want to add indexes for the Account and InvNum columns.

Counting rows for a particular name

I've a table temp(name int,count int). It stores:-
a|count
1|10
1|8
1|4
1|2
2|10
2|6
2|1
I want it's rows to be numbered, corresponding to a given name(also, note that count has to be in decreasing order), i.e, :-
a|count|row
1|10 |1
1|8 |2
1|4 |3
1|2 |4
2|10 |1
2|6 |2
2|1 |3
I tried How to show row numbers in PostgreSQL query? this post, but it just seems to number it from 1 to 7 and not name-wise. Can someone please help me with this? Thanks!
Use row_number() function
select a, count, row_number() over(partition by a order by count desc) as rn
from tablename

Data Summarization in Apache Pig/Apache Hive For Given Date Range

I have a requirement where-in i need to do data summarization on the date range provided as input. To be more specific: If my data looks like:
Input:
Id|amount|date
1 |10 |2016-01-01
2 |20 |2016-01-02
3 |20 |2016-01-03
4 |20 |2016-09-25
5 |20 |2016-09-26
6 |20 |2016-09-28
And If I want the summarization for the month of September, then
I need to calculate count of records on 4 ranges which are:
Current Date, which is each day in September.
Week Start Date(Sunday of the week as per the current date) to Current Date, Ex. if Current Date is 2016-09-28 then week start date would be 2016-09-25
and record counts between 2016-09-25 to 2016-09-28.
Month Start Date to Current Date, which is from 2016-09-01 to Current Date.
Year Start Date to Current Date,which is record count from 2016-01-01 to Current Date.
So My output should have one record with 4 Columns for each day of the month(in this case, Month is September), Something like
Output:
Current_Date|Current_date_count|Week_To_Date_Count|Month_to_date_Count|Year_to_date_count
2016-09-25 |1 |1 |1 |4
2016-09-26 |1 |2 |3 |5
2016-09-28 |1 |3 |3 |6
Important: i can pass only 2 variables, which is range start date and range end date. Rest calculation need to be dynamic.
Thanks in advance
You can join on year, then test each condition separately (using sum(if())):
select a.date, sum(if(a.date=b.date,1,0)),
sum(if(month(a.date)=month(b.date) and weekofyear(a.date)=weekofyear(b.date),1,0)),
sum(if(month(a.date)=month(b.date),1,0)),
count(*) from
(select * from input_table where date >= ${hiveconf:start} and date <${hiveconf:end}) a,
(select * from input_table where date <${hiveconf:end}) b
where year(a.date)=year(b.date) and b.date <= a.date group by a.date;

vertica sql delta

I want to calculate delta value between 2 records my table got 2 column id and timestamp i want to calculate the delta time between the records
id |timestamp |delta
----------------------------------
1 |100 |0
2 |101 |1 (101-100)
3 |106 |5 (106-101)
4 |107 |1 (107-106)
I work with a Vertica data base and I want to create view/projection of this table on my DB.
Is it possible to create this calculate without using udf function?
You can use lag() for this purpose:
select id, timestamp,
coalesce(timestamp - lag(timestamp) over (order by id), 0) as delta
from t;

sum rows from one table and move it to another table

How can I sum rows from one table (based on selected critiria) and move the outcome to another table.
I have a table related to costs within project:
Table "costs":
id| CostName |ID_CostCategory| PlanValue|DoneValue
-------------------------------------------------------
1 | books |1 |100 |120
2 | flowers |1 |90 |90
3 | car |2 |150 |130
4 | gas |2 |50 |45
and I want to put the sum of "DoneValue" of each ID_CostCategory into table "CostCategories"
Table "CostCategories":
id|name |planned|done
------------------------
1 |other|190 |takes the sum from above table
2 |car |200 |takes the sum from above table
Many thanks
I would not store this, because as soon as anything changes in Costs then CostCategories will be out of date, instead I would create a view e.g:
CREATE VIEW CostCategoriesSum
AS
SELECT CostCategories.ID,
CostCategories.Name,
SUM(COALESCE(Costs.PlanValue, 0)) AS Planned,
SUM(COALESCE(Costs.DoneValue, 0)) AS Done
FROM CostCategories
LEFT JOIN Costs
ON Costs.ID_CostCategory = CostCategories.ID
GROUP BY CostCategories.ID, CostCategories.Name;
Now instead of referring to the table, you can refer to the view and the Planned and Done totals will always be up to date.
INSERT INTO CostCategories(id,name,planned,done)
SELECT ID_CostCategory, CostName, SUM(PlanValue), SUM(DoneValue)
FROM costs GROUP BY ID_CostCategory, CostName