Grouping by "on the fly" calculation in Postgres - sql

Very simple question here, but a quick google search didn't seem to be definitive (and I do not have access to a DB to test right now). I would like to check whether you can do "on the fly" grouping in Postgres (as is possible in SQL Server). best way to clarify is an example i.e. can I do this to group by weekly periods:
select ...
from ...
group by cast((current_date - transaction_date)/7 as int)
or is it necessary to first define a week column in a subquery (as per the calculation above) and then do the grouping?
Thanks in advance for your help.

You can include most expressions in the GROUP BY, so your code is fine. This is true in Postgres and in almost any database.
It is unusual to have an aggregation query where the aggregation expressions are not part of the GROUP BY. But if you have data on every day, then this is a sensible query:
select min(date_trunc(transaction_date)) as week_start, count(*)
from ...
group by cast((current_date - transaction_date)/7 as int)

You surely can. I would slightly modify your example like
select cast((current_date - transaction_date)/7 as int) as wp, ...
from ...
group by wp;

Related

Query does not include the specified expression as part of an aggregate function in UNION query

I am doing a Union Query to add together the results of two separate queries that give me data from two different fiscal periods, to get a rolling 12 months number.
I get the message "Your query does not include the specified expression "Report_Header" as part of an aggregate function". I have read that the field needs to be included in a GROUP BY statement at the end, but when I add the field from either query or with both queries as shown below I still get the message. Help? I'm not a programmer, I'm an Access user, so I need to simple please :).
SELECT [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB]
UNION ALL
SELECT [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
[JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header
Thanks!
You can aggregate both subqueries:
SELECT [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB].Report_Header
UNION ALL
SELECT [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfCASES) AS CASES,
Sum([JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].SumOfPurchases) AS PURCHASES
FROM [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2]
GROUP BY [JOIN_IB_FREIGHT&PURCHASES_Rolling12_SUB2].Report_Header;
This may be what you want. But, it will not combine information under the same header from both tables. For that, the simplest method is probably a view.
Place GROUP BY [JOIN_IB_FREIGHT&PURCHASES_ROLLING12_SUB].Report_Header under the first query instead of the second.

Cummulative sum in report

Hello All,
I am working on a report where I am doing calculations:
Let's take the first line as an example. In the remaining prior column we have 15 and in the taken column we have 0.5, so in the remaining column, we have 14.5.
Now the issue is to use the result in the remaining field and transfer it to the next line in the remaining prior column. So instead of having 14 we should be having 14.5.
Has anyone worked on something similar and guide me on how to work on this? I really want to learn how to solve such an issue.
The ANSI standard lag() function does exactly what you want. SQL tables represent unordered sets, so I need to assume that you have some column -- which I will call id -- that identifies the ordering of the rows.
The syntax for lag() is:
select t.*, lag(Remaining) over (order by 1) as prevRemaining
from table t;
If you have a database that does not support the ANSI standard window functions, you can get the same effect with a subquery. However, the syntax for that might vary slightly among databases.

How to get all rows from a table inserted in a particular date.

I am trying to write a query that gets all the rows of a table for a particular date.
SELECT * FROM MY_TABLE WHERE COLUMN_CONTAINING_DATE='2013-05-07'
However that does not work, because in the table the COLUMN_CONTAINING_DATE contains data like '2013-05-07 00:00:01' etc. So, this would work
SELECT * FROM MY_TABLE WHERE COLUMN_CONTAINING_DATE>='2013-05-07' AND COLUMN_CONTAINING_DATE<'2013-05-08'
However, I dont want to go for option 2 because that feels like a hacky way. I would rather put a query that says get me all the rows for a give date and somehow not bother about the minutes and hours in the COLUMN_CONTAINING_DATE.
I am trying to have this query run on both H2 and DB2.
Any suggestions?
You can do:
select *
from MY_Table
where trunc(COLUMN_CONTAINING_DATE) = '2013-05-07';
However, the version that you describe as a "hack" is actually better. By wrapping a function around the data, many SQL optimizers will not use indexes. With just direct comparisons, an index would definitely be used.
Use something like this
SELECT * FROM MY_TABLE WHERE COLUMN_CONTAINING_DATE=DATE('2013-05-07')
You can ease this if you use the Temporal data management capability from DB2 10.1.
For more information:
http://www.ibm.com/developerworks/data/library/techarticle/dm-1204db2temporaldata/
If your concerns are related to the different data types (timestamp in the column, and a string containing a date), you can do this:
SELECT * FROM MY_TABLE
WHERE
COLUMN_CONTAINING_DATE >= '2013-05-07 00:00:00'
and COLUMN_CONTAINING_DATE < '2013-05-08 00:00:00'
and I'd pay attention to the formatting of the where clause, because this will improve readability a lot, if you have to look at your queries two months later. Just pick a style you prefer for ranges like "a <= x < b". Unfortunately SQL's between does not support this.
One could argue that the milliseconds are still missing, so perfectionists may append another ".0" in the timestamp ...

Need Alternative to SQL AVG() for Obvient BI tool

I am reading 2 fields from 1 table.
StartKey and Mins
Image below shows my current output result on left and what I need on right.
Here is my Query
Select
StartKey,
Duration as Mins
From TableA
Where Flag = 0
Order by StartKey
I know I can use avg(duration), but if I use that, Obvient, the software I am using to write and display the query, won't let me take the average of column Mins Avg itself.
This error I get after I manually insterted average code of column in CS file and then I try to edit column properties.
First, let me make sure I understand your problem.
You are using the SQL from your post while building something in Obvient which appears to be a Business Intelligence platform. The problem you are having is that you are unable to perform an average function in Obvient on the column of averages in your SQL query.
If that is correct, you should use your SQL query to create a view in the database which should appear to Obvient as a table and may allow you to perform the averaging function. I can't say for certain that this will solve your issue having never used Obvient, but give that a try and let us know how that works for you.
Seems like I'm missing something, but to get your desired results, this should work:
Select
StartKey,
AVG(Duration) as Mins
From TableA
Where Flag = 0
Group By StartKey
Order by StartKey
And the SQL Fiddle.
If your goal is to get the AVG(Mins) from the above query, you could use a subquery to return that:
Select AVG(Mins)
FROM (
SELECT
StartKey,
AVG(Duration) as Mins
From TableA
Group By StartKey
) t
Here is the Fiddle:
Good luck.

How can I optimise this Query?

How can I optimize this query if given the following query returns either all entries in the table or entries that match only up to current date ?
btw: The Query is targeted to a Oracle Linked Server on MS Sql 2005 as an Inline function.. Do not want this to be a table value function..
ALTER function [dbo].[ftsls031nnnHades](#withExpiredEntries bit =0)
returns table as return
select *
from openQuery(Hades ,"select '010' comno,
trim(t$cuno) t$cuno,
trim(t$cpgs) t$cpgs,
t$dile,
t$qanp,
to_char(t$stdt,'dd Mon yy') t$stdt,
to_char(t$tdat,'dd Mon yy') t$tdat,
to_char(t$disc,'999.99') t$disc,
t$damt,
t$cdis,
t$gnpr,
t$refcntd,
t$refcntu
from baan.ttdsls031010
where (to_char(t$Tdat,'yyyy-mm-dd') >= To_char(current_date,'yyyy-mm-dd'))
and (to_char(t$stdt,'yyyy-mm-dd') <= To_char(current_date,'yyyy-mm-dd'))
union all
select '020' comno,
trim(t$cuno) t$cuno,
trim(t$cpgs) t$cpgs,
t$dile,t$qanp,
to_char(t$stdt,'dd Mon yy') t$stdt,
to_char(t$tdat,'dd Mon yy') t$tdat,
to_char(t$disc,'999.99') t$disc,
t$damt,
t$cdis,
t$gnpr,
t$refcntd,
t$refcntu
from baan.ttdsls031020
where (to_char(t$tdAt,'yyyy-mm-dd') >= To_char(current_date,'yyyy-mm-dd'))
and (to_char(t$stdt,'yyyy-mm-dd') <= To_char(current_date,'yyyy-mm-dd')) ")
p.s: Column naming conventions may be alien to those who are of non BaaN .. Please excuese me for bringing up 'BaaN' conventions into StackOverflow.
Never perform any functional processing of your date column (t$Tdat and t$stdt are of this type, aren't they?) unless you have the corresponding function-based index. This approach doesn't allow you to use indexes on t$stdt and t$Tdat and drops the perfomance dramatically.
Instead, I would rewrite the where clause in the following way:
where t$Tdat >= current_date and t$stdt <= current_date
if current_date is of date type. If it's not, then you can use, for example, to_date(current_date, 'DD-MM-YYYY') instead of it.
Just in case be here now's tip - which is a good one - doesn't work:
you'll need to collect some data to know where time is being spent. Please read this OTN-thread to see how to do this for Oracle: http://forums.oracle.com/forums/thread.jspa?messageID=1812597. For SQL Server, the same principles apply: use their tools to find out where this query is spending time on.
Some general information you can share is:
How many rows are in those two tables
How many rows are returned by that query
Which indexes are present on those two tables
How long does the query currently take
What response time is acceptable, i.e. when are we done tuning
Regards,
Rob.
Not sure how much this will improve performance, but the first thing I'd do is replace the date to string conversion with just date functions. That is, use trunc() instead of to_char().
In the below way you can optimize the Baan Query
In Where condition use indexes and combine field if possible.
In where condition Use "Between/Inrange" when upper and lower limit specified.
Use "Refers To" if reference is available in data dictionary
Use few overlapping "Or" condition as possible
Use only selected field of table in select statement, Which is actually required.
Use "Order by" to get record in correct sorting format
If possible Don't use NOT INRANGE,BETWEEN,IN operators because that operator can scan full table.
Use commit.transaction() to prevent line being print twice.