mysql query on Group fails - sql

In a part of my sql query at the end of the query I have this
GROUP BY
`Record`.`RecordID`
ORDER BY
`Record`.`RecordID`
it works fine until I have RecordID null, and then mysql query fails. Is there a way around that IFNULL I dont use GROUP BY and Order BY
thank

You can try:
GROUP BY IFNULL(`Record`.`RecordID`,0)
You can skip the ORDER BY, since by default MySql will sort based on the GROUP BY

When you say fail, what do you mean?
If I have the table:
Value
a
b
{null}
c
c
and I run the query:
select value from table
group by value
Your result is:
{null}
a
b
c
To get rid of the nulls:
select value from table
group by value
having value is not null

I don't see how the GROUP BY and ORDER BY clauses in and of themselves can cause anything to fail. Please don't show just some part that you think is broken, if you knew better, you wouldn't need to ask here right?
Add a IS NOT NULL filter to remove them entirely
WHERE `Record`.`RecordID` is not null
GROUP BY
`Record`.`RecordID`
ORDER BY
`Record`.`RecordID`

Related

SQL GROUP BY 1 2 3 and SQL Order of Execution

This may be a dumb question but I am really confused. So according to the SQL Query Order of Execution, the GROUP BY clause will be executed before the SELECT clause. However it allows to do something like:
SELECT field_1, SUM(field_2) FROM myTable GROUP BY 1
My confusion is that if GROUP BY clause happens before SELECT, in this scenario I provided, how does SQL know what 1 is? It works with ORDER BY clause and it makes sense to me because ORDER BY clause happens after SELECT.
Can someone help me out? Thanks in advance!
https://www.periscopedata.com/blog/sql-query-order-of-operations
My understanding is because it's ordinal notation and for the SELECT statement to pass syntax validation you have to have at least selected a column. So the 1 is stating the first column in the select statement since it knows you have a column selected.
EDIT:
I see people saying you can't use ordinal notation and they are right if you're using SQL Server. You can use it in MySQL though.
select a,b,c from emp group by 1,2,3. First it will group by column a then b and c. It works based on the column after the select statement.
Each GROUP BY expression must contain at least one column that is not an outer reference. You cannot group by 1 if it is not a column in your table.

getting same top 1 result in sql server

I have this query:
SELECT
IT_approvaldate
FROM
t_item
WHERE
IT_certID_fk_ind = (SELECT DISTINCT TOP 1 IT_certID_fk_ind
FROM t_item
WHERE IT_rfileID_fk = '4876')
ORDER BY
IT_typesort
Result when running this query:
I need get top 1 result. (2013-04-27 00:00:00) problem is when I select top 1, getting 2nd result.
I believe reason for that order by column value same in those two result.
please see below,
However I need get only IT_approvaldate column top 1 as result of my query.
How can I do this? Can anyone help me to solve this?
Hi use below query and check
SELECT IT_approvaldate FROM t_item WHERE IT_certID_fk_ind =(SELECT DISTINCT top 1 IT_certID_fk_ind FROM t_item WHERE IT_rfileID_fk ='4876' ) and IT_approvaldate is not null ORDER BY IT_typesort
This will remove null values from the result
If you want NULL to be the last value in the sorted list you can use ISNULL in ORDER BY clause to replace NULL by MAX value of DATETIME
Below code might help:
SELECT TOP 1 IT_approvaldate
FROM t_item
WHERE IT_certID_fk_ind = (SELECT DISTINCT top 1 IT_certID_fk_ind FROM t_item WHERE IT_rfileID_fk ='4876' )
ORDER BY IT_typesort ASC, ISNULL(IT_approvaldate,'12-31-9999 23:59:59') ASC;
TSQL Select queries are not inherently deterministic. You must add a tie-breaker or by another row that is not.
The theory is SQL Server will not presume that the NULL value is greater or lesser than your row, and because your select statement is not logically implemented until after your HAVING clause, the order depends on how the database is setup.
Understand that SQL Server may not necessarily choose the same path twice unless it thinks it is absolutely better. This is the reason for the ORDER BY clause, which will treat NULLs consistently (assuming there is a unique grouping).
UPDATE:
It seemed a good idea to add a link to MSDN's documentation on the ORDER BY. Truly, it is good practice to start from the Standard/MSDN. ORDER BY Clause - MSDN

To Remove Duplicates from Netezza Table

I have a scenario for a type2 table where I have to remove duplicates on total row level.
Lets consider below example as the data in table.
A|B|C|D|E
100|12-01-2016|2|3|4
100|13-01-2016|3|4|5
100|14-01-2016|2|3|4
100|15-01-2016|5|6|7
100|16-01-2016|5|6|7
If you consider A as key column, you know that last 2 rows are duplicates.
Generally to find duplicates, we use group by function.
select A,C,D,E,count(1)
from table
group by A,C,D,E
having count(*)>1
for this output would be 100|2|3|4 as duplicate and also 100|5|6|7.
However, only 100|5|6|7 is only duplicate as per type 2 and not 100|2|3|4 because this value has come back in 3rd run and not soon after 1st load.
If I add date field into group by 100|5|6|7 will not be considered as duplicate, but in reality it is.
Trying to figure out duplicates as explained above.
Duplicates should only be 100|5|6|7 and not 100|2|3|4.
can someone please help out with SQL for the same.
Regards
Raghav
Use row_number analytical function to get rid of duplicates.
delete from
(
select a,b,c,d,e,row_number() over (partition by a,b,c,d,e) as rownumb
from table
) as a
where rownumb > 1
if you want to see all duplicated rows, you need join table with your group by query or filter table using group query as subquery.
wITH CTE AS (select a, B, C,D,E, count(*)
from TABLE
group by 1,2,3,4,5
having count(*)>1)
sELECT * FROM cte
WHERE B <> B + 1
Try this query and see if it works. In case you are getting any errors then let me know.
I am assuming that your column B is in the Date format if not then cast it to date
If you can see the duplicate then just replace select * to delete

Difference between "HAVING ... GROUP BY" and "GROUP BY ... HAVING"

I have got the table MYTABLE with 2 columns: A and B
I have got the following pieces of the code:
SELECT MYTABLE.A FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
GROUP BY MYTABLE.A
and
SELECT MYTABLE.A FROM MYTABLE
GROUP BY MYTABLE.A
HAVING SUM(MYTABLE.B) > 100
Is it the same? Is it possible that these 2 codes will return diffrent sets of results?
Thank you in advance
As documented, there is no difference. People are just used to seeing HAVING after GROUP BY.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#SQLRF20040
Specify GROUP BY and HAVING after the where_clause and hierarchical_query_clause. If you specify both GROUP BY and HAVING, then they can appear in either order.
http://sqlfiddle.com/#!4/66e33/1
I originally wrote:
I am not sure your 1st query is valid. As far as I know, HAVING should always come after GROUP BY.
I was corrected by David Aldridge, the Oracle docs state that the order does not matter. Although I don't recommend using HAVING before GROUP for readability reasons (and to prevent confusion with a WHERE clause), it is technically correct. So that makes the answer to your question 'yes, it's the same'.
You can't have a HAVING before a GROUP BY, the HAVING is like the "WHERE" but for the GROUP BY condition.
The clauses are evaluated in order. You can have a HAVING clause following immediately the FROM clause. In this case, the HAVING clause will apply to the entire rows of the result set. The select list may only contain, in this case, one/more/all of the aggregation functions contained in the HAVING clause.
So, your first query is not valid because of the above. A valid query would be
SELECT SUM(MYTABLE.B) AS s FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
The above query will return one or no row, depending on whether the condition SUM(MYTABLE.B) > 100 is verified or not.
Still, there is one more reason for which your first query is not valid. The GROUP BY clause may refer only to columns in the data set to which it applies. So going on with my valid query above, you can write the following valid query (though it will be useless and nonsense, as it is applied to either one or no rows):
SELECT SUM(s)
FROM
(
SELECT SUM(MYTABLE.B) s
FROM MYTABLE
HAVING SUM(MYTABLE.B) > 100
) q
GROUP BY s
So, just to answer: no, they're not the same. One of them is not even valid.
both WHERE and HAVING allow for the imposition of conditions in the query. Difference:
We use WHERE for the records returned by select from the table,
We use HAVING for groups returned by group by select query

I am getting: "You tried to execute a query that does not include the specified expression 'OrdID' as part of an aggregate function. How do I bypass?

My code is as follows:
SELECT Last, OrderLine.OrdID, OrdDate, SUM(Price*Qty) AS total_price
FROM ((Cus INNER JOIN Orders ON Cus.CID=Orders.CID)
INNER JOIN OrderLine
ON Orders.OrdID=OrderLine.OrdID)
INNER JOIN ProdFabric
ON OrderLine.PrID=ProdFabric.PrID
AND OrderLine.Fabric=ProdFabric.Fabric
GROUP BY Last
ORDER BY Last DESC, OrderLine.OrdID DESC;
This code has been answered before, but vaguely. I was wondering where I am going wrong.
You tried to execute a query that does not include the specified expression 'OrdID' as part of an aggregate function.
Is the error message I keep getting, no matter what I change, it gives me this error. Yes I know, it is written as SQL-92, but how do I make this a legal function?
For almost every DBMS (MySQL is the only exception I'm aware of, but there could be others), every column in a SELECT that is not aggregated needs to be in the GROUP BY clause. In the case of your query, that would be everything but the columns in the SUM():
SELECT Last, OrderLine.OrdID, OrdDate, SUM(Price*Qty) AS total_price
...
GROUP BY Last, OrderLine.OrdID, OrdDate
ORDER BY Last DESC, OrderLine.OrdID DESC;
If you have to keep your GROUP BY intact (and not to add non-agreggated fields to the list) then you need to decide which values you will want for OrderLine.OrdID and OrdDate. For example, you may chose to have MAX or MIN of these values.
So it's either as bernie suggested GROUP BY Last, OrderLine.OrdID, OrdDate or something like this (if it makes sense for your business logic):
SELECT Last, MAX(OrderLine.OrdID), MAX(OrdDate), SUM(Price*Qty) AS total_price