Perform Union/OR Operation between where clause and having Clause - sql

I am working on implementation for a SQL which should display results with Union operation between Where and Having Clause.
For example,
Select * from table where col1= 'get' group by col2 (OR/UNION) having avg(col3) >30 . This is not valid but trying to give use a case
The purpose of the sql statement is to return result set which satisfies both where and having conditions.
Lets say I have a table1, has with col1, col2, col3, col4 and large data in the table. Now, There is a use case in which user wants to see results when selects filters with specific crtieria col1 ='Y', avg(col2) >10, avg(col3*col4) =30 in filters list. Now, I have to create a criteria, such that, I should return all results which satisfies col1 ='Y' OR avg(col2) >10 OR avg(col3*col4) =30 , like we do in where clause with OR operator but here we have both where clause and having clause –
Like, the below query
resultset1 <= select * from table1 where col1= 'get';
resultset2 <= select * from table1 group by col2 having avg(col3) >30
final results = resultset1+ resultset2
Do any one have better approach or ideas in implementing such scenario?
Lets say I have filters combinations as below
col1 =23
OR
avg(col2) >30
AND
avg(col3) =10
OR
avg(col1) <10
AND
col2 =10
I need to display results satisfying these criteria in SQL

It's not clear what do you want from this quasi SQL. I guess you need to select records with two conditions col1= 'get' AND /OR ? having avg(col3) >30. So here is the solution:
Select * from table
where (col1= 'get')
OR
col2 in (SELECT col2 FROM table GROUP BY col2 HAVING avg(col3) >30)
If you need both conditions where true then replace OR with AND.
If you need to count AVG only for col1 = 'get' then add this condition into the subquery:
Select * from table
where (col1= 'get')
OR
col2 in (SELECT col2 FROM table WHERE (col1= 'get')
GROUP BY col2
HAVING avg(col3) >30)

SELECT <resultset1> --resultset based on a WHERE clause
UNION
SELECT <resultset2> --resultset based on HAVING
In general, if you want a union of resultsets, use ... UNION.
Using OR in a condition is equivalent to UNION (because the UNION operator is the relational algebra equivalent of logical disjunction), but it requires the scope of the involved conditions to be identical.
In this case, this is impossible because a HAVING condition applies not to the table mentioned in the SELECT, but instead to an intermediate table that is "silently" created by the GROUP clause. This is inevitably so because things like AVG,SUM,... only make sense if it is also determined which set of rows must be used to compute the AVG,SUM,... over, and that is what the GROUP BY specification does.
EDIT
In SQL, UNION comes in distinct flavours, UNION DISTINCT and UNION ALL. One eliminates duplicates, the other won't. If you want the exact same behaviour as OR, you'll obviously need the one that eliminates duplicates from its result set.

Related

SQL Server Query - How to append row showing total record count?

What is the best approach to append a row to a SQL Server query showing the total count of rows resulting from the query? UNION is one way, but seems very inefficient:
SELECT col1, col2 FROM tbl1
UNION ALL
SELECT STR(COUNT(col1)), NULL FROM tbl1
ROLLUP isn't an option because it requires GROUP BY, which we're not using for the queries in question.
You can use GROUPING SETS for this
SELECT
CASE WHEN GROUPING(col1) = 0 THEN col1 ELSE CAST(COUNT(*) AS varchar(30)) END AS col1,
col2
FROM tbl1
GROUP BY GROUPING SETS (
(col1, col2),
()
);
The GROUPING function will tell you whether the row is the Total row or not.
This does have the effect of grouping the columns which could be a different result and possibly less efficient. But if you include a unique/primary key as the first column in the grouping list then this shouldn't make a difference, and should be almost as performant as the original query.
You can also use a window function, which will return the total on each row as another column
SELECT
col1,
col2,
COUNT(*) OVER ()
FROM tbl1;

Counting matching rows of two same tables and counting rows of the table

I have the same table structure called "table1" under two different schemas "schema1" and "schema2". "table1" contains columns "col1, col2, col3". Initialy I want see whether there are records having the same entries of col1 and col2 in the table schema1.table1 and schema2.table1. But I had mistyped schema2.table1 as schema1.table1. And now I am confused by the query result.
SELECT COUNT(*) FROM schema1.table1 AS s1t, schema1.table1 AS s2t
WHERE s1t.col1 = s2t.col1 AND s1t.col2 = s2t.col2;
I got
count
-------
530
(1 row)
However, SELECT COUNT(*) FROM schema1.table1; shows that there are 17815 rows.
Why would the first query show there are only 530 satisfied records? Shouldn't it be 17815 as well?
You can try to use FULL OUTER JOIN to see even mismatched rows, including null values for columns(col1 and 2). This way, at least(more than or equal to) 17815 rows return
SELECT COUNT(*)
FROM schema1.table1 AS s1t
FULL OUTER JOIN schema1.table1 AS s2t
ON s1t.col1 = s2t.col1 AND s1t.col2 = s2t.col2
In your case, only matched rows return for those columns (col1 and 2).
You are joining the table to itself. That is really strange.
In any case, your join is going to filter out any rows where col1 or col2 are NULL.
In addition, the self-join might multiply the number of rows if there are duplicates (with respect to the two columns) in the table.
It is really unclear why you would be doing this, but the above explains the results you are seeing.
If you want to compare the results in the two schemas allowing for duplicates and missing values, I recommend union all/group by:
select col1, col2, sum(cnt1) as cnt1, sum(cnt2) as cnt2
from ((select col1, col2, count(*) as cnt1, 0 as cnt2
from schema1.table1
group by col1, col2
) union all
(select col1, col2, 0 as cnt1, count(*) as cnt2
from schema2.table1
group by col1, col2
)
) t12
group by col1, col2
having sum(cnt1) <> sum(cnt2);
This returns pairs where the counts are not the same in the two tables. It even works for NULL values. If you ran this on the same table, no rows would be returned.

Adding a constant value column in the group by clause

Netezza sql is giving error on this query:Cause: Invalid column name 'dummy'.
select col1,col2, '' as dummy, max(col3) from table1 group by col1,col2,dummy
If i remove the dummy from the group by clause, it works fine. But as per sql syntax, I am supposed to include all non aggregate columns in group by.
why do you need it in your group by, you can use an aggregate function and its result would always be right because the value is constant for example:
select col1,col2, min(' ') as dummy, max(col3) from table1 group by col1,col2
"dummy" is a static column (not in the table), so it does not need to be in the group by because it is an external column.
SELECT col1,
col2,
cast(5 as int) AS [dummy],
max(col3)
FROM test_1
GROUP BY col1,
col2,
col3,
'dummy'
The code produces an outer reference error # 164.
Take a look at these links
http://www.sql-server-helper.com/error-messages/msg-164.aspx
http://www.sql-server-helper.com/error-messages/msg-1-500.aspx
It's due to the order of operations...
FROM
JOIN
WHERE
GROUP BY
...
SELECT
When using group by, only the fields remaining from the previous step are available. Since you are not declaring your "Dummy" column until the Select statement the group by doesn't know it exists and therefore doesn't need to account for it.
Going by Basics..GROUP BY operations are something executed ,after the JOIN operations underneath(File IOs).. And then only the SELECTED resultet would be available.
Now that, you specified something as Dummy in SELECT, and the database would not know it, Because While GROUPing it is not available at the TABLE level.!
Try your query using GROUP BY your_column, ' ' it would work.. Because you have mentioned it directly instead referring an alias!
Finally, when a GROUP by is used.. You can specify any constants in SELECT or GROUP BY.. because they are afterall included in your SELECTed result, without a TABLE operation involved. So the database excuses them.
To resolve the issue, group it at an outer layer:
SELSE X.col1, X.col2, X.dummy, max(col3)
FROM (
SELECT col1,
col2,
cast(5 as int) AS [dummy],
col3
FROM test_1
)
GROUP BY X.col1,
X.col2,
X.dummy

How do I concatenate two similar tables on a result

I have two tables with similar columns. I would simply like to select both tables, one after another, so that if I have 'x' rows on table1 and 'y' rows on table2, I'd get 'x + y' rows.
You would use UNION [ALL] for this. The tables don't need to have the same column names but you do need to select the same number of columns from each and the corresponding columns need to be of compatible datatypes
SELECT col1,col2,col3 FROM table1
UNION ALL
SELECT col1,col2,col3 FROM table2
UNION ALL is preferrable to UNION where there is a choice as it can avoid a sort operation to get rid of duplicates.
Just to add to what they were saying, you might want to add an Order By. Depends on the version of SQL you're using.
SELECT Col1, Col2, Col3
FROM Table1
UNION
SELECT Col1, Col2, Col3
FROM Table2
ORDER BY Col1
Note that ORDER and GROUP BYs have to go after the last table in the UNION.
select col1,col2,col3 from table1
union
select col1,col2,col3 from table2
Look at the Union operator.

Combining several query results into one table, how is the results order determined?

I am retuning table results for different queries but each table will be in the same format and will all be in one final table. If I want the results for query 1 to be listed first and query2 second etc, what is the easiest way to do it?
Does UNION append the table or are is the combination random?
The SQL standard does not guarantee an order unless explicitly called for in an order by clause. In practice, this usually comes back chronologically, but I would not rely on it if the order is important.
Across a union you can control the order like this...
select
this,
that
from
(
select
this,
that
from
table1
union
select
this,
that
from
table2
)
order by
that,
this;
UNION appends the second query to the first query, so you have all the first rows first.
You can use:
SELECT Col1, Col2,...
FROM (
SELECT Col1, Col2,..., 1 AS intUnionOrder
FROM ...
) AS T1
UNION ALL (
SELECT Col1, Col2,..., 2 AS intUnionOrder
FROM ...
) AS T2
ORDER BY intUnionOrder, ...