Grouping data in the select statement - sql

I have huge data which needs to be classifed in to different groups while retrieving. Each group has a different condition. I don't want to retrieve them separately. I want to know the number of items in each group using a single sql statement.
For example, the pseudo code will be like this:
Select count(IssueID) as Issue1_Count if(condition1),
count(IssueID) as Issue2_Count if(condition2),
count(IssueID) as Issue3_Count if(condition3)
From table1, table2, tabl3
where common_condition1 and common_Condition2;
Can somebody help me in making an Oralce query for this...

Put it like this:
SELECT
SUM(CASE WHEN condition1 THEN 1 ELSE 0 END) as Issue1_Count,
SUM(CASE WHEN condition2 THEN 1 ELSE 0 END) as Issue2_Count,
SUM(CASE WHEN condition3 THEN 1 ELSE 0 END) as Issue3_Count,
FROM
table1, table2, tabl3
WHERE
common_condition1 and common_Condition2;

Oracle's CASE statement should help you here. Have a look at this: http://www.dba-oracle.com/t_case_sql_clause.htm
There are limits though, so I'm not 100% positive you can do exactly what you have here using them.

Related

SQL Ignore non matching columns

I'm trying to develop a stored procedure which does a select on a table. The stored procedure has 4 inputs: Input1, Input2, Input3, Input4.
Table has 4 columns: Col1,Col2,Col3, Col4.
The requirement is if there is no match on all selects, we need to ignore that and select pick next one:
Use case:
Select *
from table
where Col1=Input1
and Col2=Input2
and Col3=Input3
and Col4=Input4;
If there are no returns for the condition due to Col2 not equal to Input2, select needs to ignore it and try to match on others like:
Select *
from table
where Col1=Input1
and Col3=Input3
and Col4=Input4;
It should go like that till last possible option for the response:
Select *
from table
where Col1=Input1;
Please assist if there is a way and thanks in advance.
As it's mentioned in the comments, the question is a bit vague so I'm afraid any answer would be a bit vague too. You could do something like this:
select top 1 *,
case when Coll=input1 then 1 else 0 end as match1,
case when Col2=input2 then 1 else 0 end as match2,
case when Col3=input3 then 1 else 0 end as match3,
case when Col4=input4 then 1 else 0 end as match4
from table
order by match1+match2+match3+match4 desc
this is assuming SQL server as the DBMS (if it is Oracle, you may need to use LIMIT instead of TOP, for instance) and I also assumed that the columns have all the same weight in terms of matching.
Also, I'm assuming that you want only one of the best matches, if not you may need to do some changes to the TOP and/or use a where clause.
Finally, You may want to use isnull() if your columns and/or inputs are nullable.
Hmmm . . . One method is:
select t.*
from (select t.*,
sum(case when condition1 then 1 else 0 end) over () as c_1,
sum(case when condition1 and condition2 then 1 else 0 end) over () as c_12,
sum(case when condition1 and condition2 and condition3 then 1 else 0 end) over () as c_123,
sum(case when condition1 and condition2 and condition3 and condition4 then 1 else 0 end) over () as c_1234
from t
) t
where (c_1234 > 0 and condition1 and condition2 and condition3 and condition4) or
(c_1234 = 0 and c_123 > 0 and condition1 and condition2 and condition3) and
(c_123 = 0 and c_12 > 0 and condition1 and condition2 ) and
(c_12 = 0 and c_1 > 0 and condition1 ) ;
Depending on the conditions and other considerations -- such as whether you only expect one row -- there might be simpler methods.

sql group by satisfying multiple conditions within the group

I have a table like below:
I want to select the group which has RELB_CD =9093 and INFO_SRC_CD with 7784. Both conditions should be present in the group. In the table below my output should be the group with id=139993690.
You can use aggregation with having:
select id
from t
group by id
having sum(case when relb_cd = 9093 then 1 else 0 end) > 0 and
sum(case when info_src_cde = 7784 then 1 else 0 end) > 0
hey use this code hope this will help you.
you have to ignore the date column because that one is not allowing to group
select id,fisc_ind, sum(sls_amt),relb_cd,info_scop,info_src_cd from yourtable group by id,fisc_ind,relb_cd,info_scop,info_src_cd
Another working answer. If your data are large, you could compare both GL's and this working answer and see which runs faster for you. I honestly don't know which is faster. This was slightly faster with a very short set of data.
select id
from table1
where relb_cd = 9093
intersect
select id
from table1
where info_src_cd = 7784

Alternative to executing Netezza SQL subquery multiple times in Case When?

SELECT DISTINCT lr.id,
lr.dept,
lr.name
Case When lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value1%') AND lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value2%') Then 1
Else 0
End As both_panels,
Case When lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value1%') AND lr.id NOT IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value2%') Then 1
Else 0
End As only_value1_panel,
FROM RESULTS lr
I have simplified this, in reality I actually need many more Case When statements and it's performance nightmare because the subquery executes each time. Is there a more performant way to do this?
I tried creating Common Table Expressions and Temp Tables before the query, but the way I was doing it (replacing the subquery statements with a SELECT from the CTE or the Temp Table) doesn't seem to make any performance difference as it is still executing a query each time.
This would usually be handled with conditional aggregation. I think this captures your logic:
SELECT lr.id, lr.dept, lr.name,
LEAST(MAX(Case When PANEL_FLAG LIKE '%value1%' THEN 1 ELSE 0 END),
MAX(Case When PANEL_FLAG LIKE '%value2%' THEN 1 ELSE 0 END)
) As both_panels,
LEAST(MAX(Case When PANEL_FLAG LIKE '%value1%' THEN 1 ELSE 0 END),
MAX(Case When PANEL_FLAG LIKE '%value2%' THEN 0 ELSE 1 END)
) as only_value1_panel,
FROM RESULTS lr
GROUP BY lr.id, lr.dept, lr.name
If the sub queries for the ‘in’ lists are with a ‘constant’ whereclause I would consider populating them in a script (commas and all), then inject them into a ‘sql template’ file and run it.
That will run very fast.
Of course the resulting lists should be fairly small (less than 60KB for all lists in total) otherwise the sql statement will become too large.

Joining two datasets with subqueries

I am attempting to join two large datasets using BigQuery. they have a common field, however the common field has a different name in each dataset.
I want to count number of rows and sum the results of my case logic for both table1 and table2.
I believe that I have errors resulting from subquery (subselect?) and syntax errors. I have tried to apply precedent from similar posts but I still seem to be missing something. Any assistance in getting this sorted is greatly appreciated.
SELECT
table1.field1,
table1.field2,
(
SELECT COUNT (*)
FROM table1) AS table1_total,
sum(case when table1.mutually_exclusive_metric1 = "Y" then 1 else 0 end) AS t1_pass_1,
sum(case when table1.mutually_exclusive_metric1 = "Y" AND table1.mutually_exclusive_metric2 IS null OR table1.mutually_exclusive_metric3 = 'Y' then 1 else 0 end) AS t1_pass_2,
sum(case when table1.mutually_exclusive_metric3 ="Y" AND table1.mutually_exclusive_metric2 ="Y" AND table1.mutually_exclusive_metric3 ="Y" then 1 else 0 end) AS t1_pass_3,
(
SELECT COUNT (*)
FROM table2) AS table2_total,
sum(case when table2.metric1 IS true then 1 else 0 end) AS t2_pass_1,
sum(case when table2.metric2 IS true then 1 else 0 end) AS t2_pass_2,
(
SELECT COUNT (*)
FROM dataset1.table1 JOIN EACH dataset2.table2 ON common_field_table1 = common_field_table2) AS overlap
FROM
dataset1.table1,
dataset2.table2
WHERE
XYZ
Thanks in advance!
Sho. Lets take this one step at a time:
1) Using * is not explicit, and being explicit is good. Additionally, stating explicit selects and * will duplicate selects with autorenames. table1.field will become table1_field. Unless you are just playing around, don't use *.
2) You never joined. A query with a join looks like this (note order of WHERE and GROUP statements, note naming of each):
SELECT
t1.field1 AS field1,
t2.field2 AS field2
FROM dataset1.table1 AS t1
JOIN dataset2.table2 AS t2
ON t1.field1 = t2.field1
WHERE t1.field1 = "some value"
GROUP BY field1, field2
Where t1.f1 = t2.f1 contain corresponding values. You wouldn't repeat those in the select.
3) Use whitespace to make your code easier to read. It helps everyone involved, including you.
4) Your subselects are pretty useless. A subselect is used instead of creating a new table. For example, you would use a subselect to group or filter out data from an existing table. For example:
SELECT
subselect.field1 AS ssf1,
subselect.max_f1 AS ss_max_f1
FROM (
SELECT
t1.field1 AS field1,
MAX(t1.field1) AS max_f1,
FROM dataset1.table1 AS t1
GROUP BY field1
) AS subselect
The subselect is practically a new table that you select from. Treat it logically like it happens first, and you take the results from that and use it in your main select.
5) This was a terrible question. It didn't even look like you tried to figure things out one step at a time.

SQL 2 counts with different filter

I have a table and I need calculate two aggregate functions with different conditions in one statement. How can I do this?
Pseudocode below:
SELECT count(CoumntA) *< 0*, count(CoumntA) * > 0*
FROM dbo.TableA
This is the same idea as tombom's answer, but with SQL Server syntax:
SELECT
SUM(CASE WHEN CoumntA < 0 THEN 1 ELSE 0 END) AS LessThanZero,
SUM(CASE WHEN CoumntA > 0 THEN 1 ELSE 0 END) AS GreaterThanZero
FROM TableA
As #tombom demonstrated, this can be done as a single query. But it doesn't mean that it should be.
SELECT
SUM(CASE WHEN CoumntA < 0 THEN 1 ELSE 0 END) AS less_than_zero,
SUM(CASE WHEN CoumntA > 0 THEN 1 ELSE 0 END) AS greater_than_zero
FROM
TableA
The time when this is not so good is...
- There is an index on CoumntA
- Most values (50% or more feels about right) are exactly zero
In that case, two queries will be faster. This is because each query can use the index to quickly home in on the section to be counted. In the end only counting the relevant records.
The example I gave, however, scans the whole table every time. Only once, but always the whole table. This is worth it when you're counting most of the records. In your case it looks liek you're counting most or all of them, and so this is probably a good way of doing it.
It is possible to do this in one select statement.
The way I've done it before is like this:
SELECT SUM(CASE WHEN ColumnA < 0 THEN 1 END) AS LessThanZero,
SUM(CASE WHEN ColumnA > 0 THEN 1 END) AS GreaterThanZero
FROM dbo.TableA
This is the correct MS SQL syntax and I believe this is a very efficient way of doing it.
Don't forget you are not covering the case when ColumnA = 0!
select '< 0' as filter, COUNT(0) as cnt from TableA where [condition 1]
union
select '> 0' as filter, COUNT(0) as cnt from TableA where [condition 2]
Be sure that condition 1 and condition 2 create a partition on the original set of records, otherwise same records could be counted in both groups.
For SQL Server, one way would be;
SELECT COUNT(CASE WHEN CoumntA<0 THEN 1 ELSE NULL END),
COUNT(CASE WHEN CoumntA>0 THEN 1 ELSE NULL END)
FROM dbo.TableA
Demo here.
SELECT
SUM(IF(CoumntA < 0, 1, 0)) AS lowerThanZero,
SUM(IF(CoumntA > 0, 1, 0)) AS greaterThanZero
FROM
TableA
Is it clear what's happening? Ask, if you have any more questions.
A shorter form would be
SELECT
SUM(CoumntA < 0) AS lowerThanZero,
SUM(CoumntA > 0) AS greaterThanZero
FROM
TableA
This is possible, since in MySQL a true condition is equal 1, a false condition is equal 0
EDIT: okay, okay, sorry, don't know why I thought it's about MySQL here.
See the other answers about correct syntax.