SUMIF and COUNTIF for SQL - sql

I have some data that looks like this:
col1, col2, col3
A, 1.2, A|X|Y|Z
B, 0.3, B|X|Y|Z
X, 1.0, X|Y|Z
Y, 0.2, Y|Z
Z, 1.0, Z
I want to select the rows where the item in col1 appears in the col3 pipe-delimited list other than its own list. And for each of those items, I want to count the number of rows where that item appears in col3, and also sum col2 where that condition is met. So the results should be something like this:
col1, foo, bar
X, 2, 1.5
Y, 3, 2.5
Z, 4, 2.7
I've been trying to use CASE WHEN and LIKE to do this (see below), but it's not working. For foo I get all zeroes, and for bar I get null. Maybe I need some kind of subquery that I don't understand how to use?
SELECT
col1,
COUNT(CASE WHEN col3 LIKE col1 THEN 1 END) as foo,
SUM(CASE WHEN col3 LIKE col1 THEN col2 END) as bar
FROM table
GROUP BY
col1

You can do this with a self-join. In standard SQL, it would look like:
select t.col1, count(*) as foo, sum(col2) as bar
from t join
t t2
on '|' || t2.col3 || '|' like '%|' || t.col1 || '|%'
group by t.col1;
The syntax might vary in the database you are actually using, but the idea should work with whatever string concatenation mechanism your database uses.

With a self join and aggregation:
SELECT t1.col1, COUNT(*) foo, SUM(t2.col2) bar
FROM tablename t1 INNER JOIN tablename t2
ON t2.col1 <> t1.col1 AND CONCAT('|', t2.col3, '|') LIKE CONCAT('%|', t1.col1, '|%')
GROUP BY t1.col1
The above code will work in MySql.
For other databases you can use their concatenation operators like || or +.
See the demo.
Results:
> col1 | foo | bar
> :--- | --: | --:
> X | 2 | 1.5
> Y | 3 | 2.5
> Z | 4 | 2.7

Related

How to get Distinct value for a column on the basis of other column in Oracle

I want to get the distinct values from COL1 and it's COL3 value also but the condition is if COL1 = COl2 then it should pick the matching COL3 value otherwise pick the COL1 value if they are not same. I'm stuck in the logic, any help will be appreciated!
Please see the below image for more detail:
select DISTINCT COL1,
CASE WHEN COL1 = COL2 THEN COL3 END COL3 from TABLE1
WHERE COL1 IS NOT NULL;
Do a GROUP BY to get distinct COL1 values.
Use COALESCE() to return the COL3 value if there exists a COL1 = COL2 row, otherwise return the max COL3 value for the COL1. (Could use MIN() too, if that's better.)
select COL1,
COALESCE( MAX(CASE WHEN COL1 = COL2 THEN COL3 END), MAX(COL3) )
FROM table1
WHERE COL1 IS NOT NULL
GROUP BY COL1
use correlated subquery
select col1,col3
from TABLE1 a
where col2 in (select min(col2) from table1 b where a.col1=b.col1)
select distinct COL1, if(COL1 = COL2, COL3, COL1) as result
from table1
I think that you can join the table with itself and then use a join conditio to filter that out, then decide in select wether there was COL2 = COL1 and choose appropriate COL3:
SELECT DISTINCT a.COL1, CASE WHEN b.COL1 IS NULL THEN a.COL3 ELSE b.COL3 END as COL3
FROM TABLE1 a
LEFT JOIN TBALE2 b
on a.COL1 = b.COL2
and a.COL1 = b.COL1
This way you have on table a all the data, and on table b data if and only if COL1 matches with COL2. Then you select whichever COL3 is not null, prefarably the one from table b. There is Oracle function coalesce that does just that.
With a self join:
select distinct
t.col1,
case
when tt.col1 is null then t.col3
else tt.col3
end col3
from tablename t left join tablename tt
on tt.col1 = t.col1 and tt.col2 = t.col1
See the demo.
Results:
> COL1 | COL3
> ---: | :---
> 11 | ABC
> 12 | ABC
> 13 | BDG
> 14 | DEF
> 15 | CEG

How to get previous row data in sql server

I would like to get the data from previous row. I have used LAG function but did not get the expected result.
Table:-
col1 col2 col3
ABCD 1 Y
ABCD 2 N
ABCD 3 N
EFGH 4 N
EFGH 5 Y
EFGH 6 N
XXXX 7 Y
Expected result
col1 col2 col3 col4
ABCD 1 A NULL
ABCD 2 B A
ABCD 3 C B
EFGH 4 A NULL
EFGH 5 B A
EFGH 6 E B
XXXX 7 F NULL
Col4 should hold the data from previous row grouping by the value in Col1.
Please let me know how can this be achieved.
Use lag() function
select *, lag(col3) over (partition by col1 order by col2) as col4
from table t;
However You can also use subquery if your SQL doesn't have LAG()
select *,
(select top 1 col3
from table
where col1 = t.col1 and col2 < t.col2
order by col2 desc
) as col4
from table t;
Assuming SQL Server 2012 or newer...
SELECT
*,
LAG(col3) OVER (PARTITION BY col1 ORDER BY col2) AS col4
FROM
yourTable
If you're on SQL Server 2008 or older...
SELECT
*,
(
SELECT TOP(1) previous.col3
FROM yourTable AS previous
WHERE previous.col1 = yourTable.col1
AND previous.col2 < yourTable.col2
ORDER BY previous.col2 DESC
)
AS col4
FROM
yourTable
If you are on 2008 or earlier, try this:
select t1.col1, t1.col2, t1.col3, t2.col3 as col4
from table1 t1
left join table1 t2 on t1.col1 = t2.col1 and t1.col2 - 1 = t2.col2
the lag() function is the bee's knees, though. Use that, if you can.
Thank you all for the replies. By using the lag function with partition I got the expected result. I missed to used partition previously and due to that I was getting wrong results.

Print value in SQL depending on its presence in another column

I have a table of the form
Col1 | Col2
-------------
A | C
B | A
C | X
D | A
E | NULL
If any element of Col1 is present in Col2, then It should be printed as
Element, YES.
If it is not present in Col2, then it needs to be printed as element, NO and if corresponding col2 value is NULL then it needs to be printed as element, NULL
So final output should look like
A YES
B NO
C YES
D NO
E NULL
I was able to write three individual queries for the same but am struggling with the moment on how to put them inside Case statements in SQL.
SELECT Col1 FROM table WHERE col1 IN (SELECT col2 FROM table)
Select col1 FROM table where Col2 is NULL
SELECT Col1 FROM table WHERE col1 NOT IN (SELECT col2 FROM table)
I tried putting them inside case statements
Select col1, Case
when (SELECT Col1 FROM table WHERE col1 IN (SELECT col2 FROM table))
then "YES"
when (Select col1 FROM table where Col2 is NULL)
then "NULL"
else
"NO"
But I was getting an error. How should I fix this?
I would expect the query to look like this:
select col1,
(case when col2 is null then NULL
when col1 in (select t2.col2 from t t2)
then 'YES'
else 'NO'
end)
from t;

pivot data using SQL

I'm new to SQL and I'm wondering how to pivot a table like:
Col1 Col2 Col3
1 a w
2 a x
1 b y
2 b z
Into
Col1 a b
1 w y
2 x z
I was playing with GROUP BY but I can't seem to be able to turn unique rows into columns
This can be done using an aggregate function with a CASE expression:
select col1,
max(case when col2 = 'a' then col3 end) a,
max(case when col2 = 'b' then col3 end) b
from yourtable
group by col1
See SQL Fiddle with Demo
If you are using an RDBMS with a PIVOT function (SQL Server 2005+ / Oracle 11g+), then your query would be similar to this (Note: Oracle syntax below):
select *
from
(
select col1, col2, col3
from yourtable
)
pivot
(
max(col3)
for col2 in ('a', 'b')
)
See SQL Fiddle with Demo
The last way that you can do this is by using multiple joins on the same table:
select t1.col1,
t1.col3 a,
t2.col3 b
from yourtable t1
left join yourtable t2
on t1.col1 = t2.col1
and t2.col2 = 'b'
where t1.col2 = 'a'
See SQL Fiddle with Demo
All give the result:
| COL1 | 'A' | 'B' |
--------------------
| 1 | w | y |
| 2 | x | z |
If you require that the distinct values in Col2 can change without forcing changes on your query definition, you may be looking for an OLAP structure like SQL Analysis Services.
You should try something like
select * from
(select Col1, Col2, Col3 from TableName)
pivot xml (max(Col3)
for Col2 in (any) )
I'm on a mobile so I can't test if it's working right now.

Find duplicate values in oracle

I'm using this query to find duplicate values in a table:
select col1,
count(col1)
from table1
group by col1
having count (col1) > 1
order by 2 desc;
But also I want to add another column from the same table, like this:
select col1,
col2,
count(col1)
from table1
group by col1
having count (col1) > 1
order by 2 desc;
I get an ORA-00979 error with that second query
How can I add another column in my search?
Your query should be
SELECT * FROM (
select col1,
col2,
count(col1) over (partition by col1) col1_cnt
from table1
)
WHERE col1_cnt > 1
order by 2 desc;
Presumably you want to get col2 for each duplicate of col1 that turns up. You can't really do that in a single query^. Instead, what you need to do is get your list of duplicates, then use that to retrieve any other associated values:
select col1, col2
from table1
where col1 in (select col1
from table1
group by col1
having count (col1) > 1)
order by col2 desc
^ Okay, you can, by using analytic functions, as #rs. demonstrated. For this scenario, I suspect that the nested query will be more efficient, but both should give you the same results.
Based on comments, it seems like you're not clear on why you can't just add the second column. Assume you have sample data that looks like this:
Col1 | Col2
-----+-----
1 | A
1 | B
2 | C
2 | D
3 | E
If you run
select Col1, count(*) as cnt
from table1
group by Col1
having count(*) > 1
then your results will be:
Col1 | Cnt
-----+-----
1 | 2
2 | 2
You can't just add Col2 to this query without adding it to the group by clause because the database will have no way of knowing which value you actually want (i.e. for Col1=1 should the DB return 'A' or 'B'?). If you add Col2 to the group by clause, you get the following:
select Col1, Col2, count(*) as cnt
from table1
group by Col1, Col2
having count(*) > 1
Col1 | Col2 | Cnt
-----+------+----
[no results]
This is because the count is for each combination of Col1 and Col2 (each of which are unique).
Finally, by using either a nested query (as in my answer) or an analytic function (as in #rs.'s answer), you'll get the following result (query changed slightly to return the count):
select t1.col1, t1.col2, cnt
from table1 t1
join (select col1, count(*) as cnt
from table1
group by col1
having count (col1) > 1) t2
on table1.col1 = t2.col1
Col1 | Col2 | Cnt
-----+------+----
1 | A | 2
1 | B | 2
2 | C | 2
2 | D | 2
You should list all selected columns in the group by clause as well.
select col1,
col2,
count(col1)
from table1
group by col1, col2
having count (col1) > 1
order by 2 desc;
Cause of Error
You tried to execute an SQL SELECT statement that included a GROUP BY
function (ie: SQL MIN Function, SQL MAX Function, SQL SUM Function,
SQL COUNT Function) and an expression in the SELECT list that was not
in the SQL GROUP BY clause.
select col1,
col2,
count(col1)
from table1
group by col1,col2
having count (col1) > 1
order by 2 desc;