Looking for an alternative SQL statement - sql

Given the following table with 2 columns:
c1 c2
------------
a1 | b1
a1 | b1
a2 | b2
a2 | b3
a3 | b3
I want to return those values from column c2 where the value of c2 column appears multiple times for the same c1 value. I am doing the following SQL query to return the required result:
SELECT DISTINCT ( c2 ) AS c
FROM ( SELECT c1 , c2 , COUNT (*) AS rowcount
FROM table
GROUP BY c1 , c2 HAVING rowcount > 1 )
Result:
c
---
b1
Is there any alternative SQL statement of the above query?

Based on your description, you can use:
select distinct c1
from (select t.*, count(*) over (partition by c2) as cnt
from t
) t
where cnt >= 2;
Based on your sample results:
select c1
from t
group by c1
having count(*) >= 2;
And based on the revised question:
select c2
from t
group by c2
having count(*) >= 2;

Use count in having clause instead of using subquery:-
select c1
from table
group by c1
having count(c2) > 1

Most answers above will work if you want all the values in c1 that appear more than once in the table (even with the same value on c2).
If you want to measure only values of c1 that may have multiple DISTINCT values on c2 you can use:
SELECT c1
FROM table
GROUP BY c1
HAVING COUNT(DISTINCT c2) > 1

Related

hive - how to select top N elements for each match

Please consider a hive table - TableA as mentioned below.
This basic SQL syntax works fine when we want to get "all" the rows that matches the condition in the where clause. I want to limit the returned rows to a number - say N - for each of the matches of where clause.
Let me explain with an example:
(1)
Consider this table:
TableA
c1 c2
1. a
1 b
1 c
2. d
2. e
2. f
(2) Consider this query:
SELECT c1, c2
FROM TableA
WHERE c1 in (1,2)
(3) As you can imagine, it would produce this result:
Actual Results:
c1 c2
1. a
1 b
1 c
2. d
2. e
2. f
(4)
Desired Result:
c1 c2
1. a
1 b
2. d
2. e
Question: How do I modify the query in #2) to get the desired output mention in #4).
You can use row_number function to do this.
select c1,c2
from (SELECT c1, c2, row_number() over(partition by c1 order by c2) as rnum
FROM TableA
--add a where clause as needed
) t
where rnum <= 2
Only 2 values for c1
SELECT c1, c2 FROM TableA WHERE c1 = 1 ORDER BY c2 LIMIT 2
UNION ALL
SELECT c1, c2 FROM TableA WHERE c1 = 2 ORDER BY c2 LIMIT 2
More than 2 values, use rank()
select c1,c2 from
(
select c1,c2,rank() over (partition by c1 order by c2) as rank
from TableA
) t
where rank < 3;

SQL - need to print the same number of records and same product for customer in same table

I have table with customer_id and product_id
customer_id product_id
c1 1
c1 2
c1 3
c2 1
c2 2
c2 3
c3 5
c4 5
c5 3
I need to filter the same number of customers who brought the same number of products.
In addition to that, the customer (c5,3) is not valid because he have same product_id but the number of records are not matched with the customers.
This is the query i have tried
SELECT T1.ORDER_ID FROM #ORDER T1
WHERE EXISTS (SELECT * FROM #ORDER T2
WHERE T2.PRODUCT_ID = T1.PRODUCT_ID
AND T2.ORDER_ID != T1.ORDER_ID
GROUP BY T2.ORDER_ID)
The output should be like this
customer_id product_id
customer_id product_id
c1 1
c1 2
c1 3
c2 1
c2 2
c2 3
c3 5
c4 5
I tried the below approach, see if it applies in your case too. It doesn't seem like the best solution, but it works.
TEST_DROP1:
cust_id Prod_id
C1 1
C1 2
C1 3
C2 1
C2 2
C2 3
C3 5
C4 5
C5 3
Solution:
Step 1:
CREATE TABLE TEST_DROP2 AS
SELECT CUST_ID,
LISTAGG(PRODUCT_ID, ',') WITHIN GROUP (
ORDER BY PRODUCT_ID) prods
FROM TEST_DROP1
GROUP BY CUST_ID;
TEST_DROP2:
cust_id prod_id
C1 1,2,3
C2 1,2,3
C3 5
C4 5
C5 3
Run the below query,
SELECT *
FROM TEST_DROP1
WHERE cust_id IN
(SELECT CUST_ID
FROM TEST_DROP2
WHERE PRODS IN
( SELECT PRODS FROM TEST_DROP2 GROUP BY PRODS HAVING COUNT(1)>1
)
)
ORDER BY CUST_ID,
product_id;**
Result:
C1 1
C1 2
C1 3
C2 1
C2 2
C2 3
C3 5
C4 5

Get specific items from partition which should not be in other partition

I have below table -
ID type group_name creation_date
1 A G1 C1
2 B G2 C2
3 C G2 C3
4 B G1 C4
I want to extract the old type items in each group, but if that type item is latest item in other partition , then i won't extract that.
So, for G1, I will have 2 items A and B where C1 > C4
For G2, I will have 2 items B and C where C2 > C3.
Ideally, B is older for group G1 and C is older for group G2
But i don't want to extract B for G1 since it is latest for G2. Hence
the output should be C only.
Could anyone help how can i achieve this ?
Query:
SELECT DISTINCT
type
FROM (
SELECT type,
rnk,
COUNT( CASE rnk WHEN 1 THEN 1 END ) OVER ( PARTITION BY type ) AS ct
FROM (
SELECT type,
RANK() OVER ( PARTITION BY group_name ORDER BY creation_date DESC ) AS rnk
FROM table_name
)
)
WHERE rnk > 1 AND ct = 0;
Output:
TYPE
----
C

How to select entry with greater value in postgresql

I have two or more values like:
c1|c2 |c3 |c4
--+---+---+---
1 | Z | B | 29
2 | Z | B | 19
and I want to have the entry with the larger c4 value:
1 | Z | B | 29
I tried to query the max value from c4, after a group by of c2 and c3, but this doesn't work.
Postgres specific solution:
select distinct on (c2,c3) c1, c2, c3, c4
from the_table
order by c2,c3,c4 desc
ANSI SQL solution:
select c1,c2,c3,c4
from (
select c1,c2,c3,c4,
row_number() over (partition by c2,c3 order by c4 desc) as rn
from the_table
) t
where rn = 1;
You can order results in descending order by c4 and output only one row (see LIMIT clause):
SELECT *
FROM table_name
ORDER BY c4 DESC
LIMIT 1

SQL Return Null if One Column is Null (Opposite of COALESCE())

In advance, I would like to say thanks for the help. This is a great community and I've found many programming answers here.
I have a table with multiple columns, 5 of which contain dates or null.
I would like to write an sql query that essentially coalesces the 5 columns into 1 column, with the condition that if 1 of the five columns contains a "NULL" value, the returned value is null. Essentially the opposite of the coalesce condition of returning the first non-null, I want to return the first null. If none are null, returning the greatest of the 5 dates would be optimal, however I can settle with returning any one of the 5 dates.
C1 C2 C3 C4 C5
-- -- -- -- --
1/1/1991 1/1/1991 1/1/1991 1/1/1991 2/2/1992
NULL 1/1/1991 1/1/1991 1/1/1991 1/1/1991
Query Returns:
C1
--
2/2/1992
NULL
Thank you very much.
(Server is MSSQL2008)
select greatest(c1, c2, c3, c4, c5)
from table;
Life can be so easy :-)
(edit: works on Oracle)
Without overthinking it:
SELECT
CASE WHEN c1 is null or c2 is null or c3 is null or c4 is null or c5 is null
THEN null
ELSE c1
END
FROM mytable
My edit is as follows:
CASE
WHEN (c1 >= c2 AND c1 >= c3) THEN c1
WHEN (c2 >= c1 AND c2 >= c3) THEN c2
WHEN (c3 >= c1 AND c3 >= c2) THEN c3
END
Try this:
SELECT
CASE WHEN t1.SomeDate IS NULL THEN NULL ELSE MAX(t1.SomeDate) END AS TheVal
FROM
(
SELECT C1 AS SomeDate FROM Table_1
UNION ALL
SELECT C2 AS SomeDate FROM Table_1
UNION ALL
SELECT C3 AS SomeDate FROM Table_1
UNION ALL
SELECT C4 AS SomeDate FROM Table_1
UNION ALL
SELECT C5 AS SomeDate FROM Table_1
) t1
GROUP BY
t1.SomeDate
perhaps a variation on coalesce (replace -1 with an invalid value)?
SELECT CASE WHEN COALESCE(C1,C2,C3,C4,C5,-1) = -1 THEN NULL ELSE COALESCE(C1,C2,C3,C4,C5) END
Maybe with LEAST?
I don't know how this works with NULL.
SELECT
CASE WHEN C1 IS NULL THEN C2 WHEN C1 IS NULL AND C2 IS NULL THEN C3 WHEN C1 IS NULL AND C2 IS NULL AND C3 IS NULL THEN C4 WHEN C1 IS NULL AND C2 IS NULL AND C3 IS NULL AND C4 IS NULL THEN C5 ELSE C1 END AS REQUIREDNOTNULLVALUE
FROM
TABLE1