Compare one column value between row number 1 and row number 2

Compare one column value between row number 1 and row number 2 - sql

I have the following table basket for example.
basket fruit quantity
1 mango 2
1 apple 2
2 banana 2
2 banana 3
2 banana 3
Now I have to find the baskets which have more than 1 row and in the basket the types are different to each other. So basket number 1 should come out.
I have written the following SQL:
select count(*),c.basket from baskets c group by c.basket having count(*)>1;
But after this how can I get the baskets where the fruit types are different to each other among the rows? It should be basket number 1 in this case.

Just add to the HAVING clause:
select count(*), c.basket
from baskets c
group by c.basket
having count(*)>1
AND COUNT(DISTINCT fruit)>1;

I would use min() and max():
select b.basket
from baskets b
group by b.basket
where min(b.fruit) <> max(b.fruit);

I would use exists :
select b.*
from baskets b
where exists (select 1
from baskets b1
where b1.basket = b.basket and
b1.fruit <> b.fruit
);

Related

Merge row values based on other column value

I'm trying to merge the values of two rows based on the value of another row in a different column. Below is my based table
Customer ID
Property ID
Bookings per customer
Cancellations per customer
A
1
0
1
B
2
10
1
C
3
100
1
C
4
100
1
D
5
20
1
Here is the SQL query I used
select customer_id, property_id, bookings_per_customer, cancellations_per_customer
from table
And this is what I want to see. Any ideas the query to get this would be? We use presto SQL
Thanks!
Customer ID
Property ID
Bookings per customer
Cancellations per customer
A
1
0
1
B
2
10
1
C
3 , 4
100
1
D
5
20
1

We can try:
SELECT
customer_id,
ARRAY_JOIN(ARRAY_AGG(property_id), ',') AS properties,
bookings_per_customer,
cancellations_per_customer
FROM yourTable
GROUP BY
customer_id,
bookings_per_customer,
cancellations_per_customer;

Select rows where the combination of two columns is unique and we only display rows where the first column is not unique

I have an order line table that looks like this:
ID
Order ID
Product Reference
Variant
1
1
Banana
Green
2
1
Banana
Yellow
3
2
Apple
Green
4
2
Banana
Brown
5
3
Apple
Red
6
3
Apple
Yellow
7
4
Apple
Yellow
8
4
Banana
Green
9
4
Banana
Yellow
10
4
Pear
Green
11
4
Pear
Green
12
4
Pear
Green
I want to know how often people place an order with a combination of different fruit products. I want to know the orderId for that situation and which productReference was combined in the orders.
I only care about the product, not the variant.
I would imagine the desired output looking like this - a simple table output that gives insight in what product combos are ordered:
Order ID
Product
2
Banana
2
Apple
4
Banana
4
Apple
4
Pear
I just need data output of the combination Banana+Apple and Banana+Apple+Pear happening so I can get more insight in the frequency of how often this happens. We expect most of our customers to only order Apple, Banana or Pear products, but that assumption needs to be verified.
Problem
I kind of get stuck after the first step.
select orderId, productReference, count(*) as amount
from OrderLines
group by orderId, productReference
This outputs:
Order ID
Product Reference
amount
1
Banana
2
2
Apple
1
2
Banana
1
3
Apple
2
4
Apple
1
4
Banana
2
4
Pear
3
I just don't know how to move on from this step to get the data I want.

You can use a window count() over()
select *
from
(
select orderId, productReference, count(*) as amount
, count(productReference) over(partition by orderId) np
from OrderLines
group by orderId, productReference
) t
where np > 1

You need only the rows where an Order_Id has different products; you can do this many ways.
One way is to aggregate and filter to only rows where the min product <> the max product, then use a correlation to find matching orders:
select distinct t.Order_ID, t.Product_Reference
from t
where exists (
select *
from t t2
where t2.Order_ID = t.Order_ID
group by order_id
having Min(Product_Reference) != Max(Product_Reference)
);
See this demo fiddle

You could use STRING_AGG:
https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-ver16
Here's an example:
SELECT orderID, STRING_AGG(productReference, ' ') products
FROM
(
SELECT DISTINCT orderID, productReference
FROM orderLines
) order_products
GROUP BY orderID
For each order ID, this pulls out the distinct products, then the STRING_AGG combines them into one field.
Output
orderID
products
1
Banana
2
Apple Banana
3
Apple
4
Apple Banana Pear
SQL fiddle example: http://sqlfiddle.com/#!18/8a677/6

Filter rows to return the exact relationship

I have two tables, expenses and categories, they have a many-to-many relationship through the table expenses_categories. I'm trying to implement a filter by categories, lets say that I provided the id for the categories A and B, I want to return the expenses who only have A and B. For example:
Expense X have Category A, B, and C
Expense Y have Category A and B
Expense Z have Category B
I want to return only the Expense Y
I'm using PostgreSQL by the way. I really need to learn how to do this kind of stuff.
Categories
ID
NAME
1
TV
2
CC
3
NET
ExpensesCategories
expense_id
category_id
1
1
1
2
2
1
2
2
2
3
3
1
4
2
I want to get all the Expenses that have ONLY the Categories 1 and 2.
In that case, I expect to only get the Expense 1
expense_id
category_id
1
1
1
2

You can group by expense_id and use STRING_AGG() in the HAVING clause to collect all the category_ids of each expense_id and compare it to a string like '1,2' which contains the category_ids that you want in ascending order as a comma separated list:
SELECT expense_id
FROM ExpensesCategories
GROUP BY expense_id
HAVING STRING_AGG(category_id::text, ',' ORDER BY category_id) = '1,2';
If you want all the rows of these expense_ids in ExpensesCategories, use the above query as a CTE:
WITH cte AS (
SELECT expense_id
FROM ExpensesCategories
GROUP BY expense_id
HAVING STRING_AGG(category_id::text, ',' ORDER BY category_id) = '1,2'
)
SELECT *
FROM ExpensesCategories
WHERE expense_id IN (SELECT expense_id FROM cte);
See the demo.

Count Distinct Combinations in SQL

Is there a way in SQL to Count the number of occurences of a distict combination of two fields in a table e.g.
categorynum itemnum
1 3
2 1
1 3
1 2
3 1
1 3
and return 3 when counting occurences of (1;3) ?

Sure, just use a regular GROUP BY / COUNT(*)
SELECT categorynum, itemnum, COUNT(*) occurrences
FROM {table}
GROUP BY categorynum, itemnum
If you want a particular combination just add a WHERE clause (before the GROUP BY):
WHERE categorynum = 1 AND itemnum = 3

Identify duplicate records and update them with the ID of first occurrence

I have a table like this.
ID Name Source ID
1 Orange 0
2 Pear 0
3 Apple 0
4 Orange 0
5 Apple 0
6 Banana 0
7 Orange 0
What I want to do is:
For the records with FIRST occurrence of "Name", I want to update the "Source Id" with the "Id" value
For the records with SECOND and CONSECUTIVE occurrences of "Name", I want to update the "Source Id" with the "Id" value of the FIRST occurrence
So, the table should be updated as follows:
ID Name Source ID
1 Orange 1
2 Pear 2
3 Apple 3
4 Orange 1
5 Apple 3
6 Banana 6
7 Orange 1
How can I do it in SQL (Oracle to be in particular, but I'm fine with General SQL as well) ...
Thanks!

Something like this should get you what you want:
update table a
set source_id = (
select min(id)
from table b
where b.name = a.name
);

UPDATE MyTable
SET SourceID = Sub.ID
FROM MyTable
INNER JOIN (SELECT MIN(ID) as ID, Name FROM MyTable GROUP BY Name) Sub
ON Sub.Name = MyTable.Name
Just use a subquery that lists the min id per name.

Since ID is growing autoincrement value (right?) FirstID could be calculated as MIN(ID):
UPDATE fruits
SET SourceID = ag.ID
FROM fruits f
INNER JOIN
(
SELECT MIN(ID) as ID, Name FROM #fruits
GROUP BY Name
) ag
ON ag.Name = f.Name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Compare one column value between row number 1 and row number 2 - sql

Just add to the HAVING clause: select count(), c.basket from baskets c group by c.basket having count()>1 AND COUNT(DISTINCT fruit)>1;

I would use min() and max(): select b.basket from baskets b group by b.basket where min(b.fruit) <> max(b.fruit);

I would use exists : select b.* from baskets b where exists (select 1 from baskets b1 where b1.basket = b.basket and b1.fruit <> b.fruit );

Related

Merge row values based on other column value

Select rows where the combination of two columns is unique and we only display rows where the first column is not unique

Filter rows to return the exact relationship

Count Distinct Combinations in SQL

Identify duplicate records and update them with the ID of first occurrence

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Compare one column value between row number 1 and row number 2 - sql

Just add to the HAVING clause: select count(*), c.basket from baskets c group by c.basket having count(*)>1 AND COUNT(DISTINCT fruit)>1;

I would use min() and max(): select b.basket from baskets b group by b.basket where min(b.fruit) <> max(b.fruit);

I would use exists : select b.* from baskets b where exists (select 1 from baskets b1 where b1.basket = b.basket and b1.fruit <> b.fruit );

Related

Merge row values based on other column value

Select rows where the combination of two columns is unique and we only display rows where the first column is not unique

Filter rows to return the exact relationship

Count Distinct Combinations in SQL

Identify duplicate records and update them with the ID of first occurrence

Categories

Resources

Just add to the HAVING clause: select count(), c.basket from baskets c group by c.basket having count()>1 AND COUNT(DISTINCT fruit)>1;