SQL query show customers who bought apples, but not potatoes - sql

Not sure how to explain this..
I have a similar table, but i have simplified it with the following:
I have a table of goods shipped to different cusotmers. Some have bought apples only, others have bought apples and potates.
I want an SQL query to return only customers where "To be billed" = Yes AND the customer hasnt bought any vegetables.
So for example if the table looks like this:
Item
Name
Group
To_be_billed
CustomerNo.
2000
Apple
Fruit
Yes
1
2000
Apple
Fruit
No
2
2000
Apple
Fruit
No
3
2000
Apple
Fruit
Yes
4
2000
Apple
Fruit
Yes
5
4000
Potato
Vegetable
No
2
4000
Potato
Vegetable
No
4
I want the query to return:
Item
Name
Group
To_be_billed
CustomerNo.
2000
Apple
Fruit
Yes
1
2000
Apple
Fruit
Yes
5
The reason 4 has bought apples, and is to be billed, but the customer also bought Potatoes, so is to be ignored...

You can create a CTE to check for CustomerNo.s that you need to ignore, and then use not exists:
with bought_veg as
(
select "CustomerNo."
from tbl
where tbl."Group" like 'Vegetable'
)
select tbl.*
from tbl
where not exists (select 1 from bought_veg where tbl."CustomerNo." = bought_veg."CustomerNo.")
and tbl.To_be_billed = 'Yes'
Example without CTE:
select tbl.*
from tbl
where not exists (select "CustomerNo." from tbl t2 where tbl.[CustomerNo.] = t2.[CustomerNo.] and "Group" like 'Vegetable')
and tbl.To_be_billed = 'Yes'

Related

How to group by condition and average only if column value is not null in bigquery sql

Hi I have a table that shows the category of product and another table with daily price of the product. I would like to get the average price of the category where average not count null values. How do I achieve this? Example of table product
product
category
apple
fruit
pear
fruit
grape
fruit
celery
vegetables
cabbage
vegetables
chicken
meat
turkey
meat
beef
meat
another table with daily price and productid as columns and the price in the rows
date
apple
pear
grape
celery
cabbage
chicken
turkey
beef
2022-01-01
2
4
1
2
3
4
3
2022-01-02
2
2
2
4
3
2022-01-03
2
2
2
3
into
date
fruit
vegetables
meat
2022-01-01
3
1.5
3.3
2022-01-02
2
2
3.5
2022-01-02
2
2
3
Where average is only to columns where it is not null, it would be better not to do it manually.
Consider below query using UNPIVOT AND PIVOT:
SELECT * FROM (
SELECT date, category, price
FROM prices UNPIVOT (price FOR productid IN (apple, pear, grape, celery, cabbage, chicken, turkey, beef)) p
JOIN category c ON c.product = p.productid
) PIVOT (AVG(price) FOR category IN ('fruit', 'vegetables', 'meat'))
ORDER BY date;
Consider also below approach
create temp function keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
select *
from (
select date, category, round(avg(safe_cast(price as float64)), 2) avg_price
from prices t, unnest([struct(to_json_string(t) as json)]),
unnest(keys(json)) product with offset
join unnest(values(json)) price with offset using(offset)
left join products using(product)
where product != 'date'
group by date, category
)
pivot (any_value(avg_price) for category IN ('fruit', 'vegetables', 'meat'))
if applied to sample data in your question - output is
Potential benefit of using above is to eliminate need in enlisting all column names from products table, which are 8 in your example but in reality most likely much more! Obviously, another way to address this is to build dynamic query and run it using execute immediate which you can find quite a number of examples here on SO.
But, assuming that number of categories is significantly lower (just few as in your example) to compare with number of products - I would use this approach as execute immediate has its own drawbacks ...

First Best Match Not Already Used

I'm not sure how to express this in SQL, if it's even possible, or what to even call it.
I want, for each record in Table A, the first best matching record in Table B that wasn't already picked as a best match. For example, suppose I have a Generic Shopping List and a Food Menu:
Table A Table B
Generic Shopping List Food Menu
--------------------- ----------------------
Food Type Food Food Type
--------------------- ----------------------
Meat Tomatoes Vegetable
Meat Lettuce Vegetable
Vegetable Bacon Vegetable
Vegetable Bacon Meat
Vegetable Beef Meat
Vegetable Apple Fruit
Fruit Orange Fruit
Fruit Bacon Fruit
Dairy Milk Dairy
Cheese Dairy
Yogurt Dairy
With a query or join, it's easy to get the Top 1 match:
Table/Query C
Automagic Shopping
------------------
Food
------------------
Bacon
Bacon
Tomatoes
Tomatoes
Tomatoes
Tomatoes
Apple
Apple
Milk
I know how to do that, and because I like bacon, I could live with this. Unfortunately, I really need the full breadth of the available food options, such that I have slots available for it.
Table/Query C
Better Magic Shopping
---------------------
Food
---------------------
Bacon
Beef
Tomatoes
Lettuce
Bacon
<NULL - No More Available Matches - Don't Care>
Apple
Orange
Milk
If this can be done in Access, great. If it can't be done in Access, but it can be done in another product, it isn't ideal, but it's workable.
Thanks.
This is a way to do it in SQL Server:
SELECT t1.FoodType, t2.Food
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY FoodType ORDER BY FoodType) AS rn
FROM #tableA ) AS t1
LEFT JOIN (
SELECT *, ROW_NUMBER() OVER (PARTITION BY FoodType ORDER BY FoodType) AS rn
FROM #tableB) AS t2 ON t1.FoodType = t2.FoodType AND t1.rn = t2.rn
Below are, side by side, the table expressions computed by the two subqueries, t1, t2:
Results for t1: Results for t2:
FoodType rn Food FoodType rn
--------------- --------------------------
Dairy 1 Milk Dairy 1
Fruit 1 Cheese Dairy 2
Fruit 2 Yogurt Dairy 3
Meat 1 Apple Fruit 1
Meat 2 Orange Fruit 2
Vegetable 1 Bacon Fruit 3
Vegetable 2 Bacon Meat 1
Vegetable 3 Beef Meat 2
Vegetable 4 Tomatoes Vegetable 1
Lettuce Vegetable 2
Bacon Vegetable 3
Doing a LEFT JOIN on FoodType and rn gets you what you want.
Access allows a NOT IN clause. Just write the query to get the TOP 1 match. Then include this query as a subquery inside the NOT IN clause.
Select * From Table_A A, Table_B B Where A.Food_Type = B.Food_Type And B.Food Not In (Select Top 1 D.Food From Table_A D, Table_B C Where D.Food_Type = D.Food_Type And C.Food_Type = 'give your criterion value here')
Note you might want to a suitable Order By clause to describe how you determine what is a best match and what isn't.

Using a query to return the most frequent value and the count within a group using SQL in MS Access

Say I have a table showing the type of fruit consumed by an individual over a 24 hour period that looks like this:
Name Fruit
Tim Apple
Tim Orange
Tim Orange
Tim Orange
Lisa Peach
Lisa Apple
Lisa Peach
Eric Plum
Eric Orange
Eric Plum
How would I get a table that shows only the most consumed fruit for each person, as well as the number of fruits consumed. In other words, a table that looks like this:
Name Fruit Number
Tim Orange 3
Lisa Peach 2
Eric Plum 2
I tried
SELECT Name, Fruit, Count(Fruit)
FROM table
GROUP BY Name
But that returns an error because Name needs to be in the GROUP BY statement as well. Every other method I've tried returns the counts for ALL values rather than just the maximum values. MAX(COUNT()) doesn't appear to be a valid statement, so I'm not sure what else to do.
This is a pain, but you can do it. Start with your query and then use join:
SELECT n.Name, n.Fruit
FROM (SELECT Name, Fruit, Count(Fruit) as cnt
FROM table as t
GROUP BY Name, Fruit
) as t INNER JOIN
(SELECT Name, max(cnt) as maxcnt
FROM (SELECT Name, Fruit, Count(Fruit) as cnt
FROM table
GROUP BY Name, Fruit
) as t
GROUP BY Name
) as n
ON t.name = n.name and t.cnt = n.maxcnt;

In SQL, how find the total of a row over time?

A table and I want to know the total of my rows over time. For example. Here's my table:
Date Fruit Sold
Mon apple 4
Mon pear 5
Mon orange 2
Tues apple 3
Tues pear 2
Tues orange 1
The table I want back is:
Fruit Sold
apple 7
pear 7
orange 3
What is a query that I can do this? However, with my real situation, I have hundreds of types of fruit. So how do I query with out specifying each type of fruit each time?
That would be along the lines of:
select fruit, sum(sold) as sold
from fruitsales
group by fruit
-- adding something like <<where date = 'Mon'>> if you want to limit it.
This will aggregate the individual sold columns (by summing) for each fruit type.
here is how to do it:
select fruit, sum(sold)
from table
group by fruit
cheers...
Group by Time
select fruit, sum(sold),substring(saletime,1,3) from table group by fruit,substring(saletime,1,3)

Group by with count

Say I have a table like this in my MsSql server 2005 server
Apples
+ Id
+ Brand
+ HasWorms
Now I want an overview of the number of apples that have worms in them per brand.
Actually even better would be a list of all the apple brands with a flag if they are unspoiled or not.
So if I had the data
ID| Brand | HasWorms
---------------------------
1 | Granny Smith | 1
2 | Granny Smith | 0
3 | Granny Smith | 1
4 | Jonagold | 0
5 | Jonagold | 0
6 | Gala | 1
7 | Gala | 1
I want to end up with
Brand | IsUnspoiled
--------------------------
Granny Smith | 0
Jonagold | 1
Gala | 0
I figure I should first
select brand, numberOfSpoiles =
case
when count([someMagic]) > 0 then 1
else 0
end
from apples
group by brand
I can't use a having clause, because then brands without valid entries would dissapear from my list (I wouldn't see the entry Gala).
Then I thought a subquery of some kind should do it, but then I can't link the apple id of the outer (grouped) query to the inner (count) query...
Any ideas?
select brand, case when sum(hasworms)>0 then 0 else 1 end IsUnSpoiled
from apples
group by brand
SQL server version, I did spoiled instead of unspoiled, this way I could use the SIGN function and make the code shorter
table + data (DML + DDL)
create table Apples(id int,brand varchar(20),HasWorms bit)
insert Apples values(1,'Granny Smith',1)
insert Apples values(2,'Granny Smith',0)
insert Apples values(3,'Granny Smith',1)
insert Apples values(4,'Jonagold',0)
insert Apples values(5,'Jonagold',0)
insert Apples values(6,'Gala',1)
insert Apples values(7,'Gala',1)
Query
select brand, IsSpoiled = sign(sum(convert(int,hasworms)))
from apples
group by brand
Output
brand IsSpoiled
----------------------
Gala 1
Granny Smith 1
Jonagold 0
SELECT Brand,
1-MAX(HasWorms) AS IsUnspoiled
FROM apples
GROUP BY Brand
SELECT brand,
COALESCE(
(
SELECT TOP 1 0
FROM apples ai
WHERE ai.brand = ao.brand
AND hasWorms = 1
), 1) AS isUnspoiled
FROM (
SELECT DISTINCT brand
FROM apples
) ao
If you have an index on (brand, hasWorms), this query will be super fast, since it does not count aggregates, but instead searches for a first spoiled apple within each brand ans stops.
I haven't tested this, and maybe I'm missing something. But wouldn't this work?
SELECT Brand, SUM(CONVERT(int, HasWorms)) AS SpoiledCount
FROM Apples
GROUP BY Brand
ORDER BY SpoiledCount DESC
I assume HasWorms is a bit field, hence the CONVERT statement. This should return a list of brands with the count of spoiled apples per brand. You should see the worst (most spoiled) at the top and the best at the bottom.
There are many ways to skin this cat. Depending on your RDBMS, different queries will give you the best results. On our Oracle box, this query performs faster than all the others listed, assuming that you have an index on Brand in the Apples table (an index on Brand, HasWorms is even faster, but that may not be likely; depending on your data distribution, an index on just HasWorms may be the fastest of all). It also assumes you have a table "BrandTable", which just has the brands:
SELECT Brand
, 1 IsSpoiled
FROM BrandTable b
WHERE EXISTS
( SELECT 1
FROM Apples a
WHERE a.brand = b.brand
AND a.HasWorms = 1
)
UNION
SELECT Brand
, 0
FROM BrandTable b
WHERE NOT EXISTS
( SELECT 1
FROM Apples a
WHERE a.brand = b.brand
AND a.HasWorms = 1
)
ORDER BY 1;
SELECT CASE WHEN SUM(HasWorms) > 0 THEN 0 ELSE 1 END AS IsUnspoiled, Brand
FROM apples
GROUP BY Brand