How to filter on a column in SQL? - sql

The table has 3 fields and sample data.
customerid ordertype countoforders
1 APP 10
1 WEB 20
2 APP 10
3 WEB 10
4 APP 30
5 APP 40
5 WEB 10
I want to retrieve only APP order customers and it counts. How can I write the query for the same?
For example from above table APP only customers are 2 and 4.

Try this:
select customerid, countoforders from MY_TABLE
where ordertype = 'APP'
except
select customerid, countoforders from MY_TABLE
where ordertype <> 'APP'

You haven't specified what DBMS you're using, so I'll have to avoid making product specific recommendations.
If this is really urgent the UGLIEST way to achieve what you require would be to create a lookup table with all the 'WEB' orders you have in your table (customerid, ordertype) and perform a NOT EXISTS or NOT IN on that table (ex. SELECT customerid, ordertype, countoforders FROM TABLE_1 WHERE TABLE_1.customerid NOT IN (SELECT customerid FROM #lookup))
Performance-wise it may not be optimal but this would get the job done.

Related

SQL interview question: select, join, grouping

At the interview got question:
What products clients bought before first order of brand "Brand 1". Select top 5 by orders.
Tables:
Items:
RezonItemID;BrandName
5555613;Brand 1
2315946;Brand 2
9132648;Brand 3
3125847;Brand 1
3126548;Brand 5
Orders:
ClientID;ClientOrderID;RezonItemID;FactMoment
00611847;4562145;5555613;2021-01-09
00798451;7987465;1321321;2021-08-10
00914751;3154844;9132648;2021-07-01
00975418;9797451;1312125;2021-09-09
00978461;9413235;9754512;2021-10-29
My decision:
WITH first_order AS (
SELECT ClientID, MIN(FactMoment) as o_date
FROM orders
JOIN items
USING(RezonItemID)
WHERE BrandName = 'Brand 1'
GROUP BY ClientID
)
SELECT RezonItemID, COUNT(*) AS n_orders
FROM orders
JOIN items
USING(RezonItemID)
JOIN first_order
USING(ClientID)
WHERE FactMoment < o_date
GROUP BY RezonItemID
ORDER BY n_orders DESC
LIMIT 5
Is it possible to solve this by window functions? Maybe there is better decision?
Given the following tables
1 RezonItemID;BrandName
2 5555613;Brand 1
3 2315946;Brand 2
4 9132648;Brand 3
5 3125847;Brand 1
6 3126548;Brand 5
7
8 ClientID;ClientOrderID;RezonItemID;FactMoment
9 00611847;4562145;5555613;2021-01-09
10 00798451;7987465;1321321;2021-08-10
11 00914751;3154844;9132648;2021-07-01
12 00975418;9797451;1312125;2021-09-09
13 00978461;9413235;9754512;2021-10-29
It seems that if the question is "Which products did the clients buy before the first order for an item of Brand 1," a sql query may not be necessary. Assuming that FactMoment is a timestamp for the order, we can see that the first order has the earliest date (01/09/21) and has a "RezonItemID" 5555613. That item has the brand "Brand 1".
So the answer would be that no items were purchased before the first purchase of an item with BrandName = 'Brand 1'.
This is a puzzling question, the sample test data is not particularly useful since it yields no testable results so is pretty much useless.
If you want to use window functions then that's certainly possible.
The following successfully yields no rows and should work, but without proper test data it's hard to actually be sure!
Note using() is not supported by some databases, ansi join syntax is preferred.
select top(5) BrandName from (
select o.ClientID, i.BrandName, o.FactMoment,
Min(case when i.BrandName='Brand 1' then FactMoment end) over() earliest,
Count(*) over(partition by ClientID) qty
from Orders o left join Items i on i.RezonItemID=o.RezonItemID
)o
where FactMoment<earliest
order by qty desc

How do I count how many emails each customer has received when there are multiple emails to count?

I am looking to count the number of emails each customer has received however am having trouble as there is more than one customer in the table that needs counting meaning a simple where clause isn't enough.
Here is an example of the Data:
CustomerID
EmailName
1
EmailA
1
EmailB
2
EmailA
2
EmailB
2
EmailC
3
EmailA
3
EmailB
I am able to count for a specific customer by using a where clause:
WHERE CustomerID = "1"
Which will return:
CustomerID
NumberOfEmailsSent
1
2
The issue I am having is I would like to get the following result:
CustomerID
NumberOfEmailsSent
1
2
2
3
3
2
The data set I am working with has thousands of email addresses so querying each email address separately is an unrealistic solution.
That is what GROUP BY is for.
SELECT CustomerID, COUNT(EmailName)
FROM YourTable
GROUP BY CustomerID
I think that you just need a GROUP BY clause:
SELECT CustomerID, COUNT(EmailName) as 'NumberOfEmailsSent'
FROM tbl
GROUP BY CustomerID
Output check here on DB<>FIDDLE

Case Statement for multiple criteria

I would like to ignore some of the results of my query as for all intents and purposes, some of the results are a duplicate, but based on the way the request was made, we need to use this hierarchy and although we are seeing different 'Company_Name' 's, we need to ignore one of the results.
Query:
SELECT
COUNT(DISTINCT A12.Company_name) AS Customer_Name_Count,
Company_Name,
SUM(Total_Sales) AS Total_Sales
FROM
some_table AS A12
GROUP BY
2
ORDER BY
3 ASC, 2 ASC
This code omits half a doze joins and where statements that are not germane to this question.
Results:
Customer_Name_Count Company_Name Total_Sales
-------------------------------------------------------------
1 3 Blockbuster 1,000
2 6 Jimmy's Bar 1,500
3 6 Jimmy's Restaurant 1,500
4 9 Impala Hotel 2,000
5 12 Sports Drink 2,500
In the above set, we can see that numbers 2 & 3 have the same count and the same total_sales number and similar company names. Is there a way to create a case statement that takes these 3 factors into consideration and then drops one or the other for Jimmy's enterprises? The other issue is that this has to be variable as there are other instances where this happens. And I would only want this to happen if the count and sales number match each other with a similar name in the company name.
Desired result:
Customer_Name_Count Company_Name Total_Sales
--------------------------------------------------------------
1 3 Blockbuster 1,000
2 6 Jimmy's Bar 1,500
3 9 Impala Hotel 2,000
4 12 Sports Drink 2,500
Looks like other answers are accurate based on assumption that Company_IDs are the same for both.
If Company_IDs are different for both Jimmy's Bar and Jimmy's Restaurant then you can use something like this. I suggest you get functional users involved and do some data clean-up else you'll be maintaining this every time this issue arise:
SELECT
COUNT(DISTINCT CASE
WHEN A12.Company_Name = 'Name2' THEN 'Name1'
ELSE A12.Company_Name
END) AS Customer_Name_Count
,CASE
WHEN A12.Company_Name = 'Name2' THEN 'Name1'
ELSE A12.Company_Name
END AS Company_Name
,SUM(A12.Total_Sales) AS Total_Sales
FROM some_table er
GROUP BY CASE
WHEN A12.Company_Name = 'Name2' THEN 'Name1'
ELSE A12.Company_Name
END
Your problem is that the joins you are using are multiplying the number of rows. Somewhere along the way, multiple names are associated with exactly the same entity (which is why the numbers are the same). You can fix this by aggregating by the right id:
SELECT COUNT(DISTINCT A12.Company_name) AS Customer_Name_Count,
MAX(Company_Name) as Company_Name,
SUM(Total_Sales) AS Total_Sales
FROM some_table AS A12
GROUP BY Company_id -- I'm guessing the column is something like this
ORDER BY 3 ASC, 2 ASC;
This might actually overstate the sales (I don't know). Better would be fixing the join so it only returned one name. One possibility is that it is a type-2 dimension, meaning that there is a time component for values that change over time. You may need to restrict the join to a single time period.
You need to have function to return a common name for the companies and then use DISTINCT:
SELECT DISTINCT
Customer_Name_Count,
dbo.GetCommonName(Company_Name) as Company_Name,
Total_Sales
FROM dbo.theTable
You can try to use ROW_NUMBER with window function to make row number by Customer_Name_Count and Total_Sales then get rn = 1
SELECT * FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY Customer_Name_Count,Total_Sales ORDER BY Company_Name) rn
FROM (
SELECT
COUNT(DISTINCT A12.Company_name) AS Customer_Name_Count,
Company_Name,
SUM(Total_Sales) AS Total_Sales
FROM
some_table AS A12
GROUP BY
Company_Name
)t1
)t1
WHERE rn = 1

SQL: Is it possible to merge these two SQL statements into a single query?

I am using MS SQL Server 2014 on Windows 7.
In the database I have a table named Orders, which looks like this:
OrderID | CustomerID | OrderDate | ...
----------------------------------------
1028 90 2015-10-10
...
2416 68 2016-02-12
I needed two things:
the total number of customers
the number of customers this year.
I am a beginner in SQL, but I managed to write 2 SQL statements in my app that seem to do the job:
For requirement #1
SELECT COUNT(DISTINCT CustomerID) FROM Orders; // result = 74
For requirement #2:
SELECT COUNT(DISTINCT CustomerID) FROM Orders WHERE OrderDate >= '2016-01-01'; // result = 34
I would like to know if it's possible to merge/combine somehow the above 2 SQL statements into one single query...? Of course, I need both results: the total customers (74 in above case) and also the customers of this year (i.e. 34).
The database is a remote database, so any idea to speed-up the query performance is highly welcome :)
Use conditional aggregation:
SELECT COUNT(DISTINCT CustomerID) as Total,
COUNT(DISTINCT CASE WHEN OrderDate >= '2016-01-01' THEN CustomerID END) as Total_2016
FROM Orders;
You can combine the data horizontally as in the previous answer or you can combine them vertically using a UNION, like this, if you think you need a tabular form:
SELECT 'Total Customers' As Description, COUNT(DISTINCT CustomerID) As Number FROM Orders // total = 74
Union ALL
SELECT '2016 Customers' AS Description, COUNT(DISTINCT CustomerID) As Number FROM Orders WHERE OrderDate >= '2016-01-01'; // this_year = 34

Using SQL to find the total number of customers with over X orders

I've been roasting my brain with my limited SQL knowledge while attempting to come up with a query to run a statistic on my orders database.
Table ORDERS is laid out like this:
CustomerID ProductID (etc)
1 10
1 10
1 11
2 10
4 9
Each purchase is recorded with the customer id and the product ID - there CAN be multiple records for the same customer, and even multiple records with the same customer and product.
I need to come up with a query that can return the amount of customers who bought between X and X distinct products - for example, 3 customers bought less then 5 different products, 10 bought from 5-10 different products, 1 bought over 10 different products.
I'm pretty sure this has something to do with derived tables, but advanced SQL is a new fairly craft to me. Any help would be appreciated!
Try this:
SELECT T1.products_bought, COUNT(T2.cnt) AS total
FROM (
SELECT '<5' AS products_bought, 0 AS a, 4 AS b
UNION ALL
SELECT '5-10', 5, 10
UNION ALL
SELECT '>10', 11, 999999
) T1
LEFT JOIN
(
SELECT COUNT(DISTINCT ProductID) AS cnt
FROM ORDERS
GROUP BY CustomerID
) T2
ON T2.cnt BETWEEN T1.a AND T1.b
GROUP BY a, b
Result:
products_bought total
<5 3
5-10 0
>10 0