Count Distinct Over Multiple Columns - sql

I have two CTEs . The following is the output of my first CTE.
| ORDER_NUMBER | ORDER_FLAG | EMPLOYEE | PRODUCT_CATEGORY | SALES |
|--------------|------------|----------|------------------|--------|
| 3158132 | 1 | Don | Newpaper Ad | 16.00 |
| 3158132 | 1 | Don | Magazine Ad | 15.00 |
| 3158132 | 0 | Don | TV Ad | 0.00 |
| 3158132 | 1 | Don | Billboard Ad | 56.00 |
| 3006152 | 1 | Roger | TV Ad | 20.00 |
| 3006152 | 0 | Roger | Magazine Ad | 0.00 |
| 3006152 | 1 | Roger | Newspaper Ad | 214.00 |
| 3012681 | 1 | Ken | TV Ad | 130.00 |
| 3012681 | 0 | Ken | Magazine Ad | 0.00 |
| 9818123 | 1 | Pete | Billboard Ad | 200.00 |
I'm attempting to count the distinct order numbers and the sales amount by employee. The order flag will be either 1 or a 0. If sales are greater than 0.00 the order flag will be set to 1.
My desired output.
| Employee | Sales | Orders |
|----------|--------|--------|
| Don | 87.00 | 1 |
| Ken | 130.00 | 1 |
| Pete | 200.00 | 1 |
| Roger | 234.00 | 1 |
I was attempting to do a combination of distinct, case, and concat statements without any luck. Any thoughts?

You can use this:
with cteTotalSales (...) as ()
select employee,
case when (sum(sales)) > 0
then 1 else 0 as Orders,
sum(sales)
from cteTotalSales
group by employee

This should be as simple as :
with cte as (...)
select
employee,
sum(sales),
count(distinct order_number)
from cte
group by employee

This query would work for you
SELECT
EMPLOYEE,
SUM(SALES) SALES,
1 AS ORDERS
FROM
YOUR_TABLE
GROUP BY
EMPLOYEE
you can replace your subquery with YOUR_TABLE.
SELECT
EMPLOYEE,
SUM(SALES) SALES,
1 AS ORDERS
FROM
(
SELECT * FROM ...
)
GROUP BY
EMPLOYEE

Related

Duplicate value postgresql

I have an entry in the database
| group | account | description | balance | balance1 |
+----------+-------------+-----------------+-------------+--------------+
| 123123 | 0 | Name 1 | 1000.00 | 0 |
| 123123 | 777 | Name 2 | 250.00 | 0 |
| 123123 | 999 | Name 3 | 0 | 350.00 |
| 123000 | 0 | Name 4 | 500.00 | 0 |
| 123000 | 567 | Name 5 | 0 | 500.00 |
select
select * from table;
Gives exactly the same result as the example above.
I would like to get the result without duplicates in the "group" column. Here's one:
| group | account | description | balance | balance1 |
+----------+-------------+-----------------+-------------+--------------+
| 123123 | 0 | Name 1 | 1000.00 | 0 |
| | 777 | Name 2 | 250.00 | 0 |
| | 999 | Name 3 | 0 | 350.00 |
| 123000 | 0 | Name 4 | 500.00 | 0 |
| | 567 | Name 5 | 0 | 500.00 |
That is, as you can see from the example, I want to remove only duplicate values ​​from the first column, without affecting the rest.
Also "group by", "order by" I can't use, as it will break the sequence of information output.
Something like this might work for you:
with cte as
(
SELECT goup, account, description, balance, balance1,
row_number() OVER(ORDER BY (SELECT NULL)) as rn
FROM yourtable
)
SELECT case when LAG(goup) OVER (ORDER BY rn) = goup THEN NULL ELSE goup END AS goup,
account, description, balance, balance1
FROM cte;
ORDER BY (SELECT NULL) is a fairly horrible hack. It is there because row_number() requires an ORDER BY but you specifically stated that you can't use an order by. The row_number() is however needed in order to use LAG, which itself requires an OVER (ORDER BY..).
Very much a case of caveat emptor, but it might give you what you are looking for.

[SQL ]How to sum the value in some parts of column

I am practicing SQL and here is my exercise(Table 1)
Table 1: The origin table
My goal is to have the sum of income when Month is between 1 and 3. If Month=4, the income senter code herehould not be added in.
|M_Id | Year | Month | CompanyID | CustomerID | MonthIncome |
|1 | 110 | 1 | T012 | C001 | 30000 |
|2 | 110 | 2 | T012 | C001 | 60000 |
|3 | 110 | 3 | T012 | C001 | 60000 |
|4 | 110 | 4 | T012 | C001 | 100000 |
|5 | 110 | 1 | A012 | A001 | 10000 |
|6 | 110 | 1 | A012 | A001 | 50000 |
I tried some SQL:
select companyID, customerID, Year, Sum(MonthIncome) as Total
from[dbo].[Money]
group by year,companyID, customerID
and the table result look like this:
Table 2. using sum, group by , and the table become
| Year | CompanyID | CustomerID | MonthIncome|
| 110 | A012 | A001 | 60000|
| 110 | T012 | C001 | 250000|
The table style is what I want, but the sum(Income) is not right because it had included month=4.
I tried to change my sql to
select companyID, customerID, Year, Sum(MonthIncome) as Total
from[dbo].[Money]
group by year,companyID, customerID
having month between 1 and 3
but the system as me to put the month into group by, and then table style is not what I want.
Could anybody help me?
You can exclude the months that are not in the range 1 to 3 using WHERE then do the grouping:
SELECT year, companyID, customerID, Sum(MonthIncome) AS Total
FROM [dbo].[Money]
WHERE month BETWEEN 1 AND 3
GROUP BY year, companyID, customerID;

Calculating the cumulative sum with a specific 'date' merged to single column in PostgreSQL

I have a database which contains the amounts and dates per user paid. Now some users make payments on the same day and I want to show the cumulative sum of these payments only once per day in a pivot table, which I am creating using Amazon QuickSight.
I have gone through the following, but they provide the cumulative values once per row and I don't have a way to partition on just the date and not on anything else, with the sum over the payment made.
Calculating Cumulative Sum in PostgreSQL
Calculating cumulative sum with date filtering in PostgreSQL
Calculating Cumulative daily sum in PostgreSQL
PostgreSQL, renumber and cumulative sum at once
How to conditional sum two columns in PostgreSQL 9.3
My query looks like this:
SELECT
s.id,
s.first_name,
s.last_name,
s.birth_date,
s.card,
p.datetime,
p.amount,
Sum(p.amount)OVER(partition BY p.datetime ORDER BY p.datetime ) AS "Daily Amount"
FROM payments AS p
LEFT JOIN users AS s
ON p.s_h_uuid = s.h_uuid
ORDER BY p.datetime DESC
Where I am doing a Sum() Over() at this row:
Sum(pa.amount)OVER(partition BY p.datetime ORDER BY p.datetime ) AS "Daily Amount"
My Table has data as:
Users:
| id | first_name | last_name | birth_date | card |
| 2 | first_nam2 | last_nam2 | 1990-02-01 | M |
| 3 | first_nam3 | last_nam3 | 1987-07-23 | M |
| 1 | first_nam1 | last_nam1 | 1954-11-15 | A |
| 4 | first_nam4 | last_nam4 | 1968-05-07 | V |
Payments:
| p_uuid | datetime | amount |
| 2 | 2021-05-01 | 100.00 |
| 3 | 2021-05-01 | 100.00 |
| 2 | 2021-05-02 | 100.00 |
| 1 | 2021-05-03 | 100.00 |
| 3 | 2021-05-03 | 100.00 |
| 4 | 2021-05-03 | 100.00 |
| 2 | 2021-05-05 | 100.00 |
| 1 | 2021-05-05 | 100.00 |
| 4 | 2021-05-06 | 100.00 |
The output I want is that the "Daily Amount" is shown only once for a specific date, if there are multiple rows with the same date, then for the other rows, it should be blank or display something like "NA":
| p.datetime | id | first_name | last_name | birth_date | card | pa.amount | "Daily Amount" |
| 2021-05-01 | 2 | first_nam2 | last_nam2 | 1990-02-01 | M | 100.00 | 200.00 |
| 2021-05-01 | 3 | first_nam3 | last_nam3 | 1987-07-23 | M | 100.00 | |
| 2021-05-02 | 2 | first_nam2 | last_nam2 | 1990-02-01 | M | 100.00 | 100.00 |
| 2021-05-03 | 1 | first_nam1 | last_nam1 | 1954-11-15 | A | 100.00 | 300.00 |
| 2021-05-03 | 3 | first_nam3 | last_nam3 | 1987-07-23 | M | 100.00 | |
| 2021-05-03 | 4 | first_nam4 | last_nam4 | 1968-05-07 | V | 100.00 | |
| 2021-05-05 | 2 | first_nam2 | last_nam2 | 1990-02-01 | M | 100.00 | 200.00 |
| 2021-05-05 | 1 | first_nam1 | last_nam1 | 1954-11-15 | A | 100.00 | |
| 2021-05-06 | 4 | first_nam4 | last_nam4 | 1968-05-07 | V | 100.00 | 100.00 |
Is there some way that it is possible to get this output from SQL (PostgreSQL specific query)?
Looks like your sum() over() computes the wrong amount, try
Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",
EDIT
If you want to format output (cumulative amount only once per date), use row_number() to detect first row in a group. Make sure over() clause is in sync with ORDER BY of the query.
SELECT
id,
first_name,
last_name,
birth_date,
card,
datetime,
amount,
case when rn=1 then "Daily Amount" end "Daily Amount"
FROM (
SELECT
s.id,
s.first_name,
s.last_name,
s.birth_date,
s.card,
p.datetime,
p.amount,
Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",
row_number() OVER(partition BY s.id, p.datetime ORDER BY p.amount) AS rn
FROM payments AS p
LEFT JOIN users AS s ON p.s_h_uuid = s.h_uuid
) t
ORDER BY datetime DESC, id, amount
If you want the value only once per date, then use row_number():
select (case when 1 = row_number() over (partition by p.date order by p.p_uuid)
then sum(p.amount) over (partition by p.date)
end) as day_payments

SQL calculating sum and number of distinct values within group

I want to calculate
(1) total sales amount
(2) number of distinct stores per product
in one query, if possible. Suppose we have data:
+-----------+---------+-------+--------+
| store | product | month | amount |
+-----------+---------+-------+--------+
| Anthill | A | 1 | 1 |
| Anthill | A | 2 | 1 |
| Anthill | A | 3 | 1 |
| Beetle | A | 1 | 1 |
| Beetle | A | 3 | 1 |
| Cockroach | A | 1 | 1 |
| Cockroach | A | 2 | 1 |
| Cockroach | A | 3 | 1 |
| Anthill | B | 1 | 1 |
| Beetle | B | 2 | 1 |
| Cockroach | B | 3 | 1 |
+-----------+---------+-------+--------+
I have tried this with no luck:
select
[product]
,[month]
,[amount]
,cnt_distinct_stores = count(distinct(stores))
from dbo.temp
group by
[product]
,[month]
order by 1,2
Would there be possible any combination of GROUP BY clause with window functions like SUM(amount) OVER(partition by [product],[month] ORDER BY [month] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
Try
SELECT product,
SUM(amount),
COUNT(DISTINCT store)
FROM dbo.temp
GROUP BY product

Having Groups based on distinct count of another column

I have a table as follow :
+-------------+-----------+------+
| GroupNumber | TeamName | Goal |
+-------------+-----------+------+
| 1 | Sales | ABC |
| 1 | Sales | ABC |
| 1 | Sales | ABC |
| 1 | Design | XYZ |
| 2 | Design | XYZ |
| 2 | Sales | XYZ |
| 2 | technical | XYZ |
| 2 | Support | XYZ |
| 3 | Sales | XYZ |
| 3 | Sales | XYZ |
| 3 | Sales | XYZ |
+-------------+-----------+------+
I want to output only the groups that have unique teams greater than 3.
Only group 2 has this condition so the output is :
Expected Output:
+-------------+-----------+------+
| GroupNumber | TeamName | Goal |
+-------------+-----------+------+
| 2 | Design | XYZ |
| 2 | Sales | XYZ |
| 2 | technical | XYZ |
| 2 | Support | XYZ |
+-------------+-----------+------+
not sure how to utilize this in subquery
SELECT count(Distinct(TeamName))
FROM mytable
group by [GroupNumber]
HAVING COUNT(Distinct[TeamName])>3
Simply put it in a Subquery:
select *
from mytable
where [GroupNumber] in
(
SELECT [GroupNumber]
FROM mytable
group by [GroupNumber]
HAVING COUNT(Distinct[TeamName])>3
)
Please try
SELECT *
FROM mytable where GroupNumber in (select GroupNumber
FROM mytable group by TeamName
HAVING COUNT(TeamName)>3)