SQL Multiple count columns with multiple conditionS - sql

I am trying to gather some basic statistics from a table "Data_Table" that gets updated on a daily basis. Each row represents a case, which can be opened/closed/cancelled by an operator with a unique ID. I want to be able to show the count for the actions that each operator did the previous day. So getting from Data_Table to Ideal table.
Data_Table
| LOCATION | DATE | REFERENCE | OPENED_ID | CLOSED_ID | CANCELLED_ID |
| NYC | 20180102 | 123451 | 123 | 234 | 0 |
| TEX | 20180102 | 123452 | 345 | 123 | 0 |
| NYC | 20180102 | 123453 | 345 | 0 | 123 |
| TEX | 20180102 | 123453 | 234 | 0 | 123 |
Ideal Table
| LOCATION | DATE | USER_ID | OPEN | CLOSED | CANCELLED |
| NYC | 20180102 | 123 | 1 | 0 | 1 |
| NYC | 20180102 | 234 | 0 | 1 | 0 |
| NYC | 20180102 | 345 | 1 | 0 | 0 |
| TEX | 20180102 | 123 | 0 | 1 | 1 |
| TEX | 20180102 | 234 | 1 | 0 | 0 |
| TEX | 20180102 | 345 | 1 | 0 | 0 |
User 123 opened 1 case and cancelled 1 case in location NYC on date 20180102...etc.
I have made a few small queries for each action in each site that looks like this:
SELECT LOCATION, DATE, OPENED_ID, COUNT(DISTINCT [DATA_TABLE].REFERENCE)
FROM [DATA_TABLE]
WHERE DATE = CONVERT(DATE,GETDATE()-1)
AND LOCATION = 'NYC'
AND OPENED_ID in (SELECT NYC FROM [OP_ID_TABLE]WHERE [DATE FINISH] > GETDATE() )
GROUP BY OPENED_ID, LOCATION, DATE
ORDER BY LOCATION
And then repeat this query for each location for each operator action. After which I do some messy vlookups in excel to organise it into the Ideal table format, which on a daily basis is ..not ideal.
I've tried to make some sum functions but haven't had any luck.
Any help would be much appreciated.

You need to unpivot and re-aggregate. One method uses union all and group by:
select location, date, user_id,
sum(opened) as opens, sum(closed) as closes, sum(cancelled) as cancels
from ((select location, date, opened_id as user_id, 1 as opened, 0 as closed, 0 as cancelled
from t
) union all
(select location, date, closed_id as user_id, 0 as opened, 1 as closed, 0 as cancelled
from t
) union all
(select location, date, cancelled_id as user_id, 0 as opened, 0 as closed, 1 as cancelled
from t
)
) t
group by location, date, user_id;
There are other methods for doing these operations, depending on the database. However, this is ANSI-standard syntax.

Related

Duplicate value postgresql

I have an entry in the database
| group | account | description | balance | balance1 |
+----------+-------------+-----------------+-------------+--------------+
| 123123 | 0 | Name 1 | 1000.00 | 0 |
| 123123 | 777 | Name 2 | 250.00 | 0 |
| 123123 | 999 | Name 3 | 0 | 350.00 |
| 123000 | 0 | Name 4 | 500.00 | 0 |
| 123000 | 567 | Name 5 | 0 | 500.00 |
select
select * from table;
Gives exactly the same result as the example above.
I would like to get the result without duplicates in the "group" column. Here's one:
| group | account | description | balance | balance1 |
+----------+-------------+-----------------+-------------+--------------+
| 123123 | 0 | Name 1 | 1000.00 | 0 |
| | 777 | Name 2 | 250.00 | 0 |
| | 999 | Name 3 | 0 | 350.00 |
| 123000 | 0 | Name 4 | 500.00 | 0 |
| | 567 | Name 5 | 0 | 500.00 |
That is, as you can see from the example, I want to remove only duplicate values ​​from the first column, without affecting the rest.
Also "group by", "order by" I can't use, as it will break the sequence of information output.
Something like this might work for you:
with cte as
(
SELECT goup, account, description, balance, balance1,
row_number() OVER(ORDER BY (SELECT NULL)) as rn
FROM yourtable
)
SELECT case when LAG(goup) OVER (ORDER BY rn) = goup THEN NULL ELSE goup END AS goup,
account, description, balance, balance1
FROM cte;
ORDER BY (SELECT NULL) is a fairly horrible hack. It is there because row_number() requires an ORDER BY but you specifically stated that you can't use an order by. The row_number() is however needed in order to use LAG, which itself requires an OVER (ORDER BY..).
Very much a case of caveat emptor, but it might give you what you are looking for.

Count repeat records per day (without window functions)

I'm trying to get a count of repeat customer records per day and I'm having a bit of trouble using MariaDB 10.1 as window functions weren't introduced until 10.2 (therefore no partitioning, rank, etc)
I have an example set of data that looks like this:
| Date | Country | Type | Email | Response_Time |
| ---------- | ------- | --------- | ------------- | ------------- |
| 2021-05-21 | AU | Enquiry | bill#fake.com | 910 |
| 2021-05-21 | AU | Enquiry | bill#fake.com | 1050 |
| 2021-05-21 | NZ | Complaint | jim#fake.com | 56 |
| 2021-05-22 | NZ | Enquiry | jim#fake.com | 1000 |
| 2021-05-22 | NZ | Enquiry | jim#fake.com | 845 |
| 2021-05-22 | NZ | Enquiry | jim#fake.com | 700 |
| 2021-05-22 | NZ | Complaint | jim#fake.com | 217 |
| 2021-05-23 | UK | Enquiry | jane#fake.com | 843 |
| 2021-05-23 | NZ | Enquiry | jim#fake.com | 1795 |
| 2021-05-23 | NZ | Enquiry | jim#fake.com | 521 |
| 2021-05-23 | AU | Complaint | bill#fake.com | 150 |
The above can be produced with the following query:
SELECT
DATE(Start_Time) AS "Date",
Country,
Type,
Email,
Response_Time
FROM EMAIL_DETAIL
WHERE DATE(Start_Time) BETWEEN '2021-05-21' AND '2021-05-23'
AND COUNTRY IN ('AU','NZ','UK')
;
I'd like to get a count of email addresses that appear more than once in the group of day, country and type, and display it as a summary like this:
| Country | Type | Volume | Avg_Response_Time | Repeat_Daily |
| ------- | --------- | ------ | ----------------- | ------------ |
| AU | Enquiry | 2 | 980 | 1 |
| AU | Complaint | 1 | 150 | 0 |
| NZ | Enquiry | 5 | 972 | 3 |
| NZ | Complaint | 1 | 137 | 0 |
| UK | Enquiry | 1 | 843 | 0 |
The repeat daily count is a count of records where the email address appeared more than once in the group of date, country and type. Volume is the total count of records per country and type.
I'm having a hard time with the lack of window functions in this version of MariaDB and any help would really be appreciated.
(Apologies for the tables formatted as code, I was getting a formatting error when trying to post otherwise)
Hmmm . . . I think this is two levels of aggregation:
SELECT country, type, SUM(cnt) as volume,
SUM(Total_Response_Time) / SUM(cnt) as avg_Response_time,
SUM(CASE WHEN cnt > 1 THEN cnt ELSE 0 END) as repeat_daily
FROM (SELECT DATE(Start_Time) AS "Date", Country, Type, Email,
SUM(Response_Time) as Total_Response_Time, COUNT(*) as cnt
FROM EMAIL_DETAIL
WHERE DATE(Start_Time) BETWEEN '2021-05-21' AND '2021-05-23' AND
COUNTRY IN ('AU','NZ','UK')
GROUP BY date, country, type, email
) ed
GROUP BY country, type
select Date,Country, Type, AVG(Response_Time) ,sum(cc) as Volumn, sum(case when cc>1 then 1 end) as Repeat_Daily
from (
SELECT
DATE(Start_Time) AS "Date",
Country,
Type,
count(email) cc,
AVG(Response_Time) Response_Time
FROM EMAIL_DETAIL
WHERE DATE(Start_Time) BETWEEN '2021-05-21' AND '2021-05-23'
AND COUNTRY IN ('AU','NZ','UK')
group by
DATE(Start_Time) AS "Date",
Country,Type, email
)
group by "Date",Country,Type

Map replenishment to requirement based on field value - SQL Server

I am attempting to find which "replenishment" (a positive transaction quantity) can be matched to a "requirement" (a negative transaction quantity).
The basic logic would be: For a given requirement, find the first available replenishment (whether that replenishment be from existing inventory, or from an upcoming change).
I am working with a table dbo_purchases_new that looks like this:
| Element_ID | Element | Transaction_Date | Transaction_Quantity | Total_Inventory |
|:----------:|:----------:|:----------------:|:--------------------:|:---------------:|
| | STOCK | | 5 | 5 |
| MO302 | Make_Order | 1/3/2019 | 1 | 6 |
| SO105 | Sale | 2/1/2019 | -1 | 5 |
| SO106 | Sale | 2/1/2019 | -1 | 4 |
| MO323 | Make_Order | 2/2/2019 | 1 | 5 |
| SO107 | Sale | 2/4/2019 | -1 | 4 |
| SO191 | Sale | 2/5/2019 | -1 | 3 |
| SO123 | Sale | 2/6/2019 | -1 | 2 |
| SO166 | Sale | 3/1/2019 | -1 | 1 |
| SO819 | Sale | 3/5/2019 | -1 | 0 |
| SO603 | Sale | 3/10/2019 | -4 | -3 |
| MO400 | Make_Order | 3/15/2019 | 1 | -2 |
| MO459 | Make_Order | 3/15/2019 | 1 | -1 |
| MO460 | Make_Order | 3/18/2019 | 1 | 0 |
| MO491 | Make_Order | 3/19/2019 | 1 | 1 |
| MO715 | Make_Order | 4/1/2019 | 3 | 4 |
| SO100 | Sale | 4/2/2019 | -1 | 3 |
| SO322 | Sale | 4/3/2019 | -1 | 2 |
| SO874 | Sale | 4/4/2019 | -1 | 1 |
| SO222 | Sale | 4/5/2019 | -1 | 0 |
| MO999 | Make_Order | 4/5/2019 | 1 | 1 |
| SO999 | Sale | 4/6/2019 | -1 | 0 |
that is being created as a result of this question.
I am now attempting to track which Make_Order will fulfill which Sale by tracking the Transaction_Quantity.
Ideally, the resulting dataset would look like this, where Replenishment and Replenishment_Date are newly added columns:
| Element_ID | Element | Transaction_Date | Transaction_Quantity | Total_Inventory | Replenishment | Replenishment_Date |
|:----------:|:----------:|:----------------:|:--------------------:|:---------------:|:-------------:|:------------------:|
| | STOCK | | 5 | 5 | NULL | NULL |
| MO302 | Make_Order | 1/3/2019 | 1 | 6 | NULL | NULL |
| SO105 | Sale | 2/1/2019 | -1 | 5 | STOCK | NULL |
| SO106 | Sale | 2/1/2019 | -1 | 4 | STOCK | NULL |
| MO323 | Make_Order | 2/2/2019 | 1 | 5 | NULL | NULL |
| SO107 | Sale | 2/4/2019 | -1 | 4 | STOCK | NULL |
| SO191 | Sale | 2/5/2019 | -1 | 3 | STOCK | NULL |
| SO123 | Sale | 2/6/2019 | -1 | 2 | STOCK | NULL |
| SO166 | Sale | 3/1/2019 | -1 | 1 | MO302 | 1/3/2019 |
| SO819 | Sale | 3/5/2019 | -1 | 0 | MO323 | 2/2/2019 |
| SO603 | Sale | 3/10/2019 | -4 | -3 | MO460 | 3/18/2019 |
| MO400 | Make_Order | 3/15/2019 | 1 | -2 | NULL | NULL |
| MO459 | Make_Order | 3/15/2019 | 1 | -1 | NULL | |
| MO460 | Make_Order | 3/18/2019 | 1 | 0 | NULL | NULL |
| MO491 | Make_Order | 3/19/2019 | 1 | 1 | NULL | NULL |
| MO715 | Make_Order | 4/1/2019 | 3 | 4 | NULL | NULL |
| SO100 | Sale | 4/2/2019 | -1 | 3 | MO491 | 3/19/2019 |
| SO322 | Sale | 4/3/2019 | -1 | 2 | MO715 | 4/1/2019 |
| SO874 | Sale | 4/4/2019 | -1 | 1 | MO715 | 4/1/2019 |
| SO222 | Sale | 4/5/2019 | -1 | 0 | MO715 | 4/1/2019 |
| MO999 | Make_Order | 4/5/2019 | 1 | 1 | NULL | NULL |
| SO999 | Sale | 4/6/2019 | -1 | 0 | SO999 | 4/5/2019 |
The ruleset would essentially be:
For a given requirement (a negative transaction quantity of arbitrary value), find which replenishment (a positive transaction quantity of arbitrary value) satisfies it.
Stock is assigned to the first requirements until it runs out. NOTE
-- it could be the case that stock does not exist, so this only applies IF stock does exist
Then, map replenishments to requirements based on the
Transaction_Date in ASC order
I am very confused on how to accomplish this. I imagine some pseudocode would look something like:
for curr in transaction_quantity:
if curr < 0:
if stock.exists() and stock.notempty():
fill in data from that
else:
find next replenishment
fill in data from that
else:
next
Right now, I have this so far, but I know that it will not run. I am very confused on where to go from here. I have tried looking at posts like this, but that did not have an answer. I then tried looking up CURSOR, but that was very confusing to me and I am unsure how I can apply that to this problem.
/****** WiP Script ******/
SELECT
[jerry].[dbo].[purchases_new].*,
CASE WHEN Transaction_Quantity < 0 THEN -- (SELECT Element_ID FROM the_current_row WHERE transaction_quantity > 0)
ELSE NULL AS "Replenishment",
-- (SELECT Transaction_Date FROM [jerry].[dbo].[purchases_new] WHERE Element_ID
-- Not sure how to grab the correct date of the element id from the column before
FROM
[jerry].[dbo].[purchases_new]
Any assistance is appreciated. I have been pulling my hair out on this problem. The comments contain additional information.
NOTE - I have continually tried to update this question as users have requested more information.
Here is one attempt. You will need to modify if with another layer of abstraction for offsets if you need to support transaction increments/decrements > 1. It basically aligns the order of sales with the order of debits and then uses that as join back to the main dataset.
Sql Fiddle
The idea is to put additions and subtractions into two sets, orderd chronologically by set, while also remembering order of each item back into the main list. This way, you can align each subtraction with the nearest addition. This is pretty straightforward when dealing with 1's.
Edit --> Dealing with values > 1.
Computing Transaction_Amount > (+/-)1 adds a little complexity, but still solvable. Now we need to stretch each addition and subtraction transaction set out by the Transaction_Amount, so the dataset is lengthened, however, the original algorithm will still be applied to a now longer dataset. This will allow for the recording of "partial fulfillments". So (12 A 5) would equate to (12 A 1), (12 A 1), (12 A 1), (12 A 1), (12 A 1). Now, when the subtractors are lengthened in similar fashion, (with all rows in the same order as the first of the sequence) the alignment will still work and addition and subtractions can be matched with the nearest neighbor(s).
DECLARE #T TABLE(Element_ID NVARCHAR(50),Element NVARCHAR(50), Transaction_Date DATETIME,Transaction_Quantity INT,Total_Inventory INT)
INSERT #T VALUES
('MO301','Make_Order','1/1/2019',5,1),
('MO302','Make_Order','1/3/2019',1,2),
('SO105','Sale','2/1/2019',-2,1),
('SO106','Sale','2/1/2019',-1,0),
('MO323','Make_Order','2/2/2019',1,1),
('SO107','Sale','2/4/2019',-1,0),
('SO191','Sale','2/5/2019',-1,-1),
('SO123','Sale','2/6/2019',-1,-2),
('SO166','Sale','3/1/2019',-1,-3),
('SO603','Sale','3/2/2019',-1,-4),
('MO400','Make_Order','3/15/2019',1,-3),
('MO459','Make_Order','3/15/2019',1,-2),
('MO460','Make_Order','3/18/2019',1,-1),
('MO491','Make_Order','3/19/2019',1,0)
;WITH Normalized AS
(
SELECT *, RowNumber = ROW_NUMBER() OVER (ORDER BY (SELECT 0)), IsAdd = CASE WHEN Transaction_Quantity>0 THEN 1 ELSE 0 END FROM #T
)
,ReplicateAmount AS
(
SELECT Element_ID, Element, Transaction_Date, Transaction_Quantity=ABS(Transaction_Quantity) ,Total_Inventory, RowNumber, IsAdd
FROM Normalized
UNION ALL
SELECT R.Element_ID, R.Element, R.Transaction_Date, Transaction_Quantity=(R.Transaction_Quantity - 1), R.Total_Inventory, R.RowNumber, R.IsAdd
FROM ReplicateAmount R INNER JOIN Normalized N ON R.RowNumber = N.RowNumber
WHERE ABS(R.Transaction_Quantity) > 1
)
,NormalizedAgain AS
(
SELECT Element_ID, Element, Transaction_Date, Transaction_Quantity=1, Total_Inventory, RowNumber = ROW_NUMBER() OVER (ORDER BY RowNumber), IsAdd FROM ReplicateAmount
)
,Additives AS
(
SELECT *, AddedOrder = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM NormalizedAgain WHERE IsAdd=1
)
,Subtractions AS
(
SELECT Element_ID, Element, Transaction_Date, Transaction_Quantity=-1 , Total_Inventory, RowNumber, SubtractedOrder = ROW_NUMBER() OVER (ORDER BY (SELECT 0))FROM NormalizedAgain WHERE IsAdd=0
)
,WithTies AS
(
SELECT
S.RowNumber,
S.Element_ID,
BoughtFromRowNumber = A.RowNumber,
SoldToID =S.Element_ID,
BoughFromID=A.Element_ID,
S.Element,
S.Transaction_Date,
S.Transaction_Quantity,
S.Total_Inventory
FROM
Additives A
LEFT OUTER JOIN Subtractions S ON A.AddedOrder=S.SubtractedOrder
UNION
SELECT
A.RowNumber,
A.Element_ID,
BoughtFromRowNumber = S.RowNumber,
SoldToID = NULL,
BoughFromID=NULL,
A.Element,
A.Transaction_Date,
A.Transaction_Quantity,
A.Total_Inventory
FROM
Additives A
LEFT OUTER JOIN Subtractions S ON A.AddedOrder=S.SubtractedOrder
)
SELECT
T.RowNumber,
T.Element_ID,
T.Element,
T.Transaction_Date,
T.Transaction_Quantity,
T.Total_Inventory,
T2.SoldToID,
T.BoughFromID
FROM
WithTies T
LEFT OUTER JOIN WithTies T2 ON T2.BoughtFromRowNumber= T.RowNumber
WHERE
NOT T.RowNumber IS NULL
ORDER BY
T.RowNumber

How to select multiple count(*) values then group by a specific column

I've used SQL for a while but wouldn't say I'm at an advanced level. I've tinkered with trying to figure this out myself to no avail.
I have two tables - Transaction and TransactionType:
| **TransactionID** | **Name** | **TransactionTypeID** |
| 1 | Tom | 1 |
| 2 | Jim | 1 |
| 3 | Mo | 2 |
| 4 | Tom | 3 |
| 5 | Sarah | 4 |
| 6 | Tom | 1 |
| 7 | Sarah | 1 |
| **TransactionTypeID** | **TransactionType** |
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
The Transaction.TransactionTypeID is a Forein Key linked TransactionType.TransactionTypeID field.
Here's what I'd like to achieve:
I'd like a query (this will be a stored procedure) that returns three columns:
Name - the value of the Transaction.Name column.
NumberOfTypeATransactions - the count of the number of all transactions of type 'A' for that person.
NumberOfNonTypeATransactions - the count of the number of all transactions that are NOT of type A for that person, i.e. all other transaction types.
So, using the above data as an example, the result set would be:
| **Name** | **NumberOfTypeATransactions** | **NumberOfNonTypeATransactions** |
| Tom | 2 | 1 |
| Jim | 1 | 0 |
| Mo | 0 | 1 |
| Sarah | 1 | 1 |
I might also need to return the results based on a date period (which will be based on a 'transaction date' column in the Transaction table but I haven't finalized this requirement yet.
Any help in how I can achieve this would be much appreciated. Apologies of the layout of the tables is a bit odd - haven't worked out how to format them properly yet.
This is just conditional aggregation with a join:
select t.name,
sum(case when tt.TransactionType = 'A' then 1 else 0 end) as num_As,
sum(case when tt.TransactionType <> 'A' then 1 else 0 end) as num_notAs
from transaction t join
transactiontype tt
on tt.TransactionTypeID = t.TransactionTypeID
group by t.name;

Create a group indicator (SQL)

I am looking to create a group indicator for a query using SQL (Oracle specifically). Basically, I am looking for duplicate entries for certain columns and while I can find those what I also want is some kind of indicator to say what rows the duplicates are from.
Below is an example of what I am looking to do (looking for duplicates on Name, Zip, Phone). The rows with Name = aaa are all in the same group, bb are not, and c are.
Is there even a way to do this? I was thinking something with OVER (PARTITION BY ... but I can't think of a way to only increment for each group.
+----------+---------+-----------+------------+-----------+-----------+
| Name | Zip | Phone | Amount | Duplicate | Group |
+----------+---------+-----------+------------+-----------+-----------+
| aaa | 1234 | 5555555 | 500 | X | 1 |
| aaa | 1234 | 5555555 | 285 | X | 1 |
| bb | 545 | 6666666 | 358 | | 2 |
| bb | 686 | 7777777 | 898 | | 3 |
| aaa | 1234 | 5555555 | 550 | X | 1 |
| c | 5555 | 8888888 | 234 | X | 4 |
| c | 5555 | 8888888 | 999 | X | 4 |
| c | 5555 | 8888888 | 230 | X | 4 |
+----------+---------+-----------+------------+-----------+-----------+
It looks like you can just use
(CASE WHEN COUNT(*) OVER (partition by name, zip, phone) > 1
THEN 'X'
ELSE NULL
END) duplicate,
DENSE_RANK() OVER (ORDER BY name, zip, phone) group_rank
Rows that have the same name, zip, and phone will have the same group_rank. Here is a SQL Fiddle example.