I'm trying to perform a query between two different tables and come up with a case by case scenario, coming up with a list of records of calls for a specific month.
Here are my tables:
Customer table:
+----+----------------+------------+
| id | name | number |
+----+----------------+------------+
| 1 | John Doe | 8973221232 |
| 2 | American Dad | 7165531212 |
| 3 | Michael Clean | 8884731234 |
| 4 | Samuel Gatsby | 9197543321 |
| 5 | Mike Chat | 8794029819 |
+----+----------------+------------+
Transaction data:
+----------+------------+------------+----------+---------------------+
| trans_id | incoming | outgoing | duration | date_time |
+----------+------------+------------+----------+---------------------+
| 1 | 8973221232 | 9197543321 | 64 | 2018-03-09 01:08:09 |
| 2 | 3729920490 | 7651113929 | 276 | 2018-07-20 05:53:10 |
| 3 | 8884731234 | 8973221232 | 382 | 2018-05-02 13:12:13 |
| 4 | 8973221232 | 9234759208 | 127 | 2018-07-07 15:32:30 |
| 5 | 7165531212 | 9197543321 | 852 | 2018-08-02 07:40:23 |
| 6 | 8884731234 | 9833823023 | 774 | 2018-07-03 14:27:52 |
| 7 | 8273820928 | 2374987349 | 120 | 2018-07-06 05:27:44 |
| 8 | 8973221232 | 9197543321 | 79 | 2018-07-30 12:51:55 |
| 9 | 7165531212 | 7651113929 | 392 | 2018-05-22 02:27:38 |
| 10 | 5423541524 | 7165531212 | 100 | 2018-07-21 22:12:20 |
| 11 | 9197543321 | 2983479820 | 377 | 2018-07-20 17:46:36 |
| 12 | 8973221232 | 7651113929 | 234 | 2018-07-09 03:32:53 |
| 13 | 7165531212 | 2309483932 | 88 | 2018-07-16 16:22:21 |
| 14 | 8973221232 | 8884731234 | 90 | 2018-09-03 13:10:00 |
| 15 | 3820838290 | 2093482348 | 238 | 2018-04-12 21:59:01 |
+----------+------------+------------+----------+---------------------+
What am I trying to accomplish?
I'm trying to compile a list of "costs" for each of the customers that made calls on July 2018. The costs are based on:
1) If the customer received a call (incoming), the cost of the call is equal to the duration;
2) if the customer made a call (outgoing), the cost of the call is 100 if the call is 30 or less in duration. If it exceeds 30 duration, then the cost is 100 plus 5 * duration of the exceeded period.
If the customer didn't make any calls during that month he shouldn't be on the list.
Examples:
1) Customer American Dad has 3 incoming calls and 1 outgoing call, however only trans_id 10 and 13 are for the month of July. He should be paying a total of 538:
for trans_id 10 = 450 (100 for the first 30s + 5 * 70 for the remaining)
for trans_id 13 = 88
2) Customer Samuel Gatsby has 1 incoming call and 3 outgoing calls, however only trans_id 8 and 11 are for the month of July. He should be paying a total of 722:
for trans_id 8 = 345 (100 for the first 30s + 5 * 49 for the remaining)
for trans_id 11 = 377
Considering only these two examples, the output would be:
+----+----------------+------------+------------+
| id | name | number | billable |
+----+----------------+------------+------------+
| 2 | American Dad | 7165531212 | 538 |
| 4 | Samuel Gatsby | 9197543321 | 722 |
+----+----------------+------------+------------+
Note: Mike Chat shouldn't be on the list as he didn't make or receive any calls for that specific month.
What have I tried so far?
I've been playing cat and mouse with this one, I'm using the number as uniqueID, already attempted both a full outer join and combining where incoming or outgoing is not null then applying rules by case, tried doing a left join and applying cases, but I'm circling around and I can't get to a final list. Whenever I get incoming or outgoing, I'm either not able to apply the case or not able to come with both together. Really appreciate the help!
select customer_name.name, customer_name.number, bill = (CASE
WHEN customer_name.number = transaction_data.incoming then 'sum bill'
else 'multiply and add'
end)
from customer_name
left join transaction_data on customer_name.number = transaction_data.incoming or customer_name.name = transaction_data.outgoing
where strftime('%Y-%m', transaction_data.date_time) = '2018-07'
Note: I'm using sqlite to try it out online but the database is on SQL Server 2012, so I know that I can use a date format much easier, that way, but I'd like to keep as close to T-SQL as possible.
Also tried creating a case to determine whether it's incoming call or outgoing, but I'm only getting incoming as a result, even though trans_id 10 is outgoing:
select name, number, duration, case
when customer_name.number = transaction_data.incoming then 'incoming'
when customer_name.number = transaction_data.outgoing then 'outgoing'
END direction
from customer_name
left join transaction_data on customer_name.number = transaction_data.incoming or customer_name.name = transaction_data.outgoing
where strftime('%Y-%m', transaction_data.date_time) = '2018-07'
Try this:
SELECT
c."name", c.number,
SUM(CASE c.number
WHEN t.incoming THEN t.duration
ELSE IIF(t.duration - 30 < 0, 0, t.duration - 30) * 5 + 100
END) AS billable
FROM Customer AS c INNER JOIN [Transaction] AS t
ON c.number IN(t.incoming, t.outgoing)
WHERE t.date_time >= '20180701' AND t.date_time < '20180801'
GROUP BY c."name", c.number
Output:
| name | number | billable |
+---------------+------------+----------+
| John Doe | 8973221232 | 440 |
| American Dad | 7165531212 | 538 |
| Michael Clean | 8884731234 | 774 |
| Samuel Gatsby | 9197543321 | 722 |
Test it online with SQL Fiddle.
I have a 1 table in a db that stored Incoming, Outgoing and Net values for various Account Codes over time. Although there is a date field the sequence of events per Account Code is based on the "Version" number where 0 = original record for each Account Code and it increments by 1 after each change to that Account Code.
The Outgoing and Incoming values are stored in the db as cumulative values rather than the individual transaction value but I am looking for a way to Select * From this table and return the individual amounts as opposed to the cumulative.
Below are test scripts of table and data, and also 2 examples.
If i Select where code = '123' in the test table I currently get this (values are cumulative);
+------+------------+---------+---------+---------+-----+
| Code | Date | Version | Incoming| Outgoing| Net |
+------+------------+---------+---------+---------+-----+
| 123 | 01/01/2018 | 0 | 100 | 0 | 100 |
| 123 | 07/01/2018 | 1 | 150 | 0 | 150 |
| 123 | 09/01/2018 | 2 | 150 | 100 | 50 |
| 123 | 14/01/2018 | 3 | 200 | 100 | 100 |
| 123 | 18/01/2018 | 4 | 200 | 175 | 25 |
| 123 | 23/01/2018 | 5 | 225 | 175 | 50 |
| 123 | 30/01/2018 | 6 | 225 | 225 | 0 |
+------+------------+---------+---------+---------+-----+
This is what I would like to see (each individual transaction);
+------+------------+---------+----------+----------+------+
| Code | Date | Version | Incoming | Outgoing | Net |
+------+------------+---------+----------+----------+------+
| 123 | 01/01/2018 | 0 | 100 | 0 | 100 |
| 123 | 07/01/2018 | 1 | 50 | 0 | 50 |
| 123 | 09/01/2018 | 2 | 0 | 100 | -100 |
| 123 | 14/01/2018 | 3 | 50 | 0 | 50 |
| 123 | 18/01/2018 | 4 | 0 | 75 | -75 |
| 123 | 23/01/2018 | 5 | 25 | 0 | 25 |
| 123 | 30/01/2018 | 6 | 0 | 50 | -50 |
+------+------------+---------+----------+----------+------+
If I had the individual transaction values and wanted to report on the cumulative, I would use an OVER PARTITION BY, but is there an opposite to that?
I am not looking to redesign the create table or the process in which it is stored, I am just looking for a way to report on this from our MI environment.
Note: I've added other random Account Codes into this to emphasis how the data is not ordered by Code or Version, but by Date.
thanks in advance for any help.
USE [tempdb];
IF EXISTS ( SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'Table1'
AND TABLE_SCHEMA = 'dbo')
DROP TABLE [dbo].[Table1];
GO
CREATE TABLE [dbo].[Table1]
(
[Code] CHAR(3)
,[Date] DATE
,[Version] CHAR(3)
,[Incoming] DECIMAL(20,2)
,[Outgoing] DECIMAL(20,2)
,[Net] DECIMAL(20,2)
);
GO
INSERT INTO [dbo].[Table1] VALUES
('123','2018-01-01','0','100','0','100'),
('456','2018-01-02','0','50','0','50'),
('789','2018-01-03','0','0','0','0'),
('456','2018-01-04','1','100','0','100'),
('456','2018-01-05','2','150','0','150'),
('789','2018-01-06','1','50','50','0'),
('123','2018-01-07','1','150','0','150'),
('456','2018-01-08','3','200','0','200'),
('123','2018-01-09','2','150','100','50'),
('789','2018-01-10','2','0','0','0'),
('456','2018-01-11','4','225','0','225'),
('789','2018-01-12','3','75','25','50'),
('987','2018-01-13','0','0','50','-50'),
('123','2018-01-14','3','200','100','100'),
('654','2018-01-15','0','100','0','100'),
('456','2018-01-16','5','250','0','250'),
('987','2018-01-17','1','50','50','0'),
('123','2018-01-18','4','200','175','25'),
('789','2018-01-19','4','100','25','75'),
('987','2018-01-20','2','150','125','25'),
('321','2018-01-21','0','100','0','100'),
('654','2018-01-22','1','0','0','0'),
('123','2018-01-23','5','225','175','50'),
('321','2018-01-24','1','100','50','50'),
('789','2018-01-25','5','100','50','50'),
('987','2018-01-26','3','150','150','0'),
('456','2018-01-27','6','250','250','0'),
('456','2018-01-28','7','270','250','20'),
('321','2018-01-29','2','100','100','0'),
('123','2018-01-30','6','225','225','0'),
('987','2018-01-31','4','175','150','25')
;
GO
SELECT *
FROM [dbo].[Table1]
WHERE [Code] = '123'
GO;
USE [tempdb];
IF EXISTS ( SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'Table1'
AND TABLE_SCHEMA = 'dbo')
DROP TABLE [dbo].[Table1];
GO;
}
Just use lag():
select Evt, Date, Version,
(Loss - lag(Loss, 1, 0) over (partition by evt order by date)) as incoming,
(Rec - lag(Rec, 1, 0) over (partition by evt order by date)) as outgoing,
(Net - lag(Net, 1, 0) over (partition by evt order by date)) as net
from [dbo].[Table1];
Table name: Copies
+------------------------------------------------------------------------------------+
| group_id | my_id | previous | in_this | higher_value | most_recent |
+----------------------------------------------------------------------------------------------------------------
| 900 | 1 | null | Y | 7 | May16 |
| 900 | 2 | null | Y | 3 | Oct 16 |
| 900 | 3 | null | N | 9 | Oct 16 |
| 901 | 4 | 378 | Y | 3 | Oct 16 |
| 901 | 5 | null | N | 2 | Oct 16 |
| 902 | 6 | null | N | 5 | May16 |
| 902 | 7 | null | N | 9 | Oct 16 |
| 903 | 8 | null | Y | 3 | Oct 16 |
| 903 | 9 | null | Y | 3 | May16 |
| 904 | 10 | null | N | 0 | May 16 |
| 904 | 11 | null | N | 0 | May16
--------------------------------------------------------------------------------------
Output table
+---------------------------------------------------------------------------------------------------+
| group_id | my_id | previous | in_this | higher_value |most_recent|
+----------------------------------------------------------------------------------------------------
| 900 | 1 | null | Y | 7 | May16 |
| 902 | 7 | null | N | 9 | Oct 16 |
| 903 | 8 | null | Y | 3 | Oct 16 |
---------------------------------------------------------------------------------------------------------
Hi all, I need help with a query that returns one record within a group based on the importance of the field. The importance is ranked as follows:
previous- if one record within the group_id is not null, then neither record within a group_id is returned (because according to our rules, all records within a group should have the same previous value)
in_this- If one record is Y, and the other is N within a group_id, then we keep the Y; If all records are Y or all are N, then we move to the next attribute
Higher_value- If all records in the ‘in_this’ field are equal, then we need to select the record with the greater value from this field. If both records have an equal value, we move to the next attribute
Most_recent- If all records were of equal value in the ‘higher_value’ field, then we consider the newest record. If these are equal, then nothing is returned.
This is a simplified version of the table I am looking at, but I just would like to get the gist of how something like this would work. Basically, my table has multiple copies of records that have been grouped through some algorithm. I have been tasked with selecting which of these records within a group is the ‘good’ one, and we are basing this on these fields.
I’d like the output to actually show all fields, because I will likely attempt to refine the query to include other fields (there are over 40 to consider), but the most important is the group_id and my_id fields. It would be neat if we could also somehow flag why each record got picked, but that isn’t necessary.
It seems like something like this should be easy, but I have a hard time wrapping my head around how to pick from within a group_id. Thanks for your help.
You can use analytic functions for this. The trick is establishing the right variables for each condition:
select t.*
from (select t.*,
max(in_this) over (partition by group_id) as max_in_this,
min(higher_value) over (partition by group_id) as min_higher_value,
max(higher_value) over (partition by group_id) as max_higher_value,
row_number() over (partition by group_id, higher_value order by my_id) as seqnum_ghv,
min(most_recent) over (partition by group_id) as min_most_recent,
max(most_recent) over (partition by group_id) as max_most_recent,
row_number() over (partition by group_id order by most_recent) as seqnum_mr
from t
) t
where max_in_this is not null and
( (min_higher_value <> max_higher_value and seqnum_ghv = 1) or
(min_higher_value = max_higher_value and min_most_recent <> max_most_recent and seqnum_mr = 1
)
);
The third condition as stated makes no sense, but you should get the idea for how to implement this.