SQL query that throws away rows that are older & satisfy a condition? - sql

Issuing the following query:
SELECT t.seq,
t.buddyId,
t.mode,
t.type,
t.dtCreated
FROM MIM t
WHERE t.userId = 'ali'
ORDER BY t.dtCreated DESC;
...returns me 6 rows.
+-------------+------------------------+------+------+---------------------+
| seq | buddyId | mode | type | dtCreated |
+-------------+------------------------+------+------+---------------------+
| 12 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:39:05 |
| 11 | abcdefghij25#gmail.com | 4 | 1 | 2009-09-14 12:39:02 |
| 10 | op_eee_81#hotmail.com | 1 | -1 | 2009-09-14 12:39:00 |
| 9 | abcdefghij25#gmail.com | 1 | -1 | 2009-09-14 12:38:59 |
| 8 | op_eee_81#hotmail.com | 2 | 1 | 2009-09-14 12:37:53 |
| 7 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:37:46 |
+-------------+------------------------+------+------+---------------------+
I want to return rows based on this condition:
If there are duplicate rows with the same buddyId, only return me the latest (as specified by dtCreated).
So, the query should return me:
+-------------+------------------------+------+------+---------------------+
| seq | buddyId | mode | type | dtCreated |
+-------------+------------------------+------+------+---------------------+
| 12 | abcdefghij25#gmail.com | 2 | 1 | 2009-09-14 12:39:05 |
| 10 | op_eee_81#hotmail.com | 1 | -1 | 2009-09-14 12:39:00 |
+-------------+------------------------+------+------+---------------------+
I've tried with no success to use a UNIQUE function but it's not working.

This should only return the most recent entry for each userId.
SELECT a.seq
, a.buddyId
, a.mode
, a.type
, a.dtCreated
FROM mim AS [a]
JOIN (SELECT MAX(dtCreated) FROM min GROUP BY buddyId) AS [b]
ON a.dtCreated = b.dtCreated
AND a.userId = b.userId
WHERE userId='ali'
ORDER BY dtCreated DESC;

Related

SQL Count depending on certain conditions

I have two tables.
One have userid and email (users table). The other have payments information (payments table) from the userid in users.
users
+--------+------------+
| Userid | Name |
+--------+------------+
| 1 | Alex T |
| 2 | Jeremy T |
| 3 | Frederic A |
+--------+------------+
payments
+--------+-----------+------------+----------+
| Userid | ValuePaid | PaidMonths | Refunded |
+--------+-----------+------------+----------+
| 1 | 1 | 12 | null |
| 1 | 20 | 12 | null |
| 1 | 20 | 12 | null |
| 1 | 20 | 1 | null |
| 2 | 1 | 1 | null |
| 2 | 20 | 12 | 1 |
| 2 | 20 | 12 | null |
| 2 | 20 | 1 | null |
| 3 | 1 | 12 | null |
| 3 | 20 | 1 | 1 |
| 3 | 20 | 1 | null |
+--------+-----------+------------+----------+
I want to count the PaidMonths taking in consideration the following rules:
If ValuePaid < 10 PaidMonths should be = 0.23 (even if in the column the value seen is any other mumber).
If Refund=1 the PaidMonths should be = 0.
Based on this when i join both tables by userid, and sum the PaidMonths based in the previousrules, i expect to see as result:
+--------+------------+------------+
| userid | Name | paidMonths |
+--------+------------+------------+
| 1 | Alex T | 25.23 |
| 2 | Jeremy T | 13.23 |
| 3 | Frederic A | 1.23 |
+--------+------------+------------+
Can you help me to achieve this in the most elegant way? Should a temporary table be used?
The following gives your desired results, using apply with case expression to map your values:
select u.UserID, u.Name, Sum(pm) PaidMonths
from users u join payments p on p.userid=u.userid
cross apply (values(
case
when valuepaid <10 then 0.23
when Refunded=1 then 0
else PaidMonths end
))x(pm)
group by u.UserID, u.Name
See Working Fiddle

Finding MAX date aggregated by order - Oracle SQL

I have a data orders that looks like this:
| Order | Step | Step Complete Date |
|:-----:|:----:|:------------------:|
| A | 1 | 11/1/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/3/2019 |
| | 5 | 11/3/2019 |
| | 6 | 11/5/2019 |
| | 7 | 11/5/2019 |
| B | 1 | 12/1/2019 |
| | 2 | 12/2/2019 |
| | 3 | |
| C | 1 | 10/21/2019 |
| | 2 | 10/23/2019 |
| | 3 | 10/25/2019 |
| | 4 | 10/25/2019 |
| | 5 | 10/25/2019 |
| | 6 | |
| | 7 | 10/27/2019 |
| | 8 | 10/28/2019 |
| | 9 | 10/29/2019 |
| | 10 | 10/30/2019 |
| D | 1 | 10/30/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/2/2019 |
| | 5 | 11/2/2019 |
What I need to accomplish is the following:
For each order, assign the 'Order_Completion_Date' field as the most recent 'Step_Complete_Date'. If ANY 'Step_Complete_Date' is NULL, then the value for 'Order_Completion_Date' should be NULL.
I set up a SQL FIDDLE with this data and my attempt, below:
SELECT
OrderNum,
MAX(Step_Complete_Date)
FROM
OrderNums
WHERE
Step_Complete_Date IS NOT NULL
GROUP BY
OrderNum
This is yielding:
ORDERNUM MAX(STEP_COMPLETE_DATE)
D 11/2/2019
A 11/5/2019
B 12/2/2019
C 10/30/2019
How can I achieve:
| OrderNum | Order_Completed_Date |
|:--------:|:--------------------:|
| A | 11/5/2019 |
| B | NULL |
| C | NULL |
| D | 11/2/2019 |
Aggregate function with KEEP can handle this
select ordernum,
max(step_complete_date)
keep (DENSE_RANK FIRST ORDER BY step_complete_date desc nulls first) res
FROM
OrderNums
GROUP BY
OrderNum
You can use a CASE expression to first count if there are any NULL values and if not then find the maximum value:
Query 1:
SELECT OrderNum,
CASE
WHEN COUNT( CASE WHEN Step_Complete_Date IS NULL THEN 1 END ) > 0
THEN NULL
ELSE MAX(Step_Complete_Date)
END AS Order_Completion_Date
FROM OrderNums
GROUP BY OrderNum
Results:
| ORDERNUM | ORDER_COMPLETION_DATE |
|----------|-----------------------|
| D | 11/2/2019 |
| A | 11/5/2019 |
| B | (null) |
| C | (null) |
First, you are representing dates as varchars in mm/dd/yyyy format (at least in fiddle). With max function it can produce incorrect result, try for example order with dates '11/10/2019' and '11/2/2019'.
Second, the most simple solution is IMHO to use fallback date for nulls and get null back when fallback date wins:
SELECT
OrderNum,
NULLIF(MAX(NVL(Step_Complete_Date,'~')),'~')
FROM
OrderNums
GROUP BY
OrderNum
(Example is still for varchars since tilde is greater than any digit. For dates, you could use 9999-12-31, for instance.)

Returning MIN Row_Number() SQL

This is probably the clunkiest query I have ever made. I have to use a read-only account so I can't use temp tables or anything to make this easier. The goal is to return the MIN(RowNum) when sumPiecesScrapped = maxSum. I have tried adding the entire query into another subquery trying to return the MIN(RowNum) however, it is one-to-many that is tied to the primary key JobNo and when I tie it to JobNo and StepNo it gives me the same result as the one below.
SELECT
JobNo,
StepNo,
sumPiecesScrapped,
maxSum,
CASE
WHEN sumPiecesScrapped = maxSum THEN ROW_NUMBER() OVER(PARTITION BY JobNo ORDER BY JobNo, StepNo)
ELSE 0
END AS RowNum
FROM
(
SELECT
JobNo,
StepNo,
sumPiecesScrapped
FROM
(
SELECT
JobNo,
StepNo,
SUM(PiecesScrapped) as sumPiecesScrapped
FROM
(
SELECT
JobNo,
StepNo,
PiecesFinished,
PiecesScrapped
FROM TimeTicketDet
) tt2
GROUP BY JobNo, StepNo
) tt3
GROUP BY JobNo, StepNo, sumPiecesScrapped
) tt4
LEFT JOIN
(
SELECT
JobNo as tt5JobNo,
MAX(PiecesScrapped) as maxSum
FROM
(
SELECT
JobNo,
PiecesScrapped
FROM TimeTicketDet
) tt5
GROUP BY JobNo
) tt5
ON tt5.tt5JobNo = tt4.JobNo
WHERE tt4.JobNo = '12345'
Result:
+-------+--------+-------------------+--------+--------+
| JobNo | StepNo | sumPiecesScrapped | maxSum | RowNum |
+-------+--------+-------------------+--------+--------+
| 12345 | 10 | 0 | 5 | 0 |
| 12345 | 20 | 1 | 5 | 0 |
| 12345 | 30 | 5 | 5 | 3 |
| 12345 | 40 | 5 | 5 | 4 |
| 12345 | 60 | 5 | 5 | 5 |
| 12345 | 70 | 5 | 5 | 6 |
+-------+--------+-------------------+--------+--------+
Desired Result:
+-------+--------+-------------------+--------+--------+
| JobNo | StepNo | sumPiecesScrapped | maxSum | RowNum |
+-------+--------+-------------------+--------+--------+
| 12345 | 10 | 0 | 5 | 0 |
| 12345 | 20 | 1 | 5 | 0 |
| 12345 | 30 | 5 | 5 | 3 |
| 12345 | 40 | 5 | 5 | 3 |
| 12345 | 60 | 5 | 5 | 3 |
| 12345 | 70 | 5 | 5 | 3 |
+-------+--------+-------------------+--------+--------+
Other Possible Result:
+-------+--------+-------------------+--------+-----------+
| JobNo | StepNo | sumPiecesScrapped | maxSum | RowNum |
+-------+--------+-------------------+--------+-----------+
| 12345 | 10 | 0 | 5 | 0 |
| 12345 | 20 | 1 | 5 | 0 |
| 12345 | 30 | 5 | 5 | Something |
| 12345 | 40 | 5 | 5 | 0 |
| 12345 | 60 | 5 | 5 | 0 |
| 12345 | 70 | 5 | 5 | 0 |
+-------+--------+-------------------+--------+-----------+

Update using Self Join Sql Server

I have huge data and sample of the table looks like below
+-----------+------------+-----------+-----------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+-----------+
| 1 | 6/3/2014 | 1 | 6/3/2014 |
| 1 | 5/22/2015 | 2 | NULL |
| 1 | 6/3/2015 | 3 | NULL |
| 1 | 11/20/2015 | 4 | NULL |
| 2 | 2/25/2014 | 1 | 2/25/2014 |
| 2 | 7/31/2014 | 2 | NULL |
| 2 | 8/26/2014 | 3 | NULL |
+-----------+------------+-----------+-----------+
Now I need to check if the difference between Date in 2nd row and Flag_date in 1st row. If the difference is more than 180 then 2nd row Flag_date should be updated with the date in 2nd row else it needs to be updated by Flag_date in 1st Row. And same rule follows for all rows with same unique_ID
update a
set a.Flag_Date=case when DATEDIFF(dd,b.Flag_Date,a.[Date])>180 then a.[Date] else b.Flag_Date end
from Table1 a
inner join Table1 b
on a.RowNumber=b.RowNumber+1 and a.Unique_ID=b.Unique_ID
The above update query when executed once, only the second row under each Unique_ID gets updated and result looks like below
+-----------+------------+-----------+------------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+------------+
| 1 | 2014-06-03 | 1 | 2014-06-03 |
| 1 | 2015-05-22 | 2 | 2015-05-22 |
| 1 | 2015-06-03 | 3 | NULL |
| 1 | 2015-11-20 | 4 | NULL |
| 2 | 2014-02-25 | 1 | 2014-02-25 |
| 2 | 2014-07-31 | 2 | 2014-02-25 |
| 2 | 2014-08-26 | 3 | NULL |
+-----------+------------+-----------+------------+
And I need to run four times to achieve my desired result
+-----------+------------+-----------+------------+
| Unique_ID | Date | RowNumber | Flag_Date |
+-----------+------------+-----------+------------+
| 1 | 2014-06-03 | 1 | 2014-06-03 |
| 1 | 2015-05-22 | 2 | 2015-05-22 |
| 1 | 2015-06-03 | 3 | 2015-05-22 |
| 1 | 2015-11-20 | 4 | 2015-11-20 |
| 2 | 2014-02-25 | 1 | 2014-02-25 |
| 2 | 2014-07-31 | 2 | 2014-02-25 |
| 2 | 2014-08-26 | 3 | 2014-08-26 |
+-----------+------------+-----------+------------+
Is there a way where I can run update only once and all the rows are updated.
Thank you!
If you are using SQL Server 2012+, then you can use lag():
with toupdate as (
select t1.*,
lag(flag_date) over (partition by unique_id order by rownumber) as prev_flag_date
from table1 t1
)
update toupdate
set Flag_Date = (case when DATEDIFF(day, prev_Flag_Date, toupdate.[Date]) > 180
then toupdate.[Date] else prev_Flag_Date
end);
Both this version and your version can take advantage of an index on table1(unique_id, rownumber) or, better yet, table1(unique_id, rownumber, flag_date).
EDIT:
In earlier versions, this might have better performance:
with toupdate as (
select t1.*, t2.flag_date as prev_flag_date
from table1 t1 outer apply
(select top 1 t2.flag_date
from table1 t2
where t2.unique_id = t1.unique_id and
t2.rownumber < t1.rownumber
order by t2.rownumber desc
) t2
)
update toupdate
set Flag_Date = (case when DATEDIFF(day, prev_Flag_Date, toupdate.[Date]) > 180
then toupdate.[Date] else prev_Flag_Date
end);
The CTE can make use of the same index -- and it is important to have the index. The reason for the better performance is because your join on row_number() cannot use an index on that field.

Getting 0 rows returned on query

so let me start will the basic table layout for all tables involved:
#zip_code_time_zone
+----+----------+-----------+
| id | zip_code | time_zone |
+----+----------+-----------+
| 1 | 00544 | -1 |
| 2 | 00601 | -3 |
| 3 | 00602 | 0 |
| 4 | 00603 | -3 |
| 5 | 00604 | 0 |
+----+----------+-----------+
#pricing_record
+------+---------------+--------------------+
| id | location_code | service_center_zip |
+------+---------------+--------------------+
| 7119 | TX725 | 79714 |
| 7121 | TX734 | 75409 |
| 7122 | TX737 | 78019 |
| 7124 | TX742 | 75241 |
| 7126 | TX751 | 77494 |
+------+---------------+--------------------+
#transaction_record
+----+-----------------+------------------+--------------+--------------+
| id | truck_stop_code | create_date | gps_verified | central_time |
+----+-----------------+------------------+--------------+--------------+
| 1 | CA428 | 05/01/2015 14:52 | 0 | NULL |
| 2 | CA343 | 05/01/2015 19:10 | 0 | NULL |
| 3 | CA223 | 05/01/2015 09:28 | 0 | NULL |
| 4 | CA721 | 05/01/2015 07:55 | 0 | NULL |
| 5 | MN336 | 05/01/2015 06:46 | 0 | NULL |
+----+-----------------+------------------+--------------+--------------+
When I was working on this project an issue was noticed with the create_date column in transaction_record. It needs to be converted to central time, so I wrote an update query, but I have been unable to successfully set the central_time column. My query is below:
query
UPDATE t
SET t.central_time = DATEADD(hour, z.time_zone,CONVERT(DATETIME, t.create_date, 120))
FROM eagle_devel.dbo.zip_code_time_zone z
INNER JOIN eagle_devel.dbo.pricing_record p ON z.zip_code = p.service_center_zip
INNER JOIN eagle_devel.dbo.transaction_record t ON t.truck_stop_code = p.location_code
This is what i get when I run the query
(0 row(s) affected)
NOTES
The time_zone column in #zip_code_time_zone is not the standard UTC it is the difference to calculate to central
I am still working on this as we speak, just looking for some extra assistance to see if someone else can fix it faster than myself.
Try like this instead with little changes, table you are updating should be in FROM clause and then adjust the JOIN accordingly
UPDATE t
SET t.central_time = DATEADD(hour, z.time_zone,CONVERT(DATETIME, t.create_date, 120))
FROM eagle_devel.dbo.transaction_record t
INNER JOIN eagle_devel.dbo.pricing_record p ON t.truck_stop_code = p.location_code
INNER JOIN eagle_devel.dbo.zip_code_time_zone z ON z.zip_code = p.service_center_zip