Select from duplicated data into a reduced table with unique lines - sql

I have created a joined table in SQL Server MS and there are several duplicate lines in it. Now, I need to make a wise selection out of this table, so that there would be unique line for each (Item, Recall_Date) pair based on a specific selection criteria:
Here is the visual clarification of what I need as pick criteria:
Basically, my selection criteria should be as below:
If there are lines as PingPong_FE = 1 & PingPong_Replen = 1, Then pick
this,
Else if there are lines as PingPong_FE = 0 & PingPong_Replen = 1,Then
pick this,
Else if there are lines as PingPong_FE = 1 & PingPong_Replen = 0,Then
pick this,
Else if there are lines as PingPong_FE = 0 & PingPong_Replen = 0,Then
pick this
Into the output table.
How should be my SQL query look like?

You can merge PingPong_Replen and PingPong_FE columns and get the row which has the max value.
Try this;
select * from
(
SELECT t.item, t.recall_date, t.fe_date , max(t.PingPong_Replen + t.PingPong_FE) AS maxPPval
FROM tableInput t
GROUP BY t.item, t.recall_date, fe_date) t1,
tableInput t2
where t2.item = t1.item
and t2.recall_date = t1.recall_date
and t1.fe_date = t2.fe_date
and t1.maxPPval = (t2.t.PingPong_Replen + t2.PingPong_FE)

May Be like this:
WITH CTE AS (
SELECT Item, Recall_Date, PingPongFE, PinPongReplen , Row_number() over
(PARTITION BY Item, Recall_Date ORDER BY Item, Recall_Date) ROW
FROM Yourtable)
SELECT * FROM CTE WHERE ROW=1;

Related

How To Keep Records in Order from Derived Table

I am trying to update a SQL table from a remote DB2 table. There may be multiple updates for the same record but I need the updates to happen in the order they are in in the DB2 table. You can not use Order By on a derived table. I have tried several different options to try to get this to work, but the updates still do not happen in order.
For example:
Change 1 - CUSTOMER NAME = ABCX
Change 2 - CUSTOMER NAME = ABC
After I run the query, the customer name is ABCX when it should be ABC.
I truly do not know what else to try. I've tried temp tables (still derived table), creating a concatenated field with date and time fields, sub-select, row_number() over(order by date, time) and many other things. I'd like to keep it in order by the date and time fields in the remote table.
Any insight would be appreciated.
Thank you.
Here is the basic code I have:
SET
A.CUSRID = B.RECORD_ID,
A.CUSSTS = B.ACTIVE_CODE,
A.CUSCOM = B.COMPANY_NUMBER,
A.CUSMNM = B.CUSTOMER_NAME,
A.CUSAD1 = B.CUSTOMER_ADDRESS_1,
A.CUSAD2 = B.CUSTOMER_ADDRESS_2,
A.CUSAD3 = B.CUSTOMER_ADDRESS_3,
A.CUSZIP = B.CUSTOMER_ZIP_CODE,
A.CUSZPE = B.CUSZPE_NOT_USED,
A.CUSSTC = B.CUSTOMER_STATE,
A.CUSARA = B.CUSTOMER_AREA_CODE,
A.CUSPHN = B.CUSTOMER_PHONE,
A.CUSB17 = B.CUSB17_NOT_USED,
A.CUSTMT = B.STATEMENT_PRINT_CODE,
A.CUSCRL = B.CREDIT_LIMIT,
A.CUSCRC = B.CREDIT_CODE,
A.CUSMCD = B.CUSMCD_NOT_USED,
A.CUSTX1 = B.TAX_RATE_1,
A.CUSTX2 = B.TAX_RATE_2,
A.CUSTXC = B.TAX_RATE,
A.CUSTXE = B.TAX_EXEMPT_ID,
A.CUSB48 = B.CUSB48_NOT_USED,
A.CUSMDT = B.MAINTENANCE_DATE,
A.CUSB20 = B.CUSB20_NOT_USED,
A.CSSRCH = B.SEARCH_FIELD,
A.CUSBRN = B.BRANCH_ID,
A.CUSDST = B.DISTRIBUTOR_NUMBER,
A.CUSB28 = B.CUSB28_NOT_USED
FROM
dbo.mcusmas A
INNER JOIN (
SELECT
RECORD_ID,
ACTIVE_CODE,
COMPANY_NUMBER,
CUSTOMER_NUMBER,
CUSTOMER_NAME,
CUSTOMER_ADDRESS_1,
CUSTOMER_ADDRESS_2,
CUSTOMER_ADDRESS_3,
CUSTOMER_ZIP_CODE,
CUSZPE_NOT_USED,
CUSTOMER_STATE,
CUSTOMER_AREA_CODE,
CUSTOMER_PHONE,
CUSB17_NOT_USED,
STATEMENT_PRINT_CODE,
CREDIT_LIMIT,
CREDIT_CODE,
CUSMCD_NOT_USED,
TAX_RATE_1,
TAX_RATE_2,
TAX_RATE,
TAX_EXEMPT_ID,
CUSB48_NOT_USED,
MAINTENANCE_DATE,
CUSB20_NOT_USED,
SEARCH_FIELD,
BRANCH_ID,
DISTRIBUTOR_NUMBER,
CUSB28_NOT_USED,
FROM remoteserver.MCUSMASPLG
WHERE Event_State_ID = '*New' AND SENT_TO_DATA_WAREHOUSE = 'N'
) B
ON A.CUSMNB = B.CUSTOMER_NUMBER
There is no "change 1" or "change 2". There are only rows and SQL Server arbitrarily ends up using one of them. If you want to control the rows, you should select the one you want in advance:
FROM dbo.mcusmas A JOIN
(SELECT B.*,
ROW_NUMBER() OVER (PARTITION BY CUSTOMER_NUMBER ORDER BY <ordering col>) as seqnum
FROM remoteserver.MCUSMASPLG
WHERE Event_State_ID = '*New' AND
SENT_TO_DATA_WAREHOUSE = 'N'
) B
ON A.CUSMNB = B.CUSTOMER_NUMBER
I don't know how you are determining which row is the right one. Presumably, some column has this information and you can use it in the ORDER BY.

select subquery using data from the select statement?

I have two tables, headers and lines. I need to grab the batch_submission_date from the header table, but sometimes a query for batch_id will return a null for batch_submission_date, but will also return a parent_batch_id, and if we query THAT parent_batch_id as a batch_id, it will then return the correct batch_submission_date.
e.g.
SELECT t1.batch_id,
t1.parent_batch_id,
t2.batch_submission_date
FROM db.headers t1, db.lines t2
WHERE t1.batch_id = '12345';
output = 12345, 99999, null
Then we use that parent batch_id as a batch_id :
SELECT t1.batch_id,
t1.parent_batch_id,
t2.batch_submission_date
FROM db.headers t1, db.lines t2
WHERE t1.batch_id = '99999';
and we get output = 99999,99999,'2018-01-01'
So I'm trying to write a query that will do this for me - anytime a batch_id's batch_submission_date is null, we find that batch_id's parent batch_id and query that instead.
This was my idea - but I just get back null both for bp_batch_submission_date and for new_submission_date.
SELECT
t1.parent_id as parent_id,
t1.BATCH_ID as bp_batch_id,
t2.BATCH_LINE_NUMBER as bp_batch_li,
t1.BATCH_SUBMISSION_DATE as bp_batch_submission_date,
CASE
WHEN t1.BATCH_SUBMISSION_DATE is null
THEN
(SELECT a.BATCH_SUBMISSION_DATE
FROM
db.headers a,
db.lines b
WHERE
a.SD_BATCH_HEADERS_SKEY = b.SD_BATCH_HEADERS_SKEY
and a.parent_batch_id = bp_batch_id
and b.batch_line_number = bp_batch_li
) END as new_submission_date
FROM
db.headers t1,
db.lines t2
WHERE
t1.SD_BATCH_HEADERS_SKEY = t2.SD_BATCH_HEADERS_SKEY
and (t1.BATCH_ID = '12345' or t1.PARENT_BATCH_ID = '12345')
and t2.BATCH_LINE_NUMBER = '1'
GROUP BY
t2.BATCH_CLAIM_LINE_STATUS_DESC,
t1.PARENT_BATCH_ID,
t1.BATCH_ID,
t2.BATCH_LINE_NUMBER,
t1.BATCH_SUBMISSION_DATE;
is what I'm trying to do possible? using the bp_batch_id and bp_batch_li variables
Use CTE (common table expression) to avoid redundant code, then use coalesce() to find parent date in case of null. In your first queries you didn't attach joining condition between two tables, I assumed it's based on sd_batch_headers_skey like in last query.
dbfiddle demo
with t as (
select h.batch_id, h.parent_batch_id, l.batch_submission_date bs_date
from headers h
join lines l on l.sd_batch_headers_skey = h.sd_batch_headers_skey
and l.batch_line_number = '1' )
select batch_id, parent_batch_id,
coalesce(bs_date, (select bs_date from t x where x.batch_id = t.parent_batch_id)) bs_date
from t
where batch_id = 12345;
You could use simpler syntax with connect by and level <= 2 but if in your data there are really rows containing same ids (99999, 99999) then we get cycle error.

troubles with next and previous query

I have a list and the returned table looks like this. I took the preview of only one car but there are many more.
What I need to do now is check that the current KM value is larger then the previous and smaller then the next. If this is not the case I need to make a field called Trustworthy and should fill it with either 1 or 0 (true/ false).
The result that I have so far is this:
validKMstand and validkmstand2 are how I calculate it. It did not work in one list so that is why I separated it.
In both of my tries my code does not work.
Here is the code that I have so far.
FullList as (
SELECT
*
FROM
eMK_Mileage as Mileage
)
, ValidChecked1 as (
SELECT
UL1.*,
CASE WHEN EXISTS(
SELECT TOP(1)UL2.*
FROM FullList AS UL2
WHERE
UL2.FK_CarID = UL1.FK_CarID AND
UL1.KM_Date > UL2.KM_Date AND
UL1.KM > UL2.KM
ORDER BY UL2.KM_Date DESC
)
THEN 1
ELSE 0
END AS validkmstand
FROM FullList as UL1
)
, ValidChecked2 as (
SELECT
List1.*,
(CASE WHEN List1.KM > ulprev.KM
THEN 1
ELSE 0
END
) AS validkmstand2
FROM ValidChecked1 as List1 outer apply
(SELECT TOP(1)UL3.*
FROM ValidChecked1 AS UL3
WHERE
UL3.FK_CarID = List1.FK_CarID AND
UL3.KM_Date <= List1.KM_Date AND
List1.KM > UL3.KM
ORDER BY UL3.KM_Date DESC) ulprev
)
SELECT * FROM ValidChecked2 order by FK_CarID, KM_Date
Maybe something like this is what you are looking for?
;with data as
(
select *, rn = row_number() over (partition by fk_carid order by km_date)
from eMK_Mileage
)
select
d.FK_CarID, d.KM, d.KM_Date,
valid =
case
when (d.KM > d_prev.KM /* or d_prev.KM is null */)
and (d.KM < d_next.KM /* or d_next.KM is null */)
then 1 else 0
end
from data d
left join data d_prev on d.FK_CarID = d_prev.FK_CarID and d_prev.rn = d.rn - 1
left join data d_next on d.FK_CarID = d_next.FK_CarID and d_next.rn = d.rn + 1
order by d.FK_CarID, d.KM_Date
With SQL Server versions 2012+ you could have used the lag() and lead() analytical functions to access the previous/next rows, but in versions before you can accomplish the same thing by numbering rows within partitions of the set. There are other ways too, like using correlated subqueries.
I left a couple of conditions commented out that deal with the first and last rows for every car - maybe those should be considered valid is they fulfill only one part of the comparison (since the previous/next rows are null)?

SELECTing only one copy of a row with a specific key that is coming from multiple tables

I am new to SQL so bear with me. I am returning data from multiple tables. Followed is my SQL (let me know if there is a better approach):
SELECT [NonScrumStory].[IncidentNumber], [NonScrumStory].[Description], [DailyTaskHours].[ActivityDate], [Application].[AppName], [SupportCatagory].[Catagory], [DailyTaskHours].[PK_DailyTaskHours],n [NonScrumStory].[PK_NonScrumStory]
FROM [NonScrumStory], [DailyTaskHours], [Application], [SupportCatagory]
WHERE ([NonScrumStory].[UserId] = 26)
AND ([NonScrumStory].[PK_NonScrumStory] = [DailyTaskHours].[NonScrumStoryId])
AND ([NonScrumStory].[CatagoryId] = [SupportCatagory].[PK_SupportCatagory])
AND ([NonScrumStory].[ApplicationId] = [Application].[PK_Application])
AND ([NonScrumStory].[Deleted] != 1)
AND [DailyTaskHours].[ActivityDate] >= '1/1/1990'
ORDER BY [DailyTaskHours].[ActivityDate] DESC
This is what is being returned:
This is nearly correct. I only want it to return one copy of PK_NonScrumStory though and I can't figure out how. Essentially, I only want it to return one copy so one of the top two rows would not be returned.
You could group by the NonScrumStore columns, and then aggregate the other columns like this:
SELECT [NonScrumStory].[IncidentNumber],
[NonScrumStory].[Description],
MAX( [DailyTaskHours].[ActivityDate]),
MAX( [Application].[AppName]),
MAX([SupportCatagory].[Catagory]),
MAX([DailyTaskHours].[PK_DailyTaskHours]),
[NonScrumStory].[PK_NonScrumStory]
FROM [NonScrumStory],
[DailyTaskHours],
[Application],
[SupportCatagory]
WHERE ([NonScrumStory].[UserId] = 26)
AND ([NonScrumStory].[PK_NonScrumStory] = [DailyTaskHours].[NonScrumStoryId])
AND ([NonScrumStory].[CatagoryId] = [SupportCatagory].[PK_SupportCatagory])
AND ([NonScrumStory].[ApplicationId] = [Application].[PK_Application])
AND ([NonScrumStory].[Deleted] != 1)
AND [DailyTaskHours].[ActivityDate] >= '1/1/1990'
group by [NonScrumStory].[IncidentNumber], [NonScrumStory].[Description],[NonScrumStory].[PK_NonScrumStory]
ORDER BY 3 DESC
From the screenshot it seems DISTINCT should have solved your issue but if not you could use the ROW_NUMBER function.
;WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY [NonScrumStory].[PK_NonScrumStory] ORDER BY [DailyTaskHours].[ActivityDate] DESC) AS RowNum,
[NonScrumStory].[IncidentNumber], [NonScrumStory].[Description], [DailyTaskHours].[ActivityDate], [Application].[AppName], [SupportCatagory].[Catagory], [DailyTaskHours].[PK_DailyTaskHours],n [NonScrumStory].[PK_NonScrumStory]
FROM [NonScrumStory], [DailyTaskHours], [Application], [SupportCatagory]
WHERE ([NonScrumStory].[UserId] = 26)
AND ([NonScrumStory].[PK_NonScrumStory] = [DailyTaskHours].[NonScrumStoryId])
AND ([NonScrumStory].[CatagoryId] = [SupportCatagory].[PK_SupportCatagory])
AND ([NonScrumStory].[ApplicationId] = [Application].[PK_Application])
AND ([NonScrumStory].[Deleted] != 1)
AND [DailyTaskHours].[ActivityDate] >= '1/1/1990'
)
SELECT * FROM CTE WHERE RowNum = 1 ORDER BY [ActivityDate] DESC
I believe if you add DISTINCT to your query that should solve your problem. Like so
SELECT DISTINCT [NonScrumStory].[IncidentNumber], [NonScrumStory].[Description],...

Fetch unique combinations of two field values

Probably it has been asked before but I cannot find an answer.
Table Data has two columns:
Source Dest
1 2
1 2
2 1
3 1
I trying to come up with a MS Access 2003 SQL query that will return:
1 2
3 1
But all to no avail. Please help!
UPDATE: exactly, I'm trying to exclude 2,1 because 1,2 already included. I need only unique combinations where sequence doesn't matter.
For Ms Access you can try
SELECT DISTINCT
*
FROM Table1 tM
WHERE NOT EXISTS(SELECT 1 FROM Table1 t WHERE tM.Source = t.Dest AND tM.Dest = t.Source AND tm.Source > t.Source)
EDIT:
Example with table Data, which is the same...
SELECT DISTINCT
*
FROM Data tM
WHERE NOT EXISTS(SELECT 1 FROM Data t WHERE tM.Source = t.Dest AND tM.Dest = t.Source AND tm.Source > t.Source)
or (Nice and Access Formatted...)
SELECT DISTINCT *
FROM Data AS tM
WHERE (((Exists (SELECT 1 FROM Data t WHERE tM.Source = t.Dest AND tM.Dest = t.Source AND tm.Source > t.Source))=False));
your question is asked incorrectly. "unique combinations" are all of your records. but i think you mean one line per each Source. so it is:
SELECT *
FROM tab t1
WHERE t1.Dest IN
(
SELECT TOP 1 DISTINCT t2.Dest
FROM tab t2
WHERE t1.Source = t2.Source
)
SELECT t1.* FROM
(SELECT
LEAST(Source, Dest) AS min_val,
GREATEST(Source, Dest) AS max_val
FROM table_name) AS t1
GROUP BY t1.min_val, t1.max_val
Will return
1, 2
1, 3
in MySQL.
To eliminate duplicates, "select distinct" is easier than "group by":
select distinct source,dest from data;
EDIT: I see now that you're trying to get unique combinations (don't include both 1,2 and 2,1). You can do that like:
select distinct source,dest from data
minus
select dest,source from data where source < dest
The "minus" flips the order around and eliminates cases where you already have a match; the "where source < dest" keeps you from removing both (1,2) and (2,1)
Use this query :
SELECT distinct * from tabval ;