duplicates to be removed sql - sql

I have in database records
My sql:
SELECT
DISTINCT name, date(mod_wr)
FROM
test.object_stg
WHERE
ir = '4552724'
GROUP BY
name, date(mod_wr)
ORDER BY name
The last record is the same as the last but one. It has only a different date.
Is it possible to somehow query to return all records where there has been a change in the "name" column?
For record 4 and 5 there is the same name, only a different date. I would like it to return only a record of 4 and 5, because there was no change.

If you don't want to remove rows where values are resused. E.g. your line #2, you can use LAG() and then only include rows where the value is different to the previous. E.g.
select name, date(mod_wr) from
(
SELECT
name, mod_wr, lag(name) over(order by mod_wr) as prev_name
FROM
test.object_stg
WHERE
ir = '4552724'
)
WHERE prev_name IS NULL OR name <> prev_name

From your sample data, you have 3 distinct names. However, you cannot use distinct in your select statement because it applies to every field listed and none of the dates would provide an exact match.
However, you can use a group by statement in order to collate your titles together.
// MySQL 5.6 Statement
select name, date(mod_wr) from object_stg group by name;
// MSSQL 2017 Statement
select name, max(mod_wr) from object_stg group by name;
Both statements return 3 lines with just the BMW, 1.0 GL and 1.0 GLS showing with a single date.
SQL Fiddle

Related

field subtraction sql server

If I would like to subtract the fields from each other,
i.e. in A there are 11 fields described as 'Faktura zakupu' and in B there are 5 fields described as 'Faktura zakupu'. I would like to get a return of records in the form of 6 items 'Faktura zakupu' (11-5 = 6)
I tried the EXCEPT operation, but it does not return the desired results
what operation do i need to perform?
You can add row number to each row in both tables. Then SQL Server can determine that the first (Faktura zakupu, Original) in table A is a duplicate of the first (Faktura zakupu, Original) in table B and remove it during EXCEPT operation:
SELECT Name, StatusReq, ROW_NUMBER() OVER (PARTITION BY Name, StatusReq ORDER BY (SELECT NULL))
FROM a
EXCEPT
SELECT Name, StatusReq, ROW_NUMBER() OVER (PARTITION BY Name, StatusReq ORDER BY (SELECT NULL))
FROM b
It'll return 6 rows from table A... numbered 6 through 11.

How to remove duplicate data from microsoft sql database(on the result only)

the column code has values that have duplicate on it , i do want to remove the duplicate of that row.
for example i want to remove the duplicates of column code as well the row that has duplicate on it. it doesent matter if the other column has duplicate but i do want to base it on the code column. what sql query can i use.? Thank you
this is the table I am working to.
as you can see there are isdeleted column that has value of 1 on them. I only want the recored with a value of 0 on them
here is a sample record, in here you can see that row 1 has a isdeleted value of 1, which mean that this record is deleted and i only need the row 2 of this code.
You could use the windowing function ROW_NUMBER() to single out the last entry per code like in:
SELECT code, shortdesc, longdesc, isobsolete, effectivefromdate
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY code ORDER BY effectivefromdate DESC) AS rn, *
FROM CodingSuite_STG
WHERE isobsolete=1 AND isdeleted=0
) AS cs
WHERE rn=1
ORDER BY effectivefromdate
Explanation:
Core of the operation is a "sub-query". That is a "table-like" expression generated by having a SELECT clause surrounded by parentheses and followed by a table name like:
( SELECT * FROM CodingSuite_STG WHERE iobsolete=1 ) AS cs
For the outer SELECT it will appear like a table with the name "cs".
Within this sub-query I placed a special function (a "window function") consisting of two parts:
ROWN_NUMBER() OVER ( PARTITION BY code ORDER BY effectivefromdate DESC) AS rn
The ROW_NUMBER() function returns a sequential number for a certain "window" of records defined by the immediately following OVER ( ... ) clause. The PARTITION BY inside it defines a group division scheme (similar to GROUP BY), so the row numbers start from 1 for each partitioned group. ORDER BY determines the numbering order within each group. So, with entries having the same code value ROW_NUMBER() will supply the number sequence 1, 2, 3... for each record, with 1 being assigned to the record with the highest value of effectivefromdate because of ORDER BY effectivefromdate DESC.
All we need to do in the outer SELECT clause is to pick up those records from the sub-query cs that have an rn-value of 1 and we're done!

Design select SQL query

I have three values expected in a table case, Serious, Non-Serious, Unknown for each case_id
select case_id, case_seriousness
from case;
I have to build a SQL query which should show one row per case_id.
If there are rows for a case_id with multiple values, then only one row should appear based on priority - Serious, Non-Serious then Unknown.
e.g. Serious is in one row rest of four rows have Non-Serious or Unknown then Serious will be he value to show in one record.
If there are records with Non-serious and Unknown then Non-Serious should appear.
So Priorities will be like from S, NS and UK
You can use the analytical function as follows:
select case_id, case_seriousness
from
(select case_id, case_seriousness,
row_number() over (partition by case_id
order by case case_seriousness
when 'Serious' then 1
when 'Non-Serious' then 2
else 3
end ) as rn
from case)
where rn = 1;
Alternatively, You can also use DECODE instead of CASE..WHEN

Group by question in SQL Server, migration from MySQL

Failed finding a solution to my problem, would love your help.
~~ Post has been edited to have only one question ~~-
Group by one query while selecting multiple columns.
In MySQL you can simply group by whatever you want, and it will still select all of them, so if for example I wanted to select the newest 100 transactions, grouped by Email (only get the last transaction of a single email)
In MySQL I would do that:
SELECT * FROM db.transactionlog
group by Email
order by TransactionLogId desc
LIMIT 100;
In SQL Server its not possible, googling a bit suggested to specify each column that I want to have with an aggregate as a hack, that couldn't cause a mix of values (mixing columns between the grouped rows)?
For example:
SELECT TOP(100)
Email,
MAX(ResultCode) as 'ResultCode',
MAX(Amount) as 'Amount',
MAX(TransactionLogId) as 'TransactionLogId'
FROM [db].[dbo].[transactionlog]
group by Email
order by TransactionLogId desc
TransactionLogId is the primarykey which is identity , ordering by it to achieve the last inserted.
Just want to know that the ResultCode and Amount that I'll get doing such query will be of the last inserted row, and not the highest of the grouped rows or w/e.
~Edit~
Sample data -
row1:
Email : test#email.com
ResultCode : 100
Amount : 27
TransactionLogId : 1
row2:
Email: test#email.com
ResultCode:50
Amount: 10
TransactionLogId: 2
Using the sample data above, my goal is to get the row details of
TransactionLogId = 2.
but what actual happens is that I get a mixed values of the two, as I do get transactionLogId = 2, but the resultcode and amount of the first row.
How do I avoid that?
Thanks.
You should first find out which is the latest transaction log by each email, then join back against the same table to retrieve the full record:
;WITH MaxTransactionByEmail AS
(
SELECT
Email,
MAX(TransactionLogId) as LatestTransactionLogId
FROM
[db].[dbo].[transactionlog]
group by
Email
)
SELECT
T.*
FROM
[db].[dbo].[transactionlog] AS T
INNER JOIN MaxTransactionByEmail AS M ON T.TransactionLogId = M.LatestTransactionLogId
You are currently getting mixed results because your aggregate functions like MAX() is considering all rows that correspond to a particular value of Email. So the MAX() value for the Amount column between values 10 and 27 is 27, even if the transaction log id is lower.
Another solution is using a ROW_NUMBER() window function to get a row-ranking by each Email, then just picking the first row:
;WITH TransactionsRanking AS
(
SELECT
T.*,
MostRecentTransactionLogRanking = ROW_NUMBER() OVER (
PARTITION BY
T.Email -- Start a different ranking for each different value of Email
ORDER BY
T.TransactionLogId DESC) -- Order the rows by the TransactionLogID descending
FROM
[db].[dbo].[transactionlog] AS T
)
SELECT
T.*
FROM
TransactionsRanking AS T
WHERE
T.MostRecentTransactionLogRanking = 1

Determine the number of times a null value occurs in column B for a distinct value in column A, SQL table

I have a SQL table with "name" as one column, date as another, and location as a third. The location column supports null values.
I am trying to write a query to determine the number of times a null value occurs in the location column for each distinct value in the name column.
Can someone please assist?
One method uses conditional aggregation:
select name, sum(case when location is null then 1 else 0 end)
from t
group by name;
Another method that involves slightly less typing is:
select name, count(*) - count(location)
from t
group by name;
use count along with filters, as you only requires Null occurrence
select name, count(*) occurances
from mytable
where location is null
group by name
From your question, you'll want to get a distinct list of all different 'name' rows, and then you would like a count of how many NULLs there are per each name.
The following will achieve this:
SELECT name, count(*) as null_counts
FROM table
WHERE location IS NULL
GROUP BY name
The WHERE clause will only retrieve records where the records have NULL as their location.
The GROUP BY will pivot the data based on NAME.
The SELECT will give you the name, and the COUNT(*) of the number of records, per name.