Duplicate checking with like operator

Duplicate checking with like operator - sql

I want to check duplicate values in my table with like operator:
select F_Barcode, COUNT(*) as cnt
from T_Assets
group by F_Barcode
having COUNT(*) > 1
This is working fine, but some of my barcode duplicated is like this
4456
00004456
and
45552
00045552
Actually this is the same barcode, but duplicated.
I need to see all of my barcodes duplicated like this? How I can do that?
F_Barcode datatype nvarchar(50).
Values like 0000frdz can't be converted to data type int.

You can cast the type of barcode to integer, then do the count ops.
Just like this:
select F_Barcode,COUNT(*) as cnt From T_Assets
group by CAST(F_Barcode AS UNSIGNED) having COUNT(*) > 1

Check This.
Using Substring:
select substring(F_Barcode, patindex('%[^0]%',F_Barcode), 100),
COUNT(*) as cnt
From T_Assets
group by substring(F_Barcode, patindex('%[^0]%',F_Barcode), 100)
having COUNT(*) > 1
--- Replace 100 with datatype size
Using Cast :
select CAST(F_Barcode as int),COUNT(*) as cnt
From F_Barcode
group by CAST(F_Barcode as int)
having COUNT(*) > 1

fix a length for string.
add O's in the beginning to match desired length.
then collect count group wise.
select RIGHT('000000000'+F_Barcode,9) as F_Barcode, COUNT() as cnt
from T_Assets
group by RIGHT('000000000'+F_Barcode,9)
having COUNT() > 1
Here I have considered string length to 9. You can adjust as per your requirement.
Please take a note that you have to add equal number of 0's to your string length at beginning of you column.

A possible solution for SQL Server 2012 or above is to use TRY_CONVERT to ensure that cast will never fail. Barcodes should contain numbers only, but since you are storing them in a NVARCHAR field
Setup
create table BarcodeData
(
F_Barcode NVARCHAR(50)
)
insert into BarcodeData VALUES ('4456'), ('00004456'), ('45552'), ('00045552'), ('a45552'), ('1234'), ('0'), ('00001'), ('a45552')
GO
Query
;WITH CleanedBarcode AS (
-- CAST is required because INT type is inferred and will generate failure when a non number is met
SELECT ISNULL(CAST(TRY_CONVERT(INT, F_Barcode) AS NVARCHAR(50)), F_Barcode) AS CleanedCode
FROM BarcodeData
)
SELECT *
FROM CleanedBarcode CB
GROUP BY CB.CleanedCode
HAVING COUNT(1) > 1

Related

Sum rows with same identification-number and sort by custom order

I have the following table structure for the table "products":
id amount number
1 10 M6545
2 32 M6424
3 32 M6545
4 49 M6412
... ... ...
I want to select the sum of amounts of all rows with the same number. The rows with the same number should be summed up to one sum. That means:
M6545 -> Sum 42
M6424 -> Sum 32
M6421 -> Sum 49
My query looks like the following and still does not work:
SELECT SUM(amount) as SumAm FROM products WHERE number IN ('M6412', 'M6545')
I want to find a way where I can only select the sum ordered by the numbers in the "IN" statement. That means, the result table should look like:
SumAm
49
42
The sums should not be ordered in some way. It should match the order of numbers in the IN clause.

use group by number
SELECT number, SUM(amount) as SumAm FROM products
--WHERE number IN ('M6412', 'M6545') i think you dont need where clause
group by number
But if you want just for 'M6412', 'M6545' then you need where clause that you showed in your 2nd output sample

Use group by and aggregation
SELECT SUM(amount) as SumAm FROM products
WHERE number IN ('M6412', 'M6545')
group by number

You can't order by results based directly on the order of the IN clause.
What you can do is something like this:
SELECT SUM(amount) as SumAm
FROM products
WHERE number IN ('M6412', 'M6545')
GROUP BY number -- You must group by to get a row for each number
ORDER BY CASE number
WHEN 'M6412' THEN 1
WHEN 'M6545' THEN 2
END
Of course, the more items you have in your IN clause the more cumbersome this query will get. Therefor another solution might be more practical - joining to a table variable instead of using IN:
DECLARE #Numbers AS TABLE
(
sort int identity(1,1), -- this will hold the order of the inserted values
number varchar(10) PRIMARY KEY -- enforce unique values
);
INSERT INTO #Numbers (number) VALUES
('M6412'),
('M6545')
SELECT SUM(amount) as SumAm
FROM products As p
JOIN numbers As n ON p.Number = n.Number
-- number and sort have a 1 - 1 relationship,
-- so it's safe to group by it instead of by number
GROUP BY n.sort
ORDER BY n.sort

Your requirement is non-sense... this is not how IN is designed to work. Having said that, the following will give you the result in the desired order:
SELECT SUM(amount)
FROM (VALUES
('M6545', 1),
('M6412', 2)
) AS va(number, sortorder)
INNER JOIN sumam ON va.number = sumam.number
GROUP BY va.number, va.sortorder
ORDER BY va.sortorder
It is somewhat better than writing a CASE statement when you would need to add a WHEN condition for each number.

substring query

I want to get the substring out of a cell value wrt following eg-
Input: "J.H.Ambani.School"-----------School
Output: "H.Ambani"-----------------MidName
That is all the text that comes between the first and the last dots. Length of string or number of dots in string can be any. I am trying to form a query for above input column "School" to get the output column "MidName".What can be the sql query for it?

For Oracle Database:
SELECT
REGEXP_REPLACE(yourColumn, '^[^.]*.|.[^.]*$', '') AS yourAlias
FROM yourTable

If is correctly understood your problem by your statement
"That is all the text that comes between the first and the last dots". Then below is solution to your problem is as given below. Below is working solution in SQL SERVER, for other databases i could not check because of lack of time.
#SourceString : this is your input
#DestinationString : this is your output
declare #SourceString varchar(100)='J.H.Ambani.School'
declare #DestinationString varchar(100)
;with result as
(
select ROW_NUMBER()over (order by (select 100))SNO,d from(
select t.c.value('.','varchar(100)')as d from
(select cast('<a>'+replace(#SourceString,'.','</a><a>')+'</a>' as xml)data)as A cross apply data.nodes('/a') as t(c))B
)
select #DestinationString=COALESCE(#DestinationString+'.','')+ISNULL(d,'') from result where SNO>(select top 1 SNO from result order by SNO)
and SNO<(select top 1 SNO from result order by SNO desc)
select #DestinationString

Why COUNT(*) is equal to 1 without FROM clause? [duplicate]

This question already has an answer here:
Why does "select count(*)" from nothing return 1
(1 answer)
Closed 7 years ago.
for a quick check I used a query
select COUNT(*) LargeTable
and was surprized to see
LargeTable
-----------
1
seconds later I realized my mistake, made it
select COUNT(*) from LargeTable
and got expected result
(No column name)
-----------
1.000.000+
but now I don't understand why COUNT(*) returned 1
it happens if I do select COUNT(*) or declare #x int = COUNT(*); select #x
another case
declare #EmptyTable table ( Value int )
select COUNT(*) from #EmptyTable
returns
(No column name)
-----------
0
I did't find explanation in SQL standard (http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt, online source is given here https://stackoverflow.com/a/8949764/1506454)
why COUNT(*) returns 1?

In SQL Server a SELECT without a FROM clause works as though it operates against a single row table.
This is not standard SQL. Other RDBMSs provide a utility DUAL table with a single row.
So this would be treated effectively the same as
SELECT COUNT(*) AS LargeTable
FROM DUAL
A related Connect Item discussing
SELECT 'test'
WHERE EXISTS (SELECT *)
is https://connect.microsoft.com/SQLServer/feedback/details/671475/select-test-where-exists-select

Because without the FROM clause DBMS cannot know [LargeTable] is a table. You tricked it in guessing it's a COLUMN NAME alias
You can try it and see
select count(*) 'eklmnjdklfgm'
select count(*) eklmnjdklfgm
select count(*) [eklmnjdklfgm]
select count(*)
The first 3 examples returns eklmnjdklfgm as column name

Count(*) returned 1 because your sentence is not of SQL.
1) In the first sentences, you accounts a table empty, with an only row, since you don't put to which table want to access.
2) In the second case:
declare #EmptyTable table ( Value int )
select COUNT(*) from #EmptyTable
returns
(No column name)
-----------
0
You put the variable, but don't determine to which table implement him. For this, do a count and go out 0.

Oracle SQL Developer - Count function

This is the output of a select * from table1, I have a doubt with count function... I want to count that NULL, in order to do that the proper option is to do this:
select count(*) from table1 where fecha_devolucion is null --> This gives me the proper answer counting 1 however if i do:
select count(fecha_devolucion)
from table1
where fecha_devolucion is null --> this returns 0, why? Isn't the same syntax?
What's the difference between choosing a specific field and * from a table?

From the documentation (http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions032.htm):
If you specify expr, then COUNT returns the number of rows where expr
is not null. ...
If you specify the asterisk (*), then this function returns all rows...
In other words, COUNT(fecha_devolucion) counts non-NULL values of that column. COUNT(*) counts the total number of rows, regardless of the values.

This is the another way how you can get the count :
SELECT SUM(NVL(fecha_devolucion,1)) FROM table1 WHERE fecha_devolucion IS NULL;

Let's compare the two queries:
select count(*)
from table1
where fecha_devolucion is null;
select count(fecha_devolucion)
from table1
where fecha_devolucion is null;
I think you misunderstand the count() function. This function counts the number of non-NULL values in its argument list. With a constant or *, it counts all rows.
So, the first counts all the matching rows. The second counts all the non-NULL values of fecha_devolucion. But there are no such values because of the where clause.
By the way, you can also do:
select sum(case fecha_devolucion is null then 1 else 0 end) as Nullfecha_devolucion
from table1;

Sort string as number in sql server

I have a column that contains data like this. dashes indicate multi copies of the same invoice and these have to be sorted in ascending order
790711
790109-1
790109-11
790109-2
i have to sort it in increasing order by this number but since this is a varchar field it sorts in alphabetical order like this
790109-1
790109-11
790109-2
790711
in order to fix this i tried replacing the -(dash) with empty and then casting it as a number and then sorting on that
select cast(replace(invoiceid,'-','') as decimal) as invoiceSort...............order by invoiceSort asc
while this is better and sorts like this
invoiceSort
790711 (790711) <-----this is wrong now as it should come later than 790109
790109-1 (7901091)
790109-2 (7901092)
790109-11 (79010911)
Someone suggested to me to split invoice id on the - (dash ) and order by on the 2 split parts
like=====> order by split1 asc,split2 asc (790109,1)
which would work i think but how would i split the column.
The various split functions on the internet are those that return a table while in this case i would be requiring a scalar function.
Are there any other approaches that can be used? The data is shown in grid view and grid view doesn't support sorting on 2 columns by default ( i can implement it though :) ) so if any simpler approaches are there i would be very nice.
EDIT : thanks for all the answers. While every answer is correct i have chosen the answer which allowed me to incorporate these columns in the GridView Sorting with minimum re factoring of the sql queries.

Judicious use of REVERSE, CHARINDEX, and SUBSTRING, can get us what we want. I have used hopefully-explanatory columns names in my code below to illustrate what's going on.
Set up sample data:
DECLARE #Invoice TABLE (
InvoiceNumber nvarchar(10)
);
INSERT #Invoice VALUES
('790711')
,('790709-1')
,('790709-11')
,('790709-21')
,('790709-212')
,('790709-2')
SELECT * FROM #Invoice
Sample data:
InvoiceNumber
-------------
790711
790709-1
790709-11
790709-21
790709-212
790709-2
And here's the code. I have a nagging feeling the final expressions could be simplified.
SELECT
InvoiceNumber
,REVERSE(InvoiceNumber)
AS Reversed
,CHARINDEX('-',REVERSE(InvoiceNumber))
AS HyphenIndexWithinReversed
,SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))
AS ReversedWithoutAffix
,SUBSTRING(InvoiceNumber,1+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixIncludingHyphen
,SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixExcludingHyphen
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
AS AffixAsInt
,REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
AS WithoutAffix
FROM #Invoice
ORDER BY
-- WithoutAffix
REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
-- AffixAsInt
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
Output:
InvoiceNumber Reversed HyphenIndexWithinReversed ReversedWithoutAffix AffixIncludingHyphen AffixExcludingHyphen AffixAsInt WithoutAffix
------------- ---------- ------------------------- -------------------- -------------------- -------------------- ----------- ------------
790709-1 1-907097 2 907097 -1 1 1 790709
790709-2 2-907097 2 907097 -2 2 2 790709
790709-11 11-907097 3 907097 -11 11 11 790709
790709-21 12-907097 3 907097 -21 21 21 790709
790709-212 212-907097 4 907097 -212 212 212 790709
790711 117097 0 117097 0 790711
Note that all you actually need is the ORDER BY clause, the rest is just to show my working, which goes like this:
Reverse the string, find the hyphen, get the substring after the hyphen, reverse that part: This is the number without any affix
The length of (the number without any affix) tells us how many characters to drop from the start in order to get the affix including the hyphen. Drop an additional character to get just the numeric part, and convert this to int. Fortunately we get a break from SQL Server in that this conversion gives zero for an empty string.
Finally, having got these two pieces, we simple ORDER BY (the number without any affix) and then by (the numeric value of the affix). This is the final order we seek.
The code would be more concise if SQL Server allowed us to say SUBSTRING(value, start) to get the string starting at that point, but it doesn't, so we have to say SUBSTRING(value, start, LEN(value)) a lot.

Try this one -
Query:
DECLARE #Invoice TABLE (InvoiceNumber VARCHAR(10))
INSERT #Invoice
VALUES
('790711')
, ('790709-1')
, ('790709-21')
, ('790709-11')
, ('790709-211')
, ('790709-2')
;WITH cte AS
(
SELECT
InvoiceNumber
, lenght = LEN(InvoiceNumber)
, delimeter = CHARINDEX('-', InvoiceNumber)
FROM #Invoice
)
SELECT InvoiceNumber
FROM cte
CROSS JOIN (
SELECT repl = MAX(lenght - delimeter)
FROM cte
WHERE delimeter != 0
) mx
ORDER BY
SUBSTRING(InvoiceNumber, 1, ISNULL(NULLIF(delimeter - 1, -1), lenght))
, RIGHT(REPLICATE('0', repl) + SUBSTRING(InvoiceNumber, delimeter + 1, lenght), repl)
Output:
InvoiceNumber
-------------
790709-1
790709-2
790709-11
790709-21
790709-211
790711

Try this
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN PatIndex('%[-]%',invoiceid) > 0
THEN LEFT(invoiceid,PatIndex('%[-]%',invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN PatIndex('%[-]%',REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid,PatIndex('%[-]%',REVERSE(invoiceid))-1)
ELSE NULL END * 1
SQLFiddle Demo
Above query uses two case statements
Sorts first part of Invoiceid 790109-1 (eg: 790709)
Sorts second part of Invoiceid after splitting with '-' 790109-1 (eg: 1)
For detailed understanding check the below SQLfiddle
SQLFiddle Detailed Demo
OR use 'CHARINDEX'
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN CHARINDEX('-', invoiceid) > 0
THEN LEFT(invoiceid, CHARINDEX('-', invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN CHARINDEX('-', REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid, CHARINDEX('-', REVERSE(invoiceid))-1)
ELSE NULL END * 1

Order by each part separately is the simplest and reliable way to go, why look for other approaches? Take a look at this simple query.
select *
from Invoice
order by Convert(int, SUBSTRING(invoiceid, 0, CHARINDEX('-',invoiceid+'-'))) asc,
Convert(int, SUBSTRING(invoiceid, CHARINDEX('-',invoiceid)+1, LEN(invoiceid)-CHARINDEX('-',invoiceid))) asc

Plenty of good answers here, but I think this one might be the most compact order by clause that is effective:
SELECT *
FROM Invoice
ORDER BY LEFT(InvoiceId,CHARINDEX('-',InvoiceId+'-'))
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo: - SQL Fiddle
Note, I added the '790709' version to my test, since some of the methods listed here aren't treating the no-suffix version as lesser than the with-suffix versions.
If your invoiceID varies in length, before the '-' that is, then you'd need:
SELECT *
FROM Invoice
ORDER BY CAST(LEFT(list,CHARINDEX('-',list+'-')-1)AS INT)
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo with varying lengths before the dash: SQL Fiddle

My version:
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
select
invoiceid ,
cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )
from MyTable
You can implement this as a new column to your table:
ALTER TABLE MyTable ADD COLUMN invoice_numeric_id int null
GO
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
UPDATE TABLE MyTable
SET invoice_numeric_id = cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )

One way is to split InvoiceId into its parts, and then sort on the parts. Here I use a derived table, but it could be done with a CTE or a temporary table as well.
select InvoiceId, InvoiceId1, InvoiceId2
from
(
select
InvoiceId,
substring(InvoiceId, 0, charindex('-', InvoiceId, 0)) as InvoiceId1,
substring(InvoiceId, charindex('-', InvoiceId, 0)+1, len(InvoiceId)) as InvoiceId2
FROM Invoice
) tmp
order by
cast((case when len(InvoiceId1) > 0 then InvoiceId1 else InvoiceId2 end) as int),
cast((case when len(InvoiceId1) > 0 then InvoiceId2 else '0' end) as int)
In the above, InvoiceId1 and InvoiceId2 are the component parts of InvoiceId. The outer select includes the parts, but only for demonstration purposes - you do not need to do this in your select.
The derived table (the inner select) grabs the InvoiceId as well as the component parts. The way it works is this:
When there is a dash in InvoiceId, InvoiceId1 will contain the first part of the number and InvoiceId2 will contain the second.
When there is not a dash, InvoiceId1 will be empty and InvoiceId2 will contain the entire number.
The second case above (no dash) is not optimal because ideally InvoiceId1 would contain the number and InvoiceId2 would be empty. To make the inner select work optimally would decrease the readability of the select. I chose the non-optimal, more readable, approach since it is good enough to allow for sorting.
This is why the ORDER BY clause tests for the length - it needs to handle the two cases above.
Demo at SQL Fiddle

Break the sort into two sections:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE TestData
(
data varchar(20)
)
INSERT TestData
SELECT '790711' as data
UNION
SELECT '790109-1'
UNION
SELECT '790109-11'
UNION
SELECT '790109-2'
Query 1:
SELECT *
FROM TestData
ORDER BY
FLOOR(CAST(REPLACE(data, '-', '.') AS FLOAT)),
CASE WHEN CHARINDEX('-', data) > 0
THEN CAST(RIGHT(data, len(data) - CHARINDEX('-', data)) AS INT)
ELSE 0
END
Results:
| DATA |
-------------
| 790109-1 |
| 790109-2 |
| 790109-11 |
| 790711 |

Try:
select invoiceid ... order by Convert(decimal(18, 2), REPLACE(invoiceid, '-', '.'))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Duplicate checking with like operator - sql

You can cast the type of barcode to integer, then do the count ops. Just like this: select F_Barcode,COUNT() as cnt From T_Assets group by CAST(F_Barcode AS UNSIGNED) having COUNT() > 1

Related

Sum rows with same identification-number and sort by custom order

substring query

Why COUNT(*) is equal to 1 without FROM clause? [duplicate]

Oracle SQL Developer - Count function

Sort string as number in sql server

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Duplicate checking with like operator - sql

You can cast the type of barcode to integer, then do the count ops. Just like this: select F_Barcode,COUNT(*) as cnt From T_Assets group by CAST(F_Barcode AS UNSIGNED) having COUNT(*) > 1

Related

Sum rows with same identification-number and sort by custom order

substring query

Why COUNT(*) is equal to 1 without FROM clause? [duplicate]

Oracle SQL Developer - Count function

Sort string as number in sql server

Categories

Resources

You can cast the type of barcode to integer, then do the count ops. Just like this: select F_Barcode,COUNT() as cnt From T_Assets group by CAST(F_Barcode AS UNSIGNED) having COUNT() > 1