Finding Duplicate occurrences in column

Finding Duplicate occurrences in column - sql

I have 2 tables Customer and meter.
A customer might have multiple meternbr in meter table. Customer has a customernbr column.
I want to return customers who have more than one meternbr only. Look at the table below. I want to return only customers a and c with the meternbr also.
Customer Meter
-------- -----
a a-100
b a-101
c b-103
d c-104
c-105

If that is a single string (which i don't think is a good idea to begin with), and if your DBMS supports LEFT/SUBSTRING and INSTR you can do a LEFT or a SUBSTRING combined with a INSTR that finds where is the first '-' index and get the customers that have more than one occurrence by using GROUP BY and HAVING COUNT(*) > 1.
SELECT LEFT(meterColumn,INSTR(meterColumn,'-')-1)
FROM meter
WHERE LEFT(meterColumn,INSTR(meterColumn,'-')-1) IN (
SELECT LEFT(meterColumn,INSTR(meterColumn,'-')-1)
FROM meter
GROUP BY LEFT(meterColumn,INSTR(meterColumn,'-')-1)
HAVING COUNT(*) > 1
)
GROUP BY 1;
If those are two columns in the meter table (customerNbr) and (meterNbr), you could simply do:
SELECT customerNbr
FROM meter
GROUP BY 1
HAVING COUNT(*) > 1;

Use GROUP BY and HAVING Clause with COUNT(*) > 1.
Here's a working Sample: http://sqlfiddle.com/#!3/d1b91/17
Pasting code and results below also:
CREATE TABLES (Note: Have not placed FK Constaint for demo purposes)
CREATE TABLE Customer
(
customernbr NVARCHAR(20) NOT NULL
)
CREATE TABLE Meter
(
meternbr NVARCHAR(20) NOT NULL,
customernbr NVARCHAR(20) NOT NULL
)
INSERT DATA. (Uncomment last 2 SELECT statements if you wanna see the data)
INSERT INTO Customer VALUES
('a'),
('b'),
('c'),
('d');
INSERT INTO Meter VALUES
('a-100','a'),
('a-101','a'),
('b-103','b'),
('c-104','c'),
('c-105','c'),
('d-106','d');
--SELECT * FROM Customer;
--SELECT * FROM Meter;
RUN SELECT STATEMENT
SELECT
customernbr AS 'Customer',
meternbr AS 'Meter'
FROM Meter WHERE customernbr IN
(
SELECT customernbr
FROM Meter
GROUP BY customernbr
HAVING COUNT(*) > 1
)
SEE RESULTS :)
CUSTOMER METER
a a-100
a a-101
c c-104
c c-105

Related

2 sql queries 2 different results using UNION

I have 2 queries that start with same tables but filter different columns. Both queries are unioned together so I can get a single count of people without duplication.
If I run the queries with the union commented out I have the same number of rows in each 1,953. When I run with the union I get 1,816 in one and 1,922 in the other.
My data is just an account # like 123456 in the first column and a 1/0 in the second column. Help me understand how this can happen if I am starting with the same number of rows.
Here is one of the queries
select distinct acct#,
case
when (lastFilledDate is not null and lastFilledDate<>'00/00/00') or
([Last Filled DC] is not null and [Last Filled DC]<>'00/00/00') or
(vivitrol is not null and vivitrol <>'00/00/00') or
(sublocade is not null and sublocade <>'00/00/00') or
(naltrexone is not null and naltrexone <>'00/00/00') then 1
else 0 end as result
from
(
select Acct#, DOB, [COE Contact Note], [COE-INTAKA Doc], [COE-MOM
Doc], lastFilledDate, [Last Filled DC],vivitrol,sublocade,naltrexone,
ROW_NUMBER() over (partition by Acct# order by [COE-INTAKA Doc] desc)
as apptRows
from tblAppBSCImportDashCOE2279 as main
where (([COE-MOM Doc]='Yes' and [COE Contact Note] is not null) or
[COE-MOM Doc]='No') and Appt is not null
) as sub
where apptRows=1
union
select distinct acctNo,
case
when
providerMAT='The Wright Center' and [COE-MOM Doc] is not null then
1
else 0
end as result
from
(
select acctNo, [COE-MOM Doc], MAT, providerMAT,
ROW_NUMBER() over (partition by acctNo order by COEBNMOM, [COE-MOM Doc]
desc) as apptRows
from tblAppBSCImportDashCOEHM2544 as main
where [COE-MOM Doc] is not null or COEBNMOM is not null
) as sub
where apptRows=1
results look like
acct# result
123456 1
234567 0

There is one possibility. The records you selected may result in duplicate records WITHIN each select statement. Let me try to illustrate with an example. (you can input the following query into your session to follow along)
IF OBJECT_ID('TEMPDB..#TEMP1') IS NOT NULL
DROP TABLE #TEMP1
IF OBJECT_ID('TEMPDB..#TEMP2') IS NOT NULL
DROP TABLE #TEMP2
CREATE TABLE #TEMP1(
id INT
,account INT
,amount INT
,yes_no INT
)
INSERT INTO #TEMP1 (id,account,amount,yes_no)
VALUES(1,123456,5,0)
,(2,123456,10,0)
,(3,123456,20,0)
CREATE TABLE #TEMP2(
id INT
,account INT
,amount INT
,yes_no INT
)
INSERT INTO #TEMP2 (id,account,amount,yes_no)
VALUES(4,123456,5,0)
,(5,123456,10,0)
,(6,123456,20,0)
SELECT *
FROM #TEMP1
SELECT *
FROM #TEMP2
Output of this is 2 tables with distinct records:
Now suppose I write queries that select account and the 'yes_no' column:
SELECT account,yes_no
FROM #TEMP1
SELECT account,yes_no
FROM #TEMP2
You can see that now all of the records are the same values within each select statement. So what do you think happens when I union these queries together?
SELECT account,yes_no
FROM #TEMP1
UNION
SELECT account,yes_no
FROM #TEMP2
UNION will output the distinct values of the ENTIRE OUTPUT, which also applies within each query. This is an extreme example of what I think you are experiencing. You need to include some sort of ID for each query such that it can be distinguished from other records within the query, like;
SELECT id,account,yes_no
FROM #TEMP1
UNION
SELECT id,account,yes_no
FROM #TEMP2

Generating Lines based on a value from a column in another table

I have the following table:
EventID=00002,DocumentID=0005,EventDesc=ItemsReceived
I have the quantity in another table
DocumentID=0005,Qty=20
I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of:
ITEM_TAG_001,
ITEM_TAG_002,
ITEM_TAG_003,
ITEM_TAG_004,
..
ITEM_TAG_020

Here's your sql query.
with cte as (
select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1
inner join tableB t2 on t2.DocumentId = t1.DocumentId
union all
select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte
where ctr <= Qty
)select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte
option (maxrecursion 0);
Output:

Best is to introduce a numbers table, very handsome in many places...
Something along:
Create some test data:
DECLARE #MockNumbers TABLE(Number BIGINT);
DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100));
DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT);
INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values;
INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7');
INSERT INTO #YourTable2 VALUES(1,5), (2,7);
--The query
SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0'))
FROM #YourTable1 t1
INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID
CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A;
The result
FirstItem_001
FirstItem_002
[...]
FirstItem_005
SecondItem_001
SecondItem_002
[...]
SecondItem_007
The idea in short:
We use an INNER JOIN to get the quantity joined to the item.
Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it.
The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly...
The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table...
If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a

T-SQL - Pivot by week

I'm currently trying to create a T-SQL, which runs through a list of deliveries in a table, and groups them by the Customer and the Depot - so each row will be
Customer, Depot, Total Value (sum of a column called Rate)
However, the customer would like the 'total value' split into the last 9 weeks - so rather than total value, we'll have columns like this:
22/01/2012 29/01/2012 05/02/2012 12/02/2012 19/02/2012 26/02/2012 04/03/2012 11/03/2012 18/03/2012
The dates would of course change for when they run the query - it'll just be the last 9 weeks. They also want a column for the Average of all these.
I understand pivot may help me but I'm a bit stumped on how to do this. Here's my current query:
SELECT d.Name AS 'Depot, s.Name AS 'Customer', SUM(c.Rates) AS 'Total Value'
FROM Deliveries AS c INNER JOIN Account AS s ON c.Customer = s.ID
INNER JOIN Depots AS d ON c.CollectionDepot = d.Letter
GROUP BY d.Name, s.Name
Many thanks!
EDIT: Here's a screenshot of the data currently - we won't need the 'total' column on the end, just there to show you. The 'Date' column is present in the Deliveries table and is called TripDate

Without knowing your exact data. It hard to predict what you are getting. But I can give you a suggestion of a solution.
Table structure
CREATE TABLE Deliveries
(
Customer INT,
CollectionDepot INT,
Rates FLOAT,
TripDate DATETIME
)
CREATE TABLE Account
(
Name VARCHAR(100),
ID INT
)
CREATE TABLE Depots
(
Name VARCHAR(100),
Letter INT
)
Test data
INSERT INTO Deliveries
VALUES
(1,1,452,GETDATE()-10),
(1,1,800,GETDATE()-30),
(1,1,7895,GETDATE()-2),
(1,1,451,GETDATE()-2),
(1,1,478,GETDATE()-89),
(1,1,4512,GETDATE()-31),
(1,1,782,GETDATE()-20),
(1,1,652,GETDATE()-5),
(1,1,752,GETDATE()-452)
INSERT INTO Account
VALUES
('Customer 1',1)
INSERT INTO Depots
VALUES
('Depot 1',1)
Table that contains the ranges and the formated date
CREATE TABLE #tmp
(
StartDate DATETIME,
EndDate DATETIME,
FomatedDate VARCHAR(20)
)
Calculate the date ranges
;WITH Nbrs ( n ) AS (
SELECT 0 UNION ALL
SELECT 1+n FROM Nbrs WHERE n < 8 )
INSERT INTO #tmp
SELECT
DATEADD(WEEK,-n-1,GETDATE()),
DATEADD(WEEK,-n,GETDATE()),
convert(varchar, DATEADD(WEEK,-n,GETDATE()), 112)
FROM
Nbrs
ORDER BY
-n
The date columns for the pivot
DECLARE #cols VARCHAR(MAX)
SELECT #cols = COALESCE(#cols + ','+QUOTENAME(FomatedDate),
QUOTENAME(FomatedDate))
FROM
#tmp
Declaring some dynamic sql and executing it
DECLARE #query NVARCHAR(4000)=
N'SELECT
*
FROM
(
SELECT
Depots.Name AS Depot,
Account.Name AS Customer,
Deliveries.Rates,
tmp.FomatedDate,
AVG(Deliveries.Rates) OVER(PARTITION BY 1) AS Average,
SUM(Deliveries.Rates) OVER(PARTITION BY 1) AS Total
FROM
Deliveries
JOIN Account
ON Deliveries.Customer = Account.ID
JOIN Depots
ON Deliveries.CollectionDepot = Depots.Letter
JOIN #tmp AS tmp
ON Deliveries.TripDate BETWEEN tmp.StartDate AND tmp.EndDate
) AS p
PIVOT
(
AVG(rates)
FOR FomatedDate IN ('+#cols+')
) AS pvt'
EXECUTE(#query)
And then cleaning up after myself.
DROP TABLE Deliveries
DROP TABLE Account
DROP TABLE Depots
DROP TABLE #tmp

You would have to make use of the PIVOT Keyword which is available in your version of SQL Server. I have outlined how your query should look, of course some tweaking will be required since it is difficult to test without having a copy of your data.
SELECT Depots.Name AS 'Depot', Account.Name, '22/01/2012', '29/01/2012', '05/02/2012', '12/02/2012',
FROM
(SELECT Name,
FROM Deliveries
INNER JOIN Account ON Deliveries.Customer = Account.ID
INNER JOIN Depots ON Account.CollectionDepot) AS Source
PIVOT
(
SUM(Deliveries.Rates)
FOR Date IN ('22/01/2012', '29/01/2012', '05/02/2012', '12/02/2012')
) AS 'Pivot Table'
For reference you could use this as a guide:
http://msdn.microsoft.com/en-us/library/ms177410.aspx

How to select info from row above?

I want to add a column to my table that is like the following:
This is just an example of how the table is structured, the real table is more than 10.000 rows.
No_ Name Account_Type Subgroup (New_Column)
100 Sales 3
200 Underwear 0 250 *100
300 Bikes 0 250 *100
400 Profit 3
500 Cash 0 450 *400
So for every time there is a value in 'Subgroup' I want the (New_Column) to get the value [No_] from the row above
No_ Name Account_Type Subgroup (New_Column)
100 Sales 3
150 TotalSales 3
200 Underwear 0 250 *150
300 Bikes 0 250 *150
400 Profit 3
500 Cash 0 450 *400
There are cases where the table is like the above, where two "Headers" are above. And in that case I also want the first above row (150) in this case.
Is this a case for a cursor or what do you recommend?
The data is ordered by No_
--EDIT--
Starting from the first line and then running through the whole table:
Is there a way I can store the value for [No_] where [Subgroup] is ''?
And following that insert this [No_] value in the (New_Column) in each row below having value in the [Subgroup] row.
And when the [Subgroup] row is empty the process will keep going, inserting the next [No_] value in (New_Column), that is if the next line has a value in [Subgroup]
Here is a better image for what I´m trying to do:

SQL Server 2012 suggests using Window Offset Functions.
In this case : LAG
Something like this:
SELECT [No_]
,[Name]
,[Account_Type]
,[Subgroup]
,LAG([No_]) OVER(PARTITION BY [Subgroup]
ORDER BY [No_]) as [PrevValue]
FROM table
Here is an example from MS:
http://technet.microsoft.com/en-us/library/hh231256.aspx

The ROW_NUMBER function will allow you to find out what number the row is, but because it is a windowed function, you will have to use a common table expression (CTE) to join the table with itself.
WITH cte AS
(
SELECT [No_], Name, Account_Type, Subgroup, [Row] = ROW_NUMBER() OVER (ORDER BY [No_])
FROM table
)
SELECT t1.*, t2.[No_]
FROM cte t1
LEFT JOIN cte t2 ON t1.Row = t2.Row - 1
Hope this helps.

Next query will return Name of the parent row instead of the row itself, i.e. Sales for both Sales, Underwear, Bikes; and Profit for Profit, Cash:
select ISNULL(t2.Name, t1.Name)
from table t1
left join table t2 on t1.NewColumn = t2.No

So in SQL Server 2008 i created test table with 3 values in it:
create table #ttable
(
id int primary key identity,
number int,
number_prev int
)
Go
Insert Into #ttable (number)
Output inserted.id
Values (10), (20), (30);
Insert in table, that does what you need (at least if understood correctly) looks like this:
declare #new_value int;
set #new_value = 13; -- NEW value
Insert Into #ttable (number, number_prev)
Values (#new_value,
(Select Max(number) From #ttable t Where t.number < #new_value))
[This part added] And to work with subgroup- just modify the inner select to filter out it:
Select Max(number) From #ttable t
Where t.number < #new_value And Subgroup != #Subgroup

SELECT
No_
, Name
, Account_Type
, Subgroup
, ( SELECT MAX(above.No_)
FROM TableX AS above
WHERE above.No_ < a.No_
AND above.Account_Type = 3
AND a.Account_Type <> 3
) AS NewColumn
FROM
TableX AS a

Make SQL Select same row multiple times

I need to test my mail server. How can I make a Select statement
that selects say ID=5469 a thousand times.

If I get your meaning then a very simple way is to cross join on a derived query on a table with more than 1000 rows in it and put a top 1000 on that. This would duplicate your results 1000 times.
EDIT: As an example (This is MSSQL, I don't know if Access is much different)
SELECT
MyTable.*
FROM
MyTable
CROSS JOIN
(
SELECT TOP 1000
*
FROM
sysobjects
) [BigTable]
WHERE
MyTable.ID = 1234

You can use the UNION ALL statement.
Try something like:
SELECT * FROM tablename WHERE ID = 5469
UNION ALL
SELECT * FROM tablename WHERE ID = 5469
You'd have to repeat the SELECT statement a bunch of times but you could write a bit of VB code in Access to create a dynamic SQL statement and then execute it. Not pretty but it should work.

Create a helper table for this purpose:
JUST_NUMBER(NUM INT primary key)
Insert (with the help of some (VB) script) numbers from 1 to N. Then execute this unjoined query:
SELECT MYTABLE.*
FROM MYTABLE,
JUST_NUMBER
WHERE MYTABLE.ID = 5469
AND JUST_NUMBER.NUM <= 1000

Here's a way of using a recursive common table expression to generate some empty rows, then to cross join them back onto your desired row:
declare #myData table (val int) ;
insert #myData values (666),(888),(777) --some dummy data
;with cte as
(
select 100 as a
union all
select a-1 from cte where a>0
--generate 100 rows, the max recursion depth
)
,someRows as
(
select top 1000 0 a from cte,cte x1,cte x2
--xjoin the hundred rows a few times
--to generate 1030301 rows, then select top n rows
)
select m.* from #myData m,someRows where m.val=666
substitute #myData for your real table, and alter the final predicate to suit.

easy way...
This exists only one row into the DB
sku = 52 , description = Skullcandy Inkd Green ,price = 50,00
Try to relate another table in which has no constraint key to the main table
Original Query
SELECT Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod WHERE Prod_SKU = N'52'
The Functional Query ...adding a not related table called 'dbo.TB_Labels'
SELECT TOP ('times') Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod,dbo.TB_Labels WHERE Prod_SKU = N'52'

In postgres there is a nice function called generate_series. So in postgreSQL it is as simple as:
select information from test_table, generate_series(1, 1000) where id = 5469
In this way, the query is executed 1000 times.
Example for postgreSQL:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; --To be able to use function uuid_generate_v4()
--Create a test table
create table test_table (
id serial not null,
uid UUID NOT NULL,
CONSTRAINT uid_pk PRIMARY KEY(id));
-- Insert 10000 rows
insert into test_table (uid)
select uuid_generate_v4() from generate_series(1, 10000);
-- Read the data from id=5469 one thousand times
select id, uid, uuid_generate_v4() from test_table, generate_series(1, 1000) where id = 5469;
As you can see in the result below, the data from uid is read 1000 times as confirmed by the generation of a new uuid at every new row.
id |uid |uuid_generate_v4
----------------------------------------------------------------------------------------
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5630cd0d-ee47-4d92-9ee3-b373ec04756f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"ed44b9cb-c57f-4a5b-ac9a-55bd57459c02"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"3428b3e3-3bb2-4e41-b2ca-baa3243024d9"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7c8faf33-b30c-4bfa-96c8-1313a4f6ce7c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"b589fd8a-fec2-4971-95e1-283a31443d73"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"8b9ab121-caa4-4015-83f5-0c2911a58640"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7ef63128-b17c-4188-8056-c99035e16c11"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5bdc7425-e14c-4c85-a25e-d99b27ae8b9f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"9bbd260b-8b83-4fa5-9104-6fc3495f68f3"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"c1f759e1-c673-41ef-b009-51fed587353c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"4a70bf2b-ddf5-4c42-9789-5e48e2aec441"
Of course other DBs won't necessarily have the same function but it could be done:
See here.

If your are doing this in sql Server
declare #cnt int
set #cnt = 0
while #cnt < 1000
begin
select '12345'
set #cnt = #cnt + 1
end
select '12345' can be any expression

Repeat rows based on column value of TestTable. First run the Create table and insert statement, then run the following query for the desired result.
This may be another solution:
CREATE TABLE TestTable
(
ID INT IDENTITY(1,1),
Col1 varchar(10),
Repeats INT
)
INSERT INTO TESTTABLE
VALUES ('A',2), ('B',4),('C',1),('D',0)
WITH x AS
(
SELECT TOP (SELECT MAX(Repeats)+1 FROM TestTable) rn = ROW_NUMBER()
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
SELECT * FROM x
CROSS JOIN TestTable AS d
WHERE x.rn <= d.Repeats
ORDER BY Col1;

This trick helped me in my requirement.
here, PRODUCTDETAILS is my Datatable
and orderid is my column.
declare #Req_Rows int = 12
;WITH cte AS
(
SELECT 1 AS Number
UNION ALL
SELECT Number + 1 FROM cte WHERE Number < #Req_Rows
)
SELECT PRODUCTDETAILS.*
FROM cte, PRODUCTDETAILS
WHERE PRODUCTDETAILS.orderid = 3

create table #tmp1 (id int, fld varchar(max))
insert into #tmp1 (id, fld)
values (1,'hello!'),(2,'world'),(3,'nice day!')
select * from #tmp1
go
select * from #tmp1 where id=3
go 1000
drop table #tmp1

in sql server try:
print 'wow'
go 5
output:
Beginning execution loop
wow
wow
wow
wow
wow
Batch execution completed 5 times.

The easy way is to create a table with 1000 rows. Let's call it BigTable. Then you would query for the data you want and join it with the big table, like this:
SELECT MyTable.*
FROM MyTable, BigTable
WHERE MyTable.ID = 5469

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding Duplicate occurrences in column - sql

Related

2 sql queries 2 different results using UNION

Generating Lines based on a value from a column in another table

T-SQL - Pivot by week

How to select info from row above?

Make SQL Select same row multiple times

Categories

Resources