SQL - Select records with duplication in the same column - sql

Is there a way to select records with duplication within 1 column?
So, for instance you have Address_Table:
Address_Line_1
Address_Line_2
123 street
Town
321 street 321 street 321 street
Town
456 street
Town
789 street 789 street
Town
Is there a way to select the all records like 321 & 789 street from the Address_Line_1 column that contain duplicates of themselves?
Thanks

Just a thought, and not fully tested.
Select A.*
From YourTable A
Cross Apply (
select Dupes = Avg(Hits) -- perhaps Max(Hits) instead
From (
Select Value
,Hits = sum(1) over (partition by Value)
From string_split([Address_Line_1],' ')
) B1
) B
Where Dupes>1
Results
Address_Line_1 Address_Line_2
321 street 321 street 321 street Town
789 street 789 street Town

If your databse COMPATIBILITY_LEVEL more than 130
ALTER DATABASE [DatabaseName] SET COMPATIBILITY_LEVEL = 130
You can try this
SELECT ads = STUFF((
SELECT ' ' + value
FROM STRING_SPLIT(Address_Line_1, ' ')
GROUP BY value
FOR XML PATH('')
), 1, 1, '') , Address_Line_1, Address_Line_2 FROM Adress
Fidler Sample
Sample Image

Related

Concatenate column values for specific fields while displaying other column values in Oracle 11.2

I am a SQL noob.
How can I concatenate column values for specific fields while displaying other column values?
I will try my best to show a simplified example.
Say I have the following table:
Table A
Name
Address
Email
Value1
Value2
Value3
Sam
123 Main Street
sam#coporate.com
34
51
39
Peter
789 High Street
peter#coporate.com
73
05
59
Sam
123 Main Street
sam#coporate.com
43
12
84
Sally
456 State Street
sally#coporate.com
35
76
23
Sally
456 State Street
sally#coporate.com
77
34
18
Peter
789 High Street
peter#coporate.com
32
14
54
Sally
456 State Street
sally#coporate.com
64
49
23
Expected output
Name
Address
Email
Value1
Value2
Value3
Sam
123 Main Street
sam#coporate.com
34,43
51,12
39,84
Sally
456 State Street
sally#coporate.com
35,64,77
76,49,34
23,23,18
Peter
789 High Street
peter#coporate.com
32,73
14,05
54,59
I tried using LISTAGG but the issue I had was that I was then not able to display the Name, Address and Email fields. Please help and thank you in advance!
Try:
SELECT NAME, ADDRESS, EMAIL,
LISTAGG(value1, ',') WITHIN GROUP (ORDER BY value1) new_value_1,
LISTAGG(value2, ',') WITHIN GROUP (ORDER BY value2) new_value_2,
LISTAGG(value3, ',') WITHIN GROUP (ORDER BY value3) new_value_3
FROM TABLE_A
GROUP BY NAME, ADDRESS, EMAIL;
I have my compiler into job computer, I am not sure. If I wrong, tell me to drop answer please.
This should do what you want:
select name, address, email,
listagg (value1, ',') within group (order by name) as values1,
listagg (value2, ',') within group (order by name) as values2,
listagg (value3, ',') within group (order by name) as values3
from a
group by name, address, email;
Here is a db<>fiddle.

how to SQL query with conditioned distinct

Simple Database:
street | age
1st st | 2
2nd st | 3
3rd st | 4
3rd st | 2
I'd like to build a query that'll return the DISTINCT street names, but only for those households where no one is over 3.
so that result would be:
street | age
1st st | 2
2nd st | 3
How do I do that? I know of DISTINCT, but now how to conditionalize it for all the records that match the DISTINCT
Suppose the name of the table is 'tab'. You can then try:
select distinct street from tab where street not in (select street from tab where age>3);
I have created a sql fiddle where you can view the result:
http://sqlfiddle.com/#!9/2c513d/2
Distinct street names for households where no one is over 3:
SELECT street
FROM table
GROUP BY street
HAVING COUNT(1) <= 3
SELECT DISTINCT street
FROM table
WHERE NOT(age>3)
USE GROUP BY
Select Street
from yourtable
group by street
Having sum(age)<=3
Another way this could be achived with a use of NOT EXISTS
SELECT *
FROM yourtable a
WHERE NOT EXISTS
(SELECT street
FROM yourtable b
WHERE age > 3
AND a.street = b.street)

Return only unrepeated first word of companyname

This is my table
EmpID EmpName CompanyName CompanyID
123 Josep Kramer Levin Naftalis & Frankel LLP 468
123 Josep Thompson Hine LLP 567
801 Simon Ogletree Deakins International LLP 222
801 Simon Ogletree, Deakins, Nash PC 916
602 alen Baker Co Ltd 732
602 alen Baker Mcken Ltd 242
Condition is Result will return, if the first word of the company name is not same, Ex:Baker and Ogletree these words are more then once so it is not include as result
My Output like this
EmpID EmpName Company Name CompanyID
123 Josep Kramer Levin Naftalis & Frankel LLP 468
123 Josep Thompson Hine LLP 567
This one is written for Oracle:
SELECT *
FROM (SELECT table.*,
LEAD(company, 1) OVER (PARTITION BY empid ORDER BY companyid) AS next_company
FROM table)
WHERE SUBSTR(company, 0, INSTR(company, ' ') - 1) != SUBSTR(next_company, 0, INSTR(next_company, ' ') - 1) AND next_company IS NOT NULL;
Edit: I see now you comment, that you're using SQL Server. It's probably not very different but I don't have the skills to help with that.
You can try with something like this
create table #EmployeeTemp (EmpID int, Name nvarchar(100), Company nvarchar(100), CompanyID int)
insert into #EmployeeTemp (EmpID, Name, Company, CompanyID)
values
(11, 'Alberto', 'Baker Mec Ltd', 25),
(11, 'Alberto', 'Baker rel LLP', 26),
(12, 'Ameez', 'Baker Mec Ltd', 25),
(12, 'Ameez', 'Wrong Part LLP', 27)
Select EmpID, FirstWord, Count(FirstWord) as 'Counted', Count(FirstWord) over (partition by EmpID,FirstWord) 'CountByWord'
into #EmployeeFinal
from (
SELECT EmpID, CASE CHARINDEX(' ', Company, 1)
WHEN 0 THEN Company
ELSE SUBSTRING(Company, 1, CHARINDEX(' ', Company, 1) - 1)
END as 'FirstWord'
from #EmployeeTemp) src
group by EmpID, FirstWord
Select * from #EmployeeFinal where Counted = 1 and CountByWord = 1
So, you will select only first word of company, count how many times that word repeats by employee and then select only those that you want
This is the result:
EmpID FirstWord Counted CountByWord
12 Baker 1 1
12 Wrong 1 1
I am interpreting this to mean that you want all employees that have different first words for the company name, where the first word is separated by a space.
You can use window functions in combination with string functions to get this information:
select t.*
from (select t.*,
min(left(company, charindex(' ', company + ' '))) over (partition by empid) as minfirstword,
max(left(company, charindex(' ', company + ' '))) over (partition by empid) as maxfirstword
from t
) t
where minfirstword <> maxfirstword;
Here is a SQL Fiddle.

splitting a string by multiple delimitters

I have a set of addresses:
34 Main St Suite 23
435 Center Road Ste 3
34 Jack Corner Bldg 4
2 Some Street Building 345
the delimitters would be:
Suite, Ste, Bldg, Building
I would like to separate these addresses into address1 and address2 like this:
+---------------------+--------------+
| Address1 | Address2 |
+---------------------+--------------+
| 34 Main St | Suite 23 |
| 435 Center Road | Ste 3 |
| 34 Jack Corner | Bldg 4 |
| 2 Some Street | Building 345 |
+---------------------+--------------+
How can I define a set of delimitters and delimit in this fashion?
SELECT
T.Address,
Left(T.Address, IsNull(X.Pos - 1, 2147483647)) Address1,
Substring(T.Address, X.Pos + 1, 2147483647) Address2 -- Null if no second
FROM
(
VALUES
('34 Main St Suite 23'),
('435 Center Road Ste 3'),
('34 Jack Corner Bldg 4'),
('2 Some Street Building 345'),
('123 Sterling Rd'),
('405 29th St Bldg 4 Ste 217')
) T (Address)
OUTER APPLY (
SELECT TOP 1 NullIf(PatIndex(Delimiter, T.Address), 0) Pos
FROM (
VALUES ('% Suite %'), ('% Ste %'), ('% Bldg %'), ('% Building %')
) X (Delimiter)
WHERE T.Address LIKE X.Delimiter
ORDER BY Pos
) X
I used PatIndex() so an address like "Sterling Rd" won't give you a false match on "Ste"
Result set:
Address1 Address2
--------------- --------
34 Main St Suite 23
435 Center Road Ste 3
34 Jack Corner Bldg 4
2 Some Street Building 345
123 Sterling Rd NULL
405 29th St Bldg 4 Ste 217
You can use a table of delimiters on which to perform your split. In this example I am using XML to do the parsing, but after you've swapped in a reliable delimiter in place of your set (Ste, Suite, etc.) then you can perform the splitting using any of many t-sql based methods.
declare #tab table (s varchar(100))
insert into #tab
select '34 Main St Suite 23' union all
select '435 Center Road Ste 3' union all
select '34 Jack Corner Bldg 4' union all
select '2 Some Street Building 345' union all
select '20950 N. Tatum Blvd., Ste 300' union all
select '1524 McHenry Ave Ste 470';
declare #delimiters table (d varchar(100));
insert into #delimiters
select 'Suite' union all
select 'Ste' union all
select 'Bldg' union all
select 'Building';
select s,
cast('<r>'+ replace(s, d, '</r><r>'+d) + '</r>' as xml),
[Street1] = cast('<r>'+ replace(s, d, '</r><r>'+d) + '</r>' as xml).value('r[1]', 'varchar(100)'),
[Street2] = cast('<r>'+ replace(s, d, '</r><r>'+d) + '</r>' as xml).value('r[2]', 'varchar(100)')
from #tab t
cross
apply #delimiters d
where charindex(' '+d+' ', s) > 0;
select Addr,CASE WHEN CHARINDEX('suite',addr,1)>0 then LEFT(addr,CHARINDEX('suite',addr,1)-1)
WHEN CHARINDEX('Ste',addr,1)>0 then LEFT(addr,CHARINDEX('Ste',addr,1)-1)
WHEN CHARINDEX('Bldg',addr,1)>0 then LEFT(addr,CHARINDEX('Bldg',addr,1)-1)
WHEN CHARINDEX('Building',addr,1)>0 then LEFT(addr,CHARINDEX('Building',addr,1)-1)
END as [Address],
CASE WHEN CHARINDEX('suite',addr,1)>0 then RIGHT(addr,len(addr)-(CHARINDEX('suite',addr,1)-1))
WHEN CHARINDEX('Ste',addr,1)>0 then RIGHT(addr,len(addr)-(CHARINDEX('Ste',addr,1)-1))
WHEN CHARINDEX('Bldg',addr,1)>0 then RIGHT(addr,len(addr)-(CHARINDEX('Bldg',addr,1)-1))
WHEN CHARINDEX('Building',addr,1)>0 then RIGHT(addr,len(addr)-(CHARINDEX('Building',addr,1)-1))
END as [Address1]
from Addr
If you're going to try to parse this data, and it's NOT going to be delimited by something (ie comma), it's going to be much harder and you will have to make some assumptions. Having a larger data set can help you make stronger assumptions, but it will still be very brittle.
Looking at your data, I think you can make the following assumptions:
1) Address 2 is always the last 2 words (when split with spaces), so you could split the address based on spaces, and use the last 2 as Address 2, and the rest as Address 1.
2) You can assume Address 1 is the first 3 words, and the rest is Address 2.
To split up this data, I would either use T-SQL equivalent of split(' ', $data) to get an array of the words. Or, use a T-SQL equivalent of strpos and strrpos to find the 2nd to last space, or the position of the 3rd space, and substr everything before and after that into the appropriate variables.
It's up to you to make the decision based on the data available to pick the more robust assumptions and work with them.

SQL problem - one name 2 address in the same table

CName | AddressLine
-------------------------------
John Smith | 123 Nowheresville
Jane Doe | 456 Evergreen Terrace
John Smith | 999 Somewhereelse
Joe Bloggs | 1 Second Ave
If i have this table is possible to do a select to put like this
CNAME | Address1 | Address2
John Smith | 123 Nowheresville | 999 Somewhereelse
I'm using oracle
It is considered a bad design (inefficient memory usage) to add a new column for appearance of duplications in just some rows . Maybe you should consider using inner-join and a separate table for the address column!
As your table stands, you cannot use a simple self-join to reduce this to a single line. You can bring back rows that have all of the addresses (so long as you hard-code for a particular maximum number of addresses), but you will always have the same number of rows as there are addresses for a given user (unless you have a way of identifying a single address as "primary").
In order to reduce your result set to a single line, you'll have to provide some way of marking a "first" address. With SQL Server (or similar professional-grade RDBM's), you could use a common table expression with ranking/row numbering functions to do this:
with Addresses as
(select
CName,
AddressLine,
row_number() over (partition by CName order by AddressLine) as RowNum
from YourTable)
select
a1.CName,
a1.AddressLine as Address1,
a2.AddressLine as Address2,
a3.AddressLine as Address3
from Addresses a1
left join Addresses a2 on a2.CName = a1.CName and a2.RowNum = 2
left join Addresses a3 on a3.CName = a1.CName and a3.RowNum = 3
where a1.RowNum = 1
temp = your table name
select distinct cname, addressline as [address1],
(
ISNULL((select addressline from temp where cname = t.cname and addressline != t.addressline), '')
) as address2
from
temp t
The problem is resolve, Frank Kulash in oracle forum solved the problem
Here is the solution:
WITH got_r_num AS
(
SELECT cname, addressline
, ROW_NUMBER () OVER ( PARTITION BY cname
ORDER BY addressline
) AS r_num
FROM table_x
-- WHERE ... -- If you need any filtering, put it here
)
SELECT cname
, MIN (CASE WHEN r_num = 1 THEN addressline END) AS addressline1
, MIN (CASE WHEN r_num = 2 THEN addressline END) AS addressline2
FROM got_r_num
GROUP BY cname
Tanks to all for the help