SQL query get distinct records - sql

I need help with a query to get distinct records from the table.
SELECT distinct cID, firstname, lastname,
typeId, email from tableA
typeId and email has different values in the table. I know this one causing to return 2 records because these values are different.
Is there anyway I can get 1 record for each cID irrespective of typeId and email?

If you don't care about what typeId and email get selected with each cID, following is one way to do it.
SELECT DISTINCT a.cID
, a.firstname
, a.lastname
, b.typeId
, b.email
FROM TableA a
INNER JOIN (
SELECT cID, MIN(typeID), MIN(email)
FROM TableA
GROUP BY
cID
) b ON b.cID = a.cID

If any one value for typeId and email are acceptable, then
SELECT cID, firstname, lastname,
max(typeId), max(email)
from tableA
group by cID, firstname, lastname,
should do it.

Is this what you are after:?
SELECT distinct a.cID, a.firstname,
a.lastname, (SELECT typeId from tableA
WHERE cID = a.cID), (Select email from
tableA WHERE cID = a.cID) from tableA
AS a

Related

Finding everything not in SQL query result

I have the following query:
SELECT first_name, age
FROM taba
WHERE NOT EXISTS
(
SELECT p.first_name, MAX(p.age) as age FROM taba p
GROUP BY p.first_name
);
The inner sub-query finds the largest age for a given name. I want to basically find every row that isn't in the inner subquery result. What's the best way to achieve that? This query gives me the empty set and I'm not sure why.
Max Age By First Name
All Data
I want everything in all data that isn't in the max age by first name query.
Using a correlated sub-query...
SELECT
*
FROM
taba T
WHERE
age < (
SELECT MAX(age)
FROM taba P
WHERE T.first_name = P.first_name
)
Using a sub-query and a join...
SELECT
t.*
FROM
taba t
INNER JOIN
(
SELECT first_name, MAX(age) AS max_age
FROM taba
GROUP BY first_name
)
AS age
ON age.first_name = t.first_name
AND age.max_age > t.age
Using EXECPT...
SELECT first_name, age FROM taba
EXCEPT
SELECT first_name, MAX(age) FROM taba GROUP BY first_name
Using EXISTS() (with a correlated sub-query)...
SELECT
*
FROM
taba T
WHERE
EXISTS (
SELECT *
FROM taba P
WHERE P.first_name = T.first_name
AND P.age > T.age
)
You need to refer to your outer table in your inner query to get this query worked.
SELECT first_name, age
FROM taba T
WHERE NOT EXISTS(SELECT NULL
FROM taba P
WHERE T.first_name = P.first_name
AND T.age = (SELECT MAX(age)
FROM taba P2
WHERE P.first_name = P2.first_name)
GROUP BY p.first_name
);
You may try below shorter version also -
SELECT first_name, age
FROM (SELECT first_name, age, RANK() OVER(PARTITION BY first_name ORDER BY age DESC) RNK
FROM taba)
WHERE RNK <> 1;
Fiddle
There is one other solution to at least be aware of.
This avoids the correlated subquery and does not require a join.
A common solution:
SELECT *
FROM taba
WHERE (first_name, age) NOT IN (
SELECT first_name, MAX(age)
FROM taba
GROUP BY first_name
)
;
One version can be:
SELECT b.first_name, b.age
FROM taba b
WHERE b.age < (SELECT Max(p.age)
FROM taba
WHERE p.first_name = b.first_name)

With a CTE, how do I select the rows from tableB even when tableA returns no results?

I have the following sqlfiddle: http://sqlfiddle.com/#!15/e971e/10/0
So, I am running the query against two tables simultaneously. I'd like to get all info for "fran" even if she isn't in 1 table or the other. I have:
WITH tableA AS(
SELECT id, name, age
FROM a WHERE name = 'fran'
),
tableB AS(
SELECT id, name, address
FROM b WHERE name = 'fran'
)
SELECT tableA.id, tableA.name, tableA.age, tableB.address
FROM tableA, tableB
/*
WITH tableB AS(
SELECT id, name, address
FROM b WHERE name = 'fran'
)
SELECT tableB.address FROM tableB
*/
This returns no rows even though she is in tableB (run the commented portion and it returns her address).
In the end I'd like something like:
id name age address
-- -- -- 2 Main
I see your problem Marissa. The thing is that you are not joining the tables.
WITH tableA AS(
SELECT id, name, age
FROM a WHERE name = 'fran'
),
tableB AS(
SELECT id, name, address
FROM b WHERE name = 'fran'
)
SELECT a.id, a.name, a.age, b.address
FROM tableA a
FULL OUTER JOIN tableB b
ON a.id = b.id
This will return you a row only with the address of Fran. The other attributes will be blank because you don't have the data in table A.
Use union all!
WITH tableA AS (
SELECT id, name, age
FROM a
WHERE name = 'fran'
),
tableB AS (
SELECT id, name, address
FROM b
WHERE name = 'fran'
)
select 'a' as which, a.*
from tableA a
union all
select 'b' as which, b.*
from tableB b;
The join solutions will multiply the number of rows in the table. So, if a has 3 rows and b has 4 rows, then the result set will have 12 rows. With this method, the result set will have 7 rows.
You need a full outer join.
Based on your sqlfiddle the id is not suitable for joining (id 2 is fred in table A and fran in table B), you need to use the name column instead. You should ensure that name is indexed for proper performance.
SELECT
a.id, a.name, a.age,
b.id, b.name, b.address
FROM a
FULL OUTER JOIN b ON b.name = a.name
WHERE a.name = 'fran' or b.name = 'fran'
You need a Full Join plus Coalesce:
WITH tableA AS(
SELECT id, name, age
FROM a WHERE name = 'fran'
),
tableB AS(
SELECT id, name, address
FROM b WHERE name = 'fran'
)
SELECT COALESCE(tableA.id, tableB.id) AS id
,COALESCE(tableA.name, tableB.name) as name
,tableA.age, tableB.address
FROM tableA FULL JOIN tableB
ON tableA.id = tableB.id

Inserting data from two similar tables into one master table in Sql Server

Table1 -> Id, CountryFk, CompanyName
Table2 -> Id, CountryFk, CompanyName, Website
I need to merge Table1 and Table2 into 1 master table. I know this can be done by something like the below query -
INSERT INTO masterTable(Id, CountryFk, CompanyName)
SELECT * FROM Table1
UNION
SELECT * FROM Table2;
But, I have an extra column, website in table2 which isn't there in table1. I need this column in masterTable.
And more importantly, Table1 and Table2 have repeating companies with the same countryFK. For eg, IBM at countryFK=123 could be present twice in Table1. And Table1 could have a companyName that is present in Table2.
For eg: IBM at countryFk = 123 could be present in table1 and table2. I need to make sure that the masterTable does not have any duplicate companies.
Please note that the companyname by itself need not be unique. masterTable can have IBM with countryFK = 123 and IBM with countryFk = 321.
masterTable cannot have IBM with countryFk=123 twice.
IMHO, if you need to ensure both CompanyName and CountryFk not duplicate in MasterTable, you should add an unique index on the column.
Below query selects all distinct value in Table1 and Table2, and insert with existence checking for both CompanyName and CountryFk.
-- Id is identity, no need to insert value
INSERT MasterTable (CountryFk, CompanyName, WebSite)
SELECT
CountryFk,
CompanyName,
(
SELECT TOP(1) WebSite FROM Table2
WHERE CompanyName = data.CompanyName
AND CountryFk = data.CountryFk
AND WebSite IS NOT NULL
) AS WebSite
FROM
(
SELECT CountryFk, CompanyName FROM Table1
UNION
SELECT CountryFk, CompanyName FROM Table2
) data
WHERE
NOT EXISTS
(
SELECT * FROM MasterTable
WHERE CompanyName = data.CompanyName AND CountryFk = data.CountryFk
)
GROUP BY
CountryFk,
CompanyName
This may work
INSERT INTO masterTable(Id, CountryFk, CompanyName,Website)
SELECT Id, CountryFk, CompanyName, NULL as Website FROM Table1
UNION
SELECT Id, CountryFk, CompanyName,Website FROM Table2;
Try this for proper UNION.
INSERT INTO masterTable(Id, CountryFk, CompanyName,Website)
SELECT Id, CountryFk, CompanyName, "Website" as Website FROM Table1
UNION
SELECT Id, CountryFk, CompanyName, Website FROM Table2 WHERE CompanyName NOT IN(SELECT CompanyName FROM Table1);
Or
Once, you have merge your both table data into master table, then you can look for duplicate data and remove them
;WITH cte
AS (SELECT ROW_NUMBER() OVER (PARTITION BY CountryFk, CompanyName
ORDER BY ( SELECT 0)) RN
FROM MyTable)
DELETE FROM cte
WHERE RN > 1
INSERT INTO masterTable(CountryFk, CompanyName, WebSite)
SELECT CountryFk, CompanyName, min(WebSite)
FROM Table2
group by CountryFk, CompanyName;
INSERT INTO masterTable(CountryFk, CompanyName)
SELECT distinct CountryFk, CompanyName
FROM Table1
LEFT JOIN masterTable
on masterTable.CountryFk = Table1.CountryFk
and masterTable.CompanyName = Table1.CompanyName
where masterTable.CountryFk is null;

merge statement - upsert - performing unique test in source table as well

I need some help with SQL Server merge statement. I am using version 2008.
I have two tables table1 and table2 with 3 column in each table: name, age, lastname.
I want to do little variant of Upsert from table2 to table1. If record exists in table 1, ignore. If doesn't exist then insert.
I know following would work -
merge into [test].[dbo].[table1] a
using [test].[dbo].[table2] b
on a.name = b.name and a.lastname = b.lastname
when not matched then
insert (name, age, lastname) values (b.name, b.age, b.lastname)
I would like to know if I could do something like this? Currently following doesn't work:
merge into [test].[dbo].[table1] a
using [test].[dbo].[table2] b
on a.name = b.name and a.lastname = b.lastname
when not matched then
insert (select name, max(age), lastname from b group by name, lastname)
Basically I want to insert only 'unique records' from table 2 to table 1. Unique means name and lastname should be same.
Thanks.
Its not really an UPSERT operation its a simple insert and I would do something like this....
insert into [test].[dbo].[table1](name, age, lastname)
SELECT b.name, MAX(b.age) Age, b.lastname
FROM [test].[dbo].[table2] b
WHERE NOT EXISTS (SELECT 1
FROM [test].[dbo].[table1]
WHERE name = b.name
and lastname = b.lastname)
GROUP BY b.name, b.lastname
UPSERT would be if you updated records if they already existed.
For just insert you don't really Merge. An insert alone should be enough. But heres a way to do it
merge into [test].[dbo].[table1] a
using (
select
name,
lastname,
max(age) age
from [test].[dbo].[table2]
group by
name,
lastname
) b on
a.name = b.name and
a.lastname = b.lastname
when not matched
then
insert (
name,
lastname,
age
)
VALUES (
b.name,
b.lastname,
b.age
);

SQL Query with Join, Count and Where

I have 2 tables and am trying to do one query to save myself some work.
Table 1: id, category id, colour
Table 2: category id, category name
I want to join them so that I get id, category id, category name, colour
Then I want to limit it so that no "red" items are selected (WHERE colour != "red")
Then I want to count the number of records in each category (COUNT(id) GROUP BY (category id).
I have been trying:
SELECT COUNT(table1.id), table1.category_id, table2.category_name
FROM table1
INNER JOIN table2 ON table1.category_id=table2.category_id
WHERE table1.colour != "red"
But it just doesn't work. I've tried lots of variations and just get no results when I try the above query.
You have to use GROUP BY so you will have multiple records returned,
SELECT COUNT(*) TotalCount,
b.category_id,
b.category_name
FROM table1 a
INNER JOIN table2 b
ON a.category_id = b.category_id
WHERE a.colour <> 'red'
GROUP BY b.category_id, b.category_name
SELECT COUNT(*), table1.category_id, table2.category_name
FROM table1
INNER JOIN table2 ON table1.category_id=table2.category_id
WHERE table1.colour <> 'red'
GROUP BY table1.category_id, table2.category_name
I have used sub-query and it worked great!
SELECT *,(SELECT count(*) FROM $this->tbl_news WHERE
$this->tbl_news.cat_id=$this->tbl_categories.cat_id) as total_news FROM
$this->tbl_categories