take Duplicated ID's out and Identify a new columns - sql

I Joined 6 table together to gather all information that I need.
I want all Id's, Names, Birthdays, and Ethnicity.
Some Ids have 2 or more Ethnicity and that will cause a id be duplicated.
I am thinking of writing a sub query or can I just use a case statement since I have tried case statement before and works for another case but I can not apply it in this case.
what I have is:
ID NAME Birthdays Ethnicity
4000 Pedram 11/11/1999 Middle East
4001 Carlos 11/11/1920 Spanish
4001 Carlos 11/11/1920 Native American
4002 Asia 11/22/1986 Polish
4002 Asia 11/22/1986 Native American
4002 Asia 11/22/1986 White/caucassian
I want to say if any Id duplicated and ethnicity is different <> just give me this:
ID NAME Birthdays Ethnicity
4000 Pedram 11/11/1999 Middle East
4001 Carlos 11/11/1920 Multiracial
4002 Asia 11/22/1986 multiracial
PS : ethnicity is in a different table and I joined it to Person_table
PS : to be able to join ethnicity table to Person_table I needed to join 3 more tables that have pr keys that can related to each other.
PS : I tried CASE WHEN Count (Id) > 1 THEN 'Multiracial' ELSE Ethnicity END AS Ethnicity_2
and it Identify all ethnicity as Multiracial.
Any help Or thought will be appreciate.

You can use this:
WITH CTE AS
(
SELECT *,
N = COUNT(*) OVER(PARTITION BY ID),
RN = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Ethnicity)
FROM dbo.YourTable
)
SELECT ID,
NAME,
Birthdays,
CASE WHEN N > 1 THEN 'Multiracial' ELSE Ethnicity END Ethnicity
FROM CTE
WHERE RN = 1;

This one might not be the most efficient but it works. Just substitute your derived table for t below:
SELECT DISTINCT t.id, t.name,
CASE WHEN cnt = 1 THEN ethnicity
ELSE 'Multiracial' END AS ethnicity
FROM t
INNER JOIN
(SELECT id, COUNT(DISTINCT ethnicity) AS cnt
FROM t
GROUP BY id) sub
ON t.id = sub.id
Tested here: http://sqlfiddle.com/#!9/7473f/6

SELECT
id, name, Birthdays,
IIF(COUNT(DISTINCT Ethnicity) > 1, 'Multiracial', MIN(Ethnicity)) as Ethnicity
FROM
Table
GROUP BY
id, name, Birthdays
SELECT
id, name, Birthdays,
CASE WHEN COUNT(DISTINCT Ethnicity) > 1 THEN 'Multiracial' ELSE MIN(Ethnicity) END as Ethnicity
FROM
Table
GROUP BY
id, name, Birthdays

Related

How can i order by a case statement thats in a sub query

Hi all first time poster learning (MS)SQL :) - I hope you can help. I have the below query but would like to order it with highest paying category coming first.
If i try an order by salaries within the subquery, i'm told thats not allowed.
Msg 1033, Level 15, State 1, Line 60 The ORDER BY clause is
invalid in views, inline functions, derived tables, subqueries, and
common table expressions, unless TOP, OFFSET or FOR XML is also
specified.If i try it outside of the query it tells me its also not
allowed as salaries is not in the group by
Select test.rating, COUNT(rating) as total
FROM (
select ID, name, salaries,
CASE
WHEN salaries \> 12345 THEN 'paid well'
WHEN salaries \< 12345 THEN 'underpaid'
WHEN salaries = 12345 THEN 'average'
ELSE 'null'
END AS rating
from dupes
) test
GROUP by test.rating
From above this is my current output and exactly how I want it, but would like the Paid well category first, followed by average, then underpaid. Can anyone please help me?
rating total
average 2
null 5
underpaid 4
paid well
1
Just add a ORDER BY after the GROUP BY (ie it will be the last action performed):
ORDER BY
CASE WHEN test.rating='null' then 99
WHEN test.rating='Paid Well' then 1
WHEN test.rating='Average' then 2
WHEN test.rating='Underpaid' then 3
ELSE 4
END
Try this:
SELECT test.rating, COUNT(rating) AS total
FROM (
select ID, name, salaries,
CASE
WHEN salaries \> 12345 THEN 'paid well'
WHEN salaries \< 12345 THEN 'underpaid'
WHEN salaries = 12345 THEN 'average'
ELSE 'null'
END AS rating
from dupes
) test
GROUP by test.rating
ORDER BY
CASE
WHEN test.rating = 'Paid Well' THEN 1
WHEN test.rating = 'Average' THEN 2
WHEN test.rating = 'Underpaid' THEN 3
WHEN test.rating = 'null' THEN 99
ELSE 4
END
WITH detaildata AS (SELECT test.rating, COUNT(rating) AS total FROM ( select ID, name, salaries, CASE WHEN salaries \> 12345 THEN 'paid well' WHEN salaries \< 12345 THEN 'underpaid' WHEN salaries = 12345 THEN 'average' ELSE 'null' END AS rating from dupes ) test GROUP by test.rating ) SELECT rating, total FROM detaildata ORDER BY rating

Calculate Count as Percentage

I have looked around but I just can't seem to understand the logic. I think a good response is here, but like I said, it doesn't make sense, so a more specific explanation would be greatly appreciated.
So I want to show how often customers of each ethnicity are using an credit card. There are different types of credit cards, but if the CardID = 1, they used cash (hence the not equal to 1 statement).
I want to Group By ethnicity and show the count of transactions, but as a percentage.
SELECT Ethnicity, COUNT(distinctCard.TransactionID) AS CardUseCount
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
So for example, this is what it comes up with:
Ethnicity | CardUseCount
0 | 100
1 | 200
2 | 300
3 | 400
But I would like this:
Ethnicity | CardUsePer
0 | 0.1
1 | 0.2
2 | 0.3
3 | 0.4
If you need the percentage of card-transaction per ethnicity, you have to divide the cardtransactions per ethnicity by the total transactions of the same ethnicity. You don't need a sub query for that:
SELECT Ethnicity, sum(IIF(CardID=1,0,1))/count(1) AS CardUsePercentage
FROM TransactionT
INNER JOIN CustomerT
ON TransactionT.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
From your posted sample result to me it looks like you just wanted to divide the count by 1000 like
SELECT Ethnicity,
COUNT(distinctCard.TransactionID) / 1000 AS CardUseCount
FROM <rest part of query>
SELECT Ethnicity, COUNT(distinctCard.TransactionID) / (SELECT COUNT(1) FROM TransactionT WHERE CardID <> 1) AS CardUsePer
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
I think the answer you posted is your answer. As they said in your comments , you just count the transactions, you need to divide it by the number of total transactions. As stated in the answer, you need to divide the count(...) by the total number. This would be done as follows:
SELECT Ethnicity, COUNT(distinctCard.TransactionID)/(SELECT COUNT(TransactionT.TransactionID)
FROM TransactionT WHERE CardID <> 1)
AS CardUsePercent
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
This will give the result you want.
EDIT: This may be wrong, as i dont know the exact format of your tables, but i was assuming that the TransactionID field is Unique in the table. Else use the DISTINCT keyword, or the PK of your table , depending on your actual implemetation

How to subtract Total from conditioned sum in SQL

I want to do the following:
1) Find the total rows in a table
2) Find the total rows that meets a certain criteria.
3) Subtract (1) from (2).
Sample table Employees:
EmployeeID Nationality
1 Brazil
2 Korea
3 Germany
4 Brazil
5 Brazil
What I've tried:
SELECT count(EmployeeID) as Total from Employees
UNION
SELECT count(EmployeeID) as Brazilians from Employees
WHERE Nationality = 'Brazil'
Result:
Total
5
3
Row 1 will give me the total Employees. Row 2 will give me the Brazilian Employees.
I used UNION to see if I could subtract row 2 from row 1.
I could do this using CASE and SUM(), but that would require the row_number() function, which I can't use given that I'm using WebSQL. Is there another way to index these rows to be able to subtract?
Is there another approach I could use to solve this seemingly simple problem?
How about counting the rows that don't meet that criteria?
SELECT COUNT(EmployeedID) as non_brazilians
FROM Employees
WHERE Nationality <> 'Brazil';
You can use conditional aggregation:
select count(*) as TotalRows,
sum(case when Nationality = 'Brazil' then 1 else 0 end) as Brazilians,
sum(case when Nationality <> 'Brazil' then 1 else 0 end) as nonBrazilians
from Employee;
This assumes that Nationality is never NULL. If that is possible, the last condition should be:
sum(case when Nationality = 'Brazil' then 0 else 1 end) as nonBrazilians
Try this:
SELECT count(*) AS TotalRows
, (SELECT count(EmployeeID) FROM WHERE Nationality = 'Brazil') as Brazilians
, (count(*) - (SELECT count(EmployeeID) FROM WHERE Nationality = 'Brazil')) AS Subtract1From2
FROM Employee

Oracle SQL - Convert N rows' column values to N columns in 1 row

The trick with this compared to the other questions (e.g. "Oracle convert rows to columns") is that my column values are arbitrary strings, rather than something I can use with decode. Take this query:
The description table here maps people's names to descriptions, but each person can have multiple descriptions e.g. "wears a hat" or "is tall".
Select firstName, lastName,
(Select description from descriptions --This can return any number of rows (0 or more)
where description.firstName = people.firstName
and description.lastName = people.lastName
and rownum <= 3)
from people
where age >= 25;
I would want an output like this:
FIRSTNAME LASTNAME DESCRIPTION1 DESCRIPTION2 DESCRIPTION3
Jeremy Smith Tall Confused (null)
Anne Smith (Null) (Null) (Null)
Mark Davis Short Smart Strong
In the case of less than 3 descriptions, I want nulls there. In the case of more than 3 descriptions, I want to just leave them out.
I am using Oracle 11.1. Can this be done efficiently?
Assuming that you don't care what order the descriptions are returned in (i.e. Jeremy Smith could just as correctly have a Description1 or "Confused" and a Description2 of "Tall"), you just need to pivot on the row number. If you care about the order the descriptions are returned in, you can add an ORDER BY clause to the window function in the ROW_NUMBER analytic function
SELECT firstName,
lastName,
MAX( CASE WHEN rn = 1 THEN description ELSE NULL END ) description1,
MAX( CASE WHEN rn = 2 THEN description ELSE NULL END ) description2,
MAX( CASE WHEN rn = 3 THEN description ELSE NULL END ) description3
FROM (SELECT firstName,
lastName,
description,
row_number() over (partition by lastName, firstName) rn
FROM descriptions
JOIN people USING (firstName, lastName)
WHERE age >= 25)
GROUP BY firstname, lastname
As an aside, I'm hoping that you're actually storing a birth date and computing the person's age rather than storing the age and assuming that people are updating their age every year.
I have tried this option, but it says we should give order by clause inside row analytics function as shown below,
row_number() over (partition by lastName, firstName order by lastName, firstName) rn
It works fine for my scenario when i put order by clause.
My scenario is user details are in table A, usergroups are in table C, and association between users and usergroups in table B. One user can have multiple usergroups. I need to get results with username with multiple usergroups in a single row
**
Query:
**
SELECT username,
MAX( CASE WHEN rn = 1 THEN ugroup ELSE NULL END ) usergroup1,
MAX( CASE WHEN rn = 2 THEN ugroup ELSE NULL END ) usergroup2,
MAX( CASE WHEN rn = 3 THEN ugroup ELSE NULL END ) usergroup3,
MAX( CASE WHEN rn = 4 THEN ugroup ELSE NULL END ) usergroup4,
MAX( CASE WHEN rn = 5 THEN ugroup ELSE NULL END ) usergroup5,
from (
select
a.user_name username,
c.name ugroup,
row_number() over (partition by a.user_name order by a.user_name) rn
from users a,
usergroupmembership b,
usergroups c
where a.USER_NAME in ('aegreen',
'esportspau'
)
and a.user_id= b.user_id
and b.group_id=c.group_id
)group by uname;
**
Query Result
**
USERNAME USERGROUP1 USERGROUP2 USERGROUP3 USERGROUP4 USERGROUP5
aegreen US_GOLF (null) (null) (null) (null)
esportspau EMEA - FSERVICE USER_ES_ES EMEA-CR-ONLY (null) (null)

Collapse Multiple Records Into a Single Record With Multiple Columns

In a program I'm maintaining we were given a massive (~500 lines) SQL statement by the customer. It is used for generating flat files with fixed length records for transmitting data to another big business. Since its a massive flat file its not relational and the standard normal forms of data are collapsed. So, if you have a record that can have multiple codes associated, in this case upto 19, they all have be written into single line, but seperate fields, in the flat file.
Note: this example is simplified.
The data might look like this, with three tables:
RECORDS
record_id firstname lastname
--------------------------------
123 Bob Schmidt
324 George Washington
325 Ronald Reagan
290 George Clooney
CODE_TABLE
code_id code_cd code_txt
--------------------------------
5 3 President
2 4 Actor
3 7 Plumber
CODES_FOR_RECORDS
record_id code_cd
-------------------
123 7
325 3
290 4
324 3
325 4
123 4
This needs to produce records like:
firstname lastname code1 code2 code3
Bob Schmidt Actor Plumber NULL
George Washington President NULL NULL
Ronald Reagon Actor President NULL
George Clooney Actor NULL NULL
The portion of the current query we were given looks like this, but with 19 code columns instead of the 5:
select
x.record_id,
max(case when x.rankk = 1 then code_txt end) as CodeColumn1,
max(case when x.rankk = 2 then code_txt end) as CodeColumn2,
max(case when x.rankk = 3 then code_txt end) as CodeColumn3,
max(case when x.rankk = 4 then code_txt end) as CodeColumn4,
max(case when x.rankk = 5 then code_txt end) as CodeColumn5,
from
(
select
r.record_id,
ct.code_txt as ctag ,
dense_rank() over (partition by r.record_id order by cfr.code_id) as rankk
from
records as r
codes_for_records as cfr,
code_table as ct
where
r.record_id = cfr.record_id
and ct.code_cd = cfr.code_cd
and cfr.code_cd is not null
and ct.code_txt not like '%V%'
) as x
where
x.record_id is not null
group by
x.record_id
I trimmed down things for simplicties sake, but the actual statment includes an inner query and a join and more where conditions, but that should get the idea across. My brain is telling me there has to be a better way, but I'm not an SQL expert. We are using DB2 v8 if that helps. And the codes have to be in seperate columns, so no coalescing things into a single string. Is there a cleaner solution than this?
Update:
I ended up just refacorting the original query, it sill uses the ugly MAX() business, but overall the query is much more readable due to reworking other parts.
It sounds like what you are looking for is pivoting.
WITH joined_table(firstname, lastname, code_txt, rankk) AS
(
SELECT
r.firstname,
r.lastname,
ct.code_txt,
dense_rank() over (partition by r.record_id order by cfr.code_id) as rankk
FROM
records r
INNER JOIN
codes_for_records cfr
ON r.record_id = cfr.record_id
INNER JOIN
codes_table ct
ON ct.code_cd = cfr.code_cd
),
decoded_table(firstname, lastname,
CodeColumn1, CodeColumn2, CodeColumn3, CodeColumn4, CodeColumn5) AS
(
SELECT
firstname,
lastname,
DECODE(rankk, 1, code_txt),
DECODE(rankk, 2, code_txt),
DECODE(rankk, 3, code_txt),
DECODE(rankk, 4, code_txt),
DECODE(rankk, 5, code_txt)
FROM
joined_table jt
)
SELECT
firstname,
lastname,
MAX(CodeColumn1),
MAX(CodeColumn2),
MAX(CodeColumn3),
MAX(CodeColumn4),
MAX(CodeColumn5)
FROM
decoded_table dt
GROUP BY
firstname,
lastname;
Note that I've never actually done this myself before. I'm relying on the linked document as a reference.
You might need to include the record_id to account for duplicate names.
Edit: Added the GROUP BY.
One of the possible solutions is using of recursive query:
with recursive_view (record_id, rankk, final) as
(
select
record_id,
rankk,
cast (ctag as varchar (100))
from inner_query t1
union all
select
t1.record_id,
t1.rankk,
/* all formatting here */
cast (t2.final || ',' || t1.ctag as varchar (100))
from
inner_query t1,
recursive_view t2
where
t2.rankk < t1.rankk
and t1.record_id = t2.record_id
and locate(t1.ctag, t2.final) = 0
)
select record_id, final from recursive_view;
Can't guarantee that it works, but hope it will be helpful. Another way is using of custom aggregate function.