Ive been trying to use the GROUP function and also PIVOT but I cannot wrap my head around how to merge these tables and combine duplicate rows. Currently my SELECT statement returns results with duplicate UserID rows but I want to consolidate them into columns.
How would I join TABLE1 and TABLE2 into a new table which would look something like this:
NEW TABLE:
UserID Username ParentID 1 ParentID 2
--------- -------- -------- ----------
1 Dave 1 2
2 Sally 3 4
TABLE1:
UserID Username ParentID
--------- -------- --------
1 Dave 1
1 Dave 2
2 Sally 3
2 Sally 4
Table 2:
ParentID Username
--------- --------
1 Sarah
2 Joe
3 Tom
4 Mark
O r a c l e
The with clause is here just to generate some sample data and, as such, it is not a part of the answer.
After joining the tables you can use LAST_VALUE analytic function with windowing clause to get the next PARENT_ID of the user. That column (PARENT_ID_2) contains a value only within the first row of a particular USER_ID (ROW_NUMBER analytic function). Afterwords just filter out rows where PARENT_ID_2 is empty...
Sample data:
WITH
tbl_1 AS
(
Select 1 "USER_ID", 'Dave' "USER_NAME", 1 "PARENT_ID" From Dual Union All
Select 1 "USER_ID", 'Dave' "USER_NAME", 2 "PARENT_ID" From Dual Union All
Select 2 "USER_ID", 'Sally' "USER_NAME", 3 "PARENT_ID" From Dual Union All
Select 2 "USER_ID", 'Sally' "USER_NAME", 4 "PARENT_ID" From Dual
),
tbl_2 AS
(
Select 1 "PARENT_ID", 'Sarah' "USER_NAME" From Dual Union All
Select 2 "PARENT_ID", 'Joe' "USER_NAME" From Dual Union All
Select 3 "PARENT_ID", 'Tom' "USER_NAME" From Dual Union All
Select 4 "PARENT_ID", 'Mark' "USER_NAME" From Dual
)
Main SQL:
SELECT
*
FROM (
SELECT
t1.USER_ID "USER_ID",
t1.USER_NAME "USER_NAME",
t1.PARENT_ID "PARENT_ID_1",
CASE
WHEN ROW_NUMBER() OVER(PARTITION BY t1.USER_ID ORDER BY t1.USER_ID) = 1
THEN LAST_VALUE(t1.PARENT_ID) OVER(PARTITION BY t1.USER_ID ORDER BY t1.USER_ID ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
END "PARENT_ID_2"
FROM
tbl_1 t1
INNER JOIN
tbl_2 t2 ON(t1.PARENT_ID = t2.PARENT_ID)
)
WHERE PARENT_ID_2 Is Not Null
... and the Result ...
-- USER_ID USER_NAME PARENT_ID_1 PARENT_ID_2
-- ---------- --------- ----------- -----------
-- 1 Dave 1 2
-- 2 Sally 3 4
The windowing clause in this answer
ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING
takes curent and next row and returns the value defined by the analytic function (LAST_VALUE) taking care of grouping (PARTITION BY) and ordering of the rows. Regards...
This is mySql ver 5.6. Create a concatenated ParentID using group concat then separate the concatenated ParentID (1,2) and (3,4) into ParentID 1 and Parent ID 2.
SELECT t1.UserID,
t1.Username,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(t1.ParentID), ',', 1), ',', -1) AS `ParentID 1`,
SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(t1.ParentID), ',', 2), ',', -1) as `ParentID 2`
FROM TABLE1 t1
INNER JOIN TABLE2 t2 on t1.ParentID = t2.ParentID
GROUP BY t1.UserID
ORDER BY t1.UserID;
Result:
UserID Username ParentID 1 ParentID 2
1 Dave 1 2
2 Sally 3 4
Related
I have column children_ids which contain PKs from a STRING_AGG function. I am trying to use this column within a WHERE clause with the IN operator to return the total_pets but it doesn't work. If I copy and paste the values directly into the IN operator the query returns the correct info, otherwise no reuslts are found.
Here are my data sets:
Parents
=======
id parent_name
----------------
1 Bob and Mary
2 Mick and Jo
Children
========
id child_name parent_id
-------------------------
1 Eddie 1
2 Frankie 1
3 Robbie 1
4 Duncan 2
5 Rick 2
6 Jen 2
Childrens Pets
===============
id pet_name child_id
-------------------------
1 Puppy 1
2 Piggy 2
3 Monkey 3
4 Lamb 4
5 Tiger 5
6 Bear 6
7 Zebra 6
Expected Output
===============
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 3
2 4,5,6 4
Current [undesired] Output
==========================
parent_id children_ids total_pets
-----------------------------------
1 1,2,3 0
2 4,5,6 0
here is the standard sql to test for yourself
# setup data with standardSQL
WITH `parents` AS (
SELECT 1 id, 'Bob and Mary' parent_names UNION ALL
SELECT 2, 'Mick and Jo'
),
`children` AS (
SELECT 1 id, 'Eddie' child_name, 1 parent_id UNION ALL
SELECT 2, 'Frankie', 1 UNION ALL
SELECT 3, 'Robbie', 1 UNION ALL
SELECT 4, 'Duncan', 2 UNION ALL
SELECT 5, 'Rick', 2 UNION ALL
SELECT 6, 'Jen', 2
),
`childrens_pets` AS (
SELECT 1 id, 'Puppy' pet_name, 1 child_id UNION ALL
SELECT 2, 'Piggy', 2 UNION ALL
SELECT 3, 'Monkey', 3 UNION ALL
SELECT 4, 'Lamb', 4 UNION ALL
SELECT 5, 'Tiger', 5 UNION ALL
SELECT 6, 'Bear', 6 UNION ALL
SELECT 7, 'Zebra', 6
)
And the query:
#standardSQL
select
parent_id
, children_ids
-- !!! This keeps returning 0 instead of the total pets for each parent based on their children
, (
select count(p1.id)
from childrens_pets p1
where cast(p1.child_id as string) in (children_ids)
) as total_pets
from
(
SELECT
p.id as parent_id
, (
select string_agg(cast(c1.id as string))
from children as c1
where c1.parent_id = p.id
) as children_ids
FROM parents as p
join children as c
on p.id = c.parent_id
join childrens_pets as cp
on cp.child_id = c.id
)
GROUP BY
parent_id
, children_ids
... but is there a way to do it using the IN operator as my query ...
Just fix one line and it will work for you!
Replace
WHERE CAST(p1.child_id AS STRING) IN (children_ids)
with
WHERE CAST(p1.child_id AS STRING) IN (SELECT * FROM UNNEST(SPLIT(children_ids)))
Huh? This would seem to do what you want:
SELECT p.id as parent_id,
string_agg(distinct cast(c.id as string)) as children_ids
count(distinct cp.id) as num_pets
FROM parents p JOIN
children c
ON p.id = c.parent_id JOIN
children_pets cp
ON cp.child_id = c.id
GROUP BY parent_id;
I am trying to figure out the best way to remove rows from a result set where either the value in one column or the value in a different column has a duplicate in the result set.
Imagine the results of a query are as follows:
a_value | b_value
-----------------
1 | 1
2 | 1
2 | 2
3 | 1
4 | 3
5 | 2
6 | 4
6 | 5
What I want to do is:
Eliminate all rows that have duplicate values in a_value
Pick only 1 row for a given b_value
So I'd want the filtered results to end up like this after eliminating a_value duplicates:
a_value | b_value
-----------------
1 | 1
3 | 1
4 | 3
5 | 2
And then like this after picking only a single b_value:
a_value | b_value
-----------------
1 | 1
4 | 3
5 | 2
I'd appreciate suggestions on how to accomplish this task in an efficient way via SQL.
with
q_res ( a_value, b_value ) as (
select 1, 1 from dual union all
select 2, 1 from dual union all
select 2, 2 from dual union all
select 3, 1 from dual union all
select 4, 3 from dual union all
select 5, 2 from dual union all
select 6, 4 from dual union all
select 6, 5 from dual
)
-- end test data; solution begins below
select min(a_value) as a_value, b_value
from (
select a_value, min(b_value) as b_value
from q_res
group by a_value
having count(*) = 1
)
group by b_value
order by a_value -- ORDER BY is optional
;
A_VALUE B_VALUE
------- -------
1 1
4 3
5 2
1) In the inner query I am avoiding all duplicates which are present in a_value
column and getting all the remaining rows from input table and storing them
as t2. By joining t2 with t1 there would be full data without any dups as per
your #1 in requirement.
SELECT t1.*
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value;
2) Once the filtered data is obtained, I am assigning rank to each row in the filtered dataset obtained in step-1 and I am selecting only rows with rank=1.
SELECT X.a_value,
X.b_value
FROM
(
SELECT t1.*,
ROW_NUMBER() OVER ( PARTITION BY t1.b_value ORDER BY t1.a_value,t1.b_value ) AS rn
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value
) X
WHERE X.rn = 1;
I have two tables:
METHOD_TYPES
---- ----------------
ID Methods_Type
---- ----------------
1 public
2 ALL_Methods
3 private1235678
4 social
METHOD_TABLE
-------- ----------------- ----------
Ser_ID Ser_Method_Type Emp_Name
-------- ----------------- ----------
1 (null) AAAA
2 (null) BBBB
3 All_Methods Rama
4 social Raja
5 private12345678 Rakesh
I used the below query for the ORDER BY:
SELECT SUBSTR(Methods_Type, 1, 10) AS disMisType
FROM METHOD_TABLE MET
LEFT JOIN METHOD_TYPES TRMT
ON MET.Ser_Method_Type = TRMT.Methods_Type
ORDER BY (NLSSORT(MET.Ser_Method_Type, 'NLS_SORT=binary_ai')) DESC NULLS FIRST;
OUTPUT:
(null)
All_Methods
(null)
social
private12345678
But I need to order all the nulls first.
Kindly provide the exact query.
Using the data you provided - and adding in the extra columns, I get:
with method_types as (select 1 id, 'public' methods_type from dual union all
select 2 id, 'ALL_Methods' methods_type from dual union all
select 3 id, 'private1235678' methods_type from dual union all
select 4 id, 'social' methods_type from dual),
method_table as (select 1 ser_id, null ser_method_type, 'AAAA' emp_name from dual union all
select 2 ser_id, null ser_method_type, 'BBBB' emp_name from dual union all
select 3 ser_id, 'All_Methods' ser_method_type, 'Rama' emp_name from dual union all
select 4 ser_id, 'social' ser_method_type, 'Raja' emp_name from dual union all
select 5 ser_id, 'private12345678' ser_method_type, 'Rakesh' emp_name from dual)
select substr(trmt.methods_type,1,10) as dismistype,
met.*,
trmt.*
from method_table met
left join method_types trmt on (met.ser_method_type = trmt.methods_type)
order by (nlssort(met.ser_method_type, 'NLS_SORT=binary_ai')) desc nulls first;
DISMISTYPE SER_ID SER_METHOD_TYPE EMP_NAME ID METHODS_TYPE
------------------------------ ---------- --------------- -------- ---------- --------------
1 AAAA
2 BBBB
social 4 social Raja 4 social
5 private12345678 Rakesh
3 All_Methods Rama
which is not what your expected output shows, but it does maybe explain why you see nulls apparently out of order in your results - you're selecting the trmt.methods_type column, but ordering by the met.ser_method_type column. If there aren't any rows in the method_types table matching those in the method_table, then of course you will see nulls, but because there IS a value in the method_table, they may well be displayed after rows that do have a value.
Perhaps all you need to do is to change the column being selected
from substr(trmt.methods_type,1,10)
to substr(met.ser_method_type,1,10)
or change the order clause
from nlssort(met.ser_method_type, 'NLS_SORT=binary_ai')
to nlssort(trmt.methods_type, 'NLS_SORT=binary_ai')
I'm not sure why your query is not working, but you can have a more explicit order by:
ORDER BY (CASE WHEN MET.Ser_Method_Type IS NULL THEN 1 ELSE 2 END),
NLSSORT(MET.Ser_Method_Type, 'NLS_SORT=binary_ai') DESC
You can create a CASE Column only for order:
select SUBSTR(Methods_Type,1,10)AS disMisType,
SUBSTR(CASE WHEN Methods_Type IS NULL THEN '0' ELSE Methods_Type END ,1,10) AS disMisTypeORDER
FROM METHOD_TABLE MET
LEFT JOIN METHOD_TYPES TRMT
ON MET.Ser_Method_Type = TRMT.Methods_Type
ORDER BY disMisTypeORDER
So I am trying to pull rows from a table where there are more than one version for an ID that has at least one person for the ID that is not null but the versions that come after it are null.
So, if i had a statement like:
select ID, version, person from table1
the output would be:
ID Version Person
-- ------- ------
1 1 Tom
1 2 null
1 3 null
2 1 null
2 2 null
2 3 null
3 1 Mary
3 2 Mary
4 1 Joseph
4 2 null
4 3 Samantha
The version number can have an infinite value and is not limited.
I want to pull ID 1 version 2/3, and ID 4 Version 2.
So in the case of ID 2 where the person is null for all three rows I don't need these rows. And in the case of ID 3 version 1 and 2 I don't need these rows because there is never a null value.
This is a very simple version of the table I am working with but the "real" table is a lot more complicated with a bunch of joins already in it.
The desired output would be:
ID Version Person
-- ------- ------
1 2 null
1 3 null
4 2 null
The result set that I am looking for is where in a previous version for the same ID there was a person listed but is now null.
You are seeking all rows where the person is not null and that id has null rows, and the not null person version is less than the null version for the same person id:
Edited predicate based on comment
with sample_data as
(select 1 id, 1 version, 'Tom' person from dual union all
select 1, 2, null from dual union all
select 1, 3, null from dual union all
select 2, 1, null from dual union all
select 2, 2, null from dual union all
select 2, 3, null from dual union all
select 3, 1, 'Mary' from dual union all
select 3, 2, 'Mary' from dual union all
select 4, 1, 'Joseph' from dual union all
select 4, 2, null from dual union all
select 4, 3, 'Samantha' from dual)
select *
from sample_data sd
where person is null
and exists
(select 1 from sample_data
where id = sd.id
and person is not null
and version < sd.version);
/* Old predicate
and id in
(select id from sample_data where person is not null);
*/
I think this query translates pretty nicely into what you asked for?
List all the rows (R) where the person is null, but only if a previous row (P) with a non-null name exists.
select *
from table1 r
where r.person is null
and exists(
select 'x'
from table1 p
where p.id = r.id
and p.version < r.version
and p.person is not null
);
I believe the below should work.
select ID, listagg(version, ', ') within group (order by version) as versions
from table1 t1
where 0 < (select count(*) from table1 t1A where t1A.ID = t1.ID and t1A.version is not null)
and 0 < (select count(*) from table1 t1B where t1B.ID = t1.ID and t1B.version is null)
and person is null
group by ID
This should do what you want:
select id, version, person
from
(
select id, version, person,
lag(person, 1) ignore nulls
over (partition by id
order by version) as x
from table1
) dt
where person is null
and x is not null
Needing some help with SQL Server select query here.
I have the following tables defined:
UserSource
UserSourceID ID Name Dept SourceID
1 1 John AAAA 1
2 1 John AAAA 2
3 2 Nena BBBB 1
4 2 Nena BBBB 2
5 3 Gord AAAA 2
6 3 Gord AAAA 1
7 4 Stan CCCC 3
Source
SourceID Description RankOrder
1 FromHR 1
2 FromTemp 2
3 Others 3
Need to join both tables and select only the row where the rank is the smallest. Such that the resulting row would be:
UserSourceID ID Name Dept SourceID Description RankOrder
1 1 John AAAA 1 FromHR 1
3 2 Nena BBBB 1 FromHR 1
6 3 Gord AAAA 1 FromHR 1
7 4 Stan CCCC 3 Others 3
TIA.
Edit:
Here's what I have come up so far, but I seem to be missing something:
WITH
TableA AS(
SELECT 1 AS UserSourceID, 1 AS ID, 'John' AS [Name], 'AAAA' as [Dept], 1 as SourceID
UNION SELECT 2, 1, 'John', 'AAAA', 2
UNION SELECT 3, 2, 'Nena', 'BBBB', 1
UNION SELECT 4, 2, 'Nena', 'BBBB', 2
UNION SELECT 5, 3, 'Gord', 'AAAA', 2
UNION SELECT 6, 3, 'Gord', 'AAAA', 1
UNION SELECT 7, 4, 'Stan', 'DDDD', 3)
,
TableB AS(
SELECT 1 as SourceID, 'FromHR' as [Description], 1 as RankOrder
UNION SELECT 2, 'FromTemp', 2
UNION SELECT 3, 'Others', 3
)
SELECT DISTINCT tblA.*, tblB.SourceID, tblB.Description
FROM TableB tblB
JOIN TableA tblA ON tblA.SourceID = tblB.SourceID
LEFT JOIN TableB b2 ON b2.SourceID = tblB.SourceID
AND B2.RankOrder < tblB.RankOrder
WHERE B2.SourceID IS NULL
UPDATE:
I scanned the tables and there might be some variations of data. I have updated the data for the question as above.
Practically, I need to join these two tables, and be able to only select the row which would have the least RankOrder. In case of record UserSourceID = 7, that particular record would be selected because there's only one row that exists after the tables have been joined.
I use windowed aggregates for this type of solution pretty regularly. ROW_NUMBER will order and number the rows based on the PARTITION and ORDER you specify in the OVER clause.
select UserSoruceID
, ID
, Name
, Dept
, SourceID
, Description
, RankOrder
FROM (SELECT UserSoruceID
, ID
, Name
, Dept
, u.SourceID
, Description
, RankOrder
, ROW_NUMBER() over(PARTITION BY ID ORDER BY RankOrder) ranknum
FROM UserSource u
INNER JOIN
Source s
on s.SourceID = u.SourceID ) a
WHERE ranknum = 1
So in this case, for every ID, number the rows based on RankOrder, and then filter where so you only view the first row.
Here's a helpful link to that function from Microsoft. ROW_NUMBER
----UPDATE----
Here's with Rank and Row Number as options.
select UserSoruceID
, ID
, Name
, Dept
, SourceID
, Description
, RankOrder
FROM (SELECT UserSoruceID
, ID
, Name
, Dept
, u.SourceID
, Description
, RankOrder
, ROW_NUMBER() over(PARTITION BY ID ORDER BY RankOrder) row_num
, RANK() over(PARTITION BY ID ORDER BY RankOrder) rank_num --use this if you want to see the duplicate records
FROM UserSource u
INNER JOIN
Source s
on s.SourceID = u.SourceID ) a
WHERE row_num = 1 --rank_num = 1
Replace row_num with rank_num to view any items with duplicate RankOrder entries