selecting one duplicate from re-occurances with only one varying colum SQL - sql

Current State
id | val | varchar_id| uid
----------------------
1 | 1 | A4D NEWID()
1 | 2 | A3G NEWID()
2 | 1 | 7S3 NEWID()
2 | 1 | 43E NEWID()
2 | 2 | 7S3 NEWID()
2 | 2 | 431 NEWID()
3 | 1 | 432 NEWID()
3 | 2 | 43P NEWID()
Ideal state
id | val | varchar_id|
----------------------
1 | 1 | A4D NEWID()
1 | 2 | A3G NEWID()
2 | 1 | 7S3 NEWID()
2 | 2 | 7S3 NEWID()
3 | 1 | 432 NEWID()
3 | 2 | 43P NEWID()
Removing of duplicate occurrences of id + val
I have tried (pseudo code below):
SELECT *
from table
WHERE uid = MAX
GROUP BY id, val
Does anyone know of a solution to this/ am I missing something here? I do not mind which of the duplicates are returned.
Also, the version of Sybase I am using does not allow Partition x over x,y functionality.

Using SQL you can do it this way. Also your where clause isn't what SQL supports.
DECLARE #T TABLE (ID INT, Val INT, V_ID VARCHAR(50), uidd UNIQUEIDENTIFIER)
INSERT INTO #T VALUES
(1,1,'A4D',NEWID()),
(1,2,'A3G',NEWID()),
(2,1,'7S3',NEWID()),
(2,2,'43E',NEWID()),
(2,2,'7S3',NEWID()),
(2,2,'431',NEWID()),
(3,1,'432',NEWID()),
(3,2,'43P',NEWID())
SELECT t.id, t.Val, MAX(V_ID) AS varchar_id, MAX(uidd)
FROM #T AS t
GROUP BY id, val
ORDER BY id, val
This will give you the result
+---+----+-----------+-------------------------------------+
|id |Val |varchar_id |uid |
+---+----+-----------+-------------------------------------+
|1 |1 |A4D |5296ACE4-573A-4A7E-882F-516EA8E9DBDD |
|1 |2 |A3G |3EE82BEE-8C18-4415-BB3D-110F443409B5 |
|2 |1 |7S3 |68DBF7B3-316D-4A8B-B8AD-8825EC83585D |
|2 |2 |7S3 |01C54277-7156-47E1-9205-DD577A726196 |
|3 |1 |432 |6F53F332-FC9C-4EE1-A3D2-1D0FD002DDAF |
|3 |2 |43P |7B532EBD-E6C9-4BE4-B0F7-FCBCB9CE1D61 |
+---+----+-----------+-------------------------------------+

Related

Return top 10 values from each combination of codes from two columns in SQL

For my analysis, I need 10 records from each combination two columns that hold channel and category codes. For example:
|COUNT| Channel_Code | Category_Code |
|————-| ------—————— | ------——————- |
|9526 | ABC | DEF |
|4527 | ABC | JFK |
|10 | ABC | 123 |
|912 | WED | MLK |
|75 | KJJ | ONL |
|1000 | WED | DEF |
I only have tried filtering on
WHERE channel_code = ABC
AND Category_Code = DEF
Sample 10;
Also they using rownum as well, but no luck.
What I’m expecting the output to look like:
|RECORD NUM| Channel_Code | Category_Code |
|—————————-| ------—————— | ------——————- |
|1 | ABC | DEF |
|2 | ABC | DEF |
|3 | ABC | DEF |
|4 | ABC | DEF |
|5 | ABC | DEF |
|6 | ABC | DEF |
Etc… up until the 10th record. Then the next combination will start with 10 records of ABC and JFK
Is there a way to partition this in Teradata SQL? Or another possible solution. Thanks for your help!
You can use row_number as you mentioned:
SELECT
record_num, channel_code, category_code
FROM (SELECT record_num, channel_code, category_code,
ROW_NUMBER over (partition by channel_code, category_code order by record_num asc) as rn
FROM table_name
)
WHERE rn<=10
If you are basically trying to create these rows, you can use a cross join to a simple numbers table.
create volatile table vt_nums
(num integer)
on commit preserve rows;
insert into vt_nums values(1);
insert into vt_nums values(2);
insert into vt_nums values(3);
insert into vt_nums values(4);
insert into vt_nums values(5);
And here's some made up data to join with:
create volatile table vt_foo
(col1 varchar(10))
on commit preserve rows;
insert into vt_foo values ('a');
insert into vt_foo values ('b');
Finally:
select
vt_nums.num,
vt_foo.col1
from
vt_foo
cross join vt_nums
order by
2,1
Which will return:
num col1
1 a
2 a
3 a
4 a
5 a
1 b
2 b
3 b
4 b
5 b

SQL Pagination including duplicate rows

I'm having some trouble solving an issue with pagination in SQL.
I'm stuck trying to fill a #PageSize variable in my stored procedure that comes from some ODATA, However the value from ODATA doesn't get me what I'm after necessarily. My query you see returns results like this.
+----+----------+
| ID | PersonID |
+----+----------+
| 1 | 1 |
+----+----------+
| 1 | 2 |
+----+----------+
| 2 | 1 |
+----+----------+
| 2 | 2 |
+----+----------+
| 2 | 3 |
+----+----------+
| 3 | 4 |
+----+----------+
| 3 | 4 |
+----+----------+
Obviously if I got a #PageResult = 5 from OData, it would just return 5 rows, but I want it to return x occurrences of ID.
To demonstrate what I basically want, is that if my #PageSize is 1, my sproc return this.
+----+----------+
| ID | PersonID |
+----+----------+
| 1 | 1 |
+----+----------+
| 1 | 2 |
+----+----------+
If it is 2, I return this.
+----+----------+
| ID | PersonID |
+----+----------+
| 1 | 1 |
+----+----------+
| 1 | 2 |
+----+----------+
| 2 | 1 |
+----+----------+
| 2 | 2 |
+----+----------+
| 2 | 3 |
+----+----------+
And so on. I'm having no end of trouble trying to get it to return data this way, I've tried doing things like distinct top(#pagesize) ID but it always seems to get the order wrong so it misses ID's and dense_ranks don't appear to do the job either. I imagine this is causing me so much hassle because there is no default order in SQL so the solution is not so obvious. Can any of you suggest how I might achieve this?
The closest I've gotten is with this
SET #PageSize = (select COUNT(personId) from #temptable WHERE ID IN (SELECT DISTINCT TOP(#PageSize) ID From #temptable))
Try something like this:
declare #t table(ID int, PersonID int)
insert into #t(ID,PersonID) values
(1,1),(1,2),(2,1),(2,2),(2,3),(3,3),(3,4);
with q as
(
select id, row_number() over (order by ID) rn
from #t
group by id
)
select *
from #t
where id in
(
select id
from q
where rn between 1 and 2
)
order by ID, PersonID
which outputs
ID PersonID
----------- -----------
1 1
1 2
2 1
2 2
2 3
(5 rows affected)

How to get a hierarchy table in Sql Server

I would like to create a table that shows the hierarchy of another SQL Server table.
I have a table with the following structure
+-----------+----------+
| AccountID | ParentID |
+-----------+----------+
| 1 | |
+-----------+----------+
| 2 | 1 |
+-----------+----------+
| 3 | 1 |
+-----------+----------+
| 4 | 2 |
+-----------+----------+
| 5 | 3 |
+-----------+----------+
| 6 | 5 |
+-----------+----------+
and would like to get another table with the following structure
+-----------+------+
| AccountID | Path |
+-----------+------+
| 1 | 1 |
+-----------+------+
| 2 | 1 |
+-----------+------+
| 2 | 2 |
+-----------+------+
| 3 | 1 |
+-----------+------+
| 3 | 3 |
+-----------+------+
| 4 | 1 |
+-----------+------+
| 4 | 2 |
+-----------+------+
| 4 | 4 |
+-----------+------+
| 5 | 1 |
+-----------+------+
| 5 | 3 |
+-----------+------+
| 5 | 5 |
+-----------+------+
| 6 | 1 |
+-----------+------+
| 6 | 3 |
+-----------+------+
| 6 | 5 |
+-----------+------+
| 6 | 6 |
+-----------+------+
Note: In the Parents ID field you must always include your own ID, i.e., 1-1, 2-2, etc.
If you see in the first table, for AccountID 1, there is no ParentID, because it is the highest hierarchical level. But in the table I need to extract, you see that for AccountID 1 the value 1 appears in the Path column. The same happens for the rest of the values, that is, for AccountID 2, in the result table AccountID 1 appears (its superior hierarchical value), but it is also necessary that it includes the value 2. And so for the rest of the values in the AccountID column.
Setup sample data:
create table Account
(
AccountID INT,
ParentID INT NULL
)
INSERT INTO Account(AccountID, ParentID)
VALUES
(1, NULL),
(2,1),
(3,1),
(4,2),
(5,3),
(6,5)
I'm not able to get this results. Could you help me?
Thanks in advance
As mentioned, the easiest way to achieve this is with a rCTE, and the recurse down each level of the hierarchy until you get to the bottom:
--Sample Data
WITH YourTable AS(
SELECT V.AccountID,
V.[Path]
FROM (VALUES(1,NULL),
(2,1),
(3,1),
(4,2),
(5,3),
(6,5))V(AccountID,[Path])),
--Solution
rCTe AS(
SELECT YT.AccountID AS RootID,
YT.AccountID,
YT.[Path]
FROM YourTable YT
UNION ALL
SELECT r.RootID,
YT.AccountID,
YT.[Path]
FROM rCTe r
JOIN YourTable YT ON r.[Path] = YT.AccountID)
SELECT r.RootID AS AccountID,
r.AccountID AS [Path]
FROM rCTe r
ORDER BY AccountId,
[Path];
DB<>Fiddle
I tried with this sentence, based on your sentence,
WITH rCTe AS (
SELECT YT.Accountid AS RootID,
YT.Accountid,
YT.Parentaccountid
FROM PBI_OrganizacionJerarquica YT
UNION ALL
SELECT r.RootID,
YT.Accountid,
YT.Parentaccountid
FROM rCTe r
JOIN PBI_OrganizacionJerarquica YT ON r.Parentaccountid = YT.Accountid)
SELECT r.RootID AS AccountID,
r.Accountid AS [Path]
FROM rCTe r
ORDER BY AccountId,
[Path];
and I get this error
Msg 319, Level 15, State 1, Line 3
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

SQL: How to select distinct on some columns

I have a table looking something like this:
+---+------------+----------+
|ID | SomeNumber | SomeText |
+---+------------+----------+
|1 | 100 | 'hey' |
|2 | 100 | 'yo' |
|3 | 100 | 'yo' | <- Second occurrence
|4 | 200 | 'ey' |
|5 | 200 | 'hello' |
|6 | 200 | 'hello' | <- Second occurrence
|7 | 300 | 'hey' | <- Single
+---+------------+----------+
I would like to extract the rows where SomeNumber appears more than ones, and SomeNumbers and SomeText are distinct. That means I would like the following:
+---+------------+----------+
|ID | SomeNumber | SomeText |
+---+------------+----------+
|1 | 100 | 'hey' |
|2 | 100 | 'yo' |
|4 | 200 | 'ey' |
|5 | 200 | 'hello' |
+---+------------+----------+
I don't know what to do here.
I need something along the lines:
SELECT t.ID, DISTINCT(t.SomeNumber, t.SomeText) --this is not possible
FROM (
SELECT mt.ID, mt.SomeNumber, mt.SomeText
FROM MyTable mt
GROUP BY mt.SomeNumber, mt.SomeText --can't without mt.ID
HAVING COUNT(*) > 1
)
Any suggestions?
Using a cte with row number and count rows might get you what you need:
Create and populate sample table (Please save us this step in your future questions):
CREATE TABLE MyTable(id int, somenumber int, sometext varchar(10));
INSERT INTO MyTable VALUES
(1,100,'hey'),
(2,100,'yo'),
(3,100,'yo'),
(4,200,'ey'),
(5,200,'hello'),
(6,200,'hello'),
(7,300,'hey');
The query:
;WITH cte as
(
SELECT id,
someNumber,
someText,
ROW_NUMBER() OVER (PARTITION BY someNumber, someText ORDER BY ID) rn,
COUNT(id) OVER (PARTITION BY someNumber) rc
FROM MyTable
)
SELECT id, someNumber, someText
FROM cte
WHERE rn = 1
AND rc > 1
Results:
id someNumber someText
1 100 hey
2 100 yo
4 200 ey
5 200 hello

Multiple selects really needed?

I have the following table.
____________________________________
| carid | changeid | data1 | data2 |
|_______|__________|_______|_______|
| 1 | 1 |a |b |
| 1 | 2 |c |d |
| 1 | 3 |e |f |
| 2 | 3 |g |h |
| 2 | 2 |i |j |
| 2 | 4 |k |l |
| 3 | 5 |m |n |
| 3 | 1 |o |p |
| 4 | 6 |q |r |
| 4 | 2 |s |t |
|_______|__________|_______|_______|
I want to select the following result:
| carid | changeid | data1 | data2 |
|_______|__________|_______|_______|
| 1 | 1 |a |b |
| 1 | 2 |c |d |
| 1 | 3 |e |f |
| 3 | 5 |m |n |
| 3 | 1 |o |p |
|_______|__________|_______|_______|
In words:
If a row has changeid=1 I want to select all the rows with the same carid as the row with changeid=1.
This problem is quite easy to solve with a query using multiple selects. First select all rows with changeid=1 and take those carids and select all rows with those carids. Simple enough.
I was more wondering if it is possible to solve this problem without using multiple selects? Preferably I'm looking for a faster solution but I can try that out myself.
You can join the table back to itself
SELECT DISTINCT a.*
FROM YourTable a
INNER JOIN YourTable b ON b.carid = a.carid and b.changeid = 1
Table a is all the rows you want to output, filtered by table b which limits the set to those with changeid = 1.
This should have excellent performance as everything is done in a set oriented manner.
DISTINCT may not be necessary if changeid 1 may only occur once, and should be avoided if possible as it may introduce a significant performance hit for a large result set.
For multiple select you mean using IN?
SELECT carid, changeid, data1, data2
FROM YourTable
WHERE carid IN (SELECT carid FROM YourTable WHERE changeid = 1)
Most databases support window functions. You can do this as:
select carid, changeid
from (select t.*,
max(case when changeid = 1 then 1 else 0 end) over
(partition by carid) as HasChangeId1
from YourTable t
) t
where HasChangeId1 = 1;
If the "1" is the minimum value for the change id, this can be simplified to:
select carid, changeid
from (select t.*,
min(changeid) over (partition by carid) as MinChangeId
from YourTable t
) t
where MinChangeId = 1;
It sounds like you're after only the combinations of carid and changeid present in the table, in which case the DISTINCT will return only the unique combinations for you. Not sure if that is what you're after but give it a go and check it for your expected behaviour...
SELECT DISTINCT CARID, CHANGEID FROM UnknownTable