Select all related records - sql

I have a table (in SQL Server) that stores records as shown below. The purpose for Old_Id is for change tracking.
Meaning that when I want to update a record, the original record has to be unchanged, but a new record has to be inserted with a new Id and with updated values, and with the modified record's Id in Old_Id column
Id Name Old_Id
---------------------
1 Paul null
2 Paul 1
3 Jim null
4 Paul 2
5 Tim null
My question is:
When I search for id = 1 or 2 or 4, I want to select all related records.
In this case I want see records the following ids: 1, 2, 4
How can it be written in a stored procedure?
Even if it's bad practice to go with this, I can't change this logic because its legacy database and it's quite a large database.
Can anyone help with this?

you can do that with Recursive Common Table Expressions (CTE)
WITH cte_history AS (
SELECT
h.id,
h.name,
h.old_id
FROM
history h
WHERE old_id IS NULL
and id in (1,2,4)
UNION ALL
SELECT
e.id,
e.name,
e.old_id
FROM
history e
INNER JOIN cte_history o
ON o.id = e.old_id
)
SELECT * FROM cte_history;

Related

How to remove Repeated field in BigQuery schema?

I have a schema that has a repeated field nested into another repeated field like so: person.children.toys. I want to make this inner field not repeated (so child can have only single nullable toy). I know that for such change I need to make a new table with new schema and run SQL query that inserts modified results into it, but I don't know how to make the query. I need it to select first toy (or null) for each child and insert resulting objects into new table. There is a guarantee that in source table all children have no more than 1 toy.
Below is for BigQuery Standard SQL
I know - it might look over-complicated - but it totally preserves original schema while eliminating all but first (or null) toys. This can be handy if your real schema has more than just few fields so you don't need to worry about them
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, STRUCT([STRUCT('mike' AS name, ['woody'] AS toys)] AS children) AS person UNION ALL
SELECT 2 id, STRUCT([STRUCT('nik', ['buzz', 'bobeep']), ('john', ['car', 'buzz', 'bobeep'])] AS children) AS person UNION ALL
SELECT 3 id, STRUCT([STRUCT('vincent', IF(TRUE,[],['']))] AS children) AS person
)
SELECT *
REPLACE(
(SELECT AS STRUCT *
REPLACE (
(SELECT ARRAY_AGG(t) FROM
(SELECT * REPLACE((SELECT toy FROM UNNEST(toys) toy WITH OFFSET ORDER BY OFFSET LIMIT 1) AS toys) FROM UNNEST(children)) t)
AS children)
FROM UNNEST([person]))
AS person)
FROM `project.dataset.table`
If to apply to below data
Row id person.children.name person.children.toys
1 1 mike toy1
2 2 nik toy2
toy3
john toy4
toy5
toy6
3 3 vincent
result will be
Row id person.children.name person.children.toys
1 1 mike toy1
2 2 nik toy2
john toy4
3 3 vincent null
Note: toys field originally REPEATED STRING becomes just STRING
I could give you a better answer if you had a better described schema, but with the data provided:
CREATE OR REPLACE TABLE `temp.flat` AS
WITH data AS (
SELECT 1 id, STRUCT([STRUCT(['woody']AS toy)] AS children) AS person
UNION ALL
SELECT 2 id, STRUCT([STRUCT(['buzz', 'bobeep'])] AS children) AS person
UNION ALL
SELECT 3 id, STRUCT([STRUCT(IF(true,[],['']))] AS children) AS person
)
SELECT id, person.children[SAFE_OFFSET(0)].toy[SAFE_OFFSET(0)] first_toy
FROM `data`
Goes from:
To:

Eliminate NULL records in distinct select statement

In SQL SERVER 2008
Relation : Employee
empid clock-in clock-out date Cmpid
1 10 11 17-06-2015 001
1 11 12 17-06-2015 NULL
1 12 1 NULL 001
2 10 11 NULL 002
2 11 12 NULL 002
I need to populate table temp :
insert into temp
select distinct empid,date from employee
This gives all
3 records since they are distinct but what
I need is
empid date CMPID
1 17-06-2015 001
2 NULL 002
Depending on the size and scope of your table, it might just be more prudent to add
WHERE columnName is not null AND columnName2 is not null to the end of your query.
Null is different from other date value. If you wont exclude null record you have to add a and condition like table.filed is not null.
It sounds like what you want is a result table containing a row or tuple (relational databases don't have records) for every employee with a date column showing the date on which the worked or null if they didn't work. Right?
Something like this should do you:
select e.employee_id
from ( select distinct
empid
from employee
) master
left join employee detail on detail.empid = master.empid
and detail.date is not null
The master virtual table gives you the set of destinct employees; the detail gives you employees with non-null dates on which they worked. The left join gives you everything from master with any matches from detail blended in.
Rows in master with no matching rows in details, are returned once with the contributing columns from detail set to null. Rows in master with matching rows in detailare repeated once for each such match, with the detail columns reflecting the matching row's values.
This will give you the lowest date or null for each empid
SELECT empid,
MIN(date) date,
MIN(cmpid) cmpid
FROM employee
GROUP BY empid
try this
select distinct empid,date from employee where date is not null

sql select query self join or loop through to fetch records [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I have this kind of scenario in sql server I have table named Room and here is the data of it and I want output something like this as shown in this picture I have tried to show my table named room and then on top of it I have placed tag input which have RoomId,ConnectingRoomID and many more other columns now what I want is a sql select query that can return me the scenario I have placed with tag name output..
These values are self created I have thousand of rooms and in room table and thousand of connecting room with it hope my question is clear enough thanks.
I think you can use this:
with x as (
select *, sum(case connectingroomid when 0 then 1 else 0 end) over(order by roomid) as grp
from rooms
)
select x.roomid, (select min(x2.roomid) as min_roomid from x x2 where x2.grp = x.grp) as connectingroomid
from x
This is a recursive query: For all rooms go to the connecting room till you find the one that has no more connecting room (i.e. connecting room id is 0).
with rooms (roomid, connectingroomid) as
(
select
roomid,
case when connectingroomid = 0 then
roomid
else
connectingroomid
end as connectingroomid
from room
where connectingroomid = 0
union all
select room.roomid, rooms.connectingroomid
from room
inner join rooms on room.connectingroomid = rooms.roomid
)
select * from rooms
order by connectingroomid, roomid;
Here is the SQL fiddle: http://www.sqlfiddle.com/#!3/46ed0/1.
EDIT: Here is the explanation. Rather than doing this in the comments I am doing it here for better readability.
The WITH clause is used to create a recursion here. You see I named it rooms and inside rooms I select from rooms itself. Here is how to read it: Start with the part before UNION ALL. Then recursively do the part after UNION ALL. So, before UNION ALL I only select the records where connectingroomid is zero. In your example you show every room with its connectingroomid except for those with connectingroomid for which you show the room with itself. I use CASE here to do the same. But now that I am explaining this, I notice that connectingroomid is always zero because of the WHERE clause. So the statement can be simplified thus:
with rooms (roomid, connectingroomid) as
(
select
roomid,
roomid as connectingroomid
from room where connectingroomid = 0
union all
select room.roomid, rooms.connectingroomid
from room
inner join rooms on room.connectingroomid = rooms.roomid
)
select * from rooms
order by connectingroomid, roomid;
The SQL fiddle: http://www.sqlfiddle.com/#!3/46ed0/2.
With the part before the UNION ALL I found the two rooms without connecting room. Now the part after UNION ALL is executed for the two rooms found. It selects the rooms which connecting room was just found. And then it selects the rooms which connecting room was just found. And so on till the join returns no more rooms.
Hope this helps understanding the query. You can look for "recursive cte" on the Internet to find more examples and explanations on the topic.
select RoomID,
(Case when RoomID<=157 then 154
else 158 end) ConnectingRoomID
from Input
First of all, your output is not correct: Room 154 should also connect to room 0 :-)
What you are after is the transitive closure of the relation defined by the table Room. It is impossible to get this with "vanilla" SQL. There are however, a few extensions to SQL to make recursive queries possible.
For example, If I have a relation "graph":
src | target
-----+--------
1 | 2
2 | 3
3 | 4
5 | 6
6 | 7
I can define a new table "closure" with the same fields:
WITH RECURSIVE closure (src, target) AS
(SELECT src, target FROM
graph
UNION
SELECT graph.src, closure.target FROM graph, closure WHERE
graph.target = closure.src)
SELECT * FROM closure
Note that "closure" occurs in its own definition (that is why this is a recursive query) It uses the original graph as a "seed" and grows by adding tuples with increasing distance (inspecting itself to do so).
The result (it clearly shows how the relation has grown):
src | target
-----+--------
1 | 2
2 | 3
3 | 4
5 | 6
6 | 7
1 | 3
2 | 4
5 | 7
1 | 4
If you are only interested in pairs that cannot be extended further, as in your original example, you could add an extra field "distance" to the closure table and use a GROUP BY clause to keep only the maximal pairs.
Disclaimer: I'm not on Windows, and used postgres for this. MS SQL should work very much the same way.
try below sql:
Assumming #input is your input table
Note: I added an ID column in the #input table
declare #input table
(
id int identity,
RoomId int,
ConnectingRoomId int
)
insert into #input
select 154,0 union all
select 155,154 union all
select 156,155 union all
select 157,156 union all
select 158, 0 union all
select 159, 158 union all
select 160, 159
**UPDATED: remove the union **
SQL:
select
d.id,
d.roomId
,max(d.connectingRoomId) as ConnectingRoomId
from
(
select
bb.id,
bb.RoomId
,b.RoomId as connectingRoomId
from #input b
right join
(
select
a.id,
a.RoomId,a.ConnectingRoomId
from #input a
) bb on (b.id < bb.Id) or b.Id = bb.Id
where b.ConnectingRoomId = 0
) d
group by d.id, d.RoomId
/*
Result (OUTPUT TABLE)
id roomId ConnectingRoomId
----------- ----------- ----------------
1 154 154
2 155 154
3 156 154
4 157 154
5 158 158
6 159 158
7 160 158
*/

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

Sql COALESCE entire rows?

I just learned about COALESCE and I'm wondering if it's possible to COALESCE an entire row of data between two tables? If not, what's the best approach to the following ramblings?
For instance, I have these two tables and assuming that all columns match:
tbl_Employees
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
tbl_Customers
Id Name Email Etc
-----------------------------------
1 Bob ... ...
2 Dan ... ...
3 Mary ... ...
And a table with id's:
tbl_PeopleInCompany
Id CompanyId
-----------------
1 1
2 1
3 1
And I want to query the data in a way that gets rows from the first table with matching id's, but gets from second table if no id is found.
So the resulting query would look like:
Id Name Email Etc
-----------------------------------
1 Sue ... ...
2 Rick ... ...
3 Mary ... ...
Where Sue and Rick was taken from the first table, and Mary from the second.
SELECT Id, Name, Email, Etc FROM tbl_Employees
WHERE Id IN (SELECT ID From tbl_PeopleInID)
UNION ALL
SELECT Id, Name, Email, Etc FROM tbl_Customers
WHERE Id IN (SELECT ID From tbl_PeopleInID) AND
Id NOT IN (SELECT Id FROM tbl_Employees)
Depending on the number of rows, there are several different ways to write these queries (with JOIN and EXISTS), but try this first.
This query first selects all the people from tbl_Employees that have an Id value in your target list (the table tbl_PeopleInID). It then adds to the "bottom" of this bunch of rows the results of the second query. The second query gets all tbl_Customer rows with Ids in your target list but excluding any with Ids that appear in tbl_Employees.
The total list contains the people you want — all Ids from tbl_PeopleInID with preference given to Employees but missing records pulled from Customers.
You can also do this:
1) Outer Join the two tables on tbl_Employees.Id = tbl_Customers.Id. This will give you all the rows from tbl_Employees and leave the tbl_Customers columns null if there is no matching row.
2) Use CASE WHEN to select either the tbl_Employees column or tbl_Customers column, based on whether tbl_Customers.Id IS NULL, like this:
CASE WHEN tbl_Customers.Id IS NULL THEN tbl_Employees.Name ELSE tbl_Customers.Name END AS Name
(My syntax might not be perfect there, but the technique is sound).
This should be pretty performant. It uses a CTE to basically build a small table of Customers that have no matching Employee records, and then it simply UNIONs that result with the Employee records
;WITH FilteredCustomers (Id, Name, Email, Etc)
AS
(
SELECT Id, Name, Email, Etc
FROM tbl_Customers C
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
LEFT JOIN tbl_Employees E
ON C.Id = E.Id
WHERE E.Id IS NULL
)
SELECT Id, Name, Email, Etc
FROM tbl_Employees E
INNER JOIN tbl_PeopleInCompany PIC
ON C.Id = PIC.Id
UNION
SELECT Id, Name, Email, Etc
FROM FilteredCustomers
Using the IN Operator can be rather taxing on large queries as it might have to evaluate the subquery for each record being processed.
I don't think the COALESCE function can be used for what you're thinking. COALESCE is similar to ISNULL, except it allows you to pass in multiple columns, and will return the first non-null value:
SELECT Name, Class, Color, ProductNumber,
COALESCE(Class, Color, ProductNumber) AS FirstNotNull
FROM Production.Product
This article should explain it's application:
http://msdn.microsoft.com/en-us/library/ms190349.aspx
It sounds like Larry Lustig's answer is more along the lines of what you need though.