SQL - Append counter to recurring value in query output - sql

I am in the process of creating an organizational charts for my company, and to create the chart, the data must have a unique role identifier, and a unique 'reports to role' identifier for each line. Unfortunately my data is not playing ball and it out of my scope to change the source.
I have two source tables, simplified in the image below. It is important to note a couple of things in the data.
An employees manager in the query needs to come from the [EmpData] table. The 'ReportsTo' field is only in the [Role] table to be used when a role is vacant
Any number of employees can hold the same role, but for simplicity lets assume that there will only ever be one person in the 'Reports to' role
Using this sample data, my query is as follows:
/**Join Role table with employee data table.
/**Right join so roles with more than one employee will generate a row each
SELECT [Role].RoleId As PositionId
,[EmpData].ReportsToRole As ReportsToPosition
,[Role].RoleTitle
,[Empdata].EmployeeName
FROM [Role]
RIGHT JOIN [EmpData] ON [Role].RoleId=[EmpData].[Role]
UNION
/** Output all roles that do not have a holder, 'VACANT' in employee name.
SELECT [Role].RoleId
,[Role].ReportsToRole
,[Role].RoleTitle
,'VACANT'
FROM [Role]
WHERE [Role].RoleID NOT IN (SELECT RoleID from [empdata])
This almost creates the intended output, but each operator roles has 'OPER', in the PositionId column.
For the charting software to work, each position must have a unique identifier.
Any thoughts on how to achieve this outcome? I'm specifically chasing the appended -01, -02, -03 etc. highlighted yellow in the Desired Query Output.

If you are using T-SQL, you should look into using the ROW_NUMBER operator with the PARTITON BY command and combining the column with your existing column.
Specifically, you would add a column to your select of ROW_NUMBER () OVER (PARTITION BY PositionID ORDER BY ReportsToPosition,EmployeeName) AS SeqNum
I would add that to your first query, and then, in your second, I would do something like SELECT PositionID + CASE SeqNum WHEN 1 THEN "" ELSE "-"+CAST(SeqNum AS VarChar(100)),...
There are multiple ways to do this, but this will leave out the individual ones that don't need a "-1" and only add it to the rest. The major difference between this and your scheme is it doesn't contain the "0" pad on the left, which is easy to do, nor would the first "OPER" be "OPER-1", they would simply be "OPER", but this can also be worked around.
Hopefully this gets you what you need!

Related

Trimming amount of many-to-many row combinations in bridge table

In SQL Server I have a bridge table to handle the Many-to-Many relationship that a Person can have to the Areas that they belong. Initially the Person table is loaded with the reference to his/her belonging Area based on their specific address. However as many Addresses basically point to the same single area or same set of multiple areas, I want to Trim/Consolidate the possible combinations into fewer rows in the bridge table (10million+ rows into 10k).
So first I want to generate the new bridge_area table and then populate a new person table with the new Key_Bridge value as per below.
Before doing this, a few things to consider:
Have you changed your processes so this doesn't happen again? When adding new records, you can detect that you already have a mapping to the relevant areas and not create a new duplicate one.
Do you have a plan for how you are going to drop the existing tables and use the new ones? If there are existing foreign keys and indexes, you'll need to account for all of that.
I'm sure there are many different ways of doing this, but one method I think will work:
SELECT key_bridge, cast(key_area.value as int) key_area
INTO bridge_area2
FROM
(
SELECT row_number() OVER (ORDER BY areas) key_bridge, areas
FROM
(
SELECT
DISTINCT STRING_AGG(key_area, ',') WITHIN GROUP (ORDER BY key_area) areas
FROM bridge_area
GROUP BY key_address
) a
) b CROSS APPLY STRING_SPLIT(areas, ',') key_area
The innermost query uses STRING_AGG to group the key_areas for each key_address together. For example, key_address=13 has areas="100,105". Since they are sorted in order, any key_bridge that has the exact same set of key_areas will have an exact match of areas so the distinct will limit this to the minimum number we need in the new bridge_area table. (It's important to use a delimiter here that can't exist in your data, but since your key_areas are numeric, a simple comma will do.)
row_number() is used to generate a new key_bridge column for each set of areas we care about, and the cross_apply is used to separate the areas (e.g. "100,105") back into separate rows. Note the cast to convert back to a numeric format.
Then you can create your new person table from that:
SELECT key_person, (SELECT key_bridge FROM bridge_area2 b2 GROUP BY key_bridge HAVING STRING_AGG(b2.key_area, ',') WITHIN GROUP (ORDER BY b2.key_area) = STRING_AGG(b1.key_area, ',') WITHIN GROUP (ORDER BY b1.key_area)) key_bridge
INTO person2
FROM person INNER JOIN bridge_area b1
ON person.key_address = b1.key_address
GROUP BY key_person
This is similar to the previous query. For each key_person, it figures out the ordered STRING_AGG of all the relevant key_areas in the current bridge_area table and then finds the same ordered grouping of key_areass in the new bridge_area2 table.
You can see this working in this Fiddle.
You now have the new tables you want and can rename them after dropping the old ones (with the caveats listed above about indexes, foreign keys, etc.)

Avoiding Duplicates when appending records in Access

I am aware this has been asked multiple times, but for one reason or another the solutions are not working for me.
Database Layout:
I have Table1 (Scanner_Location) Who is getting data pulled from another table/ subform on a form (Scanner IBOB) * Holds Columns: FP#, Count, Location, Model_ID, PK-SL_ID
Table2 (Scanner Detail) Holds Two of the three data columns: (FP#, Location PK-SN)
Table3 (Scanner_Model) Holds the last data column, displayed in a subform. (PK-Model_ID)
The user will input FP#, and location in one section of the form, then navigate to the subform, and select multiple Models, and enter the count (Textbox). Once Selected, they click an 'update' button that executes my queries. (Of which I have an update, AND an Append Query)
The problem is, just using an update query doesn't add the records. And using an Append query creates duplicates of the existing data.
Here's how the flow carries out:
User selects Model 1 and Model 2 with a count of 4 and an FP# of 100. Clicks update.
The queries update, and the information enters correctly.
User Selects the same models again (Model_Select), with the same FP# and count, the Table1 has the same information entered again, with a different primary key.
The goal:
The append query creates duplicates of existing data. I only want my update and/or append queries to:
Update the existing data - Looking for anything with the same FP#
Add any records that do not exist already (Looking at Model_ID and FP#)
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT Scanner_Model.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS [FP#],
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS [Count]
FROM Scanner_Detail
RIGHT JOIN Scanner_Model ON Scanner_Detail.Model_ID = Scanner_Model.Model_ID
WHERE (((Scanner_Model.SM_Acc_Select)=True)
AND ((NOT Exists (SELECT * FROM Scanner_location
WHERE (((Forms!Scanner_IBOB!fpNum_txt)=Forms!Scanner_IBOB!fpNum_tx‌​t)
And ((Scanner_Model.SM_Acc_Select)=True)); ))=False));
No query named 'Update_SLoc_Acc53' - there are 'Update_SLoc_Acc3' and 'Update_SLoc_Acc54'. I modified 'Update_SLoc_Acc54' because it is the one called by the code.
The query was not pulling the Location_ID from the combobox. I found the Bound Column was set to 1 and should be 0 to reference the Location_ID column because column index begins with 0. Can hide this column from user by setting width to 0.
This query seems to work:
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT Scanner_Model.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS FPNum,
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS CountMod
FROM Scanner_Model
WHERE (((Scanner_Model.SM_Acc_Select)<>False)
AND (([Model_ID] & [Forms]![Scanner_IBOB]![fpNum_txt] &
[Forms]![Scanner_IBOB]![Location_Cbo_main])
NOT IN (SELECT Model_ID & Footprints_Num & Location_ID FROM Scanner_Location)));
Note I did not use # in field name. Advise not to use punctuation/special characters in names with only exception of underscore. Also used CountMod instead of Count as field name.
Why the requirement to select two models? What if one is added and the other isn't?
I have concerns about the db structure.
Don't think App_Location and App_Detail should both be linking to other tables. Why is Location_ID the primary key in App_Location as well as primary key in Location_Data? This is a 1-to-1 relationship.
Is Serial_Number the serial number for scanner? Why is it a primary key in Telnet? This also results in a 1-to-1 relationship in which case might as well combine them.
If an app is associated with a scanner and scanner is associated with a location then don't need location associated with app. Same goes for scanner and telnet.
Scanner_Location table is not linked to anything. If purpose of this table is to track a count of models/footprints/locations -- as already advised this is usually not a good idea. Ideally, count data should be calculated by aggregate query of raw data records when the information is needed.
Maybe use NOT IN, something like:
[some identifier field] NOT IN (SELECT [some identifier field] FROM
Review EXISTS vs IN
Consider following adjusted append query that checks existence of matched Model_ID and FP_Num in Scanner_Location. If matches do not exist, then query imports selected records as they would be new records and not duplicates. Also, table aliases are used for readability and subquery correlation.
INSERT INTO Scanner_Location ( Model_ID, FootPrints_Num, Location_ID, Scanner_Loc_Cnt )
SELECT m.Model_ID, [Forms]![Scanner_IBOB]![fpNum_txt] AS [FP#],
[Forms]![Scanner_IBOB]![Location_Cbo_main] AS Location,
[Forms]![Scanner_IBOB]![Scanner_Loc_CntTxt] AS [Count]
FROM Scanner_Detail d
RIGHT JOIN Scanner_Model m ON d.Model_ID = m.Model_ID
WHERE ((m.SM_Acc_Select = True)
AND (NOT EXISTS (SELECT 1 FROM Scanner_Location loc
WHERE ((loc.FootPrints_Num = Forms!Scanner_IBOB!fpNum_tx‌​t)
AND (loc.Model_ID = m.Model_ID)) ) ));

Retrieving duplicate and original rows from a table using sql query

Say I have a student table with the following fields - student id, student name, age, gender, marks, class.Assume that due to some error, there are multiple entries corresponding to each student. My requirement is to identify the duplicate rows in the table and the filter criterion is the student name and the class.But in the query result, in addition to identifying the duplicate records, I also need to find the original student detail which got duplicated. Is there any method to do this. I went through this answer: SQL: How to find duplicates based on two fields?. But here it only specifies how to find the duplicate rows and not a means to identify the actual row that was duplicated. Kindly throw some light on the possible solution. Thanks.
First of all: if the columns you've listed are all in the same table, it looks like your database structure could use some normalization.
In terms of your question: I'm assuming your StudentID field is a database generated, primary key and so has not been duplicated. (If this is not the case, I think you have bigger problems than just duplicates).
I'm also assuming the duplicate row has a higher value for StudentID than the original row.
I think the following should work (Note: I haven't created a table to verify this so it might not be perfect straight away. If it doesn't it should be fairly close)
select dup.StudentID as DuplicateStudentID
dup.StudentName, dup.Age, dup.Gender, dup.Marks, dup.Class,
orig.StudentID as OriginalStudentId
from StudentTable dup
inner join (
-- Find first student record for each unique combination
select Min(StudentId) as StudentID, StudentName, Age, Gender, Marks, Class
from StudentTable t
group by StudentName, Age, Gender, Marks, Class
) orig on dup.StudentName = orig.StudenName
and dup.Age = orig.Age
and dup.Gender = orig.Gender
and dup.Marks = orig.Marks
and dup.Class = orig.Class
and dup.StudentID > orig.StudentID -- Don't identify the original record as a duplicate

Is there a way to select automatically the row pointed by an FK on a given table?

Today while writing one of the many queries that every developer in my company write every day I stumbled upon a question.
The DBMS we are using is Sql Server 2008
Say for example I write a query like this in the usual PERSON - DEPARTMENT db example
select * from person where id = '01'
And this query returns one row:
id name fk_department
01 Joe dp_01
The question is: is there a way (maybe using an addon) to make sql server write and execute a select like this
select * from department where id = 'dp_01'
only by for example clicking with the mouse on the cell containing the fk value (dp_01 in the example query)? Or by right click and selecting something like ("Go to pointed value")?
I hope I didn't wrote something stupid or impossible by definition
Not really, but that seems like a silly thing to do. Why would you want to confuse an id with a department name?
Instead, you could arrange things so you could do:
select p.*
from person p
where department = 'dp_01';
You would do this by adding a computed column department that references a scalar function that looks up the value in the department table. You can read about computed columns here.
However, a computed column would have bad performance characteristics. In particular, it would basically require a full table scan on the person table, even if that is not appropriate.
Another solution is to create a view, v_person that has the additional columns you want. Then you would do:
select p.*
from v_person p
where department = 'dp_01';
Why can't you write yourself by saying
select * from department where id =
(select fk_department from person where id = '01')

How to use the result from a second select in my first select

I am trying to use a second SELECT to get some ID, then use that ID in a second SELECT and I have no idea how.
SELECT Employee.Name
FROM Emplyee, Employment
WHERE x = Employment.DistributionID
(SELECT Distribution.DistributionID FROM Distribution
WHERE Distribution.Location = 'California') AS x
This post got long, but here is a short "tip"
While the syntax of my select is bad, the logic is not. I need that "x" somehow. Thus the second select is the most important. Then I have to use that "x" within the first select. I just don't know how
/Tip
This is the only thing I could imagine, I'm very new at Sql, I think I need a book before practicing, but now that I've started I'd like to finish my small program.
EDIT:
Ok I looked up joins, still don't get it
SELECT Employee.Name
FROM Emplyee, Employment
WHERE x = Employment.DistributionID
LEFT JOIN Distribution ON
(SELECT Distribution.DistributionID FROM Distribution
WHERE Distribution.Location = 'California') AS x
Get error msg at AS and Left
I use name to find ID from upper red, I use the ID I find FROM upper red in lower table. Then I match the ID I find with Green. I use Green ID to find corresponding Name
I have California as output data from C#. I want to use California to find the DistributionID. I use the DistributionID to find the EmployeeID. I use EmployeeID to find Name
My logic:
Parameter: Distribution.Name (from C#)
Find DistributionID that has Distribution.Name
Look in Employment WHERE given DistributionID
reveals Employees that I am looking for (BY ID)
Use that ID to find Name
return Name
Tables:
NOTE: In this example picture the Employee repeats because of the select, they are in fact singular
In "Locatie" (middle table) is Location, I get location (again) from C#, I use California as an example. I need to find the ID first and foremost!
Sory they are not in english, but here are the create tables:
Try this:
SELECT angajati.Nume
FROM angajati
JOIN angajari ON angajati.AngajatID = angajari.AngajatID
JOIN distribuire ON angajari.distribuireid = distribuire.distribuireid
WHERE distribuire.locatie = 'california'
As you have a table mapping employees to their distribution locations, you just need to join that one in the middle to create the mapping. You can use variables if you like for the WHERE clause so that you can call this as a stored procedure or whatever you need from the output of your C# code.
Try this solution:
DECLARE #pLocatie VARCHAR(40)='Alba'; -- p=parameter
SELECT a.AngajatID, a.Nume
FROM Angajati a
JOIN Angajari j ON a.AngajatID=j.AngajatID
JOIN Distribuire d ON j.DistribuireID=d.DistribuireID
WHERE d.Locatie=#pLocatie
You should add an unique key on Angajari table (Employment) thus:
ALTER TABLE Angajari
ADD CONSTRAINT IUN_Angajari_AngajatID_DistribuireID UNIQUE (AngajatUD, DistribuireID);
This will prevent duplicated (AngajatID, DistribuireID).
I don't know how you are connecting Emplyee(sic?) and Employment, but you want to use a join to connect two tables and in the join specify how the tables are related. Joins usually look best when they have aliases so you don't have to repeat the entire table name. The following query will get you all the information from both Employment and Distribution tables where the distribution location is equal to california. You can join employee to employment to get name as well.
SELECT *
FROM Employment e
JOIN Distribution d on d.DistributionID = e.DistributionID
WHERE d.Location = 'California'
This will return the contents of both tables. To select particular records use the alias.[Col_Name] separated by a comma in the select statement, like d.DistributionID to return the DistributionID from the Distribution Table