SQL query join table clarification - sql

I'm trying to join a columns from 2 different tables.
1st table is TBWORKFLOWPROCESS, column name is processname
2nd table is TBLSTAPP, column names are isforreconsideration, isforamendment
My query is:
select
pro.processname, app.isforreconsideration, app.isforamendment
from
TBWORKFLOWPROCESS pro
left join
TBLSTAPP app on app.applicationtype = pro.workflowid
where
sequenceno >= 10 and pro.workflowid = 1
group by
pro.processname, app.isforreconsideration, app.isforamendment
Output:
processname isforreconsideration isforamendment
------------------------------------------------------------------------
booked
booked,doc pending
cancelled
rejected
The output is correct but what I want is this.
processname
---------------
booked
booked,doc pending
cancelled
rejected
isforreconsideration
isforamendment
Can anyone help me with this? Thanks guys.

Given that the question you asked leaves a lot to be explained, and based on what I could understand and a lot assumptions, I think you want to incorporate the values of the two attributes (isforreconsideration and isforamendment) for the same condition and show them under the attribute of processname.
Hence, here, I am assuming that you need the all the data under one column.
In the code below, I have basically just added two more rows using UNION keyword, and hence the underhanded coding of renaming it as the other column.
select
pro.processname
from
TBWORKFLOWPROCESS pro
LEFT JOIN TBLSTAPP app on app.applicationtype = pro.workflowid
where
pro.sequenceno >=10
and pro.workflowid = 1
union
select
app.isforreconsideration as 'processname'
from
TBWORKFLOWPROCESS pro
LEFT JOIN TBLSTAPP app on app.applicationtype = pro.workflowid
where
sequenceno >=10
and pro.workflowid = 1
union
select app.isforamendment as 'processname'
from
TBWORKFLOWPROCESS pro
LEFT JOIN TBLSTAPP app on app.applicationtype = pro.workflowid
where
sequenceno >=10
and pro.workflowid = 1
P.S. : I am very much sure that this code is not very correct and I may have touched the nerve of one too many, but if you could give the type of data, each column holds and the examples of which kind of data, you want as output, many others would be able to help.

Related

how to join multiple tables without showing repeated data?

I pop into a problem recently, and Im sure its because of how I Join them.
this is my code:
select LP_Pending_Info.Service_Order,
LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,
LP_Pending_Info.ASC_Code,
LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY,
LP_Part_Codes.PartCode,
LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,
LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes
on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes
on LP_Pending_Info.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes
on LP_Pending_Info.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
For every service order I have 5 part code maximum.
If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin.
for example: this service order"4182134076" has only 2 part code, first'GH81-13601A' and second 'GH96-09938A' so it should show the data 2 time but it repeat it for 8 time. what seems to be the problem?
If your records were exactly the same the distinct keyword would have solved it.
However in rows 2 and 3 which have the same Service_Order and Part_Code if you check the SO_NO you see it is different - that is why distinct won't work here - the rows are not identical.
I say you have some problem in one of the conditions in your joins. The different data is in the SO_NO column so check the raw data in the LP_Confirmation_Codes table for that Service_Order:
select * from LP_Confirmation_Codes where Service_Order = 4182134076
I assume you are missing an and with the value from the LP_Part_Codes or LP_PS_Codes (but can't be sure without seeing those tables and data myself).
By this sentence If the service order have only one value it show the result correctly but when it have more than one Part code the problem begin. - probably you are missing and and with the LP_Part_Codes table
Based on your output result, here are the following data that caused multiple output.
Service Order: 4182134076 has :
2 PartCode which are GH81-13601A and GH96-09938A
2 PS which are U and P
2 SO_NO which are 1.00024e+09 and 1.00022e+09
Therefore 2^3 returns 8 rows. I believe that you need to check where you should join your tables.
Use DINTINCT
select distinct LP_Pending_Info.Service_Order,LP_Pending_Info.Pending_Days,
LP_Pending_Info.Service_Type,LP_Pending_Info.ASC_Code,LP_Pending_Info.Model,
LP_Pending_Info.IN_OUT_WTY, LP_Part_Codes.PartCode,LP_PS_Codes.PS,
LP_Confirmation_Codes.SO_NO,LP_Pending_Info.Engineer_Code
from LP_Pending_Info
join LP_Part_Codes on LP_Pending_Info.Service_order = LP_Part_Codes.Service_order
join LP_PS_Codes on LP_Part_Codes.Service_Order = LP_PS_Codes.Service_Order
join LP_Confirmation_Codes on LP_PS_Codes.Service_Order = LP_Confirmation_Codes.Service_Order
order by LP_Pending_Info.Service_order, LP_Part_Codes.PartCode;
distinct will not return duplicates based on your select. So if a row is same, it will only return once.

Count of how many times id occurs in table SQL regexp

Hi I have a redshift table of articles that has a field on it that can contain many accounts. So there is a one to many relationship between articles to accounts.
However I want to create a new view where it lists the partner id's in one column and in another column a count of how many times the partner id appears in the articles table.
I've attempted to do this using regex and created a new redshift view, but am getting weird results where it doesn't always build properly. So one day it will say a partner appears 15 times, then the next 17, then the next 15, when the partner id count hasn't actually changed.
Any help would be greatly appreciated.
SELECT partner_id,
COUNT(DISTINCT id)
FROM (SELECT id,
partner_ids,
SPLIT_PART(partner_ids,',',i) partner_id
FROM positron_articles a
LEFT JOIN util.seq_0_to_500 s
ON s.i < regexp_count (partner_ids,',') + 2
OR s.i = 1
WHERE i > 0
AND regexp_count (partner_ids,',') = 0
ORDER BY id)
GROUP BY 1;
Let's start with some of the more obvious things and see if we can start to glean other information.
Next GROUP BY 1 on your outer query needs to be GROUP BY partner_id.
Next you don't need an order by in your INNER query and the database engine will probably do a better job optimizing performance without it so remove ORDER BY id.
If you want your final results to be ordered then add an ORDER BY partner_id or similar clause after your group by of your OUTER query.
It looks like there are also problems with how you are splitting a partnerid from partnerids but I am not positive about that because I need to understand your view and the data it provides to know how that affects your record count for partnerid.
Next your LEFT JOIN statement on the util.seq_0_to_500 I am pretty sure you can drop off the s.i = 1 as the first condition will satisfy that as well because 2 is greater than 1. However your left join really acts more like an inner join because you then exclude any non matches from positron_articles that don't have a s.i > 0.
Oddly then your entire join and inner query gets kind of discarded because you only want articles that have no commas in their partnerids: regexp_count (partner_ids,',') = 0
I would suggest posting the code for your util.seq_0_to_500 and if you have a partner table let use know about that as well because you can probably get your answer a lot easier with that additional table depending on how regexp_count works. I suspect regex_count(partnerids,partnerid) exampleregex_count('12345,678',1234) will return greater than 0 at which point you have no choice but to split the delimited strings into another table before counting or building a new matching function.
If regex_count only matches exact between commas and you have a partner table your query could be as easy as this:
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
positron_articles a
LEFT JOIN PARTNERTABLE p
ON regexp_count(a.partnerids,p.partnerid) > 0
GROUP BY
p.partner_id
I will actually correct myself as I just thought of a way to join a partner table without regexp_count. So if you have a partner table this might work for you. If not you will need to split strings. It basically tests to see if the partnerid is the entire partnerids, at the beginning, in the middle, or at the end of partnerids. If one of those is met then the records is returned.
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
PARTNERTABLE p
INNER JOIN positron_articles a
ON
(
CASE
WHEN a.partnerids = CAST(p.partnerid AS VARCHAR(100)) THEN 1
WHEN a.partnerids LIKE p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid THEN 1
ELSE 0
END
) = 1
GROUP BY
p.partner_id

Why is this SQL query returning repeated records, when there not repeated in the database?

SELECT *
FROM support_systems,tickets
INNER JOIN user_access ON tickets.support_system_id = user_access.support_system_id
WHERE support_systems.account_id = #session.account_id#
AND user_access.user_access_level >= 1
AND user_access.user_id = #session.user_id#
Any clue why this query would return a record set with repeated records? The results are looking like this:
Priority ID Subject Status
high 1 First Subject open
high 1 First Subject open
low 3 Weeee open
low 3 Weeee open
medium 4 hhhhh closed
medium 4 hhhhh closed
medium 5 neat open
medium 5 neat open
Let me know if you guys need more information, thanks a lot.
You are selecting records from the table support_system but have not specified the join condition. What is the relationship between this table and the others you are interrogating?
You may want something like this
SELECT *
FROM support_systems
INNER JOIN tickets ON
support_systems.support_system_id = tickets.support_system_id
INNER JOIN user_access ON
tickets.support_system_id = user_access.support_system_id
WHERE support_systems.account_id = #session.account_id#
AND user_access.user_access_level >= 1
AND user_access.user_id = #session.user_id#
The problem is this line:
FROM support_systems,tickets
I would remove the tickets from the FROM clause and make it an inner join clause. Right now you have what's called a "cross product": http://en.wikipedia.org/wiki/Cross_product
I would have to say its probably becuase you have an explicite join and a non explicite join which isnt handled in the where which is producing a cartesian...
you have three tables...
but only two tables used in the join... you need a 2nd join... you need to include support_systems in your join somewhere.
probably like
from support_systems a left join user_access b on a.support_systems_id = b.support_systems_id
left join ticket c on c.support_systems_id = b.support_systems_id
then your where would be the same... and it would return based on the correctly joined tables.

sql join -vs- where clause not producing the same result?

I am a bit boggled as to why these two SQL constructs do not yield the same result.
SQL#1 return 2 identical records (dups) when only one exists in the defects table... see next sql
SELECT *
FROM Defects d
JOIN StatusCode C ON C.CodeName = d.Status AND c.scid = 10
WHERE d.AssignedTo='me'
SQL#2 reruns 1 record - this is correct cause lookign at raw data there is one defect not closed for 'me'
SELECT *
FROM Defects d
WHERE d.AssignedTo='me' AND Status <> 'closed'
all i am doing is instead of using a negative where status not in something , using a positive by way of the join to records that have every value defect status other than closed
why does this happen, and how can i alter my select with the join to corect its result. i tried using DISTINCT but it fails with:
The ntext data type cannot be selected
as DISTINCT because it is not
comparable.
there are no status codes that are 'closed', not a single one:
select * from StatusCode where scid = 10
results in these values:
Fixed
New
Ready for Retest
Failed Retest
Quality Follow Up
Reopen
Rejected
Consumer
In Coding
Open
Fixed
New
Ready for Retest
Failed Retest
Quality Follow Up
Reopen
Rejected
Consumer
In Coding
Open
The inner join will return all matching combinations of rows, so there must be two rows in the StatusCode table that match the "Status" value of your Defect (and have scid = 10).
FixedNewReady for RetestFailed RetestQuality Follow UpReopenRejected ConsumerIn CodingOpenFixedNewReady for RetestFailed RetestQuality Follow UpReopenRejected ConsumerIn CodingOpen
Not sure if I parsed your list exactly right, but there do appear to be duplicates. The answer, then, is to either eliminate the duplicates in the StatusCode table, or apply an additional filter to distinguish between them if the duplicates are valid.
How many rows are returned by this?
SELECT * FROM StatusCode C WHERE c.scid = 10
You may therefore want to do this:
SELECT *
FROM Defects d
WHERE d.AssignedTo='me' AND d.Status IN (
SELECT C.CodeName FROM StatusCode C WHERE C.scid = 10
)
Edit to address your edit: since you have multiple states with scid=10, each of those will be joined to your rows, which is why you get the duplicates. My code suggestion is still valid though.
I would think the problem is here:
JOIN StatusCode C ON C.CodeName = d.Status AND c.scid = 10
The c.scid = 10 should be in the where clause.

SQL: Need to remove duplicate rows in query containing multiple joins

Note that I'm a complete SQL noob and in the process of learning. Based on Google searches (including searching here) I've tried using SELECT DISTINCT and GROUP BY but neither works, likely due to all of my joins (if anyone knows why they won't work exactly, that would be helpful to learn).
I need data from a variety of tables and below is the only way I know to do it (I just know the basics). The query below works fine but shows duplicates. I need to know how to remove those. The only hint I have right now is perhaps a nested SELECT query but based on research I'm not sure how to implement them. Any help at all would be great, thanks!
USE SQL_Contest
go
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
EDIT: I will be reviewing these answers shortly, thank you very much. I'm posting the additional duplicate info per Rob's request:
Everything displays fine until I add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Which I need. That's when a duplicate appears:
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
JobClock 2,500248E4,08-107,Brentwood Job,1,2007-05-04 13:36:54.000,2007-05-04 13:47:55.407,3049
I want that info to only display once. Essentially this query is to determine all active jobsites that have a clock assigned to them, and that job only has one clock assigned to it, and it's only one jobsite, but it's appearing twice.
EDIT 2: Based on the help you guys provided I was able to determine they actually are NOT duplicates, and each session is independent, that is the only one that happened to have two sessions. So now I'm going to try to figure out how to only pull in information from the latest session.
If everything "works fine" until you add:
LEFT JOIN tb_Session SES
ON SES.ClockSerialNumber = CLK.SerialNumber
Then there must be more than one record in tb_Session for each CLK.SerialNumber.
Run the following query:
SELECT *
FROM tb_Session SES
WHERE ClockSerialNumber = '08-107'
There should be two records returned. You need to decide how to handle this (i.e. Which record do you want to use?), unless both rows from tb_Session contain identical data, in which case, should they?
You could always change your query to:
SELECT
CLT.Description AS ClockType,
CLK.SerialNumber AS JobClockSerial,
SIT.SiteNumber AS JobID,
SIT.[Name] AS JobsiteName,
SIT.Status AS SiteActivityStatus,
DHA.IssuedDate AS DHAIssuedDate, -- Date the clock was assigned to THAT jobsite
CLK.CreatedDate AS CLKCreatedDate, -- Date clock first was assigned to ANY jobsite
SES.ClockVoltage
FROM tb_Clock CLK
INNER JOIN tb_ClockType CLT
ON CLK.TypeID = CLT.ClockTypeID
INNER JOIN tb_DeviceHolderActivity DHA
ON CLK.ClockGUID = DHA.DeviceGUID
INNER JOIN tb_Site SIT
ON SIT.SiteGUID = DHA.HolderGUID
LEFT JOIN
(
SELECT DISTINCT ClockSerialNumber, ClockVoltage
FROM tb_Session
) SES
ON SES.ClockSerialNumber = CLK.SerialNumber
WHERE DHA.ReturnedDate IS NULL
ORDER BY SIT.[Name] ASC
As that should ensure that SES only contains one record for each unique combination of ClockSerialNumber and ClockVoltage
Take this example dataset:
Ingredient
IngredientId IngredientName
============ =========
1 Apple
2 Orange
3 Pear
4 Tomato
Recipe
RecipeId RecipeName
======== ==========
1 Apple Turnover
2 Apple Pie
3 Poached Pears
Recipe_Ingredient
RecipeId IngredientId Quantity
======== ============ ========
1 1 0.25
1 1 1.00
2 1 2.00
3 3 1.00
Note: Why the Apple Turnover has two lots of apple as ingredients, is neither here nor there, it just does.
The following query will return two rows for the "Apple Turnover" recipe, one row for the "Apple Pie" recipe and one row for the "Poached Pears" recipe, because there are two entries in the Recipe_Ingredient table for IngredientId 1. That's just what happens with a join..
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
You could get this to return only one row by changing it to:
SELECT I.IngredientName,
R.RecipeName
FROM Ingredient I
JOIN Recipe_Ingredient RI
ON I.IngredientId = RI.IngredientId
JOIN Recipe R
ON RI.recipeId = R.RecipeId
GROUP BY I.IngredientName, R.RecipeName
Without more specifics regarding your data, it's hard to apply this to your specific scenario, but the walkthrough may help you understand where the "duplicates" are coming from as someone unfamiliar with SQL
The joins are not your problem. From your comments I will infer that what you are calling "duplicates" are not actual duplicates. If all columns values for 2 "duplicates" returned from the query matched, then either SELECT DISTINCT or GROUP BY would definitely eliminate them. So you should be able to find a solution by looking at your column definitions.
My best guess is that you're getting duplicates of for the same date which aren't really duplicates because the time component of the date doesn't match. To eliminate this problem, you can truncate the date fields to the date only using this technique:
DATEADD(DAY, DATEDIFF(DAY, 0, DHA.IssuedDate), 0) AS DHAIssuedDate,
DATEADD(DAY, DATEDIFF(DAY, 0, CLK.CreatedDate), 0) AS CLKCreatedDate,
If that doesn't work you might want to take a look at JobClockSerial: does this column belong in the query results?