Comparing a list of values - sql

For example, I have a head-table with one column id and a position-table with id, head-id (reference to head-table => 1 to N), and a value. Now I select one row in the head-table, say id 1. I look into the position-table and find 2 rows which referencing to the head-table and have the values 1337 and 1338. Now I wanna select all heads which have also 2 positions with these values 1337 and 1338. The position-ids are not the same, only the values, because it is not a M to N relation. Can anyone tell me a SQL-Statement? I have no idea to get it done :/

Assuming that the value is not repeated for a given headid in the position table, and that it is never NULL, then you can do this using the following logic. Do a full outer join on the position table to the specific head positions you care about. Then check whether there is a full match.
The following query does this:
select *
from (select p.headid,
sum(case when p.value is not null then 1 else 0 end) as pmatches,
sum(case when ref.value is not null then 1 else 0 end) as refmatches
from (select p.value
from position p
where p.headid = <whatever>
) ref full outer join
position p
on p.value = ref.value and
p.headid <> ref.headid
) t
where t.pmatches = t.refmatches
If you do have NULLs in the values, you can accommodate these using coalesce. If you have duplicates, you need to specify more clearly what to do in this case.

Assuming you have:
Create table head
(
id int
)
Create table pos
(
id int,
head_id int,
value int
)
and you need to find duplicates by value, then I'd use:
Select distinct p.head_id, p1.head_id
from pos p
join pos p1 on p.value = p1.value and p.head_id<>p1.head_id
where p.head_id = 1
for specific head_id, or without last where for every head_id

Related

Structuring SQL

I have the below requirement to generate a report.
TASKTYPE.TaskTypeName,TASKWIP.DMTaskState_key FROM MercuryProd.TEAMSPACE.F_DMCaseWIP WIP,
MercuryProd.TEAMSPACE.F_DMTaskWIP TASKWIP,
MercuryProd.TEAMSPACE.D_DMDataField_BM_ExternalCaseIdentifier EXTID,
MercuryProd.TEAMSPACE.D_DMTaskType TASKTYPE
WHERE WIP.DMCase_key=TASKWIP.DMCase_key
AND EXTID.BM_ExternalCaseIdentifier_key=WIP.VMAE_BM_ExternalCaseIdentifier_key
AND TASKTYPE.DMTaskType_key=TASKWIP.DMTaskType_key
AND EXTID.BM_ExternalCaseIdentifier='BMAX5C62970'
--AND TASKTYPE.DMTaskType_key=9 AND TASKWIP.DMTaskState_key=2
--AND TASKTYPE.DMTaskType_key=10 AND TASKWIP.DMTaskState_key=0
If you look at the last two lines of sql, that's critical. I need all records satisfying both condition. A case type can have multiple corresponding child records in the taskwip table. I need to filter only those cases where within the child records both criteria meets. That's task 9 with state 2 and task 10 with state 0. What I have given here is an example data for one record. There will be multiple records similarly, like for another case key, multiple child record where task 9 with state 3 not 2, and task 10 with state 2 not 0. The report should not show this record.
I am happy if you can develop a query in any of the DB language whether its slq server, Oracle, mysql. I am interested more on the logic than the language format.
Because as seen in the result set, for this case key, there is a tasktype 10 with state 0 and task type 9 with state 2.
The specification isn't clear; I'm guessing, and this is just a guess, that we want to return rows ONLY if BOTH of a couple specific rows exist.
One option is to use correlated subqueries in an EXISTS predicate.
for example, something like this:
TASKTYPE.TaskTypeName
, TASKWIP.DMTaskState_key
FROM MercuryProd.TEAMSPACE.F_DMCaseWIP WIP
JOIN MercuryProd.TEAMSPACE.F_DMTaskWIP TASKWIP
ON TASKWIP.DMCase_key = WIP.DMCase_key
JOIN MercuryProd.TEAMSPACE.D_DMDataField_BM_ExternalCaseIdentifier EXTID
ON EXTID.BM_ExternalCaseIdentifier_key = WIP.VMAE_BM_ExternalCaseIdentifier_key
JOIN MercuryProd.TEAMSPACE.D_DMTaskType TASKTYPE
ON TASKTYPE.DMTaskType_key = TASKWIP.DMTaskType_key
WHERE EXTID.BM_ExternalCaseIdentifier = 'BMAX5C62970'
AND EXISTS ( SELECT 1
FROM MercuryProd.TEAMSPACE.D_DMTaskType tt92
WHERE tt92.DMTaskType_key = 9
AND TASKWIP.DMTaskState_key = 2
)
AND EXISTS ( SELECT 1
FROM MercuryProd.TEAMSPACE.D_DMTaskType tt10
WHERE tt10.DMTaskType_key = 10
AND TASKWIP.DMTaskState_key = 0
)
Note that it doesn't matter what value the subqueries return, the EXISTS is just checking if at least one row is return.
Note that this doesn't restrict which rows from TASKTYPE are returned. If we want to limit the return to just specific matching rows, we can add to the ON clause of the TASKTYPE join, or to the WHERE clause ...
AND ( ( TASKTYPE.DMTaskType_key = 9 AND TASKWIP.DMTaskState_key = 2 )
OR ( TASKTYPE.DMTaskType_key = 10 AND TASKWIP.DMTaskState_key = 0 )
)
There are other query patterns we could use; we could do a single EXISTS like this:
AND EXISTS ( SELECT 1
FROM MercuryProd.TEAMSPACE.D_DMTaskType ttx
WHERE ( ttx.DMTaskType_key = 9 AND TASKWIP.DMTaskState_key = 2 )
OR ( ttx.DMTaskType_key = 10 AND TASKWIP.DMTaskState_key = 0 )
HAVING COUNT(DISTINCT ttx.DMTaskType_key) = 2
)
EDIT
The first pattern demonstrated isn't sufficient. That requires both TASKTYPE rows to be related to the same TASKWIP row, and that can't happen because each TASKTYPE row require a different value from the TASKWIP row.
We would need to do the join in the correlated subqueries.
Something along these lines:
AND EXISTS ( SELECT 1
FROM MercuryProd.TEAMSPACE.F_DMTaskWIP tw92
JOIN MercuryProd.TEAMSPACE.D_DMTaskType tt92
ON tt92.DMTaskType_key = tw92.DMTaskType_key
AND tt92.DMTaskType_key = 9
WHERE tw92.DMTaskState_key = 2
AND tw92.DMCase_key = WIP.DMCase_key
)
AND EXISTS ( SELECT 1
FROM MercuryProd.TEAMSPACE.F_DMTaskWIP tw10
JOIN MercuryProd.TEAMSPACE.D_DMTaskType tt10
ON tt10.DMTaskType_key = tw10.DMTaskType_key
AND tt10.DMTaskType_key = 10
WHERE tw10.DMTaskState_key = 0
AND tw10.DMCase_key = WIP.DMCase_key
)
For oracle, you can use the listagg like below
SELECT DM_Case_Key,listagg(TaskTypeName,',') within group (order by DMTaskType_key)
over (partition by DM_Case_Key) as Tasks
FROM your_data

How can I find out the relationship between two columns in database?

I have a view defined in SQL Server database and it has two columns A and B, both of which have the type of INT. I want to find out the relationship between these two, 1 to 1 or 1 to many or many to many. Is there a SQL statement I can use to find out?
For the relationship, it means for a given value of A, how many values of B maps to this value. If there is only one value, then it is 1 to 1 mapping.
You could use CTEs to generate COUNTs of how many distinct A values were associated with each B value and vice versa, then take the MAX of those values to determine if the relationship is 1 or many on each side. For example:
WITH CTEA AS (
SELECT COUNT(DISTINCT B) ac
FROM t
GROUP BY A
),
CTEB AS (
SELECT COUNT(DISTINCT A) bc
FROM t
GROUP BY B
)
SELECT CONCAT(
CASE WHEN MAX(bc) = 1 THEN '1' ELSE 'many' END,
' to ',
CASE WHEN MAX(ac) = 1 THEN '1' ELSE 'many' END
) AS [A to B]
FROM CTEA
CROSS JOIN CTEB
Note that any time a relationship is listed as 1, it may actually be many but just not showing that because of limited data in the table.
Demo on dbfiddle
Assuming you have no NULL values:
select (case when count(*) = count(distinct a) and
count(*) = count(distinct b)
then '1-1'
when count(*) = count(distinct a) or
count(*) = count(distinct b)
then '1-many'
else 'many-many'
end)
from t;
Note: This does not distinguish between 1-many for a-->b or b-->a.
You would use count and group by to get this information.
--This would give you count of values of b which map to every values of a. If there is at least one row with a count give you a value greater than 1 it means the mapping between a and b is one to many.
select a,count( distinct b)
from table
group by a
If all of the rows have the values equal to one for all of the elements in a then the mapping is one-one
A caveat , null in b would be ignored in count expressions. ie because null and another null is not equivalent

How to Update a group of rows

My sqlfiddle: http://sqlfiddle.com/#!15/4f9da/1
I'm really bad explaining this and noob to do complex query(just the basics), because its complicated.
Situation: The column revision is a group of the same object related, for example: ids 1 2 3 are the same object and always refering the last old object on using id to ground_id.
Problem: I need to make ord column to make same id for the same group of object. example: the ids 1 2 3 need their value setted to 1, because the revison 0 is the id 1. Same for id 4, which must have ord 4 and id 5 too.
Basically must be like this:
You need a recursive query to do this. First you select the rows where ground_id IS NULL, set ord to the value of id. In the following iterations you add more rows based on the value of ground_id, setting the ord value to that of the row it is being matched to. You can then use that set of rows (id, ord) as a row source for the UPDATE:
WITH RECURSIVE set_ord (id, ord) AS (
SELECT id, id
FROM ground
WHERE ground_id IS NULL
UNION
SELECT g.id, o.ord
FROM ground g
JOIN set_ord o ON o.id = g.ground_id
)
UPDATE ground g
SET ord = s.ord
FROM set_ord s
WHERE g.id = s.id;
(SQLFiddle is currently not-responsive so I can't post my code there)

How return a count(*) of 0 instead of NULL

I have this bit of code:
SELECT Project, Financial_Year, COUNT(*) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year
where it's not returning any rows when the count is zero. How do I make these rows appear with the HighRiskCount set as 0?
You can't select the values from the table when the row count is 0. Where would it get the values for the nonexistent rows?
To do this, you'll have to have another table that defines your list of valid Project and Financial_Year values. You'll then select from this table, perform a left join on your existing table, then do the grouping.
Something like this:
SELECT l.Project, l.Financial_Year, COUNT(t.Project) AS HighRiskCount
INTO #HighRisk
FROM MasterRiskList l
left join #TempRisk1 t on t.Project = l.Project and t.Financial_Year = l.Financial_Year
WHERE t.Risk_1 = 3
GROUP BY l.Project, l.Financial_Year
Wrap your SELECT Query in an ISNULL:
SELECT ISNULL((SELECT Project, Financial_Year, COUNT(*) AS hrc
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year),0) AS HighRiskCount
If your SELECT returns a number, it will pass through. If it returns NULL, the 0 will pass through.
Assuming you have your 'Project' and 'Financial_Year' where Risk_1 is different than 3, and those are the ones you intend to include.
SELECT Project, Financial_Year, SUM(CASE WHEN RISK_1 = 3 THEN 1 ELSE 0 END) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
GROUP BY Project, Financial_Year
Notice i removed the where part.
By the way, your current query is not returning null, it is returning no rows.
Use:
SELECT x.Project, x.financial_Year,
COUNT(y.*) AS HighRiskCount
INTO #HighRisk
FROM (SELECT DISTINCT t.project, t.financial_year
FROM #TempRisk1
WHERE t.Risk_1 = 3) x
LEFT JOIN #TempRisk1 y ON y.project = x.project
AND y.financial_year = x.financial_year
GROUP BY x.Project, x.Financial_Year
The only way to get zero counts is to use an OUTER join against a list of the distinct values you want to see zero counts for.
SQL generally has a problem returning the values that aren't in a table. To accomplish this (without a stored procedure, in any event), you'll need another table that contains the missing values.
Assuming you want one row per project / financial year combination, you'll need a table that contains each valid Project, Finanical_Year combination:
SELECT HR.Project, HR.Financial_Year, COUNT(HR.Risk_1) AS HighRiskCount
INTO #HighRisk HR RIGHT OUTER JOIN ProjectYears PY
ON HR.Project = PY.Project AND HR.Financial_Year = PY.Financial_Year
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY HR.Project, HR.Financial_Year
Note that we're taking advantage of the fact that COUNT() will only count non-NULL values to get a 0 COUNT result for those result set records that are made up only of data from the new ProjectYears table.
Alternatively, you might only one 0 count record to be returned per project (or maybe one per financial_year). You would modify the above solution so that the JOINed table has only that one column.
Little longer, but what about this as a solution?
IF EXISTS (
SELECT *
FROM #TempRisk1
WHERE Risk_1 = 3
)
BEGIN
SELECT Project, Financial_Year, COUNT(*) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year
END
ELSE
BEGIN
INSERT INTO #HighRisk
SELECT 'Project', 'Financial_Year', 0
END
MSDN - ISNULL function
SELECT Project, Financial_Year, ISNULL(COUNT(*), 0) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year

How to select a row for certain (or give preference in the selection) in mysql?

Need your help guys in forming a query.
Example.
Company - Car Rental
Table - Cars
ID NAME STATUS
1 Mercedes Showroom
2 Mercedes On-Road
Now, how do I select only one entry from this table which satisfies the below conditions?
If Mercedes is available in Showroom, then fetch only that row. (i.e. row 1 in above example)
But If none of the Mercedes are available in the showroom, then fetch any one of the rows. (i.e. row 1 or row 2) - (This is just to say that all the mercedes are on-road)
Using distinct ain't helping here as the ID's are also fetched in the select statement
Thanks!
Here's a common way of solving that problem:
SELECT *,
CASE STATUS
WHEN 'Showroom' THEN 0
ELSE 1
END AS InShowRoom
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY InShowRoom
LIMIT 1
Here's how to get all the cars, which also shows another way to solve the problem:
SELECT ID, NAME, IFNULL(c2.STATUS, c1.STATUS)
FROM Cars c1
LEFT OUTER JOIN Cars c2
ON c2.NAME = c1.NAME AND c2.STATUS = 'Showroom'
GROUP BY NAME
ORDER BY NAME
You would want to use the FIND_IN_SET() function to do that.
SELECT *
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
LIMIT 1
If you have a preferred order of other statuses, just add them to the second parameter.
ORDER BY FIND_IN_SET(`STATUS`,'On-Road,Showroom' ) DESC
To fetch 'best' status for all cars you can simply do this:
SELECT *
FROM Cars
GROUP BY NAME
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
SELECT * FROM cars
WHERE name = 'Mercedes'
AND status = 'Showroom'
UNION SELECT * FROM cars
WHERE name = 'Mercedes'
LIMIT 1;
EDIT Removed the ALL on the UNION since we only want distinct rows anyway.
MySQL doesn't have ranking/analytic/windowing functions, but you can use a variable to simulate ROW_NUMBER functionality (when you see "--", it's a comment):
SELECT x.id, x.name, x.status
FROM (SELECT t.id,
t.name,
t.status,
CASE
WHEN #car_name != t.name THEN #rownum := 1 -- reset on diff name
ELSE #rownum := #rownum + 1
END AS rank,
#car_name := t.name -- necessary to set #car_name for the comparison
FROM CARS t
JOIN (SELECT #rownum := NULL, #car_name := '') r
ORDER BY t.name, t.status DESC) x --ORDER BY is necessary for rank value
WHERE x.rank = 1
Ordering by status DESC means that "Showroom" will be at the top of the list, so it'll be ranked as 1. If the car name doesn't have a "Showroom" status, the row ranked as 1 will be whatever status comes after "Showroom". The WHERE clause will only return the first row for each car in the table.
The status being a text based data type tells me your data is not normalized - I could add records with "Showroom", "SHOWroom", and "showROOM". They'd be valid, but you're looking at using functions like LOWER & UPPER when you are grouping things for counting, sum, etc. The use of functions would also render an index on the column useless... You'll want to consider making a CAR_STATUS_TYPE_CODE table, and use a foreign key relationship to make sure bad data doesn't get into your table:
DROP TABLE IF EXISTS `example`.`car_status_type_code`;
CREATE TABLE `example`.`car_status_type_code` (
`car_status_type_code_id` int(10) unsigned NOT NULL auto_increment,
`description` varchar(45) NOT NULL default '',
PRIMARY KEY (`car_status_type_code_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;