Selecting with multiple conditions from one dataset - sql

I have a Table that has 5 columns
First you can see that im german. But second you see that much of the data only differs in the category and value.
I now want to find all the Datasets that have category 1 and value 1
It should give me this table
I now whant to find in the initial TableA all the entrys that match Name, Date and City BUT only if all 3 of them for every dataset match AND the category is now 2 instead of 1 AND the Value is 0.
So for the Table A it should come out as:
I hope i didnt do any mistakes. In the example and it is clear what i try.
I know for the WHERE Statement there is an IN clause that basically checks if the value is inside a list of values. But i dont know how to use this to check for 3 Values. Because when i just do 3 Lists checks it would also give me every entry that is a combination of my 3 lists regardles of which row the actual value comes from.
So instead of checking if value Name[0] And City[0] And Date[0] can be found i need to avoid that a value is found that is like Name[0] City[4] and Date[12] (Number in brackets stands for the row number).
The code i would have thought of:
Select*
FROM tablea
WHERE
(SELECT name, date, city
FROM tablea
WHERE tablea.Category=1 AND tablea.Value=0) as tableafiltered
WHERE tablea in tableafiltered
Thats what i thought would maybe work. But im pretty sure it wouldnt work. Because im trying to match 3 Columns. And the in in the where statement is only valid for one column right?

The first dataset that you describe can be a subquery and you can join it to the table:
select t.*
from tablea t inner join (
select distinct name, date, city
from tablea
where category = 1 and value = 1
) d on d.name = t.name and d.date = t.date and d.city = t.city
where t.category = 2 and t.value = 0
Another way of doing it is with EXISTS:
select t.*
from tablea t
where t.category = 2 and t.value = 0
and exists (
select 1
from tablea
where name = t.name and date = t.date and city = t.city and category = 1 and value = 1
)
See the demo.
Results:
> name | date | city | category | value
> :----- | :--------- | :------ | -------: | ----:
> Albert | 01.01.2000 | Berlin | 2 | 0
> Albert | 01.01.2000 | Hamburg | 2 | 0

One way to do this would be to create two selects, one for category=1, value=1, and one for the 2,0 combination. Then you can inner join the two tables in one row, then ensure the other two columns matches by where table1.column1=table2.column1 and table2.column2=table1.column2. You can choose the columns any way you like, that's why I give this generic form.

Related

how to get a record based on max date for each user in different multiple tables

I'm trying to make a query that can select the names, DateOfSpecimenResult,SwabResult,DateOfReleaseOfResult. Heres the Visual Representation:
Table: ContactTracingHeader
| AutoID | Name |
1 Jason
2 Chris
Table: Swab
| PatientNo | DateOfSpecimenCollection | SwabResultID | DateOfReleaseOfResult |
1 05/02/2020 1 05/10/2020
1 06/08/2020 1 06/11/2020
1 07/16/2020 2 07/20/2020
Note: ContactTracingHeader.AutoID = Swab.PatientNo
Table: SwabResult
| AutoID | SwabResult |
1 POSITIVE
2 NEGATIVE
Query Output that I'm trying to make
| AutoID | Name | DateOfSpecimenCollection | SwabResult | DateOfReleaseOfResult |
1 Jason 07/16/2020 NEGATIVE 07/20/2020
2 Chris (BLANK) (BLANK) (BLANK)
Note: Swab.ResultID = SwabResult.AutoID
Here, I'm only trying to show a Name with the latest DateOfSpecimenCollection and then use it as a reference for the 2 other column which is SwabResult and DateOfReleaseOfResult, And Since "Chris" doesn't have a input on the Swab table, his records are blank but his name and AutoID Still appears on the table. The SwabResult shows the NEGATIVE, or POSITIVE depending on the ID from table Swab.
What i have done so far is this:
SELECT
CTH.AutoID,
CTH.Firstname,
CTH.Lastname,
(SELECT MAX(DateOfSpecimenCollection)
FROM Swab
WHERE Swab.PatientNo = CTH.AutoID
) AS DateOfSpecimenCollection,
Swab.SwabResultID,
Swab.DateOfReleaseOfResult
FROM ContactTracingHeader AS CTH LEFT JOIN Swab ON Swab.PatientNo = CTH.AutoID;
This Query gets the latest DateOfSpecimenCollection and Output those who aren't in the Swab table which is correct, however it duplicates the record depending how many SwabResult or DateOfReleaseOfResult it has. Also I tried INNER JOIN the table SwabResult so i can output the SwabResult instead of the ID, However it gave me JOIN Expression not supported Error.
I apologize for the long and confusing explanation and duplicate question as I'm trying to strugge to find some answers on the internet. Thank you!
Need a unique identifier in Swab table. If not already there, autonumber should serve
Consider:
Query1:
SELECT Swab.* FROM Swab WHERE ID IN
(SELECT TOP 1 ID FROM Swab AS Dupe WHERE Dupe.PatientNo=Swab.PatientNo
ORDER BY Dupe.DateOfSpecimenCollection DESC);
Query2
SELECT ContactTracingHeader.AutoID, ContactTracingHeader.Name, Query1.DateOfSpecimenCollection,
SwabResult.SwabResult, Query1.DateOfReleaseOfResult
FROM ContactTracingHeader LEFT JOIN (SwabResult RIGHT JOIN Query1
ON SwabResult.AutoID = Query1.SwabResultID)
ON ContactTracingHeader.AutoID = Query1.PatientNo;
Or this sequence:
Query1:
SELECT Swab.PatientNo, Max(Swab.DateOfSpecimenCollection) AS MaxOfDateOfSpecimenCollection
FROM Swab
GROUP BY Swab.PatientNo;
Query2:
SELECT Swab.PatientNo, Swab.DateOfSpecimenCollection, Swab.SwabResultID,
Swab.DateOfReleaseOfResult
FROM Query1 INNER JOIN Swab
ON (Query1.MaxOfDateOfSpecimenCollection = Swab.DateOfSpecimenCollection)
AND (Query1.PatientNo = Swab.PatientNo);
Query3:
SELECT ContactTracingHeader.AutoID, ContactTracingHeader.Name,
Query2.DateOfSpecimenCollection, SwabResult.SwabResult, Query2.DateOfReleaseOfResult
FROM SwabResult RIGHT JOIN (ContactTracingHeader LEFT JOIN Query2
ON ContactTracingHeader.AutoID = Query2.PatientNo)
ON SwabResult.AutoID = Query2.SwabResultID;

Greatest N Per Group with JOIN and multiple order columns

I have two tables:
Table0:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-18 | 100 |
| aa | 1 | 12-10 | 101 |
| bb | 2 | 12-10 | 102 |
| cc | 1 | 12-09 | 100 |
| cc | 2 | 12-12 | 103 |
| cc | 2 | 12-01 | 109 |
| cc | 1 | 12-07 | 101 |
| dd | 1 | 12-08 | 100 |
and
Table1:
| ID |
|----|
| aa |
| cc |
| cc |
| dd |
| dd |
I'm trying to output results where:
ID must exist in both tables.
TYPE must be the maximum for each ID.
TIME must be the minimum value for the maximum TYPE for each ID.
SITE should be the value from the same row as the minimum TIME value.
Given my sample data, my results should look like this:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-10 | 101 |
| cc | 2 | 12-01 | 109 |
| dd | 1 | 12-08 | 100 |
I've tried these statements:
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MASTY, MIN("TIME") AS MASTM
FROM TABLE0
GROUP BY "ID") AS MAS,
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MSD.MASTY =MA."TYPE"
...which generates a syntax error
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MAB
FROM TABLE0
GROUP BY "ID") AS MAS,
((SELECT "ID", MIN("TIME") AS MACTM, MIN("TYPE") AS MACTY
FROM TABLE0
WHERE "TYPE" = 1
GROUP BY "ID")
UNION
(SELECT "ID", MIN("TIME"), MAX("TYPE")
FROM TABLE0
WHERE "TYPE" = 2
GROUP BY "ID")) AS MACU
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MACU."ID" = QTS."ID"
AND MA."TIME" = MACU.MACTM
AND MA."TYPE" = MACU.MACTB
... which is getting the wrong results.
Answering your direct question "how to avoid...":
You get this error when you specify a column in a SELECT area of a statement that isn't present in the GROUP BY section and isn't part of an aggregating function like MAX, MIN, AVG
in your data, I cannot say
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id
I didn't say what to do with SITE; it's either a key of the group (in which case I'll get every unique combination of ID,site and the min time in each) or it should be aggregated (eg max site per ID)
These are ok:
SELECT
ID, max(site), min(time)
FROM
table
GROUP BY
id
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id,site
I cannot simply not specify what to do with it- what should the database return in such a case? (If you're still struggling, tell me in the comments what you think the db should do, and I'll better understand your thinking so I can tell you why it can't do that ). The programmer of the database cannot make this decision for you; you must make it
Usually people ask this when they want to identify:
The min time per ID, and get all the other row data as well. eg "What is the full earliest record data for each id?"
In this case you have to write a query that identifies the min time per id and then join that subquery back to the main data table on id=id and time=mintime. The db runs the subquery, builds a list of min time per id, then that effectively becomes a filter of the main data table
SELECT * FROM
(
SELECT
ID, min(time) as mintime
FROM
table
GROUP BY
id
) findmin
INNER JOIN table t ON t.id = findmin.id and t.time = findmin.mintime
What you cannot do is start putting the other data you want into the query that does the grouping, because you either have to group by the columns you add in (makes the group more fine grained, not what you want) or you have to aggregate them (and then it doesn't necessarily come from the same row as other aggregated columns - min time is from row 1, min site is from row 3 - not what you want)
Looking at your actual problem:
The ID value must exist in two tables.
The Type value must be largest group by id.
The Time value must be smallest in the largest type group.
Leaving out a solution that involves having or analytics for now, so you can get to grips with the theory here:
You need to find the max type group by id, and then join it back to the table to get the other relevant data also (time is needed) for that id/maxtype and then on this new filtered data set you need the id and min time
SELECT t.id,min(t.time) FROM
(
SELECT
ID, max(type) as maxtype
FROM
table
GROUP BY
id
) findmax
INNER JOIN table t ON t.id = findmax.id and t.type = findmax.maxtype
GROUP BY t.id
If you can't see why, let me know
demo:db<>fiddle
SELECT DISTINCT ON (t0.id)
t0.id,
type,
time,
first_value(site) OVER (PARTITION BY t0.id ORDER BY time) as site
FROM table0 t0
JOIN table1 t1 ON t0.id = t1.id
ORDER BY t0.id, type DESC, time
ID must exist in both tables
This can be achieved by joining both tables against their ids. The result of inner joins are rows that exist in both tables.
SITE should be the value from the same row as the minimum TIME value.
This is the same as "Give me the first value of each group ofids ordered bytime". This can be done by using the first_value() window function. Window functions can group your data set (PARTITION BY). So you are getting groups of ids which can be ordered separately. first_value() gives the first value of these ordered groups.
TYPE must be the maximum for each ID.
To get the maximum type per id you'll first have to ORDER BY id, type DESC. You are getting the maximum type as first row per id...
TIME must be the minimum value for the maximum TYPE for each ID.
... Then you can order this result by time additionally to assure this condition.
Now you have an ordered data set: For each id, the row with the maximum type and its minimum time is the first one.
DISTINCT ON gives you exactly the first row of each group. In this case the group you defined is (id). The result is your expected one.
I would write this using distinct on and in/exists:
select distinct on (t0.id) t0.*
from table0 t0
where exists (select 1 from table1 t1 where t1.id = t0.id)
order by t0.id, type desc, time asc;

CASE...WHEN in WHERE clause in Postgresql

My query looks like:
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
here I would like to add CASE WHEN
so my query is:
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
AND CASE
WHEN t2.id_bank = 12 THEN t1.id_status_notatka_4 = ANY (selected_effect)
END
but it doesn't work. The syntax is good but it fails in searching for anything. So my question is - how use CASE WHEN in WHERE clause. Short example: if a=0 then add some condition to WHERE (AND condition), if it's not then don't add (AND condition)
No need for CASE EXPRESSION , simply use OR with parenthesis :
AND (t2.id_bank <> 12 OR t1.id_status_notatka_4 = ANY (selected_effect))
For those looking to use a CASE in the WHERE clause, in the above adding an else true condition in the case block should allow the query to work as expected. In the OP, the case will resolve as NULL, which will result in the WHERE clause effectively selecting WHERE ... AND NULL, which will always fail.
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
AND CASE
WHEN t2.id_bank = 12 THEN t1.id_status_notatka_4 = ANY (selected_effect)
ELSE true
END
The accepted answer works, but I'd like to share input for those who are looking for a different answer. Thanks to sagi, I've come up with the following query, but I'd like to give a test case as well.
Let us assume this is the structure of our table
tbl
id | type | status
-----------------------
1 | Student | t
2 | Employee | f
3 | Employee | t
4 | Student | f
and we want to select all Student rows, that have Status = 't', however, We also like to retrieve all Employee rows regardless of its Status.
if we perform SELECT * FROM tbl WHERE type = 'Student' AND status = 't' we would only get the following result, we won't be able to fetch Employees
tbl
id | type | status
-----------------------
1 | Student | t
and performing SELECT * FROM tbl WHERE Status = 't' we would only get the following result, we got an Employee Row on the result but there are Employee Rows that were not included on the result set, one could argue that performing IN might work, however, it will give the same result set. SELECT * FROM tbl WHERE type IN('Student', 'Employee') AND status = 't'
tbl
id | type | status
-----------------------
1 | Student | t
3 | Employee | t
remember, we want to retrieve all Employee rows regardless of its Status, to do that we perform the query
SELECT * FROM tbl WHERE (type = 'Student' AND status = 't') OR (type = 'Employee')
result will be
table
id | type | status
-----------------------
1 | Student | t
2 | Employee | f
3 | Employee | t

COUNT() return 0 on certain column

Requirements:I wanted to know how many images that each person have posted.
Therefore I create a table schema as follow.
Table=Person
==========
Id (PK) , Column1, Column2, LId (FK)
Table=ListMaster
============
Id (PK) , LId (Unique)
Table = ListDetail
===========
Id (PK), LId(FK), DataId(FK)
Table = Image
=========
Id (PK), Column1, Column2
and use a query SQL
SELECT Person.Id AS PersonId,
Person.Column1 AS PersonC1,
Person.Column2 AS PersonC2,
COUNT(Image.Id) AS ImageCount
FROM Person
LEFT OUTER JOIN ListMaster ON Person.LId = ListMaster.LId
LEFT OUTER JOIN ListDetail ON ListDetail.LId = ListMaster.LId
LEFT OUTER JOIN Data AS Image ON ListDetail.DataId = Image.Id
GROUP BY Person.Id,
Person.Column1,
Person.Column2
I notice some person's "ImageCount" have "0" although there are images that he have posted before.
Could you please advise me how to fix my query or tell me it is even logically possible to do what I wanted? I suspect I make a mistake regarding my table design.
Sample Data
=======
Table = Person
Id (PK)(bigint identity) | Column1 (nvarchar(max)) | Column2 (nvarchar(max)) | LId (FK) (bigint [null])
1 | Test 1 C1 | Test1 C2 | 1
2 | Test 2 C1 | Test 2 C2 | 2
3 | Test 3 C1 | Test 3 C2 | NULL
4 | Test 4 C1 | Test 4 C4 | 37
Table = ListMaster
Id (PK)(bigint)(identity) | LId (Unique)(bigint)
1 | 1
2 | 2
3 | 37
Table = ListDetail
Id (PK)(bigint identity)| LId(FK)(bigint not null)| DataId(FK)(bigint not null)
1 | 1 | 1
2 | 1 | 2
3 | 2 | 3
4 | 37 | 4
Table = Image
Id (PK)(big int not null)(identity) | Column1 (nvarchar(max)) | Column2 (nvarchar(max))
1 | Location 1 | Dummy Data 1
2 | Location 2 | Dummy Data 2
3 | Location 3 | Dummy Data 3
4 | Location 4 | Dummy Data 4
I expect the COUNT(Image.Id) AS ImageCount should return
2
1
0
1
but it return
2
1
0
0
EDIT 1 : Change the table design
EDIT 2 : Add a sample data
If, as you say, your Image.Id column is declared NOT NULL, the only reason you should get a COUNT(Image.Id) of 0 for a Person would be if your LEFT JOINs do not find any Images for a given Person. In that case, Image.Id in your underlying results will be NULL, and therefore COUNT(Image.id) will be zero. This means that either:
There's a Person who doesn't have any ListMaster entries.
There's a ListMaster which doesn't have any ListDetail entries.
There's a ListDetail entry which doesn't have any Data entries.
...or some combination of the above.
You should be able to quickly check which links are missing by adding COUNTs for the appropriate tables to your existing query:
SELECT Person.Id AS PersonId,
Person.Column1 AS PersonC1,
Person.Column2 AS PersonC2,
-- NEXT TWO COUNTS ADDED FOR DEBUGGING
COUNT(ListMaster.LId) AS ListMasterCount,
COUNT(ListDetail.LId) AS ListDetailCount,
COUNT(Image.Id) AS ImageCount
FROM Person
LEFT OUTER JOIN ListMaster ON Person.LId = ListMaster.LId
LEFT OUTER JOIN ListDetail ON ListDetail.LId = ListMaster.LId
LEFT OUTER JOIN Data AS Image ON ListDetail.DataId = Image.Id
GROUP BY Person.Id,
Person.Column1,
Person.Column2
It looks like your query and data as given should return the values you expect. Is this the actual schema, data, and query, or are you simplifying to make this post? I presume your real data does not include values like "test 1 C1", etc. Did you create a DB with these dummy field names and values to do this test, or are you saying that this is the equivalent of what you really have? If this isn't the actual stuff, it could well be that in simplifying you have left out the thing that's really causing the problem.
When I have a query that does not give the expected results, I try to drop out parts of the query to see where the problem is. Like try with only the first join and see if you get the expected results. If that works, add in the second join, etc. Leave out the GROUP BY and just dump all the records, so you can see the actual records and not just the count.
There are many possible sources of trouble. Maybe the data isn't what you think it is. Maybe one of the joins is using the wrong field. Maybe you're getting in trouble because you have different data types and a conversion is not giving the results you expect. Etc.
Try this to check if all your Id, Column1, Column2 have image
SELECT Person.Id AS PersonId,
Person.Column1 AS PersonC1,
Person.Column2 AS PersonC2,
ListMaster.*,
ListDetail.*,
Data.*
FROM Person
LEFT OUTER JOIN ListMaster ON Person.LId = ListMaster.LId
LEFT OUTER JOIN ListDetail ON ListDetail.LId = ListMaster.LId
LEFT OUTER JOIN Data AS Image ON ListDetail.DataId = Image.Id
ORDER BY Person.Id,
Person.Column1,
Person.Column2

How do I select rows where only return keys that don't have '1' in column c

Title is confusing I know, I'm just not sure how to word this. Anyway let me describe with a table:
| key | column b | column c |
|-----|----------|----------|
| a | 13 | 2 |
| a | 14 | 2 |
| a | 15 | 1 |
| b | 16 | 2 |
| b | 17 | 2 |
I'd like to select all keys where column c doesn't equal 1, so the select will result in returning only key 'b'
To clarify, my result set should not contain keys that have a row where column c is set to 1. Therefore I'd like a sql query that would return the keys that satisfy the previous statement.
To make my question as clear as possible. From the table above, what I want returned by some sql statement is a result set containing [{b}] based on the fact that key 'a' has at least one row where column c is equal to 1 whereas key 'b' does not have any rows that contain 1 in column c.
SELECT t.[Key]
FROM TableName t
WHERE NOT EXISTS (SELECT 1
FROM TableName
WHERE t.[key] = [key]
AND ColumnC = 1)
GROUP BY t.[Key]
SELECT KEY
FROM WhateverYourTableNameIs
WHERE c <> '1'
I would do this using group by and aggregation:
select [key]
from table t
group by [key]
having sum(case when c = 1 then 1 else 0 end) = 0;
The having clause counts the number of rows that have c = 1. The = 0 says that there are no such rows for a given key.
Elaboration based on other comments:
You asked for ALL keys where column c doesn't equal 1. That is exactly what the query I suggested will give you. The other part of your question so the SELECT will result in returning only key 'b', is ambiguous. The question as asked will give you results from columns A and B. There is nothing in your question to limit the result set. You either need an additional condition to your WHERE clause, or your question is inherently unanswerable.