I'm working with a dataset - structured like this
I want to exclude all records with ReviewRound being "a" if they have gone through review round "b" - If a set of unique ID's has an associated round "b" review, the round "a" review should not be included.
Some records have not gone to round "b". The issues I'm running into are as a result of there being multiple records for each unique ID.
Ideally this could be done in GoogleBigQuery, if not, filtering through GoogleScripts may also be an option!
Any suggestions would be appreciated!
If a set of unique ID's has an associated round "b" review, the round "a" review should not be included.
If I followed you correctly, you could express this as a not condition with a correlated subquery that ensures that, if the current record has ReviewRound = 'a', there is no other record that has the same id and ReviewRound = 'b'.
select t.*
from mytable t
where not (
t.ReviewRound = 'a'
and exists (
select 1
from mytable t1
and t1.id = t.id and t1.ReviewRound = 'b'
)
)
You can do this with window functions as well:
select t.* except (num_bs)
from (select t.*,
countif(reviewround = 'b') over (partition by id) as num_bs
from t
) t
where num_bs = 0 or reviewround = 'b';
By using window functions, you can solve it with this query
SELECT ID, Score
FROM (
SELECT *,
MAX(CASE WHEN ReviewRound = 'b' THEN 1 ELSE 0 END) OVER (partition by ID) as has_b
FROM mytable
) t
WHERE has_b = 0
Re-conceptualizing as keeping only the latest review round, I would try:
select * from mytable join
(select ID, max(ReviewRound) as ReviewRound from mytable group by ID)
on (ID, ReviewRound)
Related
I'm trying to optimize the filtering of data in one report/table and I've encountered a challenge.
Table is located in m.access, so any vba access code or sql query should work here.
So far I've tried few options, but could not achieve expected results:
select prev_type, type, next_type
from (
select *,
lag(type) over (order by id) as prev_type,
type,
lead(type) over (order by id) as next_type
from table
) as t
where type = "type";
Basically I want to display from below table three rows:
row with Type = 'D'
previous row to the one with Type 'D'
next row to the one with Type 'D'
enter image description here
Try with a subquery:
Select * From YourTable
Where Abs([ID] - (Select ID From YourTable Where [Type] = 'D')) <= 1
For multiple Ds, join the subquery:
Select
*
From
YourTable ,
(Select ID From YourTable Where [Type] = 'D') As T
Where
Abs(YourTable.[ID] - T.ID) <= 1
In SQL, where we need to filter the unnecessary data from a table:
case 1: if 2 IDs are same and DOD is not null then Record is needed
case 2: if single id is there and dod is not null then Record is needed
case 3: if 2 ids are same and if DOD is null for any one of them ,then record is not needed
Your help is much appreciated.
Thanks
You can use analytic functions for this:
select t.*
from (
select
t.*,
sum(case when dod is null then 1 else 0 end) over(partition by id) no_nulls
from mytable t
) t
where no_nulls = 0
Note that this also excludes records that have no duplicate id but whose dod is null (you did not describe how to handle those).
You could also use not exists (which can conviniently be turned to a delete statement if needed):
select t.*
from mytable t
where not exists(select 1 from mytable t1 where t1.id = t.id and t1.dod is null)
where no_nulls = 0
I've got a table
CREATE TABLE Table1(
Id INT NOT NULL IDENTITY(1,1),
EvType INT NOT NULL,
CreatedByUserId INT NOT NULL
)
Initial data:
And i wonna get only rows which meet the next condition: We select row until Id of the row will be less than first row with EvType == 200 per createdByUserId. So we need to select firsly all first rows for each user with evType == 200, which i've done in this way:
WITH EVS1 AS (
SELECT evs.Id, evs.EvType, evs.CreatedByUserId
ROW_NUMBER() OVER (PARTITION BY evs.CreatedByUserId ORDER BY evs.CreatedDate DESC) as rk
FROM [dbo].Table1 evs)
select *
From EVS1
WHERE EVS1.rk=1
Which produces the following result:
And then somehow i need to select rows for each user until Id is greater then row from CTE for that user, Is that possible to do that?
So we need to retrieve from that table rows until 4th included. Skip the 5th row cause it goes after the first user row with evType 200
Expected Result:
Find min(id) first and then the row having lower or equal id
SELECT *
FROM EVS1
WHERE id <= (SELECT MIN(id) FROM EVS1 WHERE evType = 200)
I assume that you define the ordering according to the id attribute.
If it is necessary to do it for each CreatedByUserId then use a dependent subquery for the minimal id computation
SELECT *
FROM EVS1 e1
WHERE id <= (
SELECT MIN(id)
FROM EVS1 e2
WHERE e2.evType = 200
and e1.CreatedByUserId = e2.CreatedByUserId
)
DBFIDDLE DEMO
I believe that this solution will be faster then a window function for a large data if you will have an index
CREATE INDEX ix_evs1_evType_CreatedByUserId ON evs1(evType, CreatedByUserId) INCLUDE(id)
You can do a window max:
select Id, EvType, CreatedByUserId
from (
select
t.*,
max(case when EvType = 200 then 1 else 0 end)
over(partition by CreatedByUserId order by Id) flagEvType
from [dbo].Table1
)
where flagEvType = 0
You want to select all rows created by a user except for those where an event type 200 occurred before:
select *
from mytable t1
where not exists
(
select null
from mytable t2
where t2.evtype = 200
and t2.createdbyuserid = t1.createdbyuserid
and t2.id < t1.id
);
I have a problem where I need to get the last item across various tables in PostgreSQL.
The following code works and returns me the type of the latest update and when it was last updated.
The problem is, this query needs to be used as a subquery, so I want to select both the type and the last updated value from this query and PostgreSQL does not seem to like this... (Subquery must return only one column)
Any suggestions?
SELECT last.type, last.max FROM (
SELECT MAX(a.updated_at), 'a' AS type FROM table_a a WHERE a.ref = 5 UNION
SELECT MAX(b.updated_at), 'b' AS type FROM table_b b WHERE b.ref = 5
) AS last ORDER BY max LIMIT 1
Query is used like this inside of a CTE;
WITH sql_query as (
SELECT id, name, address, (...other columns),
last.type, last.max FROM (
SELECT MAX(a.updated_at), 'a' AS type FROM table_a a WHERE a.ref = 5 UNION
SELECT MAX(b.updated_at), 'b' AS type FROM table_b b WHERE b.ref = 5
) AS last ORDER BY max LIMIT 1
FROM table_c
WHERE table_c.fk_id = 1
)
The inherent problem is that SQL (all SQL not just Postgres) requires that a subquery used within a select clause can only return a single value. If you think about that restriction for a while it does makes sense. The select clause is returning rows and a certain number of columns, each row.column location is a single position within a grid. You can bend that rule a bit by putting concatenations into a single position (or a single "complex type" like a JSON value) but it remains a single position in that grid regardless.
Here however you do want 2 separate columns AND you need to return both columns from the same row, so instead of LIMIT 1 I suggest using ROW_NUMBER() instead to facilitate this:
WITH LastVals as (
SELECT type
, max_date
, row_number() over(order by max_date DESC) as rn
FROM (
SELECT MAX(a.updated_at) AS max_date, 'a' AS type FROM table_a a WHERE a.ref = 5
UNION ALL
SELECT MAX(b.updated_at) AS max_date, 'b' AS type FROM table_b b WHERE b.ref = 5
)
)
, sql_query as (
SELECT id
, name, address, (...other columns)
, (select type from lastVals where rn = 1) as last_type
, (select max_date from lastVals where rn = 1) as last_date
FROM table_c
WHERE table_c.fk_id = 1
)
----
By the way in your subquery you should use UNION ALL with type being a constant like 'a' or 'b' then even if MAX(a.updated_at) was identical for 2 or more tables, the rows would still be unique because of the difference in type. UNION will attempt to remove duplicate rows but here it just isn't going to help, so avoid that wasted effort by using UNION ALL.
----
For another way to skin this cat, consider using a LEFT JOIN instead
SELECT id
, name, address, (...other columns)
, lastVals.type
, LastVals.last_date
FROM table_c
WHERE table_c.fk_id = 1
LEFT JOIN (
SELECT type
, last_date
, row_number() over(order by last_date DESC) as rn
FROM (
SELECT MAX(a.updated_at) AS last_date, 'a' AS type FROM table_a a WHERE a.ref = 5
UNION ALL
SELECT MAX(b.updated_at) AS last_date, 'b' AS type FROM table_b b WHERE b.ref = 5
)
) LastVals ON LastVals.rn = 1
since I am not as good with more complex SQL SELECT Statements I thought of just asking here, since it's hard to find something right on topic.
I got two tables who have exactly the same structure like
TABLE A (id (INT(11)), time (VARCHAR(10));)
TABLE B (id (INT(11)), time (VARCHAR(10));)
Now I want a single SELECT to count the entrys of an specific id in both tables.
SELECT COUNT(*) FROM TABLE A WHERE id = '1';
SELECT COUNT(*) FROM TABLE B WHERE id = '1';
So I thought it would be much better for the database performance if I use one SELECT instead of one.
Thanks for helping out
SELECT COUNT(*) as count, 'tableA' as table_name FROM TABLEA WHERE id = '1'
union all
SELECT COUNT(*), 'tableB' FROM TABLEB WHERE id = '1'
If you want the separate counts in a single row, you can use subqueries
SELECT
(SELECT COUNT(*) FROM TABLE A WHERE id = '1') a_count,
(SELECT COUNT(*) FROM TABLE B WHERE id = '1') b_count;
You could do it like:
select count(*)
from (
select id from t1 where id = 1
union all
select id from t2 where id = 1
) as t
Another alternative is:
select sum(cnt)
from (
select count(*) as cnt from t1 where id = 1
union all
select count(*) as cnt from t2 where id = 1
) as t