SQL query that joins two tables and sum from second - sql

I have this tables with example data:
Calendar
ID | Date
---+-----------
1 | 2020-01-01
2 | 2020-01-02
3 | 2020-01-03
EmployeeTimeWorked
ID | Date | HoursWorked | UserID
---+------------+--------------+-------
1 | 2020-01-01 | 2 | 2
2 | 2020-01-01 | 4 | 2
I want to make a MS-SQL query that shows days the user have not worked, and how many hours they have left to work (they should work 8 hours per day). All within within a time period, say a week.
The result should look like this:
EmployeeHaveNotWorked
Date | HoursLeftToWork
-----------+----------------
2020-01-01 | 2
Any idea how to make such a MS-SQL Query?

First get all users with all dates. This is done with a cross join. Seeing that you are using a UserID I suppose there is a users table. Otherwise get the users from the EmployeeTimeWorked table.
Then outer join the working times per user and date. This is a simple aggregation query.
Then subtract the worked hours from the required 8 hours.
select
u.userid,
c.date,
8 - coalesce(w.hours_worked, 0) as hours_left_to_work
from users u
cross join calendar c
left outer join
(
select userid, date, sum(hoursworked) as hours_worked
from employeetimeworked
group by userid, date
) w on w.userid = u.userid and w.date = c.date
order by u.userid, c.date;

Use a cross join to generate all possible rows and then filter out the ones that exist:
select u.userid, c.date,
8 - coalesce(sum(HoursWorked), 0) as remaining_time
from calendar c cross join
(select distinct userid from EmployeeTimeWorked) u left join
EmployeeTimeWorked etw
on etw.userid = u.userid and etw.date = c.date
where etw.userid is null
group by u.userid, c.date
having sum(HoursWorked) < 8

This query seem to have done it for me:
select * from (select c.Date, 8 - coalesce(sum(t.durationHours),0) hours_left_to_work
from Calendar c
left join TimeLog t on t.Date = c.Date
where c.date >= '2020-08-01' and c.date <= '2020-08-31'
group by c.Date) as q1
where q1.hours_left_to_work IS NOT NULL
AND q1.hours_left_to_work > 0;
TimeLog = EmployeeTimeWorked

Related

SQL query to the required output

I need a desired output as below out of two input tables
order tableusers table
id order_date id username
1 2019-01-01 1 A
2 2019-01-01 2 B
3 2019-01-02 3 B
4 2019-01-03 4 A
5 2019-01-03 5 B
Desired Output
order_date username orders
2019-01-01 A 1
2019-01-02 A 0
2019-01-03 A 1
I tried with this query,
SELECT o. order_date as order_date, u.username as username,
ISNULL (COUNT (username),0) AS orders
FROM Order O LEFT JOIN users U ON o.id=u.id
WHERE u.username = ‘A’
GROUP BY o. order_date, u.username
ORDER BY o. order_date, u.username
Which give me this result
order_date username orders
2019-01-01 A 1
2019-01-03 A 1
I don't know how to bring this part in the result "2019-01-02 A 0"
could anyone please help me with the query, Thanks in advance
You can do:
select d.order_date, 'A' as username, coalesce(cnt, 0) as orders
from (select distinct order_date as order_date from orders) d
left join (
select o.order_date, count(*) as cnt
from orders o
join users u on u.id = o.id
where u.username = 'A'
group by o.order_date
) t on t.order_date = d.order_date
order by d.order_date
Result:
order_date username orders
----------- --------- ------
2019-01-01 A 1
2019-01-02 A 0
2019-01-03 A 1
See running example at db<>fiddle.
You can use the query below in which includeAllUsers (using CROSS JOIN) allows you to include 'A' --without put it in the SELECT clause--, and StrictMatching gives you the real dataset using matching ID between the two tables Order & Users (by the way, I really recommend you to change the name of your order table in Orders or other words, because ORDER is a reserved word).
select includeAllUsers.Order_Date,
coalesce(StrictMatching.username,includeAllUsers.username) User_Name,
count(distinct StrictMatching.username) Total_Orders
from (select o.order_date, u.username, u.id from orders o cross join users u) includeAllUsers
left join (select o.order_date, u.username,o.id from orders o join users u on o.id=u.id) StrictMatching
on includeAllUsers.order_date = StrictMatching.order_date and StrictMatching.username='A'
where includeAllUsers.username='A'
group by
includeAllUsers.order_date, StrictMatching.username, includeAllUsers.username;
By combining includeAllUsers and StrictMatching and filtering by StrictMatching.username='A' (in the criterium of the JOIN clause) et again by includeAllUsers.username='A' (in WHERE clause), you get the correct result.
Here is a link to verify

Postgres: Many to many joins creates double output

I've recently added a many to many JOIN to one of my queries to add a "tag" functionality. The many to many works great, however, it's now causing a previously working part of the query to output records twice.
SELECT v.*
FROM "Server" AS s
JOIN "Vote" AS v ON (s.id = v."serverId")
JOIN "_ServerToTag" st ON (s.id = st."A")
OFFSET 0 LIMIT 25;
id | createdAt | authorId | serverId
-----+-------------------------+----------+----------
190 | 2020-12-23 15:47:25.476 | 6667 | 3
190 | 2020-12-23 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
In the example above:
Server is my main table which contains a bunch of entries. Think of it as Reddit Posts, they have a title, content and use the Vote table to count "upvotes".
id | title
----+-------------------------------
3 | test server 3
Votes is a really simple table, it contains a timestamp of the "upvote", who created it, and the Server.id it is assigned to.
_ServerToTag is a table that contains two columns A and B. It connects Server to another table which contains Tags.
A | B
---+---
3 | 1
3 | 2
The above is a much-simplified query, in reality, I am suming the outcome of the query to get a number total of Votes.
The desired outcome would be that the results are not duplicated:
id | createdAt | authorId | serverId
-----+-------------------------+----------+----------
190 | 2020-12-23 15:47:25.476 | 6667 | 3
194 | 2020-12-21 15:47:25.476 | 6667 | 3
I'm really unsure why this is even happening so I have absolutely no idea how to fix it.
Any help would be greatly appreciated.
Edit:
DISTINCT works if I want to query the Vote table. But not in more complex queries. In my case it would look something more like this:
SELECT s.id, s.title, sum(case WHEN v."createdAt" >= '2020-12-01' AND v."createdAt" < '2021-01-01'
THEN 1 ELSE 0 END ) AS "voteCount",
FROM "Server" AS s
LEFT JOIN "Vote" AS v ON (s.id = "serverId")
LEFT JOIN "_ServerToTag" st ON (s.id = st."A");
id | title | voteCount
----+-------------------------------+-----------
3 | test server 3 | 4
In the above, I only need the voteCount column to be DISTINCT.
SELECT s.id, s.title, sum(DISTINCT case WHEN v."createdAt" >= '2020-12-01' AND v."createdAt" < '2021-01-01'
THEN 1 ELSE 0 END ) AS "voteCount",
FROM "Server" AS s
LEFT JOIN "Vote" AS v ON (s.id = "serverId")
LEFT JOIN "_ServerToTag" st ON (s.id = st."A");
id | title | voteCount
----+-------------------------------+-----------
3 | test server 3 | 1
The above kind of works, but it seems to only count one vote even if there are multiple.
It appears that the problem is that you added the join to _ServerToTag. Because there are multiple rows in _ServerToTag for each row in Server the query returns multiple rows for each server, one for each matching row in _ServerToTag.
It appears that _ServerToTag was adde to the query so it will only include servers which have tags. If that's your intent you can use:
SELECT v.id, v.authorId, v.serverId, COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
INNER JOIN "Vote" AS v
ON s.id = v."serverId"
INNER JOIN (SELECT DISTINCT "A" FROM "_ServerToTag") st
ON s.id = st."A"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01'
GROUP BY v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
or
SELECT v.id, v.authorId, v.serverId, COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
INNER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
which may communicate the intent of the query a bit better.
EDIT
If you want to be able to count entries which have no votes you'll need to use an outer join to pull in the (potentially non-existent) votes and then use a CASE expression to only count votes if they exist:
SELECT s.id, v.id, v.authorId, v.serverId,
CASE
WHEN v.id IS NULL THEN 0
ELSE COUNT(DISTINCT v.createdAt)
END AS TOTAL_VOTES
FROM "Server" AS s
LEFT OUTER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY s.id, v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
You may not actually need that though - you may be able to get away with
SELECT s.id, v.id, v.authorId, v.serverId,
COUNT(DISTINCT v.createdAt) AS TOTAL_VOTES
FROM "Server" AS s
LEFT OUTER JOIN "Vote" AS v
ON s.id = v."serverId"
WHERE v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01' AND
s.id IN (SELECT "A" FROM "_ServerToTag")
GROUP BY s.id, v.id, v.authorId, v.serverId
OFFSET 0 LIMIT 25
Okay so I went and asked a friend for help after not really being able to fix my problem with the answers I received.
I think my query was just too complex and confusing and I was suggested to use subqueries to make it less complicated and easier to manage.
My query now looks like this:
SELECT
s.id
, s.title
, COALESCE(v."VOTES", 0) AS "voteCount"
FROM "Server" AS s
-- Join tags
INNER JOIN
(
SELECT
st."A"
, json_agg(
json_build_object(
'id',
t.id,
'tagName',
t."tagName"
)
) as "tagsArray"
FROM
"_ServerToTag" AS st
INNER JOIN
"Tag" AS t
ON
t.id = st."B"
GROUP BY
st."A"
) AS tag
ON
tag."A" = s.id
-- Count votes
LEFT JOIN
(
SELECT
"serverId"
, COUNT(*) AS "VOTES"
FROM
"Vote" as v
WHERE
v."createdAt" >= '2020-12-01' AND
v."createdAt" < '2021-01-01'
GROUP BY "serverId"
) as v
ON
s.id = v."serverId"
OFFSET 0 LIMIT 25;
This works exactly the same way but by selecting what I need directly in the joins it's more readable and I have more control over the data I get back.

Find the rows for customers that joined during a specific time period in another table

I have the 2 following tables:
customer_transaction
customer_id| event_name | event_date
1 | joined_rewards|2019-07-10
12 | joined_rewards|2018-07-10
17 | joined_rewards|2009-07-10
visit
customer_id| visit_start| visit_end|visit_type
1 | 2019-07-09|2019-07-11| IP
12 | 2018-06-11|2018-07-12| IP
17 | 2009-07-08|2009-07-10| EP
I want to know all the customers in the customer_transaction table that joined the rewards program between their visits of visit_type = IP. So for all the visit_types = IP, I want to know the customers who joined the rewards program during the frame of their visit period.
In this example, my new table would have customer ids 1 and 12.
I tried
SELECT DISTINCT customer_id, event_date
INTO visit_rewards
FROM customer_transaction
WHERE event_date BETWEEN (Select customer_id, visit_start, visit_end from visit)```
Click: demo:db<>fiddle
SELECT
v.customer_id
FROM
visit v
JOIN customer_transaction ct
ON ct.event_date BETWEEN v.visit_start AND v.visit_end
AND v.visit_type = 'IP'
If there can be multiple rows in the visits table (as suggested by the name), I would suggest exists:
select ct.*
from customer_transaction ct
where ct.event_name = 'joined_rewards' and
exists (select 1
from visits v
where v.customer_id = ct.customer_id and
v.visit_start <= ct.event_date and
v.visit_end >= ct.event_date and
v.visit_type = 'IP'
);

Count how many users have both month A and B

I want to count how many users have both january and february months. I have a users table with this structure and data:
id | 1
user | u1
month | january
id | 2
user | u1
month | february
id | 3
user | u2
month | january
In my example the response would be 1.
I've tried doing SELECT COUNT(*) FROM (SELECT * FROM users WHERE users.month = 'january') s1 LEFT JOIN users s2 ON s1.user = s2.user AND s2.month = 'february';
In my actual data set this SELECT COUNT(*) FROM users WHERE users.month = 'january' returns about 100 so the overall selection can not possibly be larger than this result, yet the result is way higher.
I'm sure the answer is very simple, however i'm not very proficient in SQL so i just don't know what part of the documentation i should be reading.
You can use conditional aggregation:
select count(*)
from (select t.user
from t
where t.month in ('january', 'february')
group by t.user
having count(distinct t.month) = 2
) t;
If there is at most one row per user per month, then a join might have better performance:
select count(*)
from t tj join
t tf
on tj.user = tf.user and
tj.month = 'january' and
tf.month = 'february';
If you can have duplicates, then count(distinct user) is needed.
With EXISTS:
select count(distinct user)
from tablename t
where t.month in ('january', 'february')
and exists (
select 1 from tablename where user = t.user and month > t.month
)
See the demo.

Joining and Grouping data from 3 tables

I have two tables
Category
CategorySerno | CategoryName
1 One
2 Two
3 Three
Status
StatusSerno | Status
1 Active
2 Pending
Data
CatId |Status | Date
1 1 2014-07-26 11:30:09.693
2 2 2014-07-25 17:30:09.693
1 1 2014-07-25 17:30:09.693
1 2 2014-07-25 17:30:09.693
When I join them I get I need the Joining of the latest Date/
Like
One Active 2014-07-26 11:30:09.693
Two Inactive 2014-07-25 17:30:09.693
Three Null Null
When I am doing a Join and group them It gives me
One Active 2014-07-26 11:30:09.693
One Active 2014-07-26 11:30:09.693
One Active 2014-07-26 11:30:09.693
Two Inactive 2014-07-25 17:30:09.693
Three Null Null
You could use ROW_NUMBER in a CTE:
WITH CTE AS
(
SELECT c.CategoryName,
s.Status,
d.Date,
dateNum = ROW_NUMBER() OVER (PARTITION BY CatId, d.Status
ORDER BY Date DESC)
FROM Category c
LEFT OUTER JOIN Data d
ON c.CategorySerno = d.CatId
LEFT OUTER JOIN Status s
ON d.Status = s.StatusSerno
)
SELECT CategoryName, Status, Date
FROM CTE
WHERE dateNum = 1
Demo-Fiddle
SELECT CategoryName, Status.Status, MAX(Data.Date) FROM Category
LEFT OUTER JOIN Data ON CategorySerno = CatId
LEFT OUTER JOIN Status ON Data.Status = Status.StatusSerno
GROUP BY CategoryName, Status.Status
You prabobly have mismatch between SELECT and GROUP BY columns withch couse duplications
Try this:
SELECT Category.CategoryName, Status.Status, MAX(Data.Date) Data
FROM Data
LEFT JOIN Category ON Category.CategorySerno = Data.CatId
LEFT JOIN Status ON Status.StatusSerno = Data.Status
GROUP BY Category.CategoryName, Status.Status
You don't mention the RDBMS you're working in but you might try starting with something like:
SELECT
c.CategoryName
, s.Status
, d.Date
FROM
Data d
LEFT OUTER JOIN Category c ON d.CatId = c.CategorySerno
LEFT OUTER JOIN Status s ON d.Status = s.StatusSerno
WHERE
d.date=(
SELECT
max(dd.date)
FROM
Data dd
WHERE
d.CatId = dd.CatId
AND
d.Status = dd.Status
) z
To make this more maintainable in the long run, consider using a convention to identify the primary keys in any table, e.g., table_name_id, and use this same convention for foreign keys. Employing this convention address questions like: is a "CategorySerno" a "CatId"?