SQL Server : select from multiple tables - sql

Table Accounts:
+----+------+----------+
| ID | Nick | Dono_CID |
+----+------+----------+
| 2 | Bart | 3 |
+----+------+----------+
Table Logins:
+------------+------------+
| Jogador_ID | TS_Logou |
+------------+------------+
| 2 | 1590116475 |
| 2 | 1590118258 |
+------------+------------+
In short, I intend to identify if there is a row with TS_Logou smaller than the Timestamp of 1 month ago, and if Dono_CID != -1
OBS: Accounts.ID = Logins.Jogador_ID
OBSĀ²: There are multiple records in the Logins table. I want to select the last one, in DESC order
My attempt:
SELECT
ct.Nick,
ct.Dono_CID
FROM
Contas AS ct
INNER JOIN
Logins AS lg ON lg.Jogador_ID = ct.ID
WHERE
ct.Dono_CID != -1
AND lg.TS_Logou < 1587524400
GROUP BY
lg.Jogador_ID
ORDER BY
lg.TS_Logou DESC
LIMIT 1

From your attempt, I understand that TS_Logou < 1587524400, means older than one month.
I am trying to select the login with the maximum TS_Logou satisfying the filter condition.
SELECT TOP 1 a.Id, a.Nick, a.Dono_CID
FROM Logins as l
Inner Join Account as a
a.Id = l.Jogador_Id
WHERE a.Dono_CID <> -1
AND a.TS_Logou < 1587524400
ORDER BY l.TS_Logou DESC

in hear, I try to select max TS_Logou form Logins for the user and that table joins with the Account table. this works for me
SELECT
ac.Nick,
ac.Dono_CID
FROM
Account AS ac
INNER JOIN
(SELECT l.Jogador_ID,MAX(l.TS_Logou) FROM Logins AS l
WHERE DATE(l.TS_Logou) < DATEADD(month, -1, GETDATE())
GROUP BY l.Jogador_ID) AS lg
ON lg.Jogador_ID = ac.ID
WHERE
ac.Dono_CID <> -1

Related

SUM CASE when DISTINCT?

Joining two tables and grouping, we're trying to get the sum of a user's value but only include a user's value once if that user is represented in a grouping multiple times.
Some sample tables:
user table:
| id | net_worth |
------------------
| 1 | 100 |
| 2 | 1000 |
visit table:
| id | location | user_id |
-----------------------------
| 1 | mcdonalds | 1 |
| 2 | mcdonalds | 1 |
| 3 | mcdonalds | 2 |
| 4 | subway | 1 |
We want to find the total net worth of users visiting each location. User 1 visited McDonalds twice, but we don't want to double count their net worth. Ideally we can use a SUM but only add in the net worth value if that user hasn't already been counted for at that location. Something like this:
-- NOTE: Hypothetical query
SELECT
location,
SUM(CASE WHEN DISTINCT user.id then user.net_worth ELSE 0 END) as total_net_worth
FROM visit
JOIN user on user.id = visit.user_id
GROUP BY 1;
The ideal output being:
| location | total_net_worth |
-------------------------------
| mcdonalds | 1100 |
| subway | 100 |
This particular database is Redshift/PostgreSQL, but it would be interesting if there is a generic SQL solution. Is something like the above possible?
You don't want to consider duplicate entries in the visits table. So, select distinct rows from the table instead.
SELECT
v.location,
SUM(u.net_worth) as total_net_worth
FROM (SELECT DISTINCT location, user_id FROM visit) v
JOIN user u on u.id = v.user_id
GROUP BY v.location
ORDER BY v.location;
You can use a window function to get the unique users, then join that to the user table:
select v.location, sum(u.net_worth)
from "user" u
join (
select location, user_id,
row_number() over (partition by location, user_id) as rn
from visit
order by user_id, location, id
) v on v.user_id = u.id and v.rn = 1
group by v.location;
The above is standard ANSI SQL, in Postgres this can also be expressed using distinct on ()
select v.location, sum(u.net_worth)
from "user" u
join (
select distinct on (user_id, location) *
from visit
order by user_id, location, id
) v on v.user_id = u.id
group by v.location;
You can join the user table with distinct values of location & user id combination like the below generic SQL.
SELECT v.location, SUM(u.net_worth)
FROM (SELECT location, user_id FROM visit GROUP BY location, user_id) v
JOIN user u on u.id = v.user_id
GROUP BY v.location;

SQL Performance Inner Join

Let me ask you something I've been thinking about for a while. Imagine that you have two tables with data:
MAIN TABLE (A)
| ID | Date |
|:-----------|------------:|
| 1 | 01-01-1990|
| 2 | 01-01-1991|
| 3 | 01-01-1992|
| 4 | 01-01-2000|
| 5 | 01-01-2001|
| 6 | 01-01-2003|
SECONDARY TABLE (B)
| ID | Date | TOTAL |
|:-----------|------------:|--------:|
| 1 | 01-01-1990| 1 |
| 2 | 01-01-1991| 2 |
| 3 | 01-01-1992| 1 |
| 4 | 01-01-2000| 5 |
| 5 | 01-01-2001| 8 |
| 6 | 01-01-2003| 7 |
and you want to select only ID with date greater than 31-12-1999 and get the following columns: ID, Date and Total. For that we have many options but my question would be which of the following would be better in terms of performance:
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
)
select main.id,
main.date,
B.total
from main inner join B on main.id = b.id
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
),
secondary as (
select id,
total
from B
where date > '31-12-1999'
)
select main.id,
main.date,
secondary.total
from main inner join secondary on main.id = b.id
Which of both queries would be better in terms of performance? Thanks in advance!
DATE FOR BOTH TABLES MEANS THE SAME
You don't need to use CTE you can directly join two tables -
select A.id,
A.date,
B.total
from A inner join B on A.id = b.id
where A.date > '31-12-1999'
You would need to test on your data. But there is really no need for CTEs:
select a.id a.date, b.total
from a inner join
b
on a.id = b.id
where a.date > '1999-12-31' and b.date > '1999-12-31';
As for your specific question, the two queries are not the same, because the first is filtering on only one date and the second is filtering on two dates. You should run the query that implements the logic that you intend.

GROUP BY with SUM without removing empty (null) values

TABLES:
Players
player_no | transaction_id
----------------------------
1 | 11
2 | 22
3 | (null)
1 | 33
Transactions
id | value |
-----------------------
11 | 5
22 | 10
33 | 2
My goal is to fetch all data, maintaining all the players, even with null values in following query:
SELECT p.player_no, COUNT(p.player_no), SUM(t.value) FROM Players p
INNER JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
nevertheless results omit null value, example:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
What I would like to have is mention about the empty value:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
3 | 0 | 0
What do I miss here?
Actually I use QueryDSL for that, but translated example into pure SQL since it behaves in the same manner.
using LEFT JOIN and coalesce function
SELECT p.player_no, COUNT(p.player_no), coalesce(SUM(t.value),0)
FROM Players p
LEFT JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
Change your JOIN to a LEFT JOIN, then add IFNULL(value, 0) in your SUM()
left join keeps all the rows in the left table
SELECT p.player_no
, COUNT(*) as count
, SUM(isnull(t.value,0))
FROM Players p
LEFT JOIN Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no
You might be looking for count(t.value) rather than count(*)
I'm just offering this so you have a correct answer:
SELECT p.player_no, COUNT(t.id) as [count], COALESCE(SUM(t.value), 0) as [sum]
FROM Players p LEFT JOIN
Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no;
You need to pay attention to the aggregation functions as well as the JOIN.
Please Try This:
SELECT P.player_no,
COUNT(*) as count,
SUM(isnull(T.value,0))
FROM Players P
LEFT JOIN Transactions T
ON P.transaction_id = T.id
GROUP BY P.player_no
Hope this helps.

Select distinct where date is max

This feels really stupid to ask, but i can't do this selection in SQL Server Compact (CE)
If i have two tables like this:
Statuses Users
id | status | thedate id | name
------------------------- -----------------------
0 | Single | 2014-01-01 0 | Lisa
0 | Engaged | 2014-01-02 1 | John
1 | Single | 2014-01-03
0 | Divorced | 2014-01-04
How can i now select the latest status for each person in statuses?
the result should be:
Id | Name | Date | Status
--------------------------------
0 | Lisa | 2014-01-04 | Divorced
1 | John | 2014-01-03 | Single
that is, select distinct id:s where the date is the highest, and join the name. As bonus, sort the list so the latest record is on top.
In SQL Server CE, you can do this using a join:
select u.id, u.name, s.thedate, s.status
from users u join
statuses s
on u.id = s.id join
(select id, max(thedate) as mtd
from statuses
group by id
) as maxs
on s.id = maxs.id and s.thedate = maxs.mtd;
The subquery calculates the maximum date and uses that as a filter for the statuses table.
Use the following query:
SELECT U.Id AS Id, U.Name AS Name, S.thedate AS Date, S.status AS Status
FROM Statuses S
INNER JOIN Users U on S.id = U.id
WHERE S.thedate IN (
SELECT MAX(thedate)
FROM statuses
GROUP BY id);

Multiple Left Joins - how to?

I have a Rails app running at Heroku, where I'm trying to calculate the rank (position) of a user to a highscore list.
The app is a place for the users to bet each other and the can start the wager (be creating a CHOICE) or they can bet against an already created Choice (by making a BET).
I have the following SQL which should give me an array of users based on their total winnings on both Choices and Bets.. But it's giving me some wrong total winning and I think the problem is in the Left Joins because if I rewrite the SQL to only contain either the Choice or the Bet table then I works just fine..
Anyone with any pointers on how to rewrite the SQL to work correctly :)
SELECT users.id, sum(COALESCE(bets.profitloss, 0) + COALESCE(choices.profitloss, 0)) as total_pl
FROM users
LEFT JOIN bets ON bets.user_id = users.id
LEFT JOIN choices ON choices.user_id = users.id
GROUP BY users.id
ORDER BY total_pl DESC
Result:
+---------------+
| id | total_pl |
+---------------+
| 1 | 830 |
| 4 | 200 |
| 3 | 130 |
| 7 | -220 |
| 5 | -1360 |
| 6 | -4950 |
+---------------+
Below are the two SQL string where I only join to one table and the two results from that.. see that the sum of the below do not match the above result.. The below are the correct sum.
SELECT users.id, sum(COALESCE(bets.profitloss, 0)) as total_pl
FROM users
LEFT JOIN bets ON bets.user_id = users.id
GROUP BY users.id
ORDER BY total_pl DESC
SELECT users.id, sum(COALESCE(choices.profitloss, 0)) as total_pl
FROM users
LEFT JOIN choices ON choices.user_id = users.id
GROUP BY users.id
ORDER BY total_pl DESC
+---------------+
| id | total_pl |
+---------------+
| 3 | 170 |
| 1 | 150 |
| 4 | 100 |
| 5 | 80 |
| 7 | 20 |
| 6 | -30 |
+---------------+
+---------------+
| id | total_pl |
+---------------+
| 1 | 20 |
| 4 | 0 |
| 3 | -10 |
| 7 | -30 |
| 5 | -110 |
| 6 | -360 |
+---------------+
This is happening because of the relationship between the two LEFT JOINed tables - that is, if there are (multiple) rows in both bets and choices, the total number of rows seen is multiplied from the individual row counts, not the addition.
If you have
choices
id profitloss
================
1 20
1 30
bets
id profitloss
================
1 25
1 35
The result of the join is actually:
bets/choices
id bets.profitloss choices.profitloss
1 20 25
1 20 35
1 30 25
1 30 35
(see where this is going?)
Fixing this is actually fairly simple. You haven't specified an RDBMS, but this should work on any of them (or with minor tweaks).
SELECT users.id, COALESCE(bets.profitloss, 0)
+ COALESCE(choices.profitloss, 0) as total_pl
FROM users
LEFT JOIN (SELECT user_id, SUM(profitloss) as profitloss
FROM bets
GROUP BY user_id) bets
ON bets.user_id = users.id
LEFT JOIN (SELECT user_id, SUM(profitloss) as profitloss
FROM choices
GROUP BY user_id) choices
ON choices.user_id = users.id
ORDER BY total_pl DESC
(Also, I believe the convention is to name tables singular, not plural.)
Your problem is that you are blowing out your data set. If you did a SELECT * you would be able to see it. Try this. I was not able to test it because I don't have your tables, but it should work
SELECT
totals.id
,SUM(totals.total_pl) total_pl
FROM
(
SELECT users.id, sum(COALESCE(bets.profitloss, 0)) as total_pl
FROM users
LEFT JOIN bets ON bets.user_id = users.id
GROUP BY users.id
UNION ALL SELECT users.id, sum(COALESCE(choices.profitloss, 0)) as total_pl
FROM users
LEFT JOIN choices ON choices.user_id = users.id
GROUP BY users.id
) totals
GROUP BY totals.id
ORDER BY total_pl DESC
In a similar solution as Clockwork, since the columns are the same per table, I would pre-union them and just sum them. So, AT MOST, the inner query will have two records per user... one for the bets, one for the choices -- each respectively pre-summed since doing a UNION ALL. Then, simple join/sum to get the results
select
U.userid,
sum( coalesce( PreSum.profit, 0) ) as TotalPL
from
Users U
LEFT JOIN
( select user_id, sum( profitloss ) as Profit
from bets
group by user_id
UNION ALL
select user_id, sum( profitloss ) as Profit
from choices
group by user_id ) PreSum
on U.ID = PreSum.User_ID
group by
U.ID