Lookup value in second table - sql

I have two tables.
Table Data
ID Item Kvartal
1 Payment 1
2 Salary 2
Table Kvartal
ID Kvartal_text Kvartal_nummer
1 Q1 1
2 Q2 2
I like to map Kvartal in table Data to Kvartal_text in table Kvartal by matching Kvartal in table Data with ID in table Kvartal. To get a result like Payment Q1; Salary Q2.
I have tried
SELECT * FROM Data
WHERE Data.Kvartal IN (SELECT Kvartal.Kvartal_text
FROM Kvartal
WHERE Kvartal.Kvartal_nummer = Data.Kvartal);

Simply JOIN the two tables and select the fields you want:
SELECT d.Item, k.Kvartal_text
FROM Data d
JOIN Kvartal k
ON k.ID = d.Kvartal

You can use MySQL Join operations for such tasks.
SELECT d.ID, d.Item, d.Kvartal, k.Kvartal_text FROM `Data` d
LEFT JOIN(
SELECT Kvartal_text, Kvartal_nummer FROM `Kvartal`
) AS k
ON k.Kvartal_nummer = d.Kvartal

Related

Optimize a complex PostgreSQL Query

I am attempting to make a complex SQL join on several tables: as shown below. I have included an image of the dB schema also.
Consider table_1 -
e_id name
1 a
2 b
3 c
4 d
and table_2 -
e_id date
1 1/1/2019
1 1/1/2020
2 2/1/2019
4 2/1/2019
The issue here is performance. From the tables 2 - 4 we only want the most recent entry for a given e_id but because these tables contain historical data (~ >3.5M rows) it's quite slow. I've attached an example of how we're currently trying to achieve this but it only includes one join of 'table_1' with 'table_x'. We group by e_id and get the max date for it. The other way we've thought about doing this is creating a Materialized View and pulling data from that and refreshing it after some period of time. Any improvements welcome.
from fds.region as rg
inner join (
select e_id, name, p_id
from fds.table_1
where sec_type = 'S' AND active_flag = 1
) as table_1 on table_1.e_id = rg.e_id
inner join fds.table_2 table_2 on table_2.e_id = rg.e_id
inner join fds.sec sec on sec.p_id = table_1.p_id
inner join fds.entity ent on ent.int_entity_id = sec.int_entity_id
inner join (
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4 on table_4.e_id = rg.e_id
where rg.region_str like '%US' and ent.sec_type = 'P'
order by table_2.int_price
limit 500;
You can simplify this logic:
(
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4
To:
(SELECT DISTINCT ON (int_1.e_id) int_1.*
FROM fds.table_4 int_1
ORDER BY int_1.e_id, int_1.date DESC
) table_4
This can take advantage of an index on fds.table_4(e_id, date desc) -- and might be wicked fast with such an index.
You also want appropriate indexes for the joins and filtering. However, it is hard to be more specific without an execution plan.

SELECT JOIN Table Operations

In my query below, I joined tables b, c, d to a by the base_id column.
However, I need to do some operations in my SELECT statement.
Since b, c, and d are not joined to each other,
is my (a.qty - (b.qty - (c.qty + d.qty))) formula computing only those tables that have the same base_id column?
SELECT (a.qty - (b.qty - (c.qty + d.qty))) AS qc_in
FROM receiving a
LEFT JOIN (
SELECT SUM(qty) AS qty, base_id
FROM quality_control bb
WHERE location_to = 6
AND is_canceled = 0
GROUP BY base_id
) b
ON b.base_id = a.base_id
LEFT JOIN (
SELECT SUM(qty) AS qty, base_id
FROM quality_control ba
WHERE location_from = 6
AND is_canceled = 0
GROUP BY base_id
) c
ON c.base_id = a.base_id
LEFT JOIN (
SELECT SUM(qty) AS qty, base_id
FROM issuance
WHERE location_from = 6
AND is_canceled = 0
GROUP BY base_id
) d
ON d.base_id = a.base_id
WHERE a.is_canceled = 0
I think you're confused by how joining works (if I'm reading the question correctly). If you have:
select *
from table1 a
join table2 b
on a.Id = b.Id
join table3c
on a.Id = c.Id
Then yes a is joined to b, and b is joined to c, but that also means that a is joined to c. One way to think of it is as one giant in-memory table that has all of the a columns then all of the b columns and then all of the c columns in the one result.
If a.Id is 1, and you select the row from b where Id is the same (1) and then you join to c where the Id is the same as a (1), then a, b and c all have the same id.
So yes, (a.qty - (b.qty - (c.qty + d.qty))) will only be doing that calculation for rows where a, b, c and d all have the same base_id .
Surely. In the nested statements you join all the tables on base_id. That means, as the result of those joins you will get a huge table containing columns from all of the joined tables with a common column you join on (base_id in your case).

How to group results by count of relationships

Given tables, Profiles, and Memberships where a profile has many memberships, how do I query profiles based on the number of memberships?
For example I want to get the number of profiles with 2 memberships. I can get the number of profiles for each membership with:
SELECT "memberships"."profile_id", COUNT("profiles"."id") AS "membership_count"
FROM "profiles"
INNER JOIN "memberships" on "profiles"."id" = "memberships"."profile_id"
GROUP BY "memberships"."profile_id"
That returns results like
profile_id | membership_count
_____________________________
1 2
2 5
3 2
...
But how do I group and sum the counts to get the query to return results like:
n | profiles_with_n_memberships
_____________________________
1 36
2 28
3 29
...
Or even just a query for a single value of n that would return
profiles_with_2_memberships
___________________________
28
I don't have your sample data, but I just recreated the scenario here with a single table : Demo
You could LEFT JOIN the counts with generate_series() and get zeroes for missing count of n memberships. If you don't want zeros, just use the second query.
Query1
WITH c
AS (
SELECT profile_id
,count(*) ct
FROM Table1
GROUP BY profile_id
)
,m
AS (
SELECT MAX(ct) AS max_ct
FROM c
)
SELECT n
,COUNT(c.profile_id)
FROM m
CROSS JOIN generate_series(1, m.max_ct) AS i(n)
LEFT JOIN c ON c.ct = i.n
GROUP BY n
ORDER BY n;
Query2
WITH c
AS (
SELECT profile_id
,count(*) ct
FROM Table1
GROUP BY profile_id
)
SELECT ct
,COUNT(*)
FROM c
GROUP BY ct
ORDER BY ct;

SQL deleting records with group by multiple tables

I am trying to delete duplicate records in a table but on if they are duplicate per a record from another.
The following query gets me the number of duplicate records per 'bodyshop'.
Im trying to delete multiple invoices for each bodyshop.
SELECT
inv.InvoiceNo, job.BodyshopId, COUNT(*)
FROM
[Test].[dbo].[Invoices] as inv
join [Test].[dbo].Repairs as rep on rep.Id = inv.RepairId
join [Test].[dbo].Jobs as job on job.Id = rep.JobsId
GROUP BY
inv.InvoiceNo, job.BodyshopId
HAVING
COUNT(*) > 1
I want the duplicate invoice numbers per bodyshop to be deleted but i do want the original one to remain.
InvoiceNo BodyshopId (No column name)
29737 16 2
29987 16 3
30059 16 2
23491 139 2
23608 139 3
23867 139 4
23952 139 3
I only want invoice number 29737 to be once against bodyshopid 16 etc.
Hope that makes sense
Thanks
Perhaps this :
with cte as (
SELECT
inv.ID, inv.InvoiceNo, job.BodyshopId, rn = row_number() over (partition by inv.InvoiceNo, job.BodyshopId order by inv.InvoiceNo, job.BodyshopId)
FROM
[Test].[dbo].[Invoices] as inv
join [Test].[dbo].Repairs as rep on rep.Id = inv.RepairId
join [Test].[dbo].Jobs as job on job.Id = rep.JobsId
)
delete t1
from [Test].[dbo].[Invoices] t1 inner join cte t2 on t1.ID = t2.ID
where t2.rn > 1
Edit 1 - Your comments are trues. So a solution is to add an identity column to the invoice table. I've adapt my query.
To add / remove an identity column :
alter table [Test].[dbo].[Invoices] id int identity(1,1)
drop column id
You may run the following as two records are same so, Group by will return single row for same invoice:
DELETE FROM inv where id not in (
SELECT Max(inv.id) FROM (
SELECT
inv.id, inv.InvoiceNo, job.BodyshopId, COUNT(*)
FROM
[Test].[dbo].[Invoices] as inv
join [Test].[dbo].Repairs as rep on rep.Id = inv.RepairId
join [Test].[dbo].Jobs as job on job.Id = rep.JobsId
GROUP BY
inv.InvoiceNo, job.BodyshopId
HAVING
COUNT(*) > 1
) TMP_TABLE )
id is the primary key.
General SQL. Modify if needed for sql-server.

SQL joining two tables with common row

I have 2 tables in sybase
Account_table
Id account_code
1 A
2 B
3 C
Associate_table
id account_code
1 A
1 B
1 C
2 A
2 B
3 A
3 C
I have this sql query
SELECT * FROM account_table account, associate_table assoc
WHERE account.account_code = assoc.account_code
This query will return 7 rows. What I want is to return the rows from associate_table that is only common to the 3 accounts like this:
account id account_code Assoc Id
1 A 1
2 B 1
3 C 1
Can anyone help what kind of join should I do?
SELECT b.id account_id,a.code account_code,a.id assoc_id
FROM associate a,
account b
WHERE a.code = b.code
AND a.id IN (SELECT a.id
FROM associate a,
account b
WHERE a.code = b.code
GROUP BY a.id
HAVING Count(*) = (SELECT Count(*)
FROM account));
NOTE: this query works only if you have unique values in Id and account_code columns in account table. And also, your associate_table should contain unique combination of (id, account,code). i.e., associate table should not contain (1,A) or any pair twice.
Try this
SELECT AC.ID,AC.account_code,ASS.ID
FROM account_table AC INNER JOIN associate_table AS ASS ON AC.account_code = ASS.account_code
OK so far answer is accepted I'll post simpler one:
SELECT *
FROM account_table AS account,
associate_table AS assoc
WHERE account.account_code = assoc.account_code
HAVING (
SELECT
COUNT(*)
FROM associate_table assoc_2
WHERE assoc_2.id = assoc.id
) = 3
here 3 is the number of codes account table has, if it's gonna be dynamic (changing over time),
you can use (SELECT COUNT(*) FROM account_table) instead of exact number. Also I'm sure it will be cached by database engine, so requires less resources