SQL Merge two queries and insert a new column for calculus - sql

I have 2 tables, Transactions (attributes of Interest: disponent_id, transaction_id) and Attachments (attributes of Interest: disponent_id, filename).
The main goal is the following:
I want to group the transactions per each Disponent of the table "Transact" (transactions per disponent)
The same with the table "Attach" (attachments per disponent)
After, I want to merge both and insert a new column, which shows the ratio of attachments per transaction (Attachments/Transactions)
..
(1)
Disponent | Transactions
213456 | 35
...
(2)
Disponent | Attachments
213456 | 70
(3)
Disponent | Transactions | Attachments | Ratio
213456 | 35 | 70 | 2
...
I've tried
SELECT Transact.disponent_id, COUNT(Transact.transaction_id) AS Transactionnumber
FROM Transact
GROUP BY Transact.disponent_id
UNION ALL
SELECT Attach.disponent_id, COUNT(Attach.filename) AS Filenumber
FROM Attach
GROUP BY Attach.disponent_id
But the result is only:
disponent_id | transactionnumber
234576 | 65
...
How can I insert the calculation and the attachment column?

I used your queries within with clause, then used new select statement with inner join.
check it out:
With wth0 as
(
SELECT
Transact.disponent_id,
COUNT(Transact.transaction_id) AS Transactionnumber
FROM Transact
GROUP BY Transact.disponent_id
),
wth1 as
(
SELECT Attach.disponent_id, COUNT(Attach.filename) AS Filenumber
FROM Attach
GROUP BY Attach.disponent_id
)
SELECT
wth0.disponent_id,
wth0.Transactionnumber,
wth1.Filenumber,
wth1.Filenumber / wth0.Transactionnumber as Ratio
from wth0
inner join wth1
on wth0.disponent_id = wth1.disponent_id;

Related

Trying to display Using Sum() with JOIN doesn't provide correct output

I'm trying to create a query that displays a user's Id, the sum of total steps, and sum of total calories burnt.
The data for steps and calories are within two datasets, so I used JOIN. However, when I write out the query, the joined data does not look correct. However when I do them separately, it appears to show the correct data
Below are my queries...I am fairly new to SQL, so I am somewhat confused on what I did wrong. How do I correct this? Thank you in advanced for the help!
For the Steps table, "Id" and "StepTotal" are Integers. For the Calories table, "Id" and "Calories" are also Integers.
SELECT steps.Id,Sum(StepTotal) AS Total_steps,Sum(cal.Calories) as Total_calories
FROM fitbit.Daily_steps AS steps
JOIN fitbit.Daily_calories AS cal ON steps.Id=cal.Id
GROUP BY Id
Given Output(Picture)
Expected Output(Picture)
For Steps
SELECT Id,Sum(StepTotal) AS Total_steps
FROM fitbit.Daily_steps
group by Id
Id
Total_steps
1503960366
375619
1624580081
178061
1644430081
218489
For Calories
SELECT Id,Sum(Calories) AS Total_calories
FROM fitbit.Daily_calories
group by Id
Id
Total_calories
1503960366
56309
1624580081
45984
1644430081
84339
I believe your current solution is returning additional rows as the result of the JOIN.
Let's look at an example data set
Steps
id | total
a | 5
a | 7
b | 3
Calories
id | total
a | 100
a | 300
b | 400
Now, if we SELECT * FROM Calories, we'd get 3 rows. If we SELECT * FROM Calories GROUP BY id, we'd get two rows.
But if we use a JOIN:
SELECT Steps.id, Steps.total AS steps, Calories.total AS cals FROM Steps
JOIN Calories
ON Steps.id = Calories.id
WHERE id = 'a'
This would return the following:
Steps_Calories
id | steps | cals
a | 5 | 100
a | 5 | 300
a | 7 | 100
a | 7 | 300
So now if we GROUP BY & SUM(steps), we get 24, instead of the expected 12, because the JOIN returns each pairing of steps & calories.
To mitigate this, we can use sub-queries & group & sum within the sub-queries
SELECT Steps.id, Steps.total AS steps, Calories.total AS cals
FROM (SELECT id, SUM(total) FROM Steps GROUP BY id) as step_totals
JOIN (Select id, SUM(total) FROM Cals GROUP BY id) as cal_totals
JOIN Calories
ON cal_totals.id = step_totals.id
Now each subquery only returns a single row for each id, so the join only returns a single row as well.
Of course, you'll have to adapt this for your schema.

SQL: Selecting record where values in one field are unique based off of most recent date

I'm attempting to write an SQL statement to select records such that each record has a unique PartNo, and I want that record to be based off of the most recent ReceiveDate. I got an answer when I asked this question:
SELECT t.*
FROM Table as t
WHERE t.ReceiveDate = (SELECT MAX(t2.ReceiveDate)
FROM Table as t2
WHERE t2.PartNo = t.PartNo
);
However, this answer assumes that for each ReceiveDate, you would not have the same PartNo twice. In situations where there are multiple records with the same PartNo and ReceiveDate, it does not matter which is selected, but I only want one to be selected (PartNo must be unique)
Example:
PartNo | Vendor | Qty | ReceiveDate
100 | Bob | 2 | 2020/07/30
100 | Bob | 3 | 2020/07/30
Should only return one of these records.
I'm using Microsoft Access which uses Jet SQL which is very similar to T-SQL.
Use NOT EXISTS:
select distinct t.*
from tablename as t
where not exists (
select 1 from tablename
where partno = t.partno
and (
receivedate > t.receivedate
or (receivedate = t.receivedate and qty > t.qty)
or (receivedate = t.receivedate and qty = t.qty and vendor > t.vendor)
)
)
manually set up a standard Aggregate query (sigma icon in ribbon) where grouped on Part No and Date field is set to MAX...
run the query to check to see it returns the values you seek... then while in design view - - select SQL view and this will give you the sql statement...

How to aggregate json fields when using GROUP BY clause in postgres?

I have the following table structure in my Postgres DB (v12.0)
id | pieces | item_id | material_detail
---|--------|---------|-----------------
1 | 10 | 2 | [{"material_id":1,"pieces":10},{"material_id":2,"pieces":20},{"material_id":3,"pieces":30}]
2 | 20 | 2 | [{"material_id":1,"pieces":40}
3 | 30 | 3 | [{"material_id":1,"pieces":20},{"material_id":3,"pieces":30}
I am using GROUP BY query for this records, like below
SELECT SUM(PIECES) FROM detail_table GROUP BY item_id HAVING item_id =2
Using which I will get the total pieces as 30. But how could I get the count of total pieces from material_detail group by material_id.
I want result something like this
pieces | material_detail
-------| ------------------
30 | [{"material_id":1,"pieces":50},{"material_id":2,"pieces":20},{"material_id":3,"pieces":30}]
As I am from MySQL background, I don't know how to achieve this with JSON fields in Postgres.
Note: material_detail column is of JSONB type.
You are aggregating on two different levels. I can't think of a solution that wouldn't need two separate aggregation steps. Additionally to aggregate the material information all arrays of the item_id have to be unnested first, before the actual pieces value can be aggregated for each material_id. Then this has to be aggregated back into a JSON array.
with pieces as (
-- the basic aggregation for the "detail pieces"
select dt.item_id, sum(dt.pieces) as pieces
from detail_table dt
where dt.item_id = 2
group by dt.item_id
), details as (
-- normalize the material information and aggregate the pieces per material_id
select dt.item_id, (m.detail -> 'material_id')::int as material_id, sum((m.detail -> 'pieces')::int) as pieces
from detail_table dt
cross join lateral jsonb_array_elements(dt.material_detail) as m(detail)
where dt.item_id in (select item_id from pieces) --<< don't aggregate too much
group by dt.item_id, material_id
), material as (
-- now de-normalize the material aggregation back into a single JSON array
-- for each item_id
select item_id, jsonb_agg(to_jsonb(d) - 'item_id') as material_detail
from details d
group by item_id
)
-- join both results together
select p.item_id, p.pieces, m.material_detail
from pieces p
join material m on m.item_id = p.item_id
;
Online example

`INTERSECT` does not return anything from two tables, separately values are returned fine

I'm not sure what I am doing wrong here since I didn't touch SQL queries for several years plus MSSQL query language is a bit strange to me but after 30 minutes of googling I still cannot find the answer.
Problem
I have two queries that work perfectly fine:
SELECT COUNT(*) AS 'NumberOfAccounts' FROM Accounts
SELECT COUNT(*) AS 'NumberOfUsers' FROM Users
I need to get this information in one go in my API response since I don't want to execute two statements. How can I combine them into one query so it will return table as follows:
+------------------+---------------+
| NumberOfAccounts | NumberOfUsers |
+------------------+---------------+
| 10 | 16 |
+------------------+---------------+
What I have tried
UNION SELECT COUNT(*) AS 'NumberOfAccounts' FROM Accounts UNION SELECT COUNT(*) AS 'NumberOfUsers' FROM Users
This is giving me the result of both tables, however it all pushes it into NumberOfAccounts and the result is invalid for me to parse.
+------------------+
| NumberOfAccounts |
+------------------+
| 10 |
| 16 |
+------------------+
INTRSECT SELECT COUNT(*) AS 'NumberOfAccounts' FROM Accounts INTERSECT SELECT COUNT(*) AS 'NumberOfUsers' FROM Users
This just gives me empty result with only NumberOfAccounts column in it.
You can just put these as subqueries in a select:
SELECT (SELECT COUNT(*) FROM Accounts) as NumberOfAccounts,
(SELECT COUNT(*) FROM Users) as NumberOfUsers
In SQL Server, no FROM clause is needed.
UNION is the wrong usage here. Union will "merge" rows of identical tables (or identical selects) and not columns.
One solution might be:
SELECT AccountCount, UserCount FROM
(SELECT COUNT(*) AS AccountCount, 1 AS Id FROM Accounts) AS a
JOIN
(SELECT COUNT(*) AS UserCount, 1 as Id FROM Users) AS u ON (a.Id = u.Id)
Be aware of the artificial surrogate key 1 you need to insert to join both sub-selects together.
For completeness sake; with UNION ALL you'd do:
SELECT 'NumberOfAccounts' AS what, COUNT(*) AS howmany FROM accounts
UNION ALL
SELECT 'NumberOfUsers' AS what, COUNT(*) AS howmany FROM users;
which results in
+------------------+---------+
| what | howmany |
+------------------+---------+
| NumberOfAccounts | 10 |
| NumberOfUsers | 16 |
+------------------+---------+
And another variation:
WITH cte AS
(
SELECT COUNT(*) AS cntAccounts, 0 AS cntUsers FROM accounts
UNION ALL
SELECT 0 AS cntAccounts, COUNT(*) AS cntUsers FROM users
)
SELECT
SUM(cntAccounts) AS NumberOfAccounts
,SUM(cntUsers ) AS NumberOfUsers
FROM cte
If you want (need) better performance you can get the row counts from the following query which uses sys.dm_db_partition_stats to get the row counts:
SELECT (
SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('Accounts')
AND (index_id=0 or index_id=1)) NumberOfAccounts,
(
SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('Users')
AND (index_id=0 or index_id=1)) NumberOfUsers

Self-referencing Query and Not Equals

Trying to pull data from a single table called tblTooling where two TlPartNo numbers are equal to different values and the TlToolNo are not equal for these TlPartNo . This is an Access DB and the following statement gets me close, but still gives too much data.
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling, tblTooling AS tblTooling_1
WHERE (((tblTooling.TlToolNo)<>tblTooling_1.TlToolNo)
AND ((tblTooling.TlPartNo)="10290722")
AND ((tblTooling_1.TlPartNo)="10295379"));
The included image has the tblTooling structure and Data. Plus the expected results from the query.
You seem to want exclude a ToolNo value when it occurs with both PartNo values. In that case you could group intermediate results by ToolNo, and see whether in such a group there is only one PartNo present (with having). In that case keep that record, and in the outer query, get the two other columns added to it:
SELECT DISTINCT
tblTooling.TlToolNo,
tblTooling.TlPartNo,
tblTooling.TlOP,
tblTooling.TlQuantity
FROM tblTooling
INNER JOIN (
SELECT TlToolNo,
Min(TlPartNo) AS MinTlPartNo,
Max(TlPartNo) AS MaxTlPartNo
FROM tblTooling
WHERE TlPartNo IN ("10290722", "10295379")
GROUP BY TlToolNo
HAVING Min(TlPartNo) = Max(TlPartNo)
) AS grp
ON grp.TlToolNo = tblTooling.TlToolNo
AND grp.MinTlPartNo = tblTooling.TlPartNo
Note that for your sample data this will return 4 rows:
TlToolNo | TlPartNo | TlOP | TlQuantity
----------+----------+------+-----------
T00012362 | 10290722 | OP10 | 2
T00012456 | 10290722 | OP10 | 1
T00013456 | 10290722 | OP20 | 1
T00014348 | 10295379 | OP20 | 1
I think you can do this with not exists:
select t.*
from tblTooling as t
where not exists (select 1
from tblTooling as t2
where t2.TlPartNo in ("10290722", "10295379") and
t2.TlToolNo = t.TlToolNo and
t2.tiid <> t.tiid
) and
t.TlPartNo in ("10290722", "10295379");
This saves on the select distinct, which should be a performance boost.