Display related rows in same row in MSaccess - sql

I have a set of related rows which I need to display in a single line. For example, the data I have is in different rows.
"ID" RecordDate "ExpType" "OrigBudget" "ActualCost"
1001 1-5-2017 Hardware $ 5000
1001 2-6-2017 Hardware $ 5200
The Original budget is approved at an earlier time for the same record but the Actual cost often differs and is recorded at a later date. I want the output as
ProjectID YearofEntry ExpenseType OrgBudget ActualCost <BR>
1001 2017 Hardware $ 5000 $ 5200 <BR>
I have tried group query to aggregate it based on ExpenseType and ProjectId but not successful in getting it into a single row so far.

if you always just have two rows for each ExpType - one with the original budget and one with the actual costs - you could simply use a GROUP BY:
SELECT ID AS ProjectID
,YEAR(RecordDate) AS YearofEntry
,ExpType AS ExpenseType
,MAX(OrigBudget) AS OrgBudget
,MAX(ActualCost) AS ActualCost
FROM yourtable
GROUP BY ID
,YEAR(RecordDate)
,ExpType

Try This:
SELECT ID,
Year([RecordDate]) AS YEARofEntry,
ExpType,
Sum(OrigBudget) AS SumOfOrigBudget,
Sum(ActualCost) AS SumOfActualCost
FROM youtable
GROUP BY ID,
Year([RecordDate]),
ExpType;

Related

How to get the set size, first and last record in a db2 ordered set with one call

I have a very big transaction table on DB2 v11, and I need to query a subset of it as efficiently as possible. All I need is the total count of the set (not known in advance, it's based on criteria, lets say 1 day) and the ID of the first record, and the ID of the last record.
The old code was fetching the entire table, then just using the 1st record ID, and the last record ID, and size, and not making use of the rest. Now this code is timing out. It's a complex query of several joins.
IS there a way to just fetch the size of the set, 1st record, last record all in one select query ?
I've read that reordering the list in order to fetch the 1st record(so fetch with Desc, then change to Asc) is not efficient.
sample table 1 TRANSACTION_RECORDS:
tdID TIMESTAMP name
-------------------------------
123 2020-03-31 john
234 2020-03-31 dan
456 2020-03-01 Eve
675 2020-04-01 joy
sample table 2 TRANSACTION_TYPE:
invoiceId tdID account
------------------------------
897 123 abc
898 123 def
877 234 mnc
899 456 opp
Sample query
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
group by tr.tdID
order by TR.tdID ASC
This results in multiple columns, (but it requires the group by)
123,123
234,234
456,456
What I want is:
123,456
As I mentioned in the comments, for this query you don't need Group BY and neither Order by, just do:
select Min(tr.transaction_id), Max(tr.transaction_id)
from TRANSACTION_RECORDS TR
join TRANSACTION_TYPE TT
on TR.tdID=tt.tdID
WHERE Date(TR.TIMESTAMP) = '2020-03-31'
It should work as expected

Oracle SQL self join performance

lets say I have a table called order with following data. I need to get the customer_name along with the no. of orders they have placed.
Table name: order
id | customer_name | item
1 | Siddhant | TV
2 | Siddhant | Mobile
3 | Sankalp | Football
Desired output:
customer_name | no_of_orders
Siddhant | 2
Sankalp | 1
I tried below 2 queries to get the result:
select customer_name, count(customer_name) as no_of_orders
from order
group by customer_name;
This gives me the correct result but takes around ~10.5 secs to run
select ord.customer_name, count(ord1.customer_name) as no_of_orders
from order ord
inner join order ord1 on ord1.customer_name = ord.customer_name
group by ord.customer_name;
This gives me the square(correct count) in the result but runs in ~2 secs. I can get the square root to get the actual count.
I understand why the second query gives the square of the actual count in the output but can someone explain why it runs so fast compared to the first query?
PS: I am running these in Oracle SQL Developer.
The first version is the one you should be using here:
SELECT customer_name, COUNT(customer_name) AS no_of_orders
FROM "order"
GROUP BY customer_name;
Absent a WHERE or HAVING clause, adding an index might not be too helpful here, because Oracle has to basically touch every record in the table in order to do the aggregation. As to why the second version appears to be faster, I speculate that the benchmarks you are using are not representative, because they are based on a fairly small table size. If you scale your table data to be in the tens of thousands of rows, I predict that the first version will be substantially faster.

SQL Query Totals

I'm trying to write a query that calculates the average profit per employee for several projects.
I have a table that has employee names, what project they are working on, and how much profit they bring to their specific project each day.
My first query gives 3 fields - The project name, the sum of all the profits the employees bring to the project, and the number of employees in the project.
My second query I am trying to display 2 fields - the project name and the average profit per employee that each project makes
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit, Count(SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName;
SELECT SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps] AS total
FROM SAYSqueryNIPE
GROUP BY SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps];
Unfortunately, my second query is giving me the same average profit for every project and I'm not sure why. Any help would be much appreciated.
EDIT:
Query 1 reads:
**Employee Name | Sell Rate | Renumeration | Profit (Sell-Renumeration) | Project Name**
Query 2 reads:
**PROJECT NAME | SumofProfit | NumberofEmployees**
Project X | $1500 | 3 employees
Query 3 reads:
**PROJECT NAME | TOTAL**
Project X | $500 (Average profit per employee)
The problem is in your first query where you count the number of employees. As written in the question, you are returning a count of rows, not employees who worked on the project. You need to use count distinct. I'd also recommend not counting on EmpFirstName. If you have more than one employee with the same first name, the query won't give you correct results. It would be better to use a unique employee identifier instead of their first name.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit,
Count(distinct SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName
You could wrap the whole thing up into a single query instead of two or three, as described in your question.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit)/Count(distinct SAYSquery.[EmpFirstName]) AS AvgProfitPerEmployee
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName

Retrieving the latest transaction for an item

I have a table that lists lots of item transactions. I need to get the last record for each item (the one dated the latest).
For example my table looks like this:
Item Date TrxLocation
XXXXX 1/1/13 WAREHOUSE
XXXXX 1/2/13 WAREHOUSE
XXXXX 1/3/13 WAREHOUSE
aaaa 1/1/13 warehouse
aaaa 2/1/13 WAREHOUSE
I want the data to come back as follows:
XXXXX 1/3/13 WAREHOUSE
AAAA 2/1/13 WAREHOUSE
I tried doing something like this but it is bringing back the wrong date
select Distinct ITEMNMBR
TRXLOCTN,
DATERECD
from TEST
where DateRecd = (select max(DATERECD)
from TEST)
Any help is appreciated.
You're on the right track. You just need to change your subquery to a correlated subquery, which means that you give it some context to the outer query. If you just run your subquery (select max(DATERECD) from TEST) by itself, what do you get? You get a single date that is the latest date in the whole table, regardless of item. You need to tie the subquery to the outer query by linking on the ITEMNMBR column, like this:
SELECT ITEMNMBR, TRXLOCTN, DATERECD
FROM TEST t
WHERE DateRecd = (
SELECT MAX (DATERECD)
FROM TEST tMax
WHERE tMax.ITEMNMBR = t.ITEMNMBR)
No need for subquery. You are querying a single table and need to select MAX(date) and GROUP BY item and TrxLocation.
SELECT Item, max(DATERECD) AS max_dt_recd, TrxLocation
FROM test
GROUP BY Item, TrxLocation
/

JavaDB: get ordered records in the subquery

I have the following "COMPANIES_BY_NEWS_REPUTATION" in my JavaDB database (this is some random data just to represent the structure)
COMPANY | NEWS_HASH | REPUTATION | DATE
-------------------------------------------------------------------
Company A | 14676757 | 0.12345 | 2011-05-19 15:43:28.0
Company B | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company C | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company A | -7874564 | 0.12345 | 2011-05-19 15:43:28.0
One news_hash may relate to several companies while a company can relate to several news_hashes as well. Reputation and date are bound to the news_hash.
What I need to do is calculate the average reputation of last 5 news for every company. In order to do that I somehow feel that I need to user 'order by' and 'offset' in a subquery as shown in the code below.
select COMPANY, avg(REPUTATION) from
(select * from COMPANY_BY_NEWS_REPUTATION order by "DATE" desc
offset 0 rows fetch next 5 row only) as TR group by COMPANY;
However, JavaDB allows neither ORDER BY, nor OFFSET in a subquery. Could anyone suggest a working solution for my problem please?
Which version of JavaDB are you using? According to the chapter TableSubquery in the JavaDB documentation, table subqueries do support order by and fetch next, at least in version 10.6.2.1.
Given that subqueries can be ordered and the size of the result set can be limited, the following (untested) query might do what you want:
select COMPANY, (select avg(REPUTATION)
from (select REPUTATION
from COMPANY_BY_NEWS_REPUTATION
where COMPANY = TR.COMPANY
order by DATE desc
fetch first 5 rows only))
from (select distinct COMPANY
from COMPANY_BY_NEWS_REPUTATION) as TR
This query retrieves all distinct company names from COMPANY_BY_NEWS_REPUTATION, then retrieves the average of the last five reputation rows for each company. I have no idea whether it will perform sufficiently, that will likely depend on the size of your data set and what indexes you have in place.
If you have a list of unique company names in another table, you can use that instead of the select distinct ... subquery to retrieve the companies for which to calculate averages.