Can someone help me with this join - sql

I need it to give me me a total of 0 for week 33 - 39, but I'm really bad with joining 3 tables and I cant figure it out
Right now it only gives me an answer for dates that there are actual records in the tracker_weld_table.
SELECT SUM(tracker_parts_archive.weight),
WEEK(mycal.dt) as week
FROM
tracker_parts_archive, tracker_weld_archive
RIGHT JOIN
(SELECT dt FROM calendar_table WHERE dt >= '2018-7-1' AND dt <= '2018-10-1') as mycal
ON
weld_worker = '133'AND date(weld_dateandtime) = mycal.dt
WHERE
tracker_weld_archive.tracker_partsID = tracker_parts_archive.id
GROUP BY week

I think you are trying for something like this:
SELECT WEEK(c.dt) as week, COALESCE(SUM(tpa.weight), 0)
FROM calendar_table c left join
tracker_weld_archive tw
on date(tw.weld_dateandtime) = c.dt left join
tracker_parts_archive tp
on tw.tracker_partsID = tp.id and tp.weld_worker = 133
WHERE c.dt >= '2018-07-01' AND c.dt <= '2018-10-01'
GROUP BY week
ORDER BY week;
Notes:
You want to keep all (matching) rows in the calendar table, so it should be first.
All subsequent joins should be LEFT JOINs.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Write out the full proper date constant -- YYYY-MM-DD. This is an ISO-standard format.
I am guessing that weld_worker is a number, so single quotes are not needed for the comparison.

First, lets start with understanding what you want.. You want totals per week. This means there will be a "GROUP BY" clause (also for any MIN(), MAX(), AVG(), SUM(), COUNT(), etc. aggregates). What is the group BY basis. In this scenario, you want per week. Leading to the next part that you want for a specific date range qualified per your calendar table.
I would start in order what WHAT filtering criteria first. Also, ALWAYS TRY to identify all table( or alias).column in your queries so anyone after you knows where the columns are coming from, especially when multiple tables. In this case "ct" is the ALIAS for "Calendar_Table"
SELECT
ct.dt
from
calendar_table ct
where
ct.dt >= '2018-07-01'
AND ct.dt <= '2018-10-01'
Now, the above date looks to be INCLUSIVE of October 1 and looks like you are trying to generate a quarterly sum from July, Aug, Sept. I would change to LESS than Oct 1.
Now, your calendar has many days and you want it grouped by week, so the WEEK() function gets you that distinct reference without explicitly checking every date. Also, try NOT to use reserved keywords as final column names... makes for confusion later on sometimes.
I have aliased the column name as "WeekBasis". Here, I did a COUNT(*) just to show the total days and the group by showing it in context.
SELECT
WEEK( ct.dt ) WeekBasis,
MIN( ct.dt ) as FirstDayOfThisWeek,
MAX( ct.dt ) as LastDayOfThisWeek,
COUNT(*) as DaysInThisWeek
from
calendar_table ct
where
ct.dt >= '2018-07-01'
AND ct.dt <= '2018-10-01'
group by
WEEK( ct.dt )
So, at this point, we have 1 record per week within the date period you are concerned,
but I also grabbed the earliest and latest dates just to show other components too.
Now, lets get back to your extra tables. We know the dates in question, now need to
get the details from the other tables (which is lacking in the post. You should post
critical components such as how tables are related via common / joined column basis.
How is tracker_part_archive related to tracker_weld_archive??
To simplify your query, you dont even NEED your calendar table as the welding
table HAS a date field and you know your range. Just query against that directly.
IF your worker's ID is numeric, don't add quotes around it, just leave as a number.
SELECT
WEEK( twa.Weld_DateAndTime ) WeekBasis,
COUNT(*) WeldingEntriesDone,
SUM(tpa.weight) TotalWeight
from
tracker_weld_archive twa
JOIN tracker_parts_archive tpa
-- GUESSING on therelationship here.
-- may also be on a given date too???
-- all pieces welded by a person on a given date
ON twa.weld_worker = tpa.weld_worker
AND twa.Weld_DateAndTime = tpa.Weld_DateAndTime
where
twa.Weld_Worker = 133
AND twa.Weld_DateAndTime >= '2018-07-01'
AND twa.Weld_DateAndTime <= '2018-10-01'
group by
WEEK( twa.Weld_DateAndTime )
IF you provide the table structures AND sample data, this can be refined a bit more for you.

Related

How to calculated Working Hours over 17 week period

I am trying to work out How many field engineers work over 48 hours over a 17 week period. (by law you cannot work over 48 hours over a 17 week period)
I managed to run the query for 1 Engineer but when I run it without an Engineer filter my query is very slow.
I need to get the count of Engineers working over 48 hours and count under 48 hours then get an Average time worked per week.
Note: I am doing a Union on SPICEMEISTER & SMARTMEISTER because they are our old and new databases.
• How many field engineers go over the 48 hours
• How many field engineers are under the 48 hours
• What is the average time worked per week for engineers
SELECT DS_Date
,TechPersNo
FROM
(
SELECT DISTINCT
SMDS.EPL_DAT as DS_Date
,EN.pers_no as TechPersNo
FROM
[SpiceMeister].[FS_OTBE].[EngPayrollNumbers] EN
INNER JOIN
[SmartMeister].[Main].[PlusDailyKopf] SMDS
ON RIGHT(CAST(EN.[TechnicianID] AS CHAR(10)),5) = SMDS.PRPO_TECHNIKERNR
WHERE
SMDS.EPL_DAT >= '2017-01-01'
and
SMDS.EPL_DAT < '2018-03-01'
UNION ALL
SELECT DISTINCT
SPDS.DailySummaryDate as DS_Date
,EN.pers_no as TechPersNo
FROM
[SpiceMeister].[FS_OTBE].[EngPayrollNumbers] EN
INNER JOIN
[SpiceMeister].[FS_DS_BO].[DailySummaryHeader] SPDS
ON EN.TechnicianID = SPDS.TechnicianID
WHERE
SPDS.DailySummaryDate >= '2018-03-01'
) as Techa
where TechPersNo = 850009
) Tech
cross APPLY
Fast results
The slowness is definitely due to the use of cross apply with a correlated subquery. This will force the computation on a per-row basis and prevents SQL Server from optimizing anything.
This seems more like it should be a 'group by' query, but I can see why you had trouble making it up on account of the complex cumulative calculation in which you need output by person and by date, but the average involves not the date in question but a date range ending on the date in question.
What I would do first is make a common query to capture your base data between the two datasets. That's what I do in the 'dailySummaries' common table expression below. Then I would join dailySummaries onto itself matching by the employee and selecting the date range required. From that, I would group by employee and date, aggregating by the date range.
with
dailySummaries as (
select techPersNo = en.pers_no,
ds_date = smds.epl_dat,
dtDif = datediff(minute, smds.abfahrt_zeit, smds.rueck_zeit)
from smartMeister.main.plusDailyKopf
join smartMeister.main.plusDailyKopf smds
on right(cast(en.technicianid as char(10)),5) = smds.prpo_technikernr
where smds.epl_dat < '2018-03-01'
union all
select techPersNo = en.pers_no,
dailySummaryDate,
datediff(minute,
iif(spds.leaveHome < spds.workStart, spds.leaveHome, spds.workStart),
iif(spds.arrivehome > spds.workEnd, spds.arrivehome, spds.workEnd)
)
from spiceMeister.fs_ds_bo.dailySummaryHeader spds
join spiceMeister.fs_ds_bo.dailySummaryHeader spds
on en.TechnicianID = spds.TechnicianID
where spds.DailySummaryDate >= '2018-03-01'
)
select ds.ds_date,
ds.techPersNo,
AvgHr = convert(real, sum(dsPrev.dtDif)) / (60*17)
from dailySummaries ds
left join dailySummaries dsPrev
on ds.techPersNo = dsPrev.techPersNo
and dsPrev.ds_date between dateadd(day,-118, ds.ds_date) and ds.ds_date
where ds.ds_date >= '2017-01-01'
group by ds_date,
techPersNo
order by ds_date
I may have gotten a thing or two wrong in translation but you get the idea.
In the future, post a more minimal example. The union of the two datasets from the separate databases is not central to the problem you were trying to ask about. A lot of the date filterings aren't core to the question. The special casting in your join match logic is not important. These things cloud the real issue.

ORA-00907: missing right parenthesis using count with expression inside it

SELECT Hotel_Name, COUNT(H_CHECK.Hotel_checkIn >= 'JUL-1-2016' AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016') FROM HOTEL, H_CHECK
GROUP BY Hotel_Name
ORA-00907: missing right parenthesis
I have tried putting Parenthesis in many ways, but I couldn't find the solution. I'm using Oracle Application Express 11G.
This is the query:
Display the hotel name that has more than 2 customers checked in on July 2016.
Once you fix your immediate syntax problem, you need proper JOIN syntax.
One way to fix the problem is simply to move the conditions to a WHERE clause, resulting in a query like this:
SELECT Hotel_Name, COUNT(hc.hotel_id)
FROM HOTEL h LEFT JOIN
H_CHECK hc
ON h.hotel_id = hc.hotel_id -- I don't know what the right join condition is
WHERE hc.Hotel_checkIn >= DATE '2016-07-01' AND
hc.Hotel_checkIn <= DATE '2016-07-31'
GROUP BY Hotel_Name;
You cannot count based on condition in Select statement of your sql query.
COUNT (
H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016')
This is wrong. You can do it like>
SELECT Hotel_Name,
COUNT (1)
FROM HOTEL
join H_CHECK
ON H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016'
GROUP BY Hotel_Name
having count(1) > 2;
You're missing the CASE ... END from your conditions inside your count. You're after something like:
SELECT Hotel_Name,
COUNT(case when H_CHECK.Hotel_checkIn >= 'JUL-1-2016'
AND H_CHECK.Hotel_checkIn <= 'JUL-31-2016'
then 1
end)
FROM HOTEL, H_CHECK
GROUP BY Hotel_Name;
However, there are a number of concerns I have regarding your query:
if your hotel_checkin column is of DATE datatype, then you should be comparing it to DATEs not strings. I.e. H_CHECK.Hotel_checkIn >= to_date('07-01-2016', 'mm-dd-yyyy') - this way, you avoid relying on implicit conversion of the string into a date, which relies on your NLS_DATE_FORMAT parameter setting. This could be changed and may cause your query to fail.
FROM HOTEL, H_CHECK Don't use the old-style method of joining; instead, use the ANSI style method: FROM hotel cross join h_check
Did you really mean the join to be a cross join or did you forget to add the join conditions?
You should alias the hotel_name column to aid maintainability.
You should also give your count column a name.
Since you're only counting rows with a checkin between 1st and 31st July 2016, you should move this condition into the where clause (as XING has shown in their answer) *unless* you need to also show hotels that don't have any checkins within that time period.
Your condition assumes that there are no time elements to the hotel_checkin column - ie. everything is set to midnight. If, however, you could have a date with a time, bear in mind that your count will ignore all rows with a checkin date of 31st July 2016 that are after midnight. In which case, your upper bound needs to change to: H_CHECK.Hotel_checkIn < to_date('08-01-2016', 'mm-dd-yyyy')

How to calculate difference between two rows in a date interval?

I'm trying to compare data from an Access 2010 database based on a date interval. Example I have items from various purchase orders and I want to maintain the history of these item's delivery to a warehouse. So my purchase order has a request for a quantity of 10 of a material, for example, and it can be partially delivered in many deliveries and I want to know how this delivery varied in a date interval. To fill the date field the criteria used is the following: if the item had an update in the QtyPending field, I copy the current row deactivating it with a booelan field, create a new entry with the current update date updating the QtyPending field, so the active record is the actual state of the item. So I have a table that holds informations about these items like that
PO POItem QtyPending Date Active
4500000123 10 10 01/09/2014 FALSE
4500000123 10 8 05/09/2014 TRUE
4500000122 30 5 03/09/2014 FALSE
4500000122 30 1 04/09/2014 TRUE
With this example, for the first item, it means that from date 01/09 to 04/09 the QtyPending field didn't suffer a variation, meaning that the supplier didn't make any delivery to me, but from 01/09 to 05/08 he delivered me a qty of 2 of a material. For the second one, from date 03/09 to 04/09 the supplier delivered me a qty of 4 of a material. So, if I were to be making a report query from 02/09/2014 to 04/09/2014, the expected output is like this:
PO POItem QtyDelivered
4500000123 10 0
4500000122 30 4
And a report from 31/08/2014 to 10/09/2014, would have this output
PO POItem QtyDelivered
4500000123 10 2
4500000122 30 4
I'm not coming up with a query to make this report. Can anyone help me?
There are many ways of solving this. The easiest one would be to simply make a query of all the necessary records between two dates, loop over them and insert into a temporary table the result. This temporary table can then be the source of your report. A lot of people will scream at you for not using a big query instead but getting the result that you want in the fastest and simplest way should be your priority.
Your problem with your schema is that you don't have the QtyDelivered stored for each record. If you would have it, it would be an easy thing to sum over it in order to get needed result. By not storing this value, you have transformed a simple and fast query into a much harder and slower one because you need to recalculate this value in some way or other and you must do this without forgetting the fact that it's possible to have more than two records.
For calculating this value, you can either use a sub-query to retrieve the value from the previous row or a Left join do to the same. Once you have this value, you can subtract these two to get the needed difference; allowing for the possibility of Null value if there is no previous row. Once you have these values, you can now sum over them to get the final result with a Group By. Notice that in order to perform these calculations, you need to have one or two more levels of subquery. The first query should be something like:
Select PO, POItem, QtyPending, (Select Top 1 QtyPending from MyTable T2 where T1.PO = T2.PO and T2.Date < T1.Date And (T2.Date between #Date1 and #Date2) Order by T2.Date Desc) as QtyPending2 from MyTable T1 Where T1.Date between #Date1 and #Date2) ...
With this as either another subquery or as a View, you can then compute the desired difference by comparing the values of QtyPending and QtyPending2; without forgetting that QtyPendin2 may be Null. The remaining steps are easy to do.
Notice that the above example is for SQL-Server, you might have to change it a little for Access. In any case, you can find here many examples on how to compare two rows under Access. As noted earlier, you can also use a Left Join instead of a subquery to compare your rows.
I came up with this query that solved the problem, it wasn't that simple
SELECT
ItmDtIni.PO
,ItmDtIni.POItem AS [PO Item]
,ROUND(ItmDtIni.QtyPending - ItmDtEnd.QtyPending, 3) AS [Qty Delivered]
,ROUND((ItmDtIni.QtyPending - ItmDtEnd.QtyPending) * ItmDtEnd.Price, 2) AS [Value delivered(US$)]
//Filtering subqueries to bring only the items in the date interval to make a self join
FROM (((SELECT
PO
,POItem
,QtyPending
,MIN(Date) AS MinDate
FROM Item
WHERE Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy')
GROUP BY
PO
,POItem
,QtyPending) AS ItmDtIni
//Self join filtering to bring only items in the date interval with the previously filtered table
INNER JOIN (SELECT
PO
,POItem
,QtyPending
,Price
,MAX(Date) AS MaxDate
FROM Item
WHERE Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy')
GROUP BY
PO
,POItem
,QtyPending
,Price) AS ItmDtEnd
ON ItmDtIni.PO = ItmDtEnd.PO
AND ItmDtIni.POItem = ItmDtEnd.POItem)
INNER JOIN PO
ON ItmDtEnd.PO = PO.Numero)
WHERE
//Showing only items that had a variation in the date interval
ROUND(ItmDtIni.QtyPending - ItmDtEnd.QtyPending, 3) <> 0
//Anchoring min date in the interval for each item found by the first subquery
AND ItmDtIni.MinDate = (SELECT MIN(Item.Date)
FROM Item
WHERE
ItmDtIni.PO = Item.PO
AND ItmDtIni.POItem = Item.POItem
AND Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy'))
//Anchoring max date in the interval for each item found by the second subquery
AND ItmDtEnd.MaxDate = (SELECT MAX(Item.Date)
FROM Item
WHERE
ItmDtEnd.PO = Item.PO
AND ItmDtEnd.POItem = Item.POItem
AND Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy'))

SQL: Average value per day

I have a database called ‘tweets’. The database 'tweets' includes (amongst others) the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss), ‘classified’ and 'processed text'. Within the ‘processed text’ row there are certain strings such as {TICKER|IBM}', to which I will refer as ticker-strings.
My target is to get the average value of ‘classified’ per ticker-string per day. The row ‘classified’ includes the numerical values -1, 0 and 1.
At this moment, I have a working SQL query for the average value of ‘classified’ for one ticker-string per day. See the script below.
SELECT Date( `created_at` ) , AVG( `classified` ) AS Classified
FROM `tweets`
WHERE `processed_text` LIKE '%{TICKER|IBM}%'
GROUP BY Date( `created_at` )
There are however two problems with this script:
It does not include days on which there were zero ‘processed_text’s like {TICKER|IBM}. I would however like it to spit out the value zero in this case.
I have 100+ different ticker-strings and would thus like to have a script which can process multiple strings at the same time. I can also do them manually, one by one, but this would cost me a terrible lot of time.
When I had a similar question for counting the ‘tweet_id’s per ticker-string, somebody else suggested using the following:
SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG,
coalesce(BAC, 0) AS BAC
FROM dates d LEFT JOIN
(SELECT DATE(created_at) AS date,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
END) as IBM,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
END) as GOOG,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
END) as BAC
FROM tweets
GROUP BY date
) t
ON d.date = t.date;
This script worked perfectly for counting the tweet_ids per ticker-string. As I however stated, I am not looking to find the average classified scores per ticker-string. My question is therefore: Could someone show me how to adjust this script in such a way that I can calculate the average classified scores per ticker-string per day?
SELECT d.date, t.ticker, COALESCE(COUNT(DISTINCT tweet_id), 0) AS tweets
FROM dates d
LEFT JOIN
(SELECT DATE(created_at) AS date,
SUBSTR(processed_text,
LOCATE('{TICKER|', processed_text) + 8,
LOCATE('}', processed_text, LOCATE('{TICKER|', processed_text))
- LOCATE('{TICKER|', processed_text) - 8)) t
ON d.date = t.date
GROUP BY d.date, t.ticker
This will put each ticker on its own row, not a column. If you want them moved to columns, you have to pivot the result. How you do this depends on the DBMS. Some have built-in features for creating pivot tables. Others (e.g. MySQL) do not and you have to write tricky code to do it; if you know all the possible values ahead of time, it's not too hard, but if they can change you have to write dynamic SQL in a stored procedure.
See MySQL pivot table for how to do it in MySQL.

SQL to calculate value of Shares at a particular time

I'm looking for a way that I can calculate what the value of shares are at a given time.
In the example I need to calculate and report on the redemptions of shares in a given month.
There are 3 tables that I need to look at:
Redemptions table that has the Date of the redemption, the number of shares that were redeemed and the type of share.
The share type table which has the share type and links the 1st and 3rd tables.
The Share price table which has the share type, valuation date, value.
So what I need to do is report on and have calculated based on the number of share redemptions the value of those shares broken down by month.
Does that make sense?
Thanks in advance for your help!
Apologies, I think I should elaborate a little further as there might have been some misunderstandings. This isn't to calculate daily changing stocks and shares, it's more for fund management. What this means is that the share price only changes on a monthly basis and it's also normally a month behind.
The effect of this is that the what the query needs to do, is look at the date of the redemption, work out the date ie month and year. Then look at the share price table and if there's a share price for the given date (this will need to be calculated as it will be a single day ie the price was x on day y) then multiple they number of units by this value. However, if there isn't a share price for the given date then use the last price for that particular share type.
Hopefully this might be a little more clear but if there's any other information I can provide to make this easier then please let me know and I'll supply you with the information.
Regards,
Phil
This should do the trick (note: updated to group by ShareType):
SELECT
ST.ShareType,
RedemptionMonth = DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0),
TotalShareValueRedeemed = Sum(P.SharePrice * R.SharesRedeemed)
FROM
dbo.Redemption R
INNER JOIN dbo.ShareType ST
ON R.ShareTypeID = ST.ShareTypeID
CROSS APPLY (
SELECT TOP 1 P.*
FROM dbo.SharePrice P
WHERE
R.ShareTypeID = P.ShareTypeID
AND R.RedemptionDate >= P.SharePriceDate
ORDER BY P.SharePriceDate DESC
) P
GROUP BY
ShareType,
DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0)
ORDER BY
ShareType,
RedemptionMonth
;
See it working in a Sql Fiddle.
This can easily be parameterized by simply adding a WHERE clause with conditions on the Redemption table. If you need to show a 0 for share types in months where they had no Redemptions, please let me know and I'll improve my answer--it would help if you would fill out your use case scenario a little bit, and describe exactly what you want to input and what you want to see as output.
Also please note: I'm assuming here that there will always be a price for a share redemption--if a redemption exists that is before any share price for it, that redemption will be excluded.
If you have the valuations for every day, then the calculation is a simple join followed by an aggregation. The resulting query is something like:
select year(redemptiondate), month(redemptiondate),
sum(r.NumShares*sp.Price) as TotalPrice
from Redemptions r left outer join
ShareType st
on r.sharetype = st.sharetype left outer join
SharePrice sp
on st.sharename = sp.sharename and r.redemptiondate = sp.pricedate
group by year(redemptiondate), month(redemptiondate)
order by 1, 2;
If I understand your question, you need a query like
select shares.id, shares.name, sum (redemption.quant * shareprices.price)
from shares
inner join redemption on shares.id = redemption.share
inner join shareprices on shares.id = shareprices.share
where redemption.curdate between :p1 and :p2
order by shares.id
group by shares.id, shares.name
:p1 and :p2 are date parameters
If you just need it for one date range:
SELECT s.ShareType, SUM(ISNULL(sp.SharePrice, 0) * ISNULL(r.NumRedemptions, 0)) [RedemptionPrice]
FROM dbo.Shares s
LEFT JOIN dbo.Redemptions r
ON r.ShareType = s.ShareType
OUTER APPLY (
SELECT TOP 1 SharePrice
FROM dbo.SharePrice p
WHERE p.ShareType = s.ShareType
AND p.ValuationDate <= r.RedemptionDate
ORDER BY p.ValuationDate DESC) sp
WHERE r.RedemptionDate BETWEEN #Date1 AND #Date2
GROUP BY s.ShareType
Where #Date1 and #Date2 are your dates
The ISNULL checks are just there so it actually gives you a value if something is null (it'll be 0). It's completely optional in this case, just a personal preference.
The OUTER APPLY acts like a LEFT JOIN that will filter down the results from SharePrice to make sure you get the most recent ValuationDate from table based on the RedemptionDate, even if it wasn't from the same date range as that date. It could probably be achieved another way, but I feel like this is easily readable.
If you don't feel comfortable with the OUTER APPLY, you could use a subquery in the SELECT part (i.e., ISNULL(r.NumRedemptions, 0) * (/* subquery from dbo.SharePrice here */)