I have 2 tables UserSession and Sale.
Table User has 4 columns, namely UserID, UserSessionID, SessionOpenDate and SessionCloseDate.
Table Sale has 2 columns, which are price, cost, userID and CompletedDate.
When a user logs in, a new row is created in the User table, where the user's login timestamp will be saved in the SessionOpenDate and a new UserSessionID will be assigned to the session. When the user logs off, the log off timestamp will be be saved in SessionCloseDate.
When the user is still logged in, the user can make some sale and the sale information is saved in the Sale table. The timestamp when the sale is finalized in saved in CompletedDate column.
For some reason, I need to get the all sales done in a certain UserSessionID where the CompletedDate must be within the SessionOpenDate and SessionCloseDate. However, if the user has not logged off yet, which means that the value in SessionCloseDate is null, the CompletedDate should be between SessionOpenDate and now.
Here's my query:
SELECT SUM(s.cost) AS Cost, SUM(s.price) AS Price
FROM Sale AS s
INNER JOIN UserSession AS u
ON s.userID = u.userID
WHERE
(s.CompletedDate >=
( SELECT SessionOpenDate
FROM UserSession
WHERE (UserSessionID = u.UserSessionID)
)
)
AND
(s.CompletedDate <
(
IF EXISTS
(
SELECT SessionCloseDate AS closeTime
FROM UserSession AS UserSessionTemp
WHERE (UserSessionID = u.UserSessionID)
)
BEGIN
SET closeTime = SELECT CURRENT_TIMESTAMP
END
)
)
AND u.UserSessionID IN (1)
However, Sql Server says Incorrect syntax near the keyword 'IF'. and Incorrect syntax near ')'.
Can anyone tell me what went wrong with my IF block?
You can't use an IF block inside a SELECT statement. Also, I don't know what you're really trying to accomplish with SET, since closeTime is not a variable/parameter.
You can use IIF in SQL Server 2012 (syntactical sugar for CASE WHEN <condition> THEN <true_value> ELSE <false_value> END - use this syntax for earlier versions):
IIF(EXISTS
(
SELECT SessionCloseDate AS closeTime
FROM UserSession AS UserSessionTemp
WHERE (UserSessionID = u.UserSessionID)
), (
SELECT SessionCloseDate AS closeTime
FROM UserSession AS UserSessionTemp
WHERE (UserSessionID = u.UserSessionID)
),
CURRENT_TIMESTAMP
)
Honestly, without getting too complicated, here's what I would do instead:
SELECT SUM(s.cost) AS Cost, SUM(s.price) AS Price
FROM userSession AS u
INNER JOIN Sale AS s ON u.userID = s.userID
WHERE u.UserSessionID = #UserSessionId
AND s.CompletedDate >= u.SessionOpenDate
AND (u.SessionCloseDate IS NULL OR s.CompletedDate < u.SessionCloseDate)
Or,
SELECT SUM(s.cost) AS Cost, SUM(s.price) AS Price
FROM userSession AS u
INNER JOIN Sale AS s ON u.userID = s.userID
WHERE u.UserSessionID = #UserSessionId
AND s.CompletedDate BETWEEN u.SessionOpenDate
AND COALESCE(u.SessionCloseDate, '12/31/9999 23:59:59.9999')
I'd go with something like this, depending on what you're looking for. I couldn't tell if you wanted the info for a particular user or session, but that's simple enough to add.
SELECT UserSessionID, SUM(cost) AS TotalCost, SUM(price) AS TotalPrice
FROM UserSession LEFT OUTER JOIN sale
ON UserSession.userid = sale.userid AND
((UserSession.SessionCloseDate IS NULL AND sale.CompletedDate BETWEEN UserSession.SessionOpenDate AND GetDate())
OR (sale.SessionCloseDate IS NOT NULL AND sale.CompletedDate BETWEEN UserSession.SessionOpenDate AND UserSession.SessionCloseDate))
WHERE SUM(cost) > 0
GROUP BY UserSessionID
(you can ADD AND UserSessionID = 'mysessionid' or and UserID = 'myuserid' above the group by if you don't want the full list)
Others have explained the specific problem and "given you a fish". I would like to "teach you how to fish", though.
Please see mixed-up statement types for a full discussion (disclosure: my own blog post). Here are some snippets.
An expression consists of one or more literal values or functions tied together with operators, which when evaluated in the correct order result in a value or collection of values.
Snip...
Procedural statements are called that because there is some procedure that must be followed. It isn't a simple case of order-of-operations resulting in a single value. There is in fact no value expressed at all.
Snip...
Now that you know the three main kinds of statements (and I won't rule out the possibility of there being more or of there being subclassifications of these) the key concept you must know to get along well with SQL Server is that when a certain kind of statement is expected, you can't use a different one in its place.
Related
I am using a NOT EXSITS clause in my query and wanted to make sure it was working correctly since I was getting lesser rows than expected.
SELECT DISTINCT offer.courier_uuid,
offer.region_uuid,
offer.offer_time_local,
Cast(scores.acceptance_rate AS DECIMAL(5, 3)) AS acceptance_rate
FROM integrated_delivery.trip_offer_fact offer
JOIN integrated_product.driver_score_v2 scores ON offer.courier_uuid = scores.courier_id
AND offer.region_uuid = scores.region_id
AND offer.business_day BETWEEN date '2019-04-04' AND date '2019-04-07'
AND scores.extract_dt = 20190331
AND NOT EXISTS
(SELECT NULL
FROM source_cassandra_courier_scheduling.assigned_block_by_id_v2 sched
JOIN source_cassandra_delivery.region r ON sched.region_id = r.id
WHERE offer.courier_uuid = sched.courier_id
AND offer.offer_time_local >= date_parse(date_format(AT_TIMEZONE("start",r.time_zone),'%Y-%m-%d %H:%i:%s'),'%Y-%m-%d %H:%i:%s')
AND offer.offer_time_local <= date_parse(date_format(AT_TIMEZONE("end",r.time_zone),'%Y-%m-%d %H:%i:%s'),'%Y-%m-%d %H:%i:%s')
AND element_at(sched.state,-1) = 'ASSIGNED')
ORDER BY 3
Is there anything wrong with my not exists clause? I am only asking since I am getting back lesser rows than expected. The not exists caluse contains a time conversion but i dont think that would affect anything.
I am trying to get all possible ids and their offer times that do NOT EXIST in the scheduled shifts table. I wanted confirm if the way I have the NOT EXISTS clause is correct or if there is something else I would need that would correctly pull all records that exist or not exist in that shed table?
I'm trying to include a column calculated as a % of OTYPE.
IE
Order type | Status | volume of orders at each status | % of all orders at this status
SELECT
T.OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
(STATVOL / COUNT(ROW_ID)) * 100
FROM Database.S_ORDER O
LEFT JOIN /* Finding definitions for status codes & attaching */
(
SELECT
ROW_ID AS TYPEJOIN,
"NAME" AS OTYPE
FROM database.S_ORDER_TYPE
) T
ON T.TYPEJOIN = ORDER_TYPE_ID
GROUP BY (T.OTYPE, STATUS_CD)
/*Excludes pending and pending online orders */
WHERE CAST(CREATED AS DATE) = '2018/09/21' AND STATUS_CD <> 'Pending'
AND STATUS_CD <> 'Pending-Online'
ORDER BY T.OTYPE, STATUS_CD DESC
OTYPE STATUS_CD STATVOL TOTALPERC
Add New Service Provisioning 2,740 100
Add New Service In-transit 13 100
Add New Service Error - Provisioning 568 100
Add New Service Error - Integration 1 100
Add New Service Complete 14,387 100
Current output just puts 100 at every line, need it to be a % of total orders
Could anyone help out a Teradata & SQL student?
The complication making this difficult is my understanding of the group by and count syntax is tenuous. It took some fiddling to get it displayed as I have it, I'm not sure how to introduce a calculated column within this combo.
Thanks in advance
There are a couple of places the total could be done, but this is the way I would do it. I also cleaned up your other sub query which was not required, and changed the date to a non-ambiguous format (change it back if it cases an issue in Teradata)
SELECT
T."NAME" as OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
COUNT(STATUS_CD)*100/TotalVol as Pct
FROM database.S_ORDER O
LEFT JOIN EDWPRDR_VW40_SBLCPY.S_ORDER_TYPE T on T.ROW_ID = ORDER_TYPE_ID
cross join (select count(*) as TotalVol from database.S_ORDER) Tot
GROUP BY T."NAME", STATUS_CD, TotalVol
WHERE CAST(CREATED AS DATE) = '2018-09-21' AND STATUS_CD <> 'Pending' AND STATUS_CD <> 'Pending-Online'
ORDER BY T."NAME", STATUS_CD DESC
A where clause comes before a group by clause, so the query
shown in the question isn't valid.
Always prefix every column reference with the relevant table alias, below I have assumed that where you did not use the alias that it belongs to the orders table.
You probably do not need a subquery for this left join. While there are times when a subquery is needed or good for performance, this does not appear to be the case here.
Most modern SQL compliant databases provide "window functions", and Teradata does do this. They are extremely useful, and here when you combine count() with an over clause you can get the total of all rows without needing another subquery or join.
Because there is neither sample data nor expected result provided with the question I do not actually know which numbers you really need for your percentage calculation. Instead I have opted to show you different ways to count so that you can choose the right ones. I suspect you are getting 100 for each row because the count(status_cd) is equal to the count(row_id). You need to count status_cd differently to how you count row_id. nb: The count() function increases by 1 for every non-null value
I changed the way your date filter is applied. It is not efficient to change data on every row to suit constants in a where clause. Leave the data untouched and alter the way you apply the filter to suit the data, this is almost always more efficient (search sargable)
SELECT
t.OTYPE
, o.STATUS_CD
, COUNT(o.STATUS_CD) count_status
, COUNT(t.ROW_ID count_row_id
, count(t.row_id) over() count_row_id_over
FROM dbo.S_ORDER o
LEFT JOIN dbo.S_ORDER_TYPE t ON t.TYPEJOIN = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
WHERE o.CREATED >= '2018-09-21' AND o.CREATED < '2018-09-22'
AND o.STATUS_CD <> 'Pending'
AND o.STATUS_CD <> 'Pending-Online'
GROUP BY
t.OTYPE
, o.STATUS_CD
ORDER BY
t.OTYPE
, o.STATUS_CD DESC
As #TomC already noted, there's no need for the join to a Derived Table. The simplest way to get the percentage is based on a Group Sum. I also changed the date to an Standard SQL Date Literal and moved the where before group by.
SELECT
t."NAME",
o.STATUS_CD,
Count(o.STATUS_CD) AS STATVOL,
-- rule of thumb: multiply first then divide, otherwise you will get unexpected results
-- (Teradata rounds after each calculation)
100.00 * STATVOL / Sum(STATVOL) Over ()
FROM database.S_ORDER AS O
/* Finding definitions for status codes & attaching */
LEFT JOIN database.S_ORDER_TYPE AS t
ON t.ROW_ID = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
-- if o.CREATED is a Timestamp there's no need to apply the CAST
WHERE Cast(o.CREATED AS DATE) = DATE '2018-09-21'
AND o.STATUS_CD NOT IN ('Pending', 'Pending-Online')
GROUP BY (T.OTYPE, o.STATUS_CD)
ORDER BY T.OTYPE, o.STATUS_CD DESC
Btw, you probably don't need an Outer Join, Inner should return the same result.
I need to select and compare the last advertised date in advert, to any null values in lease to get when an un-leased property and when it was last advertised. This is the code I have so far;
SELECT YR_LEASE.PROPERTYNUM,
MAX(YR_ADVERT.DATETO),
count(YR_LEASE.RENTERNUM)
FROM YR_LEASE
JOIN YR_ADVERT
ON YR_LEASE.PROPERTYNUM=YR_ADVERT.PROPERTYNUM
GROUP BY YR_LEASE.PROPERTYNUM
This returns a count this is far too high and I'm not sure what i'm doing wrong, here's my ERD to try and give this question some context;
http://www.gliffy.com/pubdoc/4239520/L.png
I think you need to first identify unleased properties. From there you can find the latest advert date. Assuming some properties have never been advertised you'll need to go via YR_PROPERTY and do a left join to include unadvertised properties.
SELECT NVL(TO_CHAR(MAX(YR_ADVERT.DATETO),'DD/MM/YYYY'),'NO LAST ADVERT DATE') LAST_ADVERT_DATE
,YR_PROPERTY.PROPERTYNUM
FROM YR_PROPERTY LEFT JOIN YR_ADVERT ON YR_PROPERTY.PROPERTYNUM = YR_ADVERT.PROPERTYNUM
WHERE NOT EXISTS (SELECT 1
FROM YR_LEASE
WHERE YR_LEASE.PROPERTYNUM = YR_PROPERTY.PROPERTYNUM
AND YR_LEASE.RENT_FINISH > SYSDATE)
GROUP BY YR_LEASE.PROPERTYNUM;
I had received some really good help earlier, and I appreciate it.
I have another record selection snafu.
I have a parameter that I need to set as the end date.
I need to pull the most recent state before the end date from a table titled state_change.
I need to exclude any records from the report who are not in the required states at that period in time.
state is set currently to be state_change.new_state
( {#grouping} = "Orders" and rec_date < {?endDate} and {#state} in [0,2,5] )
OR
( {#grouping} = "Stock" and rec_date < {?endDate} and {#state} in [1,2,3,5,7] )
If I could run a SQL query to pull this information, it would probably work, but I cannot figure out how to do it.
Essentially, I need #state to be:
Select max(new_state)
From state_change
where change_time < {?endDate}
but on a per item level.
Any help would be appreciated.
You'll probably need to use a command object with a parameter for your end date, or create a parameterized stored procedure. The command object will allow you to enter all the sql you need, like joining your results with the max newState value before the end date:
select itemID, new_state, rec_date, max_newState from
(select itemID, new_state, rec_date from table) t inner join
(Select itemID, max(new_state) as max_newState
From state_change
where change_time < {?endDate}
group by itemID) mx on t.itemid = mx.itemID and t.new_state = mx.max_newState
I can't tell if your orders and stock groupings are in the same table, so I'm not sure how you need to limit your sets by the correct state values.
I have a table which records users's scores at a game (a user may submit 5,10,20,as many scores as he wants).
I need to show the 20 top scores of a game, but per user. (as a user may have submitted eg 4 scores which are the top according to other users's scores)
The query i have written is:
SELECT DISTINCT
`table_highscores`.`userkey`,
max(`table_highscores`.`score`),
`table_users`.`username`,
`table_highscores`.`dateachieved`
FROM
`table_highscores`, `table_users`
WHERE
`table_highscores`.`userkey` = `table_users`.`userkey`
AND
`table_highscores`.`gamekey` = $gamekey
GROUP BY
`userkey`
ORDER BY
max(`table_highscores`.`score`) DESC,
LIMIT 0, 20;
The output result is ok, but there is a problem. When i calculate the difference of days (today-this of dateachieved) the result is wrong. (eg instead of saying "the score was submitted 22 days ago, it says 43 days ago) So,I have to do a second query for each score so to find the true date (meaning +20 queries). Is there any shorter way to find the correct date?
Thanks.
there is a problem. When i calculate the difference of days (today-this of dateachieved) the result is wrong.
There's two issues
the dateachieved isn't likely to be the value associated with the high score
you can use MySQL's DATEDIFF to return the the number of days between the current date and the dateachieved value.
Use:
SELECT u.username,
hs.userkey,
hs.score,
DATEDIFF(NOW(), hs.dateachieved)
FROM TABLE_HIGHSCORES hs
JOIN TABLE_USERNAME u ON u.userkey = hs.userkey
JOIN (SELECT ths.userkey,
ths.gamekey,
ths.max_score,
MAX(ths.date_achieved) 'max_date'
FROM TABLE_HIGHSCORES ths
JOIN (SELECT t.userkey,
t.gamekey,
MAX(t.score) 'max_score'
FROM TABLE_HIGHSCORES t
GROUP BY t.userkey, t.gamekey) ms ON ms.userkey = ths.userkey
AND ms.gamekey = ths.gamekey
AND ms.max_score = ths.score
) x ON x.userkey = hs.userkey
AND x.gamekey = hs.gamekey
AND x.max_score = hs.score
AND x.max_date = hs.dateachieved
WHERE hs.gamekey = $gamekey
ORDER BY hs.score DESC
LIMIT 20
I also changed your query to use ANSI-92 JOIN syntax, from ANSI-89 syntax. It's equivalent performance, but it's easier to read, syntax is supported on Oracle/SQL Server/Postgres/etc, and provides consistent LEFT JOIN support.
Another thing - you only need to use backticks when tables and/or column names are MySQL keywords.
In your query you should use an explicit JOIN and you don't need the DISTINCT keyword.
This query should solve your problem. I am assuming here that it is possible for a user to submit the same highscore more than once on different dates, and if that happens then you want the oldest date:
SELECT T1.userkey, T1.score, username, dateachieved FROM (
(SELECT userkey, max(score) AS score
FROM table_highscores
WHERE gamekey = $gamekey
GROUP BY userkey) AS T1
JOIN
(SELECT userkey, score, min(dateachieved) as dateachieved
FROM table_highscores
WHERE gamekey = $gamekey
GROUP BY userkey, score) AS T2
ON T1.userkey = T2.userkey AND T1.score = T2.score
) JOIN table_users ON T1.userkey = table_users.userkey
LIMIT 20
You didn't say what language you are using to calculate the difference but I'm guessing it's PHP because of the $gamekey you used there (which should be escaped properly, btw).
If your dateachieved field is in the DATETIME format, you can calculate the difference like this:
$diff = round((time() - strtotime($row['dateachieved'])) / 86400);
I think you need to clarify your question a little better. Can you provide some data and expected outputs and then I should be able to help you further?