Selecting top n Oracle records with ROWNUM still valid in subquery? - sql

I have the following FireBird query:
update hrs h
set h.plan_week_id=
(select first 1 c.plan_week_id from calendar c
where c.calendar_id=h.calendar_id)
where coalesce(h.calendar_id,0) <> 0
(Intention: For records in hrs with a (non-zero) calendar_id
take calendar.plan_week_id and put it in hrs.plan_week_id)
The trick to select the first record in Oracle is to use WHERE ROWNUM=1, and if understand correctly I do not have to use ROWNUM in a separate outer query because I 'only' match ROWNUM=1 - thanks SO for suggesting Questions that may already have your answer ;-)
This would make it
update hrs h
set h.plan_week_id=
(select c.plan_week_id from calendar c
where (c.calendar_id=h.calendar_id) and (rownum=1))
where coalesce(h.calendar_id,0) <> 0
I'm actually using the 'first record' together with the selection of only one field to guarantee that I get one value back which can be put into h.plan_week_id.
Question: Will the above query work under Oracle as intended?
Right now, I do not have a filled Oracle DB at hand to run the query on.

Like Nicholas Krasnov said, you can test it in SQL Fiddle.
But if you ever find yourself about to use where rownum = 1 in a subquery, alarm bells should go off, because in 90% of the cases you are doing something wrong. Very rarely will you need a random value. Only when all selected values are the same, a rownum = 1 is valid.
In this case I expect calendar_id to be a primary key in calendar. Therefor each record in hrs can only have 1 plan_week_id selected per record. So the where rownum = 1 is not required.
And to answer your question: Yes, it will run just fine. Though the brackets around each where clause are also not required and in fact only confusing (me).

Related

Failed UPDATE with CASE

I'm trying to write a query which will update reorder_level based on how much of an item was sold within a particular time period.
with a as (select invoice_itemized.itemnum, inventory.itemname,
sum(invoice_itemized.quantity) as sold
from invoice_itemized
join inventory on invoice_itemized.itemnum=inventory.itemnum and
inventory.vendor_number='COR' and inventory.dept_id='cigs'
join invoice_totals on
invoice_itemized.invoice_number=invoice_totals.invoice_number and
invoice_totals.datetime>=dateadd(month,-1,getdate())
group by invoice_itemized.itemnum, inventory.itemname)
update inventory
set reorder_level = case when a.sold/numpervencase>=5 then 30
when a.sold/numpervencase>=2 then 20
when a.sold/numpervencase>=1 then 5
else 1 end,
reorder_quantity = 1
from a
join inventory_vendors on a.itemnum=inventory_vendors.itemnum
Replacing the update with a select performs entirely as expected, returning proper results from the case and selecting 94 rows.
with the update in place, all of the areas affected by the update (6758) got set to 1.
Run this, and eyeball the results:
with a as (select invoice_itemized.itemnum, inventory.itemname,
sum(invoice_itemized.quantity) as sold
from invoice_itemized
join inventory on invoice_itemized.itemnum=inventory.itemnum and
inventory.vendor_number='COR' and inventory.dept_id='cigs'
join invoice_totals on
invoice_itemized.invoice_number=invoice_totals.invoice_number and
invoice_totals.datetime>=dateadd(month,-1,getdate())
group by invoice_itemized.itemnum, inventory.itemname)
select a.sold, numpervencase, a.sold/numpervencase,
case
when a.sold/numpervencase>=5 then 30
when a.sold/numpervencase>=2 then 20
when a.sold/numpervencase>=1 then 5
else 1
end,
*
from a
join inventory_vendors on a.itemnum=inventory_vendors.itemnum
Always a good idea to select before update to check that data ends up as you expect
all of the areas affected by the update got set to 1
I put the raw ingredients into the query above; see if the sums worked out as expected. You might need to cast one of the operands to something with decimal places:
1/2 = 0
1.0/2 = 0.5
And it updated far more rows than i was expecting
Every row that comes out of that select will be updated. Identify the rows you don't want to update and put a where clause in to remove them
Am i overthinking this?
Undertesting, probably
Do I even need the cte?
Makes it easier to represent, but no- you could get the same result by pasting the contents of the cte in as a subquery.. it's what the db does (effectively) anyway
Do i have my from statement in the wrong place?
We don't know what result you're after so that one is impossible to answe beyond "doing so would probably generate a syntax error, so.. no"
The actual problem seems to be
your case when is always going to ELSE, find out why
your cte selects too many rows (I couldn't tell if the number you posted was the number you got or the number you were expecting but it's pretty moot without example data), find out why
Solved. When I added another join to the update it worked correctly. i had to add join inventory on inventory_vendors.itemnum=inventory.itemnum

What is "Select -1", and how is it different from "Select 1"?

I have the following query that is part of a common table expression. I don't understand the function of the "Select -1" statement. It is obviously different than the "Select 1" that is used in "EXISTS" statements. Any ideas?
select days_old,
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
group by days_old
union all
select -1, -- Selecting the -1 here
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
where days_old between 1 and 7
It's just selecting the number "minus one" for each row returned, just like "select 1" will select the number "one" for each row returned.
There is nothing special about the "select 1" syntax uses in EXISTS statements by the way; it's just selecting some random value because EXISTS requires a record to be returned and a record needs data; the number 1 is sufficient.
Why you would do this, I have no idea.
When you have a union statement, each part of the union must contain the same columns. From what I read when I look at this, the first statement is giving you one line for each days old value and then some stats for each day old. The second part of the union is giving you a summary of all the records that are only a week or so less. Since days old column is not relevant here, they put in a fake value as a placeholder in order to do the union. OF course this is just a guess based on reading thousands of queries through the years. To be sure, I would need to actually run teh code.
Since you say this is a CTE, to really understand why this is is happening, you may need to look at the data it generates and how that data is used in the next query that uses the CTE. That might answer your question.
What you have asked is basically about a business rule unique to your company. The true answer should lie in any requirements documents for the original creation of the code. You should go look for them and read them. We can make guesses based on our own experience but only people in your company can answer the why question here.
If you can't find the documentation, then you need to talk (Yes directly talk, preferably in person) to the Stakeholders who use the data and find out what their needs were. Only do this after running the code and analyzing the results to better understand the meaning of the data returned.
Based on your query, all the records with days_old between 1 and 7 will be output as '-1', that is what select -1 does, nothing special here and there is no difference between select -1 and select 1 in exists, both will output the records as either 1 or -1, they are doing the same thing to check whether if there has any data.
Back to your query, I noticed that you have a union all and compare each four columns you select connected by union all, I am guessing your task is to get a final result with days_old not between 1 and 7 and combine the result with day_old, which is one because you take all between 1 and 7.
It is just a grouping logic there.
Your query returns aggregated
data (counts and rounds) grouped by days_old column plus one more group for data where days_old between 1 and 7.
So, -1 is just another additional group there, it cannot be 1 because days_old=1 is an another valid group.
result will be like this:
row1: days_old=1 count(*)=2 ...
row2: days_old=3 count(*)=5 ...
row3: days_old=9 count(*)=6 ...
row4: days_old=-1 count(*)=7

Order by in subquery behaving differently than native sql query?

So I am honestly a little puzzled by this!
I have a query that returns a set of transactions that contain both repair costs and an odometer reading at the time of repair on the master level. To get an accurate Cost per mile reading I need to do a subquery to get both the first meter reading between a starting date and an end date, and an ending meter.
(select top 1 wf2.ro_num
from wotrans wotr2
left join wofile wf2
on wotr2.rop_ro_num = wf2.ro_num
and wotr2.rop_fac = wf2.ro_fac
where wotr.rop_veh_num = wotr2.rop_veh_num
and wotr.rop_veh_facility = wotr2.rop_veh_facility
AND ((#sdate = '01/01/1900 00:00:00' and wotr2.rop_tran_date = 0)
OR ([dbo].[udf_RTA_ConvertDateInt](#sdate) <= wotr2.rop_tran_date
AND [dbo].[udf_RTA_ConvertDateInt](#edate) >= wotr2.rop_tran_date))
order by wotr2.rop_tran_date asc) as highMeter
The reason I have the tables aliased as xx2 is because those tables are also used in the main query, and I don't want these to interact with each other except to pull the correct vehicle number and facility.
Basically when I run the main query it returns a value that is not correct; it returns the one that is second(keep in mind that the first and second have the same date.) But when I take the subquery and just copy and paste it into it's own query and run it, it returns the correct value.
I do have a work around for this, but I am just curious as to why this happening. I have searched quite a bit and found not much(other than the fact that people don't like order bys in subqueries). Talking to one of my friends that also does quite a bit of SQL scripting, it looks to us as if the subquery is ordering differently than the subquery by itsself when you have multiple values that are the same for the order by(i.e. 10 dates of 08/05/2016).
Any ideas would be helpful!
Like I said I have a work around that works in this one case, but don't know yet if it will work on a larger dataset.
Let me know if you want more code.

Why my sql query is so slow in one database?

I am using sql server 2008 r2 and I have two database, which is one have 11.000 record and another is just 3000 record, when i do run this query
SELECT Right(rtrim(tbltransac.No_Faktur),6) as NoUrut,
tbltransac.No_Faktur,
tbltransac.No_FakturP,
tbltransac.Kd_Plg,
Tblcust.Nm_Plg,
GRANDTOTAL AS Total_Faktur,
tbltransac.Nm_Pajak,
tbltransac.Tgl_Faktur,
tbltransac.Tgl_FakturP,
tbltransac.Total_Distribusi
FROM Tblcust
INNER JOIN ViewGrandtotal AS tbltransac ON Tblcust.Kd_Plg = tbltransac.Kd_Plg
WHERE tbltransac.Kd_Trn = 'J'
and year(tbltransac.tgl_faktur)=2015
And ISNULL(tbltransac.No_OPJ,'') <> 'SHOP'
Order by Right(rtrim(tbltransac.No_Faktur),6) Desc
It takes me 1 minute 30 sec in my server (I query it using sql management tool) that have 3000 record but it only took 3 sec to do a query in my another server which is have 11000 record, whats wring with my database?
I've already tried to backup and restore my 3000 record database and restore it in my 11000 record server, it's faster.. took 30 sec to do a query, but it's still annoying if i compare to my 11000 record server. They are in the same spec
How this happend? what i should check? i check on event viewer, resource monitor or sql management log, i couldn't find any error or blocked connection. There is no wrong routing too..
Please help... It just happen a week ago, before this it was fine, and I haven't touch the server more than a month...
as already mentioned before, you have three issues in your query.
Just as an example, change the query to this one:
SELECT Right(rtrim(tbltransac.No_Faktur),6) as NoUrut,
tbltransac.No_Faktur,
tbltransac.No_FakturP,
tbltransac.Kd_Plg,
Tblcust.Nm_Plg,
GRANDTOTAL AS Total_Faktur,
tbltransac.Nm_Pajak,
tbltransac.Tgl_Faktur,
tbltransac.Tgl_FakturP,
tbltransac.Total_Distribusi
FROM Tblcust
INNER JOIN ViewGrandtotal AS tbltransac ON Tblcust.Kd_Plg = tbltransac.Kd_Plg
WHERE tbltransac.Kd_Trn = 'J'
and tbltransac.tgl_faktur BETWEEN '20150101' AND '20151231'
And tbltransac.No_OPJ <> 'SHOP'
Order by NoUrut Desc --Only if you need a sorted output in the datalayer
Another idea, if your viewGrandTotal is quite large, could be an pre-filtering of this table before you join it. Sometimes SQL Server doesn't get a good plan which needs some lovely touch to get him in the right direction.
Maybe this one:
SELECT Right(rtrim(vgt.No_Faktur),6) as NoUrut,
vgt.No_Faktur,
vgt.No_FakturP,
vgt.Kd_Plg,
tc.Nm_Plg,
vgt.Total_Faktur,
vgt.Nm_Pajak,
vgt.Tgl_Faktur,
vgt.Tgl_FakturP,
vgt.Total_Distribusi
FROM (SELECT Kd_Plg, Nm_Plg FROM Tblcust GROUP BY Kd_Plg, Nm_Plg) as tc -- Pre-Filter on just the needed columns and distinctive.
INNER JOIN (
-- Pre filter viewGrandTotal
SELECT DISTINCT vgt.No_Faktur, vgt.No_Faktur, vgt.No_FakturP, vgt.Kd_Plg, vgt.GRANDTOTAL AS Total_Faktur, vgt.Nm_Pajak,
vgt.Tgl_Faktur, vgt.Tgl_FakturP, vgt.Total_Distribusi
FROM ViewGrandtotal AS vgt
WHERE tbltransac.Kd_Trn = 'J'
and tbltransac.tgl_faktur BETWEEN '20150101' AND '20151231'
And tbltransac.No_OPJ <> 'SHOP'
) as vgt
ON tc.Kd_Plg = vgt.Kd_Plg
Order by NoUrut Desc --Only if you need a sorted output in the datalayer
The pre filtering could increase the generation of a better plan.
Another issue could be just the multi-threading. Maybe your query get a parallel plan as it reaches the cost threshold because of the 11.000 rows. The other query just hits a normal plan due to his lower rows. You can take a look at the generated plans by including the actual execution plan inside your SSMS Query.
Maybe you can compare those plans to get a clue. If this doesn't help, you can post them here to get some feedback from me.
I hope this helps. Not quite easy to give you good hints without knowing table structures, table sizes, performance counters, etc. :-)
Best regards,
Ionic
Note: first of all you should avoid any function in Where clause like this one
year(tbltransac.tgl_faktur)=2015
Here Aaron Bertrand how to work with date in Where clause
"In order to make best possible use of indexes, and to avoid capturing too few or too many rows, the best possible way to achieve the above query is ":
SELECT COUNT(*)
FROM dbo.SomeLogTable
WHERE DateColumn >= '20091011'
AND DateColumn < '20091012';
And i cant understand your logic in this piece of code but this is bad part of your query too
ISNULL(tbltransac.No_OPJ,'') <> 'SHOP'
Actually Null <> "Shop" in this case, so Why are you replace it to ""?
Thanks and good luck
Here is some recommendations:
year(tbltransac.tgl_faktur)=2015 replace this with tbltransac.tgl_faktur >= '20150101' and tbltransac.tgl_faktur < '20160101'
ISNULL(tbltransac.No_OPJ,'') <> 'SHOP' replace this with tbltransac.No_OPJ <> 'SHOP' because NULL <> 'SHOP'.
Order by Right(rtrim(tbltransac.No_Faktur),6) Desc remove this, because ordering should be done in presentation layer rather then in data layer.
Read about SARG arguments and predicates:
What makes a SQL statement sargable?
To write an appropriate SARG, you must ensure that a column that has
an index on it appears in the predicate alone, not as a function
parameter. SARGs must take the form of column inclusive_operator
or inclusive_operator column. The column name is alone
on one side of the expression, and the constant or calculated value
appears on the other side. Inclusive operators include the operators
=, >, <, =>, <=, BETWEEN, and LIKE. However, the LIKE operator is inclusive only if you do not use a wildcard % or _ at the beginning of
the string you are comparing the column to

MS SQL 2000 - How to efficiently walk through a set of previous records and process them in groups. Large table

I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.