What is "Select -1", and how is it different from "Select 1"? - sql

I have the following query that is part of a common table expression. I don't understand the function of the "Select -1" statement. It is obviously different than the "Select 1" that is used in "EXISTS" statements. Any ideas?
select days_old,
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
group by days_old
union all
select -1, -- Selecting the -1 here
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
where days_old between 1 and 7

It's just selecting the number "minus one" for each row returned, just like "select 1" will select the number "one" for each row returned.
There is nothing special about the "select 1" syntax uses in EXISTS statements by the way; it's just selecting some random value because EXISTS requires a record to be returned and a record needs data; the number 1 is sufficient.
Why you would do this, I have no idea.

When you have a union statement, each part of the union must contain the same columns. From what I read when I look at this, the first statement is giving you one line for each days old value and then some stats for each day old. The second part of the union is giving you a summary of all the records that are only a week or so less. Since days old column is not relevant here, they put in a fake value as a placeholder in order to do the union. OF course this is just a guess based on reading thousands of queries through the years. To be sure, I would need to actually run teh code.
Since you say this is a CTE, to really understand why this is is happening, you may need to look at the data it generates and how that data is used in the next query that uses the CTE. That might answer your question.
What you have asked is basically about a business rule unique to your company. The true answer should lie in any requirements documents for the original creation of the code. You should go look for them and read them. We can make guesses based on our own experience but only people in your company can answer the why question here.
If you can't find the documentation, then you need to talk (Yes directly talk, preferably in person) to the Stakeholders who use the data and find out what their needs were. Only do this after running the code and analyzing the results to better understand the meaning of the data returned.

Based on your query, all the records with days_old between 1 and 7 will be output as '-1', that is what select -1 does, nothing special here and there is no difference between select -1 and select 1 in exists, both will output the records as either 1 or -1, they are doing the same thing to check whether if there has any data.
Back to your query, I noticed that you have a union all and compare each four columns you select connected by union all, I am guessing your task is to get a final result with days_old not between 1 and 7 and combine the result with day_old, which is one because you take all between 1 and 7.

It is just a grouping logic there.
Your query returns aggregated
data (counts and rounds) grouped by days_old column plus one more group for data where days_old between 1 and 7.
So, -1 is just another additional group there, it cannot be 1 because days_old=1 is an another valid group.
result will be like this:
row1: days_old=1 count(*)=2 ...
row2: days_old=3 count(*)=5 ...
row3: days_old=9 count(*)=6 ...
row4: days_old=-1 count(*)=7

Related

I need to query a database based on keywords

I need to query a postgresql database where there are keywords stored in the same row as the data I am trying to query. If it is queried on that keyword, that object is more likely, but not guaranteed to be the object queried. I want it to query about 10 items at a time, but I'm pretty sure I know how to do that(select top 10). So basically if the keyword is present it is more likely but not guaranteed to be the object queried. How do I do this?
I have a year of experience as a database developer but I don't know how to solve this problem. I would also be open to switching software if there are better suggestions. Thanks!!
So for example if the user searches on Apples then Data2 is more likely, but not guaranteed to be queried.
You want to select 10 rows, prefering those matching the keyword. So, order by match, then restrict to ten rows:
select *
from mytable
order by
case when keyword1 = 'Apples' then 0 else 1 end +
case when keyword2 = 'Apples' then 0 else 1 end +
case when keyword3 = 'Apples' then 0 else 1 end
fetch first 10 rows only;
Demo: https://dbfiddle.uk/?rdbms=postgres_8.4&fiddle=34758b94fe725f7f51a476e80c97187c
A row with a matching keyword is more likely, but not guaranteed to be selected, because the query picks ten rows, making arbitrary choices in case of ties. The linked demo shows one situation with less than 10 matches and one with more than ten.

Sql column value as formula in select

Can I select a column based on another column's value being listed as a formula? So I have a table, something like:
column_name formula val
one NULL 1
two NULL 2
three one + two NULL
And I want to do
SELECT
column_name,
CASE WHEN formula IS NULL
val
ELSE
(Here's where I'm confused - How do I evaluate the formula?)
END as result
FROM
table
And end up with a result set like
column_name result
one 1
two 2
three 3
You keep saying column, and column name, but you're actually talking about rows, not columns.
The problem is that you (potentially) want different formulas for each row. For example, row 4 might be (two - one) = 1 or even (three + one) = 4, where you'd have to calculate row three before you could do row 4. This means that a simple select query that parses the formulas is going to be very hard to do, and it would have to be able to handle each type of formula, and even then if the formulas reference other formulas that only makes it harder.
If you have to be able to handle functions like (two + one) * five = 15 and two + one * five = 7, then you'd be basically re-implementing a full blown eval function. You might be better to return the SQL table to another language that has eval functions built in, or you could use something like SQL Eval.net if it has to be in SQL.
Either way, though, you've still got to change "two + one" to "2 + 1" before you can do the eval with it. Because these values are in other rows, you can't see those values in the row you're looking at. To get the value for "one" you have to do something like
Select val from table where column_name = 'one'
And even then if the val is null, that means it hasn't been calculated yet, and you have to come back and try again later.
If I had to do something like this, I would create a temporary table, and load the basic table into it. Then, I'd iterate over the rows with null values, trying to replace column names with the literal values. I'd run the eval over any formulas that had no symbols anymore, setting the val for those rows. If there were still rows with no val (ie they were waiting for another row to be done first), I'd go back and iterate again. At the end, you should have a val for every row, at which point it is a simple query to get your results.
Possible solution would be like this kind....but since you mentioned very few things so this works on your above condition, not sure for anything else.
GO
SELECT
t1.column_name,
CASE WHEN t1.formula IS NULL
t1.val
ELSE
(select sum(t2.val) from table as t2 where t2.formula is not null)
END as result
FROM
table as t1
GO
If this is not working feel free to discuss it further.

SQL to find the matching row between two tables of same schema

I have two tables, say X and X_STAGING.
They are exactly identical in columns i.e. schema is same. However, the number of rows are different. I know that the first row of X is there in X_STAGING - the data was partially copied over from X_STAGING to X. However I need to know exactly which row of the X_STAGING contains the data, that went into the first row of X.
At the moment I am using this
SELECT
SUM(MATCH)
FROM
(
SELECT
CASE WHEN X_STAGING.KEY_ID='KEY_FROM_THE_FIRST_ROW_OF_X' THEN 1 ELSE 0 END AS MATCH
FROM
X_STAGING
WHERE ROWNUM<2550000
)
Changing the ROWNUM I can find out at which ROWNUM does the count get to 1. And then my adjusting ROWNUM I can eventually get to the particular row.
This will work, but I am sure there has to be a quicker and more clever way of doing this.
Please help.
Note: I am working on Linux, DB2 environment.
I don't understand what you are trying to accomplish, but the following does what you are asking for:
SELECT
MAX(MATCH)
FROM
(
SELECT
CASE WHEN X_STAGING.KEY_ID='KEY_FROM_THE_FIRST_ROW_OF_X' THEN ROWNUM ELSE 0 END AS MATCH
FROM
X_STAGING
)

Selecting top n Oracle records with ROWNUM still valid in subquery?

I have the following FireBird query:
update hrs h
set h.plan_week_id=
(select first 1 c.plan_week_id from calendar c
where c.calendar_id=h.calendar_id)
where coalesce(h.calendar_id,0) <> 0
(Intention: For records in hrs with a (non-zero) calendar_id
take calendar.plan_week_id and put it in hrs.plan_week_id)
The trick to select the first record in Oracle is to use WHERE ROWNUM=1, and if understand correctly I do not have to use ROWNUM in a separate outer query because I 'only' match ROWNUM=1 - thanks SO for suggesting Questions that may already have your answer ;-)
This would make it
update hrs h
set h.plan_week_id=
(select c.plan_week_id from calendar c
where (c.calendar_id=h.calendar_id) and (rownum=1))
where coalesce(h.calendar_id,0) <> 0
I'm actually using the 'first record' together with the selection of only one field to guarantee that I get one value back which can be put into h.plan_week_id.
Question: Will the above query work under Oracle as intended?
Right now, I do not have a filled Oracle DB at hand to run the query on.
Like Nicholas Krasnov said, you can test it in SQL Fiddle.
But if you ever find yourself about to use where rownum = 1 in a subquery, alarm bells should go off, because in 90% of the cases you are doing something wrong. Very rarely will you need a random value. Only when all selected values are the same, a rownum = 1 is valid.
In this case I expect calendar_id to be a primary key in calendar. Therefor each record in hrs can only have 1 plan_week_id selected per record. So the where rownum = 1 is not required.
And to answer your question: Yes, it will run just fine. Though the brackets around each where clause are also not required and in fact only confusing (me).

MS SQL 2000 - How to efficiently walk through a set of previous records and process them in groups. Large table

I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.