How do I query a column where a specific number does not exist in any of the rows of that column - sql

I have ID | Name | Salary with types as Integer | String | Integer respectively.
I need to query the avg of all the rows of the Salary column, and then query the avg of all the rows of the Salary column again, but if any of those rows contain 0, remove 0 from those numbers, and calculate the avg.
So like if Salary returns 1420, 2006, 500, the next query should return 142, 26, 5. Then I calculate the avg of the subsequent numbers not containing 0.
I tried googling my specific problem but am not finding anything close to a solution. I'm not looking for an answer too much more than a shove in the right direction.
My Thoughts
Maybe I need to convert the integer data type to a varchar or string then remove the '0' digit from there, then convert back?
Maybe I need to create a temporary table from the first tables results, and insert them, just without 0?
Any ideas? Hope I was clear. Thanks!
Sample table data:
ID | Name | Salary
---+----------+-------
1 | Kathleen | 1420
2 | Bobby | 690
3 | Cat | 500
Now I need to query the above table but with the 0's removed from the salary rows
ID | Name | Salary
---+----------+-------
1 | Kathleen | 142
2 | Bobby | 69
3 | Cat | 5

You want to remove all 0s from your numbers, then take a numeric average of the result. As you are foreseeing, this requires mixing string and numeric operations.
The actual syntax will vary across databases. In MySQL, SQL Server and Oracle, you should be able to do:
select avg(replace(salary, '0', '') + 0) as myavg
from mytable
This involves two steps of implicit conversion: replace() forces string context, and + 0 turns the result back to a number. In SQL Server, you will get an integer result - if you want a decimal average instead, you might need to add a decimal value instead - so + 0.0 instead of + 0.
In Postgres, where implicit conversion is not happening as easily, you would use explicit casts:
select avg(replace(salary::text, '0', '')::int) as myavg
from mytable
This returns a decimal value.

Do you just want conditional aggregation?
select avg(salary), avg(case when salary <> 0 then salary end)
from t;
or do you want division?
select id, name, floor(salary / 10)
from t;
This produces the results you specify but it has nothing to do with "average"s.

Related

Aggregating / Concatenation of very long Varchar2 strings and find key words in the text || Oracle

I have been given a task to develop a script/ function/ query to aggregate groups of rows in a table and then search for specific keywords in it. The column to be aggregated is a varchar2 column with size 3200 and some of the aggregated rows have lengths way beyond 5000.
(I understand that the size of varchar2 is 4000)
When I try to aggregate the data into a single column, it gives a "result of string concatenation is too long" error (ORA-01489)
I have tried inbuilt aggregators like LISTAGG, XMLAGG, and also some custom functions but I have been asked to prefer a SQL query over a function or procedure.
Once I can get the data to be aggregated, I have to then search through the rows for matching keywords.
(can't just search the rows without aggregating as some of the words are split across the rows, eg row1 ends with "KEYW" and row2 starts with "ORD" if I need to look for "KEYWORD" in the table
my table kind of looks like this (can't post the real table data, sorry),
id_1 | id_2 | name | row_num | description
1 5 A 0 this has so
1 5 A 1 me keyword
1 5 B 0 this is
1 3 E 0 new some
2 12 A 0 diff str
here the unique rows are identified using the first 3 columns and the 4th column lists the order in which these "description" strings need to be concatenated.
I would like to get the output as:
id_1 | id_2 | name | description (concated)
1 5 A this is **some** keyword
1 3 E new **some**
when looking for the keyword "some"
Please help as I am fairly new to DBs and any help will be highly appreciated.
Thanks & Regards
Kunal

Query returns rows outside of `between` range?

I am querying a SQL Server database to get results from a table between two number values. Here is that statement:
select *
FROM [DATA].[dbo].[TableName] with (nolock)
where number between '1400' and '1500'
order by CAST(number as float);
For the most part, the results are within the range as expected. However, I do see some anomalies where a number that has the first four digits within the range is returned as a result. For example:
14550
In the result above, the first four digits are 1455 which would be within the range of 1400 to 1500. My guess is that this has to do with the CAST(number as float) part of the statement. Any suggestions on how I can update this statement to only return numbers between the stated values?
Here is the number info I get when running sp_help:
| Column_name | Type | Computed | Length | Prec | Scale | Nullable | TrimTrailingBlanks | FixedLenNullInSource | Collation |
=============================================================================================================================================================
| NUMBER | varchar | no | 4000 | | | yes | no | yes | SQL_Latin1_General_CP1_CI_AS |
Your comparison is being done as a string, because a column named number is stored as a string and the comparison values are strings. You could easily fix this just by changing the comparison values to numbers:
select *
FROM [DATA].[dbo].[TableName]
where number between 1400 and 1500
order by CAST(number as float);
But this is a hacky solution -- and it will return an error if any of the number values are not numbers. The real solution is to fix the data model, so it is not storing numbers as strings:
alter table tablename alter number int;
This uses int because all the referenced values in the question are ints.
If you cannot do this because the column is erroneously called number and contains non-numbers, then use a safe conversion function:
select *
FROM [DATA].[dbo].[TableName]
where try_cast(number as float) between 1400 and 1500
order by try_cast(number as float);
Note: I'm also not sure if this is the logic you really want, because it includes 1500. You might really want:
select *
FROM [DATA].[dbo].[TableName]
where try_cast(number as float) >= 1400 and
try_cast(number as float) < 1500
order by try_cast(number as float);
You have to cast the number as an int...
select *
FROM [DATA].[dbo].[TableName]
where CAST(number as int) between 1400 and 1500
order by CAST(number as int);

How to get the biggest column value between duplicated rows id?

I am working on an Oracle 11g database query that needs to retrieve a list of the highest NUM value between duplicated rows in a table.
Here is an example of my context:
ID | NUM
------------
1 | 1111
1 | 2222
2 | 3333
2 | 4444
3 | 5555
3 | 6666
And here is the result I am expecting after the query is executed:
NUM
----
2222
4444
6666
I know how to get the GREATEST value in a list of numbers, but I have absolutely no guess on how to group two lines, fetch the biggest column value between them IF they have the same ID.
Programmaticaly it is something quite easy to achieve, but using SQL it tends to be a litle bit less intuitive for me. Any suggestion or advise is welcomed as I don't even know which function could help me doing this in Oracle.
Thank you !
This is the typical use case for a GROUP BY. Assuming your Num field can be compared:
SELECT ID, MAX(NUM) as Max
FROM myTable
GROUP BY ID
If you don't want to select the ID (as in the output you provided), you can run
SELECT Max
FROM (
SELECT ID, MAX(NUM) as Max
FROM myTable
GROUP BY ID
) results
And here is the SQL fiddle
Edit : if NUM is, as you mentioned later, VARCHAR2, then you have to cast it to an Int. See this question.
The most efficient way I would suggest is
SELECT ids,
value
FROM (SELECT ids,
value,
max(value)
over (
PARTITION BY ids) max_value
FROM test)
WHERE value = max_value;
This requires that the query maintain a single value per id of the maximum value encountered so far. If a new maximum is found then the existing value is modified, otherwise the new value is discarded. The total number of elements that have to be held in memory is related to the number of ids, not the number of rows scanned.
See this SQLFIDDLE

how to select one tuple in rows based on variable field value

I'm quite new into SQL and I'd like to make a SELECT statement to retrieve only the first row of a set base on a column value. I'll try to make it clearer with a table example.
Here is my table data :
chip_id | sample_id
-------------------
1 | 45
1 | 55
1 | 5986
2 | 453
2 | 12
3 | 4567
3 | 9
I'd like to have a SELECT statement that fetch the first line with chip_id=1,2,3
Like this :
chip_id | sample_id
-------------------
1 | 45 or 55 or whatever
2 | 12 or 453 ...
3 | 9 or ...
How can I do this?
Thanks
i'd probably:
set a variable =0
order your table by chip_id
read the table in row by row
if table[row]>variable, store the table[row] in a result array,increment variable
loop till done
return your result array
though depending on your DB,query and versions you'll probably get unpredictable/unreliable returns.
You can get one value using row_number():
select chip_id, sample_id
from (select chip_id, sample_id,
row_number() over (partition by chip_id order by rand()) as seqnum
) t
where seqnum = 1
This returns a random value. In SQL, tables are inherently unordered, so there is no concept of "first". You need an auto incrementing id or creation date or some way of defining "first" to get the "first".
If you have such a column, then replace rand() with the column.
Provided I understood your output, if you are using PostGreSQL 9, you can use this:
SELECT chip_id ,
string_agg(sample_id, ' or ')
FROM your_table
GROUP BY chip_id
You need to group your data with a GROUP BY query.
When you group, generally you want the max, the min, or some other values to represent your group. You can do sums, count, all kind of group operations.
For your example, you don't seem to want a specific group operation, so the query could be as simple as this one :
SELECT chip_id, MAX(sample_id)
FROM table
GROUP BY chip_id
This way you are retrieving the maximum sample_id for each of the chip_id.

How to execute a LIKE query against a DECIMAL (or INTEGER) field?

Is it possible to execute a LIKE statement against a table column that contains DECIMAL types? Or else, what would be the best way to select matching rows given a number in a decimal (or integer) field?
E.g.:
Name Age
... ...
John 25
Mary 76
Jim 45
Erica 34
Anna 56
Bob 55
Executing something like SELECT * FROM table WHERE age LIKE 5 would return:
Name Age
John 25
Jim 45
Anna 56
Bob 55
It is not clear from your question what exactly you are trying to achieve, but based on the example query, the filtering you need to do should be achievable using normal arithmetic operators.
SELECT * FROM table WHERE MOD(age, 10) = 5 -- All records where the age ends in 5
Or:
SELECT * FROM table WHERE MOD(age, 5) = 0 -- All records where age is divisible by 5
Now that you clarified that though you are using a DECIMAL field you are not actually using it as a numeric value (as if you would, the requirement wouldn't exist), the answers given by others are reasonable - convert the field to a text value and use LIKE on it.
Alternatively, change the type of the field to something that is more suitable to the way you are using it.
You can convert your decimal field to varchar and then apply like.
If you create a query
select name from table where age like '%5%'
you could achieve this (at least in mysql and db2)
But if you prefer to match a number you should use something like:
select name from table where age > minimum and age < maximum
Or try to compare against a modulo if you are really interested in querying on the last number.