Use subquery with multiple rows - sql

I have been trying to work out code in SQL to clean up a data sheet (more than 200 rows and 50 columns) to add trailing zeros before the decimal point values.
I tried to apply a to_char to convert string data into a 0 padded figure, for all values less than 1
select to_char((select "1980" from imf_population where "1980" <1), '0.999')
from imf_population
However due to the subquery the to_char cannot perform a conversion on multiple rows returned from the 1980 column as there is more one record whose value is less than 1.
Any tips on how to get around this?

Your to_char must be in, not out. Once in, the outer select is not needed anymore:
select to_char("1980",'0.999') from imf_population where "1980"<1;
"1980" is a column name, right? (well, sqlite accepted create table imf_population ("1980" number); select "1980" from imf_population;, but it does not have to_char, I guess you're using oracle)

Note : Only Use Lowercase Letters, Numbers, and Underscores when
naming columns. Use Simple, Descriptive Column Names
CREATE TABLE imf_population (col varchar2(20) )
INSERT INTO imf_population (col) VALUES ('0.5289')
select to_char(col,'0.999') from imf_population where col<1;
| TO_CHAR(COL,'0.999') |
| :------------------- |
| 0.529 |
db<>fiddle here

Related

Find all the rows where column is letter case postgresql

I have a table in postgres database where I need to find all the rows -
Between two dates where fromTo is date column.
And also only those rows where column data contains mix of lower and upper case letters. for eg: eCTiWkAohbQAlmHHAemK
I can do between two dates as shown below but confuse on second point on how to do that?
SELECT * FROM test where fromTo BETWEEN '2022-09-08' AND '2022-09-23';
Data type for fromTo column is shown below -
fromTo | timestamp without time zone | | not null | CURRENT_TIMESTAMP
You can use a regular expression to check that it is only alphabetical characters and at least one uppercase character.
select *
from foo
where data ~ '[[:upper:]]' and data ~ '^[[:alpha:]]+$';
and fromTo BETWEEN '2022-09-08' AND '2022-09-23'
The character classes will match all alphabetical characters, including those with accents.
Demonstration.
Note that this may not be able to make use of an index. If your table is large, you may need to reconsider how you're storing the data.

How do I query a column where a specific number does not exist in any of the rows of that column

I have ID | Name | Salary with types as Integer | String | Integer respectively.
I need to query the avg of all the rows of the Salary column, and then query the avg of all the rows of the Salary column again, but if any of those rows contain 0, remove 0 from those numbers, and calculate the avg.
So like if Salary returns 1420, 2006, 500, the next query should return 142, 26, 5. Then I calculate the avg of the subsequent numbers not containing 0.
I tried googling my specific problem but am not finding anything close to a solution. I'm not looking for an answer too much more than a shove in the right direction.
My Thoughts
Maybe I need to convert the integer data type to a varchar or string then remove the '0' digit from there, then convert back?
Maybe I need to create a temporary table from the first tables results, and insert them, just without 0?
Any ideas? Hope I was clear. Thanks!
Sample table data:
ID | Name | Salary
---+----------+-------
1 | Kathleen | 1420
2 | Bobby | 690
3 | Cat | 500
Now I need to query the above table but with the 0's removed from the salary rows
ID | Name | Salary
---+----------+-------
1 | Kathleen | 142
2 | Bobby | 69
3 | Cat | 5
You want to remove all 0s from your numbers, then take a numeric average of the result. As you are foreseeing, this requires mixing string and numeric operations.
The actual syntax will vary across databases. In MySQL, SQL Server and Oracle, you should be able to do:
select avg(replace(salary, '0', '') + 0) as myavg
from mytable
This involves two steps of implicit conversion: replace() forces string context, and + 0 turns the result back to a number. In SQL Server, you will get an integer result - if you want a decimal average instead, you might need to add a decimal value instead - so + 0.0 instead of + 0.
In Postgres, where implicit conversion is not happening as easily, you would use explicit casts:
select avg(replace(salary::text, '0', '')::int) as myavg
from mytable
This returns a decimal value.
Do you just want conditional aggregation?
select avg(salary), avg(case when salary <> 0 then salary end)
from t;
or do you want division?
select id, name, floor(salary / 10)
from t;
This produces the results you specify but it has nothing to do with "average"s.

Extracting number of specific length from a string in Postgres

I am trying to extract a set of numbers from comments like
"on april-17 transactions numbers are 12345 / 56789"
"on april-18 transactions numbers are 56789"
"on may-19 no transactions"
Which are stored in a column called "com" in table comments
My requirement is to get the numbers of specific length. In this case length of 5, so 12345 and 56789 from the above string separately, It is possible to to have 0 five digit number or more more than 2 five digit number.
I tried using regexp_replace with the following result, I am trying the find a efficient regex or other method to achieve it
select regexp_replace(com, '[^0-9]',' ', 'g') from comments;
regexp_replace
----------------------------------------------------
17 12345 56789
I expect the result to get only
column1 | column2
12345 56789
There is no easy way to create query which gets an arbitrary number of columns: It cannot create one column for one number and at the next try the query would give two.
For fixed two columns:
demo:db<>fiddle
SELECT
matches[1] AS col1,
matches[2] AS col2
FROM (
SELECT
array_agg(regexp_matches[1]) AS matches
FROM
regexp_matches(
'on april-17 transactions numbers are 12345 / 56789',
'\d{5}',
'g'
)
) s
regexp_matches() gives out all finds in one row per find
array_agg() puts all elements into one array
The array elements can be give out as separate columns.

How can I "dynamically" split a varchar column by specific characters?

I have a column that stores 2 values. Example below:
| Column 1 |
|some title1 =ExtractThis ; Source Title12 = ExtractThis2|
I want to remove 'ExtractThis' into one column and 'ExtractThis2' into another column. I've tried using a substring but it doesn't work as the data in column 1 is variable and therefore it doesn't always carve out my intended values. SQL below:
SELECT substring(d.Column1,13,24) FROM dbo.Table d
This returns 'Extract This' but for other columns it either takes too much or too little. Is there a function or combination of functions that will allow me to split consistently on the character? This is consistent in my column unlike my length count.
select substring(col1,CHARINDEX('=',col1)+1,CHARINDEX (';',col1)-CHARINDEX ('=',col1)-1) Val1,
substring(col1,CHARINDEX('=',col1,CHARINDEX (';',col1))+1,LEN(col1)) Val2
from #data
there is duplicate calculation that can be reduced from 5 to 3 to each line.
but I want to believe this simple optimization done by SQL SERVER.

Explode range of integers out for joining in SQL

I have one table that stores a range of integers in a field, sort of like a print range, (e.g. "1-2,4-7,9-11"). This field could also contain a single number.
My goal is to join this table to a second one that has discrete values instead of ranges.
So if table one contains
1-2,5
9-15
7
And table two contains
1
2
3
4
5
6
7
8
9
10
The result of the join would be
1-2,5 1
1-2,5 2
1-2,5 5
7 7
9-15 9
9-15 10
Working in SQL Server 2008 R2.
Use a string split function of your choice to split on comma. Figure out the min/max values and join using between.
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table T1(Col1 varchar(10))
create table T2(Col2 int)
insert into T1 values
('1-2,5'),
('9-15'),
('7')
insert into T2 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
Query 1:
select T1.Col1,
T2.Col2
from T2
inner join (
select T1.Col1,
cast(left(S.Item, charindex('-', S.Item+'-')-1) as int) MinValue,
cast(stuff(S.Item, 1, charindex('-', S.Item), '') as int) MaxValue
from T1
cross apply dbo.Split(T1.Col1, ',') as S
) as T1
on T2.Col2 between T1.MinValue and T1.MaxValue
Results:
| COL1 | COL2 |
----------------
| 1-2,5 | 1 |
| 1-2,5 | 2 |
| 1-2,5 | 5 |
| 9-15 | 9 |
| 9-15 | 10 |
| 7 | 7 |
Like everybody has said, this is a pain to do natively in SQL Server. If you must then I think this is the proper approach.
First determine your rules for parsing the string, then break down the process into well-defined and understood problems.
Based on your example, I think this is the process:
Separate comma separated values in the string into rows
If the data does not contain a dash, then it's finished (it's a standalone value)
If it does contain a dash, parse the left and right sides of the dash
Given the left and right sides (the range) determine all the values between them into rows
I would create a temp table to populate the parsing results into which needs two columns:
SourceRowID INT, ContainedValue INT
and another to use for intermediate processing:
SourceRowID INT, ContainedValues VARCHAR
Parse your comma-separated values into their own rows using a CTE like this Step 1 is now a well-defined and understood problem to solve:
Turning a Comma Separated string into individual rows
So your result from the source
'1-2,5'
will be:
'1-2'
'5'
From there, SELECT from that processing table where the field does not contain a dash. Step 2 is now a well-defined and understood problem to solve These are standalone numbers and can go straight into the results temp table. The results table should also get the ID reference to the original row.
Next would be to parse the values to the left and right of the dash using CHARINDEX to locate it, then the appropriate LEFT and RIGHT functions as needed. This will give you the starting and ending value.
Here is a relevant question for accomplishing this step 3 is now a well-defined and understood problem to solve:
T-SQL substring - separating first and last name
Now you have separated the starting and ending values. Use another function which can explode this range. Step 4 is now a well-defined and understood problem to solve:
SQL: create sequential list of numbers from various starting points
SELECT all N between #min and #max
What is the best way to create and populate a numbers table?
and, also, insert it into the temp table.
Now what you should have is a temp table with every value in the exploded range.
Simply JOIN that to the other table on the values now, then to your source table on the ID reference and you're there.
My suggestion is to add one more field and many more records to your ranges table. Specifically, the primary key would be the integer and the other field would be the range. Records would look like this:
number range
1 1-2,5
2 1-2,5
3 na
4 na
5 1-2,5
etc
Having said that, this is still rather limiting because a number can only have one range. If you want to be thorough, set up a many to many relationship between numbers and ranges.
As far as I can tell you best option is something like below:
Create a table value function that accepts your ranges an converts them to a collection of ints. So 1-3,5 would return:
1
2
3
5
Then use these results to join to other tables. I don't have an exact function to do this at hand, but this one seems like an excellent start.