SQL: selecting distinct substring from a field

SQL: selecting distinct substring from a field - sql

I'm blacking out on my basic SQL and would appreciate a quick hand:
I have a SQLite table, with 2 columns: Datetime, and a string saying something like "call from 555-555-3344".
I need a simple query that will give me a count of all distinct phone numbers that called on a certain day.
If the field had contained just the number, I could have used Select Distinct on it. How do I do it if the value (phone number) is a substring in that field (though always the last 10 digits).
Assistance, as always, much appreciated.
Guy

You can use the following (I used 12 instead of 10 in order to include the separator -):
SELECT COUNT(DISTINCT SUBSTR(phone_nbr, -12))
FROM table
WHERE call_dt = :call_dt;

Related

Define a temporary table within an SQL "select" query

I'm referring to MS Access SQL. Suppose I have a column with dates of birth of a population and the decades which these dates fall into.
Year Decade
1971 8
1953 6
1958 6
1929 3
1930 4
I want to create a query which will show how many people were born in each decade of a century.
I know it's going to be something like
SELECT (Year mod 100) \ 10 + 1 as [Decade], Count(*) as [How many people]
FROM People
GROUP BY (Year mod 100) \ 10 + 1
My problem is that there might be some decades in which no one was born from my population and I still want these to show up in my query, with a zero.
My ideal solution would be defining a table on the fly, consisting of rows {1,2,3,4…}, very much like you'd do in any programming language, say in Python decades = range(1,10), then creating the table with the counted people, and then joining these two together with a left join.
It seems not possible, but I'm a newbie to SQL and databases. Is that possible? What are other approaches?

MsAccess does not have a function like Range() that you can use. What I have done in my databases is create a table of numbers to use for cases like this.
The simplest way to create this table is by using an Excel spreadsheet to build the column of numbers (for instance, from 1 to 1,000) and then import the spreadsheet as a new table. Then make whatever adjustments are appropriate - for example, the new table should have a primary key on the numbers column, and the numbers column should probably be of a long integer data type. You could call the table [Numbers] and name the column [NumberValue] - these names are up (you could just as easily call your column [Nums] or even just [N]). But I would caution against using the name [Number] for a table or column because Number is a datatype name and MSAccess does not always play nicely with names that are SQL or VBA keywords.
Now you can use your new table with regular sql: Select * from [Numbers] where NumberValue >= 1 and NumberValue <= 10

First, create a small query to generate numbers:
Select Distinct Abs([id] Mod 10) As N
From MSysObjects;
Save it as Ten.
Then create a Cartesian (multiplying) query like this:
Select
Ten.N As Decade,
Val(Nz(T.C)) As [How many people]
From
Ten
Left Join
(Select Count(*) As C, [Year] Mod 10 As D
From People
Group By [Year] Mod 10) As T
On Ten.N = T.D
Output will be similar to:
Extended example that returns that dates of this year:
SELECT
DateSerial(Year(Date()),1,1+[Ten_0].[N]+[Ten_1].[N]*10+[Ten_2].[N]*100) AS [Date]
FROM
Ten AS Ten_0,
Ten AS Ten_1,
Ten AS Ten_2
WHERE
(((DateSerial(Year(Date()),1,1+[Ten_0].[N]+[Ten_1].[N]*10+[Ten_2].[N]*100))<=DateSerial(Year(Date()),12,31))
AND
((Ten_2.N)<4))
ORDER BY
DateSerial(Year(Date()),1,1+[Ten_0].[N]+[Ten_1].[N]*10+[Ten_2].[N]*100);

SQL Server : aggregate function doesn't work?

I have imported a price list from a csv to my SQL Server database. That has worked fine. But now some weird stuff. Table is named PRICE which includes a column (and some more) Endprice and a total of 761 rows. All datatypes are varchar(50).
SELECT MAX(Endprice)
FROM PRICE
When I want this simple SQL statement to show the highest price in the column, I get a wrong result. I don't know why.
I get 98,39 as a result, but that's definitively wrong, it must be 100,73.
Here you can see a part of the data:
And now the wrong MAX() result:
BUT when I'm using the MIN function I get the highest one!? The min is somewhere at ~50 (not shown in the screenshot part).
`
The resultset of SELECT Endprice FROM PRICE is correct. I am at my wit's end.

This is because your column is a varchar, so it is determining the min or max based on characters. The column should be a decimal or money type, so it sorts by the value of your number (instead of an alphabetic sort like you are getting now).
Alphabetic sort: 9 is more than 1, thus 98.39 is the max.

The reason is because price is a varchar().
Here are two solutions:
order by len(price), price
This works assuming that all the price values have the same structure.
Or:
order by cast(price as float)
If you could have non-numeric values (always a danger when storing numbers in the wrong data type):
order by (case when isnumeric(price) = 1 then cast(price as float) end)
Or better yet:
alter table alter column price money
Then you don't have to worry about having the wrong type for the column.

Your problem is Endprice columns is varchar(50), therefore it is comparing strings not numbers, which means that a 9>1 no matter what cames next of the first digit. You have to convert it to a number before the max!
Also you really should consider in doing what #a_horse_with_no_name suggested change your column into a number like column type.
This is a example on how you solve your actual problem
select max(cast(endprice as money)) from sample
See it here: http://sqlfiddle.com/#!3/767f6/1
Note that I used . as a decimal separator it will depend on your database language setup.

How to combine the LIKE function with a DATE_PART function in PostgreSQL?

Using Postgres, but if someone knows how to do this in standard SQL that would be a great start. I am joining to a table via a character varying column. This column contains values such as:
PC11941.2004
PC14151.2004
PC21213.2003
SPC21434.2003
PC17715.04V1
PC18733.2002
0MRACCT_ALL.GLFUNCT
A lot of the numbers after the periods correspond to years. I want to join the table via the current year. So, for example, I could JOIN on the condition LIKE '%2015'.
But I want to create this view and never return to it so I would need to join it against something like (get_fy_part('YEAR', clock_timestamp()).
Not sure how I go about writing that. I haven't had success, yet.

You can get the current year with date_part('year', CURRENT_DATE)
Something like this should work:
SELECT * FROM mytable WHERE mycolumn LIKE ('%' || date_part('year', CURRENT_DATE))
The || operator concatenates the percent-sign with the year.
I hope that helps!

Use the function RIGHT().
SELECT originalColumn, RIGHT(originalColumn,4)
FROM table;
This will get you the years you are interested in.
If you want everything after the dot, then something like:
SELECT originalColumn, RIGHT(originalColumn,len(originalColumn)-position('.' in originalColumn))
FROM table

Depends on the exact rules - and actually implemented CHECK constraints for the column.
If there is always a single dot in your column col and all your years have 4 digits:
Basic solution
SELECT * FROM tbl
WHERE col LIKE to_char(now(), '"%."YYYY');
Why?
It's most efficient to compare to the same data type. Since the column is a character type (varchar), rather use to_char() (returns text, which is effectively the same as varchar) than EXTRACT or date_part() (return double precision).
More importantly, this expression is sargable. That's generally cheapest and allows (optional) index support. In your case, a trigram index would work:
PostgreSQL LIKE query performance variations
Optimize
If you want to be as fast (read performance) and accurate as possible, and your table has more than a trivial number of rows, go with a specialized partial expression index:
CRATE INDEX tbl_year_idx ON tbl (cast(right(col, 4) AS int) DESC)
WHERE col ~ '\.\d{4}$'; -- ends with a dot and 4 digits
Matching query:
SELECT * FROM tbl
WHERE col ~ '\.\d{4}$' -- repeat index condition
AND right(col, 4)::int = EXTRACT(year FROM col);
Test performance with EXPLAIN ANALYZE.
You could even go one step further and tailor the index for the current year:
CRATE INDEX tbl_year2015_idx ON tbl (tbl_id) -- any (useful?) column
WHERE col LIKE '%.2015';
Works with the first "basic" query.
You would have to (re-)create the index for each year. A simple solution would be to create indexes for a couple of years ahead and append another one each year automatically ...
This is also the point where you consider the alternative: store the year as redundant integer column in your table and simplify the rest.
That's what I would do.

SQL - Select and return functions

I am using SQL*Plus. This is what I am to do: Using the BOOK_ORDER table, create a query using the correct function to return the order number, the date ordered, the date shipped, and a column representing the number of months between the two dates for all columns where a date shipped exists. Format the number returned from the function to display only two decimals, and give the column an alias of "Months Between".
NOTE: Be sure that all of the numbers in the fourth column are positive numbers
I've started it this way; however, I am a bit lost and confused in what I am doing.
SELECT BOOK_ORDER.ORDERID, BOOK_ORDER.ORDERDATE, BOOK_ORDER.SHIPDATE ||', ' ||
Can someone help?

The assignment clearly points to using the Oracle built-in function MONTHS_BETWEEN. It's in the documentation. Find out more.
select b.orderid
, b.orderdate
, b.shipdate
, round(months_between(b.orderdate, b.shipdate),2) as "months between"
from book_order b
where b.shipdate is not null
/
Please be sure to give SO credit when you hand in your homemwork.

subtracting in SQL Server

I have a table in SQL Server where I have the scores for some competencies, I have one score for the standard and one for the actual score. For instance S25 is the actual score and C25 is the standard for the score. I need to find the difference between the two so I can see who was above and below the standard and cannot figure out how to get the subtract to work. THe way I tried was
Select (S25) - (C25) AS 25_Score
Which did not work

If table starts with a number, bracket it, and that might work. What error do you get?
select (S25)-(C25) AS [25_Score]
from table_name

Your query should work if your columns are a numeric datatype.
The only issue I see is you are starting the alias with a number. You will need to escape the number value with a square bracket:
Select (S25) - (C25) AS [25_Score]
from yt;
See Demo

It may be that the column is of varchar so you have to convert
select convert(int,[S25])-convert(int,[C25]) AS [25_Score]
from table_name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: selecting distinct substring from a field - sql

You can use the following (I used 12 instead of 10 in order to include the separator -): SELECT COUNT(DISTINCT SUBSTR(phone_nbr, -12)) FROM table WHERE call_dt = :call_dt;

Related

Define a temporary table within an SQL "select" query

SQL Server : aggregate function doesn't work?

How to combine the LIKE function with a DATE_PART function in PostgreSQL?

SQL - Select and return functions

subtracting in SQL Server

Categories

Resources