Query to retrieve only columns which have last name starting with 'K' - sql

]
The name column has both first and last name in one column.
Look at the 'rep' column. How do I filter only those rep names where the last name is starting with 'K'?

The way that table is defined won't allow you to do that query reliably--particularly if you have international names in the table. Not all names come with the given name first and the family name second. In Asia, for example, it is quite common to write names in the opposite order.
That said, assuming you have all western names, it is possible to get the information you need--but your indexes won't be able to help you. It will be slower than if your data had been broken out properly.
SELECT rep,
RTRIM(LEFT(LTRIM(RIGHT(rep, LEN(rep) - CHARINDEX(' ', rep))), CHARINDEX(' ', LTRIM(RIGHT(rep, LEN(rep) - CHARINDEX(' ', rep)))) - 1)) as family_name
WHERE family_name LIKE 'K%'
So what's going on in that query is some string manipulation. The dialect up there is SQL Server, so you'll have to refer to your vendor's string manipulation function. This picks the second word, and assumes the family name is the second word.
LEFT(str, num) takes the number of characters calculated from the left of the string
RIGHT(str, num) takes the number of characters calculated from the right of the string
CHARINDEX(char, str) finds the first index of a character
So you are getting the RIGHT side of the string where the count is the length of the string minus the first instance of a space character. Then we are getting the LEFT side of the remaining string the same way. Essentially if you had a name with 3 parts, this will always pick the second one.
You could probably do this with SUBSTRING(str, start, end), but you do need to calculate where that is precisely, using only the string itself.
Hopefully you can see where there are all kinds of edge cases where this could fail:
There are a couple records with a middle name
The family name is recorded first
Some records have a title (Mr., Lord, Dr.)
It would be better if you could separate the name into different columns and then the query would be trivial--and you have the benefit of your indexes as well.
Your other option is to create a stored procedure, and do the calculations a bit more precisely and in a way that is easier to read.

Assuming that the name is <firstname> <lastname> you can use:
where rep like '% K%'

Related

SQL Query returning incorrectly formatted data even though the function is returning the correct values

The data in my table represents physical locations with the following data: A municipality name, a state(province/region), and a unique ID which is comprised of a prefix and postfix separated by a dash (all NVARCHAR).
Name
State
UniqueID
Atlanta
Georgia
A12-1383
The dash in the UniqueID is not always in the same position (can be A1-XYZ, A1111-XYZ, etc.). The postfix will always be numerical.
My approach is using a combination of RIGHT and CHARINDEX to first find the index of the dash and then isolate the postfix to the right of the dash, and then applying a MAX to the result. My issue so far has been that this combination is sometimes returning things like -1234 or 12-1234, i.e, including the dash and occasionally some of the prefix. Because of this, the max is obviously applied incorrectly.
Here is my query thus far:
select name, max(right(uniqueid,(Charindex('-',uniqueid)))) as 'Max'
from locations
where state = 'GA' and uniqueid is not NULL
group by name
order by name ASC
This is what the results look like for the badly formatted rows:
Name
Max
Atlanta
11-2442
Savannah
-22
This is returning incorrectly formatted data for 'Max', so I isolated the functions.
CHARINDEX is correctly returning the position of the dash, including in cases where the function is returning badly formatted data. Since the dash is never in the same place, I cannot isolate the RIGHT function to see if that is the problem.
Am I using these functions incorrectly? Or is my approach altogether incorrect?
Thanks!
CHARINDEX is counting how many chars before the - i.e. how many chars on the left hand side, so using RIGHT with the number of chars on the left isn't going to work. You need to find out how many chars are on the right which can be done with LEN() to get the total length less the number of chars on the left.
SELECT RIGHT(MyColumn,LEN(MyColumn)-CHARINDEX('-',MyColumn))
FROM (
VALUES
('A12ssss-1383'),
('A12-13834'),
('A12ss-138')
) X (MyColumn);

Extracting number from a description filed for Hive

I am trying to extract numbers from a column which contains number and characters. They are however, structured hence I would like to know if we can just extract the number. I wonder if explode will work.
The current description column:
I need a help in setting up a campaign soon. Revenue: 1000
What I tried to do is to create a new column for that number called revenue.
My current command is:
SELECT description, X.value
FROM task
lateral view
explode(description) X as value
You could try using the Split function like this
SELECT
description,
split (description, ':\\s')[1] as Revenue
FROM task
Where :\\s is the regex pattern to match a colon followed by a space.
-------- EDIT: --------
If there are multiple : in the data then you could try (not sure if it will work though) the following (assuming that the last split will always contain the digits)
SELECT
description,
split (description, ':\\s')[size(split (description, ':\\s')) - 1] as Revenue
FROM task
Also your try of using Revenue\\s:\\s as the pattern may not be working due to the extra space matching try `Revenue:\s'
---------------------------
Or alternatively if the description doesn't always have the colon you could use the method regexp_extract(string subject, string pattern, int index)
Something like:
SELECT
description,
regexp_extract(description, '.*?(\d+)$', 1) as Revenue
FROM task
Where the regex pattern .*?(\\d+)$ will match multiple digits at the end of the description (but only if they are at the end)
With the latter option you should be able to find a suitable pattern if the description is not always consistent.
You can also use the following to remove any non-numeric characters:
select regexp_replace(description, '[^0-9]', '') as Revenue from task
This only works, though, if there is only a single number in the [description] field. If it's reliably formatted, using a more specific RegEx would likely be preferable.

how do I retrieve data from a sql table with huge number of inputs for a single column

I have a Company table in SQL Server and I would like to retrieve list of data related to particular companies and list of companies is very huge of around 200 company names and I am trying to use IN clause of T-SQL which is complicating the retrieval as few the companies have special characters in their name like O'Brien and so its throwing up an error as it is obvious.
SELECT *
FROM COMPANY
WHERE COMPANYNAME IN
('Archer Daniels Midland'
'Shell Trading (US) Company - Financial'
'Redwood Fund, LLC'
'Bunge Global Agribusiness - Matt Thibodeaux'
'PTG, LLC'
'Morgan Stanley Capital Group'
'Vitol Inc.'..
.....
....
.....)
Above is the script that is not working for obvious reasons, is there any way I can input those company names from an excel file and retrieve the data?
The easiest way would be to make a table and join it:
CREATE TABLE dbo.IncludedCompanies (CompanyName varchar(1000)
INSERT INTO dbo.IncludedCompanies
VALUES
('Archer Daniels Midland'),
('PTG, LLC')
...
SELECT *
FROM Company C
JOIN IncludedCompanies IC
ON C.CompanyName = IC.CompanyName
I do not think that mysql knows how to handle excel format, but you can fix your query.
Check how complicated names are stored in database (check if they have escape characters in them or anything else".
Replace all ' with \' in your query and it will take care of the ' characters
mysql> select now() as 'O\'Brian'; returns
O'Brian
2014-03-17 15:06:39
So i'm guessing you have a excel sheet with a column containing these names, and you want to use this in your where clause. In addition, some of the values have special characters in them, which needs to be escaped.
First thing you do is to escape the '-characters. You do this in excel, with a search replace for all occurences of ' with '' (the escaped version in sqlserver (\' in MySQL.)) Then, create a new column on each side side of your companies column, and in the first row input a ' on the left hand side, and ', on the right. Then use the copy cell functionality (the little square in the bottom right of the cell when you select it) to copy the cells to the left and right to all the rows, as far as the company list goes (just grab the square and pull it downwards..)
Then, take your list, now containing three columns and x rows and paste it into your favorite text editor. It should look something like this:
' Company#1 ',
' Company with special '' char ',
[...]
' Last company ',
Now, you will have some whitespace to get rid of. Use search replace and replace two space characters with nothing, and repeat (or take the space from the first ' to the start of the text and replace this with nothing.
Now, you should have a list of:
'Company#1',
'Company with special '' char',
[...]
'Last company',
Remove the last comma, and you'll have a valid list of parameters to your in-clause (or a (temporary) table if you want to keep your query a bit cleaner.)

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work

Count with muliple where conditions in ms access

I have the query below;
Select count(*) as poor
from records where deviceId='00019' and type='Poor' and timestamp between #14-Sep-2012 01:01:01# and #24-Sep-2012 01:01:01#
table is like;
id. deviceId, type, timestamp
data is like;
data is like;
1, '00019', 'Poor', '19-Sep-2012 01:01:01'
2, '00019', 'Poor', '19-Sep-2012 01:01:01'
3, '00019', 'Poor', '19-Sep-2012 01:01:01'
4, '00019', 'Poor', '19-Sep-2012 01:01:01'
i am trying to count the devices with a specific specific type.
Please help.. access always returns wrong data. it is returning 1 while 00019 has 4 entries for poor
Type and timestamp are both reserved words, so enclose them in square brackets in your query like this: [type] and [timestamp]. I doubt those reserved words are the cause of your problem, but it's hard to predict exactly when reserved words will cause query problems, so just rule out this possibility by using the square brackets.
Beyond that, stored text values sometimes contained extra non-visible characters. Check the lengths of the stored text values to see whether any are longer than expected.
SELECT
Len(deviceId) AS LenOfDeviceId,
Len([type]) AS LenOfType,
Len([timestamp]) AS LenOfTimestamp
FROM records;
In comments you mentioned spaces (ASCII value 32) in your stored values. I had been thinking we were dealing with other non-printable/invisible characters. If you have one or more actual space characters at the beginning and/or end of a stored deviceId value, the Trim() function will discard them. So this query will give you different length numbers in the two columns:
SELECT
Len(deviceId) AS LenOfDeviceId,
Len(Trim(deviceId)) AS LenOfDeviceId_NoSpaces
FROM records;
If the stored values can also include spaces within the string (not just at the beginning and/or end), Trim() will not remove those. In that case, you could use the Replace() function to discard all the spaces. Note however a query which uses Replace() must be run from inside an Access application session --- you can't use it from Java code.
SELECT
Len(deviceId) AS LenOfDeviceId,
Len(Replace(deviceId, ' ', '')) AS LenOfDeviceId_NoSpaces
FROM records;
If that query returns the same length numbers in both columns, then we are not dealing with actual space characters (ASCII value 32) ... but some other type of character(s) which look "space-like".
If you want to count devices with specific type irrespective of deviceids then use this:
Select count(*) as excellent
from records where type='Poor'
If you want to count devices with specific deviceid irrespective of types then use this:
Select count(*) as excellent
from records where deviceId='00019'