SQL Select Convert State Name To Abbreviation - sql

In a SQL select statement, how to convert a full state name to state abbreviation (e.g. New York to NY)? I'd like to do this without joins if possible. What would the regexp_replace look like?
select regexp_replace(table.state, 'New York', 'NY', 'g') as state
Can this approach be done en mass for all states?
For reference list of states names and abbreviations: https://gist.github.com/esfand/9443427.

Here it is in raw WHEN/THEN form if needed, along with Canadian provinces:
CASE "YOUR COLUMN CONTAINING FULL STATE NAMES"
WHEN 'Alabama' THEN 'AL'
WHEN 'Alaska' THEN 'AK'
WHEN 'Arizona' THEN 'AZ'
WHEN 'Arkansas' THEN 'AR'
WHEN 'California' THEN 'CA'
WHEN 'Colorado' THEN 'CO'
WHEN 'Connecticut' THEN 'CT'
WHEN 'Delaware' THEN 'DE'
WHEN 'District of Columbia' THEN 'DC'
WHEN 'Florida' THEN 'FL'
WHEN 'Georgia' THEN 'GA'
WHEN 'Hawaii' THEN 'HI'
WHEN 'Idaho' THEN 'ID'
WHEN 'Illinois' THEN 'IL'
WHEN 'Indiana' THEN 'IN'
WHEN 'Iowa' THEN 'IA'
WHEN 'Kansas' THEN 'KS'
WHEN 'Kentucky' THEN 'KY'
WHEN 'Louisiana' THEN 'LA'
WHEN 'Maine' THEN 'ME'
WHEN 'Maryland' THEN 'MD'
WHEN 'Massachusetts' THEN 'MA'
WHEN 'Michigan' THEN 'MI'
WHEN 'Minnesota' THEN 'MN'
WHEN 'Mississippi' THEN 'MS'
WHEN 'Missouri' THEN 'MO'
WHEN 'Montana' THEN 'MT'
WHEN 'Nebraska' THEN 'NE'
WHEN 'Nevada' THEN 'NV'
WHEN 'New Hampshire' THEN 'NH'
WHEN 'New Jersey' THEN 'NJ'
WHEN 'New Mexico' THEN 'NM'
WHEN 'New York' THEN 'NY'
WHEN 'North Carolina' THEN 'NC'
WHEN 'North Dakota' THEN 'ND'
WHEN 'Ohio' THEN 'OH'
WHEN 'Oklahoma' THEN 'OK'
WHEN 'Oregon' THEN 'OR'
WHEN 'Pennsylvania' THEN 'PA'
WHEN 'Rhode Island' THEN 'RI'
WHEN 'South Carolina' THEN 'SC'
WHEN 'South Dakota' THEN 'SD'
WHEN 'Tennessee' THEN 'TN'
WHEN 'Texas' THEN 'TX'
WHEN 'Utah' THEN 'UT'
WHEN 'Vermont' THEN 'VT'
WHEN 'Virginia' THEN 'VA'
WHEN 'Washington' THEN 'WA'
WHEN 'West Virginia' THEN 'WV'
WHEN 'Wisconsin' THEN 'WI'
WHEN 'Wyoming' THEN 'WY'
WHEN 'Alberta' THEN 'AB'
WHEN 'British Columbia' THEN 'BC'
WHEN 'Manitoba' THEN 'MB'
WHEN 'New Brunswick' THEN 'NB'
WHEN 'Newfoundland and Labrador' THEN 'NL'
WHEN 'Northwest Territories' THEN 'NT'
WHEN 'Nova Scotia' THEN 'NS'
WHEN 'Nunavut' THEN 'NU'
WHEN 'Ontario' THEN 'ON'
WHEN 'Prince Edward Island' THEN 'PE'
WHEN 'Quebec' THEN 'QC'
WHEN 'Saskatchewan' THEN 'SK'
WHEN 'Yukon Territory' THEN 'YT'
ELSE NULL
END

With PostgreSQL you can use JSON
select '{"Alabama": "AL", "Alaska": "AK"}'::json->'Alabama'
You can also use a column reference instead of a string literal
select
'{"Alabama": "AL", "Alaska": "AK"}'::json->example.state
from
(values ('Alabama')) example(state)

As comments suggested that a join is needed. Below is what I ended up doing. Let me know if there is a better way.
with states(name, abbr) as (
select
*
from
(values ('Alabama', 'AL'),
('Alaska', 'AK'),
('Arizona', 'AZ'),
('Arkansas', 'AR'),
('California', 'CA'),
('Colorado', 'CO'),
('Connecticut', 'CT'),
('Delaware', 'DE'),
('District of Columbia', 'DC'),
('Florida', 'FL'),
('Georgia', 'GA'),
('Hawaii', 'HI'),
('Idaho', 'ID'),
('Illinois', 'IL'),
('Indiana', 'IN'),
('Iowa', 'IA'),
('Kansas', 'KS'),
('Kentucky', 'KY'),
('Louisiana', 'LA'),
('Maine', 'ME'),
('Maryland', 'MD'),
('Massachusetts', 'MA'),
('Michigan', 'MI'),
('Minnesota', 'MN'),
('Mississippi', 'MS'),
('Missouri', 'MO'),
('Montana', 'MT'),
('Nebraska', 'NE'),
('Nevada', 'NV'),
('New Hampshire', 'NH'),
('New Jersey', 'NJ'),
('New Mexico', 'NM'),
('New York', 'NY'),
('North Carolina', 'NC'),
('North Dakota', 'ND'),
('Ohio', 'OH'),
('Oklahoma', 'OK'),
('Oregon', 'OR'),
('Pennsylvania', 'PA'),
('Rhode Island', 'RI'),
('South Carolina', 'SC'),
('South Dakota', 'SD'),
('Tennessee', 'TN'),
('Texas', 'TX'),
('Utah', 'UT'),
('Vermont', 'VT'),
('Virginia', 'VA'),
('Washington', 'WA'),
('West Virginia', 'WV'),
('Wisconsin', 'WI'),
('Wyoming', 'WY')) as state
)
And select (select states.abbr from states where name = state_name)

Related

Extract words from a column and count frequency

Does anyone know if there's an efficient way to extract all the words from a single column and count the frequency of each word in SQL Server? I only have read-only access to my database so I can't create a self-defined function to do this.
Here's a reproducible example:
CREATE TABLE words
(
id INT PRIMARY KEY,
text_column VARCHAR(1000)
);
INSERT INTO words (id, text_column)
VALUES
(1, 'SQL Server is a popular database management system'),
(2, 'It is widely used for data storage and retrieval'),
(3, 'SQL Server is a powerful tool for data analysis');
I have found this code but it's not working correctly, and I think it's too complicated to understand:
WITH E1(N) AS
(
SELECT 1
FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
) t(N)
),
E2(N) AS (SELECT 1 FROM E1 a CROSS JOIN E1 b),
E4(N) AS (SELECT 1 FROM E2 a CROSS JOIN E2 b)
SELECT
LOWER(x.Item) AS [Word],
COUNT(*) AS [Counts]
FROM
(SELECT * FROM words) a
CROSS APPLY
(SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = LTRIM(RTRIM(SUBSTRING(a.text_column, l.N1, l.L1)))
FROM
(SELECT
s.N1,
L1 = ISNULL(NULLIF(CHARINDEX(' ',a.text_column,s.N1),0)-s.N1,4000)
FROM
(SELECT 1
UNION ALL
SELECT t.N+1
FROM
(SELECT TOP (ISNULL(DATALENGTH(a.text_column)/2,0))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E4) t(N)
WHERE SUBSTRING(a.text_column ,t.N,1) = ' '
) s(N1)
) l(N1, L1)
) x
WHERE
x.item <> ''
AND x.Item NOT IN ('0o', '0s', '3a', '3b', '3d', '6b', '6o', 'a', 'a1', 'a2', 'a3', 'a4', 'ab', 'able', 'about', 'above', 'abst', 'ac', 'accordance', 'according', 'accordingly', 'across', 'act', 'actually', 'ad', 'added', 'adj', 'ae', 'af', 'affected', 'affecting', 'affects', 'after', 'afterwards', 'ag', 'again', 'against', 'ah', 'ain', 'ain''t', 'aj', 'al', 'all', 'allow', 'allows', 'almost', 'alone', 'along', 'already', 'also', 'although', 'always', 'am', 'among', 'amongst', 'amoungst', 'amount', 'an', 'and', 'announce', 'another', 'any', 'anybody', 'anyhow', 'anymore', 'anyone', 'anything', 'anyway', 'anyways', 'anywhere', 'ao', 'ap', 'apart', 'apparently', 'appear', 'appreciate', 'appropriate', 'approximately', 'ar', 'are', 'aren', 'arent', 'aren''t', 'arise', 'around', 'as', 'a''s', 'aside', 'ask', 'asking', 'associated', 'at', 'au', 'auth', 'av', 'available', 'aw', 'away', 'awfully', 'ax', 'ay', 'az', 'b', 'b1', 'b2', 'b3', 'ba', 'back', 'bc', 'bd', 'be', 'became', 'because', 'become', 'becomes', 'becoming', 'been', 'before', 'beforehand', 'begin', 'beginning', 'beginnings', 'begins', 'behind', 'being', 'believe', 'below', 'beside', 'besides', 'best', 'better', 'between', 'beyond', 'bi', 'bill', 'biol', 'bj', 'bk', 'bl', 'bn', 'both', 'bottom', 'bp', 'br', 'brief', 'briefly', 'bs', 'bt', 'bu', 'but', 'bx', 'by', 'c', 'c1', 'c2', 'c3', 'ca', 'call', 'came', 'can', 'cannot', 'cant', 'can''t', 'cause', 'causes', 'cc', 'cd', 'ce', 'certain', 'certainly', 'cf', 'cg', 'ch', 'changes', 'ci', 'cit', 'cj', 'cl', 'clearly', 'cm', 'c''mon', 'cn', 'co', 'com', 'come', 'comes', 'con', 'concerning', 'consequently', 'consider', 'considering', 'contain', 'containing', 'contains', 'corresponding', 'could', 'couldn', 'couldnt', 'couldn''t', 'course', 'cp', 'cq', 'cr', 'cry', 'cs', 'c''s', 'ct', 'cu', 'currently', 'cv', 'cx', 'cy', 'cz', 'd', 'd2', 'da', 'date', 'dc', 'dd', 'de', 'definitely', 'describe', 'described', 'despite', 'detail', 'df', 'di', 'did', 'didn', 'didn''t', 'different', 'dj', 'dk', 'dl', 'do', 'does', 'doesn', 'doesn''t', 'doing', 'don', 'done', 'don''t', 'down', 'downwards', 'dp', 'dr', 'ds', 'dt', 'du', 'due', 'during', 'dx', 'dy', 'e', 'e2', 'e3', 'ea', 'each', 'ec', 'ed', 'edu', 'ee', 'ef', 'effect', 'eg', 'ei', 'eight', 'eighty', 'either', 'ej', 'el', 'eleven', 'else', 'elsewhere', 'em', 'empty', 'en', 'end', 'ending', 'enough', 'entirely', 'eo', 'ep', 'eq', 'er', 'es', 'especially', 'est', 'et', 'et-al', 'etc', 'eu', 'ev', 'even', 'ever', 'every', 'everybody', 'everyone', 'everything', 'everywhere', 'ex', 'exactly', 'example', 'except', 'ey', 'f', 'f2', 'fa', 'far', 'fc', 'few', 'ff', 'fi', 'fifteen', 'fifth', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'fix', 'fj', 'fl', 'fn', 'fo', 'followed', 'following', 'follows', 'for', 'former', 'formerly', 'forth', 'forty', 'found', 'four', 'fr', 'from', 'front', 'fs', 'ft', 'fu', 'full', 'further', 'furthermore', 'fy', 'g', 'ga', 'gave', 'ge', 'get', 'gets', 'getting', 'gi', 'give', 'given', 'gives', 'giving', 'gj', 'gl', 'go', 'goes', 'going', 'gone', 'got', 'gotten', 'gr', 'greetings', 'gs', 'gy', 'h', 'h2', 'h3', 'had', 'hadn', 'hadn''t', 'happens', 'hardly', 'has', 'hasn', 'hasnt', 'hasn''t', 'have', 'haven', 'haven''t', 'having', 'he', 'hed', 'he''d', 'he''ll', 'hello', 'help', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'heres', 'here''s', 'hereupon', 'hers', 'herself', 'hes', 'he''s', 'hh', 'hi', 'hid', 'him', 'himself', 'his', 'hither', 'hj', 'ho', 'home', 'hopefully', 'how', 'howbeit', 'however', 'how''s', 'hr', 'hs', 'http', 'hu', 'hundred', 'hy', 'i', 'i2', 'i3', 'i4', 'i6', 'i7', 'i8', 'ia', 'ib', 'ibid', 'ic', 'id', 'i''d', 'ie', 'if', 'ig', 'ignored', 'ih', 'ii', 'ij', 'il', 'i''ll', 'im', 'i''m', 'immediate', 'immediately', 'importance', 'important', 'in', 'inasmuch', 'inc', 'indeed', 'index', 'indicate', 'indicated', 'indicates', 'information', 'inner', 'insofar', 'instead', 'interest', 'into', 'invention', 'inward', 'io', 'ip', 'iq', 'ir', 'is', 'isn', 'isn''t', 'it', 'itd', 'it''d', 'it''ll', 'its', 'it''s', 'itself', 'iv', 'i''ve', 'ix', 'iy', 'iz', 'j', 'jj', 'jr', 'js', 'jt', 'ju', 'just', 'k', 'ke', 'keep', 'keeps', 'kept', 'kg', 'kj', 'km', 'know', 'known', 'knows', 'ko', 'l', 'l2', 'la', 'largely', 'last', 'lately', 'later', 'latter', 'latterly', 'lb', 'lc', 'le', 'least', 'les', 'less', 'lest', 'let', 'lets', 'let''s', 'lf', 'like', 'liked', 'likely', 'line', 'little', 'lj', 'll', 'll', 'ln', 'lo', 'look', 'looking', 'looks', 'los', 'lr', 'ls', 'lt', 'ltd', 'm', 'm2', 'ma', 'made', 'mainly', 'make', 'makes', 'many', 'may', 'maybe', 'me', 'mean', 'means', 'meantime', 'meanwhile', 'merely', 'mg', 'might', 'mightn', 'mightn''t', 'mill', 'million', 'mine', 'miss', 'ml', 'mn', 'mo', 'more', 'moreover', 'most', 'mostly', 'move', 'mr', 'mrs', 'ms', 'mt', 'mu', 'much', 'mug', 'must', 'mustn', 'mustn''t', 'my', 'myself', 'n', 'n2', 'na', 'name', 'namely', 'nay', 'nc', 'nd', 'ne', 'near', 'nearly', 'necessarily', 'necessary', 'need', 'needn', 'needn''t', 'needs', 'neither', 'never', 'nevertheless', 'new', 'next', 'ng', 'ni', 'nine', 'ninety', 'nj', 'nl', 'nn', 'no', 'nobody', 'non', 'none', 'nonetheless', 'noone', 'nor', 'normally', 'nos', 'not', 'noted', 'nothing', 'novel', 'now', 'nowhere', 'nr', 'ns', 'nt', 'ny', 'o', 'oa', 'ob', 'obtain', 'obtained', 'obviously', 'oc', 'od', 'of', 'off', 'often', 'og', 'oh', 'oi', 'oj', 'ok', 'okay', 'ol', 'old', 'om', 'omitted', 'on', 'once', 'one', 'ones', 'only', 'onto', 'oo', 'op', 'oq', 'or', 'ord', 'os', 'ot', 'other', 'others', 'otherwise', 'ou', 'ought', 'our', 'ours', 'ourselves', 'out', 'outside', 'over', 'overall', 'ow', 'owing', 'own', 'ox', 'oz', 'p', 'p1', 'p2', 'p3', 'page', 'pagecount', 'pages', 'par', 'part', 'particular', 'particularly', 'pas', 'past', 'pc', 'pd', 'pe', 'per', 'perhaps', 'pf', 'ph', 'pi', 'pj', 'pk', 'pl', 'placed', 'please', 'plus', 'pm', 'pn', 'po', 'poorly', 'possible', 'possibly', 'potentially', 'pp', 'pq', 'pr', 'predominantly', 'present', 'presumably', 'previously', 'primarily', 'probably', 'promptly', 'proud', 'provides', 'ps', 'pt', 'pu', 'put', 'py', 'q', 'qj', 'qu', 'que', 'quickly', 'quite', 'qv', 'r', 'r2', 'ra', 'ran', 'rather', 'rc', 'rd', 're', 'readily', 'really', 'reasonably', 'recent', 'recently', 'ref', 'refs', 'regarding', 'regardless', 'regards', 'related', 'relatively', 'research', 'research-articl', 'respectively', 'resulted', 'resulting', 'results', 'rf', 'rh', 'ri', 'right', 'rj', 'rl', 'rm', 'rn', 'ro', 'rq', 'rr', 'rs', 'rt', 'ru', 'run', 'rv', 'ry', 's', 's2', 'sa', 'said', 'same', 'saw', 'say', 'saying', 'says', 'sc', 'sd', 'se', 'sec', 'second', 'secondly', 'section', 'see', 'seeing', 'seem', 'seemed', 'seeming', 'seems', 'seen', 'self', 'selves', 'sensible', 'sent', 'serious', 'seriously', 'seven', 'several', 'sf', 'shall', 'shan', 'shan''t', 'she', 'shed', 'she''d', 'she''ll', 'shes', 'she''s', 'should', 'shouldn', 'shouldn''t', 'should''ve', 'show', 'showed', 'shown', 'showns', 'shows', 'si', 'side', 'significant', 'significantly', 'similar', 'similarly', 'since', 'sincere', 'six', 'sixty', 'sj', 'sl', 'slightly', 'sm', 'sn', 'so', 'some', 'somebody', 'somehow', 'someone', 'somethan', 'something', 'sometime', 'sometimes', 'somewhat', 'somewhere', 'soon', 'sorry', 'sp', 'specifically', 'specified', 'specify', 'specifying', 'sq', 'sr', 'ss', 'st', 'still', 'stop', 'strongly', 'sub', 'substantially', 'successfully', 'such', 'sufficiently', 'suggest', 'sup', 'sure', 'sy', 'system', 'sz', 't', 't1', 't2', 't3', 'take', 'taken', 'taking', 'tb', 'tc', 'td', 'te', 'tell', 'ten', 'tends', 'tf', 'th', 'than', 'thank', 'thanks', 'thanx', 'that', 'that''ll', 'thats', 'that''s', 'that''ve', 'the', 'their', 'theirs', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'thered', 'therefore', 'therein', 'there''ll', 'thereof', 'therere', 'theres', 'there''s', 'thereto', 'thereupon', 'there''ve', 'these', 'they', 'theyd', 'they''d', 'they''ll', 'theyre', 'they''re', 'they''ve', 'thickv', 'thin', 'think', 'third', 'this', 'thorough', 'thoroughly', 'those', 'thou', 'though', 'thoughh', 'thousand', 'three', 'throug', 'through', 'throughout', 'thru', 'thus', 'ti', 'til', 'tip', 'tj', 'tl', 'tm', 'tn', 'to', 'together', 'too', 'took', 'top', 'toward', 'towards', 'tp', 'tq', 'tr', 'tried', 'tries', 'truly', 'try', 'trying', 'ts', 't''s', 'tt', 'tv', 'twelve', 'twenty', 'twice', 'two', 'tx', 'u', 'u201d', 'ue', 'ui', 'uj', 'uk', 'um', 'un', 'under', 'unfortunately', 'unless', 'unlike', 'unlikely', 'until', 'unto', 'uo', 'up', 'upon', 'ups', 'ur', 'us', 'use', 'used', 'useful', 'usefully', 'usefulness', 'uses', 'using', 'usually', 'ut', 'v', 'va', 'value', 'various', 'vd', 've', 've', 'very', 'via', 'viz', 'vj', 'vo', 'vol', 'vols', 'volumtype', 'vq', 'vs', 'vt', 'vu', 'w', 'wa', 'want', 'wants', 'was', 'wasn', 'wasnt', 'wasn''t', 'way', 'we', 'wed', 'we''d', 'welcome', 'well', 'we''ll', 'well-b', 'went', 'were', 'we''re', 'weren', 'werent', 'weren''t', 'we''ve', 'what', 'whatever', 'what''ll', 'whats', 'what''s', 'when', 'whence', 'whenever', 'when''s', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'wheres', 'where''s', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whim', 'whither', 'who', 'whod', 'whoever', 'whole', 'who''ll', 'whom', 'whomever', 'whos', 'who''s', 'whose', 'why', 'why''s', 'wi', 'widely', 'will', 'willing', 'wish', 'with', 'within', 'without', 'wo', 'won', 'wonder', 'wont', 'won''t', 'words', 'world', 'would', 'wouldn', 'wouldnt', 'wouldn''t', 'www', 'x', 'x1', 'x2', 'x3', 'xf', 'xi', 'xj', 'xk', 'xl', 'xn', 'xo', 'xs', 'xt', 'xv', 'xx', 'y', 'y2', 'yes', 'yet', 'yj', 'yl', 'you', 'youd', 'you''d', 'you''ll', 'your', 'youre', 'you''re', 'yours', 'yourself', 'yourselves', 'you''ve', 'yr', 'ys', 'yt', 'z', 'zero', 'zi', 'zz')
GROUP BY x.Item
ORDER BY COUNT(*) DESC
Here's the result of the above code, as you can see it's not counting correctly:
Word Counts
server 2
sql 2
data 1
database 1
popular 1
powerful 1
Can anyone help on this? Would be really appreciated!
You can make use of String_split here, such as
select value Word, Count(*) Counts
from words
cross apply String_Split(text_column, ' ')
where value not in(exclude list)
group by value
order by counts desc;
You should should the string_split function -- like this
SELECT id, value as aword
FROM words
CROSS APPLY STRING_SPLIT(text_column, ',');
This will create a table with all the words by id -- to get the count do this:
SELECT aword, count(*) as counts
FROM (
SELECT id, value as aword
FROM words
CROSS APPLY STRING_SPLIT(text_column, ',');
) x
GROUP BY aword
You may need to lower case the LOWER(text_column) if you want it to not matter
If you don't have access to STRING_SPLIT function, you can use weird xml trick to convert space to a word node and then shred it with nodes function:
select word, COUNT(*)
from (
select n.value('.', 'nvarchar(50)') AS word
from (
VALUES
(1, 'SQL Server is a popular database management system'),
(2, 'It is widely used for data storage and retrieval'),
(3, 'SQL Server is a powerful tool for data analysis')
) AS t (id, txt)
CROSS APPLY (
SELECT CAST('<x>' + REPLACE(txt, ' ', '</x><x>') + '</x>' AS XML) x
) x
CROSS APPLY x.nodes('x') z(n)
) w
GROUP BY word
Of course, this will fail on "bad" words and invalid xml-characters but it can be worked on. Text processing has never been SQL Server's strong-point though, so probably better to use some NLP library to do this kind of stuff

Population by United States Region

Using SQL Let’s say we have a dataset of population by state (e.g., Vermont, 623,251, and so on), but we want to know the population by United States region (e.g., Midwest, 68,985,454). Could you describe how you would go about doing that?
Dataset from census.gov
Where I'm stuck at
--First I created a table with a state and population column.
CREATE TABLE states (
state VARCHAR(20),
population INT
);
--Then I uploaded a CSV file from census.gov that I cleaned up.
SELECT * FROM states;
--Created a temporary table to add in the region column.
DROP TABLE IF EXISTS temp_Regions;
CREATE TEMP TABLE temp_Regions (
state VARCHAR(20),
state_pop INT,
region VARCHAR(20),
region_pop INT
);
INSERT INTO temp_Regions
SELECT state, population
FROM states;
--Used CASE WHEN statements to put states in to their respective regions.
SELECT state,
CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
END AS region, state_pop, region_pop
FROM temp_Regions;
--Now I'm stuck at this point. I'm unable to get data into the region_pop column. How do I get the sum of the populations by U.S. Region?
Let me know if you need further clarification on things. Thanks for your help y'all!
You can make use of analytical function sum() over(partition by) to achieve this
with data
as (
SELECT state
,CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
END AS region
, state_pop
FROM temp_Regions
)
select state
,region
,state_pop
,sum(state_pop) over(partition by region) as region_population
from data

How to count states using case statement

I am converting a column of state abbreviations to state names but I'd also like to get a count of each state name. When I try adding SELECT state,count(*) as count to the beginning of my query, I end up getting errors. Where can I add a count function to get an output with state names and count of each?
Code I'm using
query = """
SELECT
CASE
WHEN state = 'AL' THEN 'Alabama'
WHEN state = 'AK' THEN 'Alaska'
WHEN state = 'AZ' THEN 'Arizona'
WHEN state = 'AR' THEN 'Arkansas'
WHEN state = 'CA' THEN 'California'
WHEN state = 'CO' THEN 'Colorado'
WHEN state = 'CT' THEN 'Connecticut'
WHEN state = 'DE' THEN 'Delaware'
WHEN state = 'DC' THEN 'District of Columbia'
WHEN state = 'FL' THEN 'Florida'
WHEN state = 'GA' THEN 'Georgia'
WHEN state = 'HI' THEN 'Hawaii'
WHEN state = 'ID' THEN 'Idaho'
WHEN state = 'IL' THEN 'Illinois'
WHEN state = 'IN' THEN 'Indiana'
WHEN state = 'IA' THEN 'Iowa'
WHEN state = 'KS' THEN 'Kansas'
WHEN state = 'KY' THEN 'Kentucky'
WHEN state = 'LA' THEN 'Louisiana'
WHEN state = 'ME' THEN 'Maine'
WHEN state = 'MD' THEN 'Maryland'
WHEN state = 'MA' THEN 'Massachusetts'
WHEN state = 'MI' THEN 'Michigan'
WHEN state = 'MN' THEN 'Minnesota'
WHEN state = 'MS' THEN 'Mississippi'
WHEN state = 'MO' THEN 'Missouri'
WHEN state = 'MT' THEN 'Montana'
WHEN state = 'NE' THEN 'Nebraska'
WHEN state = 'NV' THEN 'Nevada'
WHEN state = 'NH' THEN 'New Hampshire'
WHEN state = 'NJ' THEN 'New Jersey'
WHEN state = 'NM' THEN 'New Mexico'
WHEN state = 'NY' THEN 'New York'
WHEN state = 'NC' THEN 'North Carolina'
WHEN state = 'ND' THEN 'North Dakota'
WHEN state = 'OH' THEN 'Ohio'
WHEN state = 'OK' THEN 'Oklahoma'
WHEN state = 'OR' THEN 'Oregon'
WHEN state = 'PA' THEN 'Pennsylvania'
WHEN state = 'RI' THEN 'Rhode Island'
WHEN state = 'SC' THEN 'South Carolina'
WHEN state = 'SD' THEN 'South Dakota'
WHEN state = 'TN' THEN 'Tennessee'
WHEN state = 'TX' THEN 'Texas'
WHEN state = 'UT' THEN 'Utah'
WHEN state = 'VT' THEN 'Vermont'
WHEN state = 'VA' THEN 'Virginia'
WHEN state = 'WA' THEN 'Washington'
WHEN state = 'WV' THEN 'West Virginia'
WHEN state = 'WI' THEN 'Wisconsin'
WHEN state = 'WY' THEN 'Wyoming'
WHEN state = 'AB' THEN 'Alberta'
WHEN state = 'BC' THEN 'British Columbia'
WHEN state = 'MB' THEN 'Manitoba'
WHEN state = 'NM' THEN 'New Brunswick'
WHEN state = 'NL' THEN 'Newfoundland and Labrador'
WHEN state = 'NT' THEN 'Northwest Territories'
WHEN state = 'NS' THEN 'Nova Scotia'
WHEN state = 'NU' THEN 'Nunavut'
WHEN state = 'ON' THEN 'Ontario'
WHEN state = 'PE' THEN 'Prince Edward Island'
WHEN state = 'QC' THEN 'Quebec'
WHEN state = 'SK' THEN 'Saskatchewan'
WHEN state = 'YT' THEN 'Yukon Territory'
END AS state
FROM business
"""
result = spark.sql(query)
result.show()
This is what I end up with:
But I'd like it to look like this, just with full state names instead of abbreviations:
Here you go:
SELECT CASE WHEN state = 'AL' THEN 'Alabama' WHEN state = 'AK' THEN 'Alaska' WHEN state = 'AZ' THEN 'Arizona' WHEN state = 'AR' THEN 'Arkansas' WHEN state = 'CA' THEN 'California' WHEN state = 'CO' THEN 'Colorado' WHEN state = 'CT' THEN 'Connecticut' WHEN state = 'DE' THEN 'Delaware' WHEN state = 'DC' THEN 'District of Columbia' WHEN state = 'FL' THEN 'Florida' WHEN state = 'GA' THEN 'Georgia' WHEN state = 'HI' THEN 'Hawaii' WHEN state = 'ID' THEN 'Idaho' WHEN state = 'IL' THEN 'Illinois' WHEN state = 'IN' THEN 'Indiana' WHEN state = 'IA' THEN 'Iowa' WHEN state = 'KS' THEN 'Kansas' WHEN state = 'KY' THEN 'Kentucky' WHEN state = 'LA' THEN 'Louisiana' WHEN state = 'ME' THEN 'Maine' WHEN state = 'MD' THEN 'Maryland' WHEN state = 'MA' THEN 'Massachusetts' WHEN state = 'MI' THEN 'Michigan' WHEN state = 'MN' THEN 'Minnesota' WHEN state = 'MS' THEN 'Mississippi' WHEN state = 'MO' THEN 'Missouri' WHEN state = 'MT' THEN 'Montana' WHEN state = 'NE' THEN 'Nebraska' WHEN state = 'NV' THEN 'Nevada' WHEN state = 'NH' THEN 'New Hampshire' WHEN state = 'NJ' THEN 'New Jersey' WHEN state = 'NM' THEN 'New Mexico' WHEN state = 'NY' THEN 'New York' WHEN state = 'NC' THEN 'North Carolina' WHEN state = 'ND' THEN 'North Dakota' WHEN state = 'OH' THEN 'Ohio' WHEN state = 'OK' THEN 'Oklahoma' WHEN state = 'OR' THEN 'Oregon' WHEN state = 'PA' THEN 'Pennsylvania' WHEN state = 'RI' THEN 'Rhode Island' WHEN state = 'SC' THEN 'South Carolina' WHEN state = 'SD' THEN 'South Dakota' WHEN state = 'TN' THEN 'Tennessee' WHEN state = 'TX' THEN 'Texas' WHEN state = 'UT' THEN 'Utah' WHEN state = 'VT' THEN 'Vermont' WHEN state = 'VA' THEN 'Virginia' WHEN state = 'WA' THEN 'Washington' WHEN state = 'WV' THEN 'West Virginia' WHEN state = 'WI' THEN 'Wisconsin' WHEN state = 'WY' THEN 'Wyoming' WHEN state = 'AB' THEN 'Alberta' WHEN state = 'BC' THEN 'British Columbia' WHEN state = 'MB' THEN 'Manitoba' WHEN state = 'NM' THEN 'New Brunswick' WHEN state = 'NL' THEN 'Newfoundland and Labrador' WHEN state = 'NT' THEN 'Northwest Territories' WHEN state = 'NS' THEN 'Nova Scotia' WHEN state = 'NU' THEN 'Nunavut' WHEN state = 'ON' THEN 'Ontario' WHEN state = 'PE' THEN 'Prince Edward Island' WHEN state = 'QC' THEN 'Quebec' WHEN state = 'SK' THEN 'Saskatchewan' WHEN state = 'YT' THEN 'Yukon Territory' END AS state,
COUNT(*)
FROM business
GROUP BY state

How do you only return one field when there are multiple entries for each field?

I am trying to only return one email address for each employee. An Employee can be both an employee and a student. If you have both an employee and student email address then I only want to return the employee email address else if you only have student email address then return the student email address.
Here is the entire query:
select --spriden_pidm as pidm,
spriden_id as ban_id,
spriden_last_name as lastname,
spriden_first_name as firstname,
gmal.email,
phone_number.area || phone_number.phone as phone_number,
addr.permanent_address AS street,
addr.permanent_city AS city,
addr.permanent_state AS state,
addr.permanent_zip AS zip,
case
when nbrjobs_ecls_code in ('E1', 'E2', 'EN', 'F1', 'F2') and nbrjobs_ann_salary between 0 and 49999.99 then 'EHRA1'
when nbrjobs_ecls_code in ('E1', 'E2', 'EN', 'F1', 'F2') and nbrjobs_ann_salary between 50000 and 99999.99 then 'EHRA2'
when nbrjobs_ecls_code in ('E1', 'E2', 'EN', 'F1', 'F2') and nbrjobs_ann_salary between 100000 and 149999.99 then 'EHRA3'
when nbrjobs_ecls_code in ('E1', 'E2', 'EN', 'F1', 'F2') and nbrjobs_ann_salary >= 150000 then 'EHRA4'
when nbrjobs_ecls_code in ('SE', 'SN', 'LE') and nbrjobs_ann_salary between 0 and 49999.99 then 'SHRA1'
when nbrjobs_ecls_code in ('SE', 'SN', 'LE') and nbrjobs_ann_salary between 50000 and 99999.99 then 'SHRA2'
when nbrjobs_ecls_code in ('SE', 'SN', 'LE') and nbrjobs_ann_salary between 100000 and 149999.99 then 'SHRA3'
when nbrjobs_ecls_code in ('SE', 'SN', 'LE') and nbrjobs_ann_salary >= 150000 then 'SHRA4'
when nbrjobs_ecls_code in ('FA') then 'AF'
when nbrjobs_ecls_code in ('SH', 'SS', 'TS', 'WS') then 'M1'
else
null
end as empl_cat
from nbrjobs a,
spriden,
(select goremal_pidm as pidm,
goremal_email_address as email
from goremal
where goremal_emal_code in ('EMPL', 'STDN')
and goremal_status_ind = 'A') gmal,
(SELECT sprtele_pidm AS pidm,
sprtele_phone_area AS area,
sprtele_phone_number AS phone
FROM sprtele c
WHERE sprtele_tele_code = 'CA'
AND sprtele_primary_ind = 'Y'
AND sprtele_status_ind IS NULL
AND sprtele_seqno =
(SELECT MAX (sprtele_seqno)
FROM sprtele
WHERE sprtele_tele_code = 'CA'
AND sprtele_primary_ind = 'Y'
AND sprtele_status_ind IS NULL
AND sprtele_pidm = c.sprtele_pidm)) phone_number,
--spraddr
(SELECT spraddr_pidm AS pidm,
spraddr_street_line1 AS permanent_address,
spraddr_city AS permanent_city,
spraddr_stat_code AS permanent_state,
spraddr_zip AS permanent_zip
FROM spraddr b
WHERE spraddr_atyp_code = 'CA'
AND spraddr_status_ind IS NULL
AND spraddr_seqno =
(SELECT MAX (spraddr_seqno)
FROM spraddr
WHERE spraddr_atyp_code = 'CA'
AND spraddr_status_ind IS NULL
AND spraddr_pidm = b.spraddr_pidm)) addr
where a.nbrjobs_pidm = spriden_pidm
and a.nbrjobs_pidm = gmal.pidm(+)
and a.nbrjobs_pidm = phone_number.pidm(+)
and a.nbrjobs_pidm = addr.pidm(+)
and spriden_change_ind is null
and a.nbrjobs_sgrp_code = to_char(sysdate, 'YYYY')
and a.nbrjobs_effective_date = (select max(b.nbrjobs_effective_date)
from nbrjobs b
where b.nbrjobs_pidm = a.nbrjobs_pidm
and b.nbrjobs_posn = a.nbrjobs_posn
and b.nbrjobs_effective_date <= sysdate
--and b.nbrjobs_ecls_code in ('E1','E2','EN','F1','F2','SE','SN','LE')
and b.nbrjobs_ecls_code in ('E1','E2','EN','F1','F2','SE','SN','LE', 'RF', 'AF', 'FA', 'SH', 'SS', 'TS', 'WS')
and b.nbrjobs_sgrp_code = to_char(sysdate, 'YYYY'))
and a.nbrjobs_status <> 'T';`
and this is the part of the query I am trying to change to return the desired email address
(select goremal_pidm as pidm,
goremal_email_address as email
from goremal
where goremal_emal_code in ('EMPL', 'STDN')
and goremal_status_ind = 'A') gmal,
So the issue is that the query will return two email addresses if the employee is also a student? What you can do in this case is PIVOT the data, then use COALESCE() to get the student email where the employee email is NULL. The below query would replace the problematic subquery:
SELECT pidm, COALESCE(empl_email, stdn_email) AS email
FROM (
SELECT goremal_pidm AS pidm, goremal_email_address AS email, goremal_emal_code
FROM goremal
WHERE goremal_emal_code in ('EMPL', 'STDN')
AND goremal_status_ind = 'A'
) PIVOT (
MAX(email) FOR goremal_emal_code IN ('EMPL' AS empl_email, 'STDN' AS stdn_email)
)
EDIT: As an aside, you can use conditional aggregation instead of an explicit PIVOT (helpful if you're using Oracle 9i or lower):
SELECT pidm, COALESCE(empl_email, stdn_email) AS email FROM (
SELECT goremal_pidm AS pidm
, MAX(CASE WHEN goremal_emal_code = 'EMPL' THEN goremal_email_address END) AS empl_email
, MAX(CASE WHEN goremal_emal_code = 'STDN' THEN goremal_email_address END) AS stdn_email
FROM goremal
WHERE goremal_emal_code in ('EMPL', 'STDN')
AND goremal_status_ind = 'A'
GROUP BY goremal_pidm
)
Hope this helps.
Try using NVL2, as a example for your case -
NVL2(EMP_EMAIL_ADR, EMP_EMAIL_ADR, STDN_EMAIL_ADR)
This clause will return if the Employee email address is not null else it returns Student email address.
Hope this helps.

Else do nothing SQL query

I have a field, froiexported, in DB table claim3 that is either set to one or zero. I want to run an update where if the criteria in the case statement is met the value in froiexported is set to 1 else do nothing. Below will make my results incorrect every day.
update claim3
set froiexpoted =
CASE
WHEN froimaintdate >= dateadd(day,datediff(day,1,GETDATE()),0)
AND froimaintdate < dateadd(day,datediff(day,0,GETDATE()),0)
AND c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN', 'KS', 'KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE', 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV')
THEN '1'
ELSE '0'
END
You can use a where clause instead:
update claim3
set froiexpoted = 1
where froiexpoted <> 1
and froimaintdate >= dateadd(day,datediff(day,1,getdate()),0)
and froimaintdate < dateadd(day,datediff(day,0,getdate()),0)
and c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN'
, 'KS','KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE'
, 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV'
)
if you need to set 0s for the previous day as well:
update claim3
set froiexpoted = case
when c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN'
, 'KS','KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE'
, 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV'
)
then 1
else 0
end
where froimaintdate >= dateadd(day,datediff(day,1,getdate()),0)
and froimaintdate < dateadd(day,datediff(day,0,getdate()),0)
How about setting it to 1 if criteria are met, else set to the current value?