Population by United States Region

Population by United States Region - sql

Using SQL Let’s say we have a dataset of population by state (e.g., Vermont, 623,251, and so on), but we want to know the population by United States region (e.g., Midwest, 68,985,454). Could you describe how you would go about doing that?
Dataset from census.gov
Where I'm stuck at
--First I created a table with a state and population column.
CREATE TABLE states (
state VARCHAR(20),
population INT
);
--Then I uploaded a CSV file from census.gov that I cleaned up.
SELECT * FROM states;
--Created a temporary table to add in the region column.
DROP TABLE IF EXISTS temp_Regions;
CREATE TEMP TABLE temp_Regions (
state VARCHAR(20),
state_pop INT,
region VARCHAR(20),
region_pop INT
);
INSERT INTO temp_Regions
SELECT state, population
FROM states;
--Used CASE WHEN statements to put states in to their respective regions.
SELECT state,
CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
END AS region, state_pop, region_pop
FROM temp_Regions;
--Now I'm stuck at this point. I'm unable to get data into the region_pop column. How do I get the sum of the populations by U.S. Region?
Let me know if you need further clarification on things. Thanks for your help y'all!

You can make use of analytical function sum() over(partition by) to achieve this
with data
as (
SELECT state
,CASE WHEN state IN ('Connecticut', 'Maine', 'Massachusetts', 'New Hampshire', 'Rhode Island', 'Vermont', 'New Jersey', 'New York', 'Pennsylvania') THEN 'Northeast'
WHEN state IN ('Illinois', 'Indiana', 'Michigan', 'Ohio', 'Wisconsin', 'Iowa', 'Kansas', 'Minnesota', 'Missouri', 'Nebraska', 'North Dakota', 'South Dakota') THEN 'Midwest'
WHEN state IN ('Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Maryland', 'North Carolina', 'South Carolina', 'Virginia', 'West Virginia', 'Alabama', 'Kentucky', 'Mississippi', 'Tennessee', 'Arkansas', 'Louisiana', 'Oklahoma', 'Texas') THEN 'South'
WHEN state IN ('Arizona', 'Colorado', 'Idaho', 'Montana', 'Nevada', 'New Mexico', 'Utah', 'Wyoming', 'Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') THEN 'West'
END AS region
, state_pop
FROM temp_Regions
)
select state
,region
,state_pop
,sum(state_pop) over(partition by region) as region_population
from data

Related

Extract words from a column and count frequency

Does anyone know if there's an efficient way to extract all the words from a single column and count the frequency of each word in SQL Server? I only have read-only access to my database so I can't create a self-defined function to do this.
Here's a reproducible example:
CREATE TABLE words
(
id INT PRIMARY KEY,
text_column VARCHAR(1000)
);
INSERT INTO words (id, text_column)
VALUES
(1, 'SQL Server is a popular database management system'),
(2, 'It is widely used for data storage and retrieval'),
(3, 'SQL Server is a powerful tool for data analysis');
I have found this code but it's not working correctly, and I think it's too complicated to understand:
WITH E1(N) AS
(
SELECT 1
FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
) t(N)
),
E2(N) AS (SELECT 1 FROM E1 a CROSS JOIN E1 b),
E4(N) AS (SELECT 1 FROM E2 a CROSS JOIN E2 b)
SELECT
LOWER(x.Item) AS [Word],
COUNT(*) AS [Counts]
FROM
(SELECT * FROM words) a
CROSS APPLY
(SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = LTRIM(RTRIM(SUBSTRING(a.text_column, l.N1, l.L1)))
FROM
(SELECT
s.N1,
L1 = ISNULL(NULLIF(CHARINDEX(' ',a.text_column,s.N1),0)-s.N1,4000)
FROM
(SELECT 1
UNION ALL
SELECT t.N+1
FROM
(SELECT TOP (ISNULL(DATALENGTH(a.text_column)/2,0))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E4) t(N)
WHERE SUBSTRING(a.text_column ,t.N,1) = ' '
) s(N1)
) l(N1, L1)
) x
WHERE
x.item <> ''
AND x.Item NOT IN ('0o', '0s', '3a', '3b', '3d', '6b', '6o', 'a', 'a1', 'a2', 'a3', 'a4', 'ab', 'able', 'about', 'above', 'abst', 'ac', 'accordance', 'according', 'accordingly', 'across', 'act', 'actually', 'ad', 'added', 'adj', 'ae', 'af', 'affected', 'affecting', 'affects', 'after', 'afterwards', 'ag', 'again', 'against', 'ah', 'ain', 'ain''t', 'aj', 'al', 'all', 'allow', 'allows', 'almost', 'alone', 'along', 'already', 'also', 'although', 'always', 'am', 'among', 'amongst', 'amoungst', 'amount', 'an', 'and', 'announce', 'another', 'any', 'anybody', 'anyhow', 'anymore', 'anyone', 'anything', 'anyway', 'anyways', 'anywhere', 'ao', 'ap', 'apart', 'apparently', 'appear', 'appreciate', 'appropriate', 'approximately', 'ar', 'are', 'aren', 'arent', 'aren''t', 'arise', 'around', 'as', 'a''s', 'aside', 'ask', 'asking', 'associated', 'at', 'au', 'auth', 'av', 'available', 'aw', 'away', 'awfully', 'ax', 'ay', 'az', 'b', 'b1', 'b2', 'b3', 'ba', 'back', 'bc', 'bd', 'be', 'became', 'because', 'become', 'becomes', 'becoming', 'been', 'before', 'beforehand', 'begin', 'beginning', 'beginnings', 'begins', 'behind', 'being', 'believe', 'below', 'beside', 'besides', 'best', 'better', 'between', 'beyond', 'bi', 'bill', 'biol', 'bj', 'bk', 'bl', 'bn', 'both', 'bottom', 'bp', 'br', 'brief', 'briefly', 'bs', 'bt', 'bu', 'but', 'bx', 'by', 'c', 'c1', 'c2', 'c3', 'ca', 'call', 'came', 'can', 'cannot', 'cant', 'can''t', 'cause', 'causes', 'cc', 'cd', 'ce', 'certain', 'certainly', 'cf', 'cg', 'ch', 'changes', 'ci', 'cit', 'cj', 'cl', 'clearly', 'cm', 'c''mon', 'cn', 'co', 'com', 'come', 'comes', 'con', 'concerning', 'consequently', 'consider', 'considering', 'contain', 'containing', 'contains', 'corresponding', 'could', 'couldn', 'couldnt', 'couldn''t', 'course', 'cp', 'cq', 'cr', 'cry', 'cs', 'c''s', 'ct', 'cu', 'currently', 'cv', 'cx', 'cy', 'cz', 'd', 'd2', 'da', 'date', 'dc', 'dd', 'de', 'definitely', 'describe', 'described', 'despite', 'detail', 'df', 'di', 'did', 'didn', 'didn''t', 'different', 'dj', 'dk', 'dl', 'do', 'does', 'doesn', 'doesn''t', 'doing', 'don', 'done', 'don''t', 'down', 'downwards', 'dp', 'dr', 'ds', 'dt', 'du', 'due', 'during', 'dx', 'dy', 'e', 'e2', 'e3', 'ea', 'each', 'ec', 'ed', 'edu', 'ee', 'ef', 'effect', 'eg', 'ei', 'eight', 'eighty', 'either', 'ej', 'el', 'eleven', 'else', 'elsewhere', 'em', 'empty', 'en', 'end', 'ending', 'enough', 'entirely', 'eo', 'ep', 'eq', 'er', 'es', 'especially', 'est', 'et', 'et-al', 'etc', 'eu', 'ev', 'even', 'ever', 'every', 'everybody', 'everyone', 'everything', 'everywhere', 'ex', 'exactly', 'example', 'except', 'ey', 'f', 'f2', 'fa', 'far', 'fc', 'few', 'ff', 'fi', 'fifteen', 'fifth', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'fix', 'fj', 'fl', 'fn', 'fo', 'followed', 'following', 'follows', 'for', 'former', 'formerly', 'forth', 'forty', 'found', 'four', 'fr', 'from', 'front', 'fs', 'ft', 'fu', 'full', 'further', 'furthermore', 'fy', 'g', 'ga', 'gave', 'ge', 'get', 'gets', 'getting', 'gi', 'give', 'given', 'gives', 'giving', 'gj', 'gl', 'go', 'goes', 'going', 'gone', 'got', 'gotten', 'gr', 'greetings', 'gs', 'gy', 'h', 'h2', 'h3', 'had', 'hadn', 'hadn''t', 'happens', 'hardly', 'has', 'hasn', 'hasnt', 'hasn''t', 'have', 'haven', 'haven''t', 'having', 'he', 'hed', 'he''d', 'he''ll', 'hello', 'help', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'heres', 'here''s', 'hereupon', 'hers', 'herself', 'hes', 'he''s', 'hh', 'hi', 'hid', 'him', 'himself', 'his', 'hither', 'hj', 'ho', 'home', 'hopefully', 'how', 'howbeit', 'however', 'how''s', 'hr', 'hs', 'http', 'hu', 'hundred', 'hy', 'i', 'i2', 'i3', 'i4', 'i6', 'i7', 'i8', 'ia', 'ib', 'ibid', 'ic', 'id', 'i''d', 'ie', 'if', 'ig', 'ignored', 'ih', 'ii', 'ij', 'il', 'i''ll', 'im', 'i''m', 'immediate', 'immediately', 'importance', 'important', 'in', 'inasmuch', 'inc', 'indeed', 'index', 'indicate', 'indicated', 'indicates', 'information', 'inner', 'insofar', 'instead', 'interest', 'into', 'invention', 'inward', 'io', 'ip', 'iq', 'ir', 'is', 'isn', 'isn''t', 'it', 'itd', 'it''d', 'it''ll', 'its', 'it''s', 'itself', 'iv', 'i''ve', 'ix', 'iy', 'iz', 'j', 'jj', 'jr', 'js', 'jt', 'ju', 'just', 'k', 'ke', 'keep', 'keeps', 'kept', 'kg', 'kj', 'km', 'know', 'known', 'knows', 'ko', 'l', 'l2', 'la', 'largely', 'last', 'lately', 'later', 'latter', 'latterly', 'lb', 'lc', 'le', 'least', 'les', 'less', 'lest', 'let', 'lets', 'let''s', 'lf', 'like', 'liked', 'likely', 'line', 'little', 'lj', 'll', 'll', 'ln', 'lo', 'look', 'looking', 'looks', 'los', 'lr', 'ls', 'lt', 'ltd', 'm', 'm2', 'ma', 'made', 'mainly', 'make', 'makes', 'many', 'may', 'maybe', 'me', 'mean', 'means', 'meantime', 'meanwhile', 'merely', 'mg', 'might', 'mightn', 'mightn''t', 'mill', 'million', 'mine', 'miss', 'ml', 'mn', 'mo', 'more', 'moreover', 'most', 'mostly', 'move', 'mr', 'mrs', 'ms', 'mt', 'mu', 'much', 'mug', 'must', 'mustn', 'mustn''t', 'my', 'myself', 'n', 'n2', 'na', 'name', 'namely', 'nay', 'nc', 'nd', 'ne', 'near', 'nearly', 'necessarily', 'necessary', 'need', 'needn', 'needn''t', 'needs', 'neither', 'never', 'nevertheless', 'new', 'next', 'ng', 'ni', 'nine', 'ninety', 'nj', 'nl', 'nn', 'no', 'nobody', 'non', 'none', 'nonetheless', 'noone', 'nor', 'normally', 'nos', 'not', 'noted', 'nothing', 'novel', 'now', 'nowhere', 'nr', 'ns', 'nt', 'ny', 'o', 'oa', 'ob', 'obtain', 'obtained', 'obviously', 'oc', 'od', 'of', 'off', 'often', 'og', 'oh', 'oi', 'oj', 'ok', 'okay', 'ol', 'old', 'om', 'omitted', 'on', 'once', 'one', 'ones', 'only', 'onto', 'oo', 'op', 'oq', 'or', 'ord', 'os', 'ot', 'other', 'others', 'otherwise', 'ou', 'ought', 'our', 'ours', 'ourselves', 'out', 'outside', 'over', 'overall', 'ow', 'owing', 'own', 'ox', 'oz', 'p', 'p1', 'p2', 'p3', 'page', 'pagecount', 'pages', 'par', 'part', 'particular', 'particularly', 'pas', 'past', 'pc', 'pd', 'pe', 'per', 'perhaps', 'pf', 'ph', 'pi', 'pj', 'pk', 'pl', 'placed', 'please', 'plus', 'pm', 'pn', 'po', 'poorly', 'possible', 'possibly', 'potentially', 'pp', 'pq', 'pr', 'predominantly', 'present', 'presumably', 'previously', 'primarily', 'probably', 'promptly', 'proud', 'provides', 'ps', 'pt', 'pu', 'put', 'py', 'q', 'qj', 'qu', 'que', 'quickly', 'quite', 'qv', 'r', 'r2', 'ra', 'ran', 'rather', 'rc', 'rd', 're', 'readily', 'really', 'reasonably', 'recent', 'recently', 'ref', 'refs', 'regarding', 'regardless', 'regards', 'related', 'relatively', 'research', 'research-articl', 'respectively', 'resulted', 'resulting', 'results', 'rf', 'rh', 'ri', 'right', 'rj', 'rl', 'rm', 'rn', 'ro', 'rq', 'rr', 'rs', 'rt', 'ru', 'run', 'rv', 'ry', 's', 's2', 'sa', 'said', 'same', 'saw', 'say', 'saying', 'says', 'sc', 'sd', 'se', 'sec', 'second', 'secondly', 'section', 'see', 'seeing', 'seem', 'seemed', 'seeming', 'seems', 'seen', 'self', 'selves', 'sensible', 'sent', 'serious', 'seriously', 'seven', 'several', 'sf', 'shall', 'shan', 'shan''t', 'she', 'shed', 'she''d', 'she''ll', 'shes', 'she''s', 'should', 'shouldn', 'shouldn''t', 'should''ve', 'show', 'showed', 'shown', 'showns', 'shows', 'si', 'side', 'significant', 'significantly', 'similar', 'similarly', 'since', 'sincere', 'six', 'sixty', 'sj', 'sl', 'slightly', 'sm', 'sn', 'so', 'some', 'somebody', 'somehow', 'someone', 'somethan', 'something', 'sometime', 'sometimes', 'somewhat', 'somewhere', 'soon', 'sorry', 'sp', 'specifically', 'specified', 'specify', 'specifying', 'sq', 'sr', 'ss', 'st', 'still', 'stop', 'strongly', 'sub', 'substantially', 'successfully', 'such', 'sufficiently', 'suggest', 'sup', 'sure', 'sy', 'system', 'sz', 't', 't1', 't2', 't3', 'take', 'taken', 'taking', 'tb', 'tc', 'td', 'te', 'tell', 'ten', 'tends', 'tf', 'th', 'than', 'thank', 'thanks', 'thanx', 'that', 'that''ll', 'thats', 'that''s', 'that''ve', 'the', 'their', 'theirs', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'thered', 'therefore', 'therein', 'there''ll', 'thereof', 'therere', 'theres', 'there''s', 'thereto', 'thereupon', 'there''ve', 'these', 'they', 'theyd', 'they''d', 'they''ll', 'theyre', 'they''re', 'they''ve', 'thickv', 'thin', 'think', 'third', 'this', 'thorough', 'thoroughly', 'those', 'thou', 'though', 'thoughh', 'thousand', 'three', 'throug', 'through', 'throughout', 'thru', 'thus', 'ti', 'til', 'tip', 'tj', 'tl', 'tm', 'tn', 'to', 'together', 'too', 'took', 'top', 'toward', 'towards', 'tp', 'tq', 'tr', 'tried', 'tries', 'truly', 'try', 'trying', 'ts', 't''s', 'tt', 'tv', 'twelve', 'twenty', 'twice', 'two', 'tx', 'u', 'u201d', 'ue', 'ui', 'uj', 'uk', 'um', 'un', 'under', 'unfortunately', 'unless', 'unlike', 'unlikely', 'until', 'unto', 'uo', 'up', 'upon', 'ups', 'ur', 'us', 'use', 'used', 'useful', 'usefully', 'usefulness', 'uses', 'using', 'usually', 'ut', 'v', 'va', 'value', 'various', 'vd', 've', 've', 'very', 'via', 'viz', 'vj', 'vo', 'vol', 'vols', 'volumtype', 'vq', 'vs', 'vt', 'vu', 'w', 'wa', 'want', 'wants', 'was', 'wasn', 'wasnt', 'wasn''t', 'way', 'we', 'wed', 'we''d', 'welcome', 'well', 'we''ll', 'well-b', 'went', 'were', 'we''re', 'weren', 'werent', 'weren''t', 'we''ve', 'what', 'whatever', 'what''ll', 'whats', 'what''s', 'when', 'whence', 'whenever', 'when''s', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'wheres', 'where''s', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whim', 'whither', 'who', 'whod', 'whoever', 'whole', 'who''ll', 'whom', 'whomever', 'whos', 'who''s', 'whose', 'why', 'why''s', 'wi', 'widely', 'will', 'willing', 'wish', 'with', 'within', 'without', 'wo', 'won', 'wonder', 'wont', 'won''t', 'words', 'world', 'would', 'wouldn', 'wouldnt', 'wouldn''t', 'www', 'x', 'x1', 'x2', 'x3', 'xf', 'xi', 'xj', 'xk', 'xl', 'xn', 'xo', 'xs', 'xt', 'xv', 'xx', 'y', 'y2', 'yes', 'yet', 'yj', 'yl', 'you', 'youd', 'you''d', 'you''ll', 'your', 'youre', 'you''re', 'yours', 'yourself', 'yourselves', 'you''ve', 'yr', 'ys', 'yt', 'z', 'zero', 'zi', 'zz')
GROUP BY x.Item
ORDER BY COUNT(*) DESC
Here's the result of the above code, as you can see it's not counting correctly:
Word Counts
server 2
sql 2
data 1
database 1
popular 1
powerful 1
Can anyone help on this? Would be really appreciated!

You can make use of String_split here, such as
select value Word, Count(*) Counts
from words
cross apply String_Split(text_column, ' ')
where value not in(exclude list)
group by value
order by counts desc;

You should should the string_split function -- like this
SELECT id, value as aword
FROM words
CROSS APPLY STRING_SPLIT(text_column, ',');
This will create a table with all the words by id -- to get the count do this:
SELECT aword, count(*) as counts
FROM (
SELECT id, value as aword
FROM words
CROSS APPLY STRING_SPLIT(text_column, ',');
) x
GROUP BY aword
You may need to lower case the LOWER(text_column) if you want it to not matter

If you don't have access to STRING_SPLIT function, you can use weird xml trick to convert space to a word node and then shred it with nodes function:
select word, COUNT(*)
from (
select n.value('.', 'nvarchar(50)') AS word
from (
VALUES
(1, 'SQL Server is a popular database management system'),
(2, 'It is widely used for data storage and retrieval'),
(3, 'SQL Server is a powerful tool for data analysis')
) AS t (id, txt)
CROSS APPLY (
SELECT CAST('<x>' + REPLACE(txt, ' ', '</x><x>') + '</x>' AS XML) x
) x
CROSS APPLY x.nodes('x') z(n)
) w
GROUP BY word
Of course, this will fail on "bad" words and invalid xml-characters but it can be worked on. Text processing has never been SQL Server's strong-point though, so probably better to use some NLP library to do this kind of stuff

How to correctly use map and use np.where to replace values in a column

This is the city column of my dataframe- df.city
array(['la', 'hollywood', 'pasadena', 'los angeles', 'new york',
'studio city', 'venice', 'santa monica', 'mar vista',
'beverly hills', 'w. hollywood', 'encino', 'st. boyle hts .',
'westlake village', 'westwood', 'west la', 'chinatown',
'monterey park', 'rancho park', 'redondo beach', 'long beach',
'marina del rey', 'culver city', 'burbank', 'century city',
'malibu', 'seal beach', 'northridge', 'st. hermosa beach'],
dtype=object)
I want the strings containing ['la','hollywood'] to be converted to 'los angeles'. How to do this, i was using np.where(condition,x,y) for this but its third-argument(y) let me down.
To replace the rest of the cities i made this dictionary
cities={'studio city':'los angeles', 'santa monika':'los angeles', 'mar vista':'los angeles', 'beverly hills':'los angeles', 'encino':'los angeles', 'st. boyle hts .':'los angeles', 'westwood':'los angeles', 'chinatown':'los angeles', 'moterey park':'los angeles', 'rancho park':'los angeles', 'redondo beach':'los angeles', 'century city':'los angeles', 'marina del rey':'los angeles', 'malibu':'los angeles', 'seal beach':'los angeles', 'northridge':'los angeles','st. hermosa beach':'los angeles'}
When i use df.city.map(cities) , it maps the ones present in dictionary and replace the others such as 'los angeles' with NaN's.
How can I go about cleaning this column of my dataframe column?

You could use np.where like this:
df['city'] = np.where((df['city'].str.contains('la'))| (df['city'].str.contains('hollywood')), 'los angeles', df['city'])
The third argument is just the original column.

Controlling decimal precision after resetting index of unstacked Pandas data frame

My data is as follows:
test_df = pd.DataFrame({'Manufacturer':['Ford', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW'],
'Metric':['Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty'],
'Sector':['Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA'],
'Value':[45000, 70000, 90000, 65000, 40000, 65000, 63000, 2700, 4400, 3400, 3000, 4700, 5700, 1500, 2000, 2500, 1300, 2000, 2450],
'City': ['Frankfurt', 'Bremen', 'Berlin', 'Hamburg', 'New York', 'Chicago', 'Los Angeles', 'Dresden', 'Munich', 'Cologne', 'Miami', 'Atlanta', 'Phoenix', 'Nuremberg', 'Dusseldorf', 'Leipzig', 'Houston', 'San Diego', 'San Francisco']
})
I reset the index and create a pivot table, as follows:
temp_table = test_df.reset_index().pivot_table(values = 'Value', index = ['Manufacturer', 'Metric', 'Sector'], aggfunc='sum')
Then, I create two new data frames:
s1 = temp_table.set_index(['Manufacturer','Sector']).query("Metric=='Orders'").Value
s2 = temp_table.set_index(['Manufacturer','Sector']).query("Metric=='Sales'").Value
Then, I unstack these data frames:
s1.div(s2).unstack()
Which gives me:
Sector Germany USA
Manufacturer
---
BMW 19.117647 11.052632
Ford 42.592593 13.333333
Mercedes 20.454545 13.829787
Then, I reset the index:
df_out = s1.div(s2).reset_index()
Which gives me:
Manufacturer Sector Value
0 BMW Germany 19.117647
1 BMW USA 11.052632
2 Ford Germany 42.592593
3 Ford USA 13.333333
4 Mercedes Germany 20.454545
5 Mercedes USA 13.829787
I would like to be able to round the Value column to 2 decimal places.
I tried to use the round() function, as follows:
df_out['Value'].round(2)
But, this doesn't seem to affect the values when I call df_out again.
What is the best way to control the decimal precision in this case?
Thanks!

SQL Select Convert State Name To Abbreviation

In a SQL select statement, how to convert a full state name to state abbreviation (e.g. New York to NY)? I'd like to do this without joins if possible. What would the regexp_replace look like?
select regexp_replace(table.state, 'New York', 'NY', 'g') as state
Can this approach be done en mass for all states?
For reference list of states names and abbreviations: https://gist.github.com/esfand/9443427.

Here it is in raw WHEN/THEN form if needed, along with Canadian provinces:
CASE "YOUR COLUMN CONTAINING FULL STATE NAMES"
WHEN 'Alabama' THEN 'AL'
WHEN 'Alaska' THEN 'AK'
WHEN 'Arizona' THEN 'AZ'
WHEN 'Arkansas' THEN 'AR'
WHEN 'California' THEN 'CA'
WHEN 'Colorado' THEN 'CO'
WHEN 'Connecticut' THEN 'CT'
WHEN 'Delaware' THEN 'DE'
WHEN 'District of Columbia' THEN 'DC'
WHEN 'Florida' THEN 'FL'
WHEN 'Georgia' THEN 'GA'
WHEN 'Hawaii' THEN 'HI'
WHEN 'Idaho' THEN 'ID'
WHEN 'Illinois' THEN 'IL'
WHEN 'Indiana' THEN 'IN'
WHEN 'Iowa' THEN 'IA'
WHEN 'Kansas' THEN 'KS'
WHEN 'Kentucky' THEN 'KY'
WHEN 'Louisiana' THEN 'LA'
WHEN 'Maine' THEN 'ME'
WHEN 'Maryland' THEN 'MD'
WHEN 'Massachusetts' THEN 'MA'
WHEN 'Michigan' THEN 'MI'
WHEN 'Minnesota' THEN 'MN'
WHEN 'Mississippi' THEN 'MS'
WHEN 'Missouri' THEN 'MO'
WHEN 'Montana' THEN 'MT'
WHEN 'Nebraska' THEN 'NE'
WHEN 'Nevada' THEN 'NV'
WHEN 'New Hampshire' THEN 'NH'
WHEN 'New Jersey' THEN 'NJ'
WHEN 'New Mexico' THEN 'NM'
WHEN 'New York' THEN 'NY'
WHEN 'North Carolina' THEN 'NC'
WHEN 'North Dakota' THEN 'ND'
WHEN 'Ohio' THEN 'OH'
WHEN 'Oklahoma' THEN 'OK'
WHEN 'Oregon' THEN 'OR'
WHEN 'Pennsylvania' THEN 'PA'
WHEN 'Rhode Island' THEN 'RI'
WHEN 'South Carolina' THEN 'SC'
WHEN 'South Dakota' THEN 'SD'
WHEN 'Tennessee' THEN 'TN'
WHEN 'Texas' THEN 'TX'
WHEN 'Utah' THEN 'UT'
WHEN 'Vermont' THEN 'VT'
WHEN 'Virginia' THEN 'VA'
WHEN 'Washington' THEN 'WA'
WHEN 'West Virginia' THEN 'WV'
WHEN 'Wisconsin' THEN 'WI'
WHEN 'Wyoming' THEN 'WY'
WHEN 'Alberta' THEN 'AB'
WHEN 'British Columbia' THEN 'BC'
WHEN 'Manitoba' THEN 'MB'
WHEN 'New Brunswick' THEN 'NB'
WHEN 'Newfoundland and Labrador' THEN 'NL'
WHEN 'Northwest Territories' THEN 'NT'
WHEN 'Nova Scotia' THEN 'NS'
WHEN 'Nunavut' THEN 'NU'
WHEN 'Ontario' THEN 'ON'
WHEN 'Prince Edward Island' THEN 'PE'
WHEN 'Quebec' THEN 'QC'
WHEN 'Saskatchewan' THEN 'SK'
WHEN 'Yukon Territory' THEN 'YT'
ELSE NULL
END

With PostgreSQL you can use JSON
select '{"Alabama": "AL", "Alaska": "AK"}'::json->'Alabama'
You can also use a column reference instead of a string literal
select
'{"Alabama": "AL", "Alaska": "AK"}'::json->example.state
from
(values ('Alabama')) example(state)

As comments suggested that a join is needed. Below is what I ended up doing. Let me know if there is a better way.
with states(name, abbr) as (
select
*
from
(values ('Alabama', 'AL'),
('Alaska', 'AK'),
('Arizona', 'AZ'),
('Arkansas', 'AR'),
('California', 'CA'),
('Colorado', 'CO'),
('Connecticut', 'CT'),
('Delaware', 'DE'),
('District of Columbia', 'DC'),
('Florida', 'FL'),
('Georgia', 'GA'),
('Hawaii', 'HI'),
('Idaho', 'ID'),
('Illinois', 'IL'),
('Indiana', 'IN'),
('Iowa', 'IA'),
('Kansas', 'KS'),
('Kentucky', 'KY'),
('Louisiana', 'LA'),
('Maine', 'ME'),
('Maryland', 'MD'),
('Massachusetts', 'MA'),
('Michigan', 'MI'),
('Minnesota', 'MN'),
('Mississippi', 'MS'),
('Missouri', 'MO'),
('Montana', 'MT'),
('Nebraska', 'NE'),
('Nevada', 'NV'),
('New Hampshire', 'NH'),
('New Jersey', 'NJ'),
('New Mexico', 'NM'),
('New York', 'NY'),
('North Carolina', 'NC'),
('North Dakota', 'ND'),
('Ohio', 'OH'),
('Oklahoma', 'OK'),
('Oregon', 'OR'),
('Pennsylvania', 'PA'),
('Rhode Island', 'RI'),
('South Carolina', 'SC'),
('South Dakota', 'SD'),
('Tennessee', 'TN'),
('Texas', 'TX'),
('Utah', 'UT'),
('Vermont', 'VT'),
('Virginia', 'VA'),
('Washington', 'WA'),
('West Virginia', 'WV'),
('Wisconsin', 'WI'),
('Wyoming', 'WY')) as state
)
And select (select states.abbr from states where name = state_name)

Else do nothing SQL query

I have a field, froiexported, in DB table claim3 that is either set to one or zero. I want to run an update where if the criteria in the case statement is met the value in froiexported is set to 1 else do nothing. Below will make my results incorrect every day.
update claim3
set froiexpoted =
CASE
WHEN froimaintdate >= dateadd(day,datediff(day,1,GETDATE()),0)
AND froimaintdate < dateadd(day,datediff(day,0,GETDATE()),0)
AND c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN', 'KS', 'KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE', 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV')
THEN '1'
ELSE '0'
END

You can use a where clause instead:
update claim3
set froiexpoted = 1
where froiexpoted <> 1
and froimaintdate >= dateadd(day,datediff(day,1,getdate()),0)
and froimaintdate < dateadd(day,datediff(day,0,getdate()),0)
and c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN'
, 'KS','KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE'
, 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV'
)
if you need to set 0s for the previous day as well:
update claim3
set froiexpoted = case
when c1.jurst in ('AK', 'AL', 'CA', 'CO', 'FL', 'GA', 'IA', 'IN'
, 'KS','KY', 'LA', 'MA', 'ME', 'MN', 'MO', 'MS', 'NC', 'NE'
, 'NJ', 'PA', 'RI', 'SC', 'TN', 'TX', 'UT', 'VA', 'VT', 'WV'
)
then 1
else 0
end
where froimaintdate >= dateadd(day,datediff(day,1,getdate()),0)
and froimaintdate < dateadd(day,datediff(day,0,getdate()),0)

How about setting it to 1 if criteria are met, else set to the current value?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Population by United States Region - sql

Related

Extract words from a column and count frequency

How to correctly use map and use np.where to replace values in a column

Controlling decimal precision after resetting index of unstacked Pandas data frame

SQL Select Convert State Name To Abbreviation

Else do nothing SQL query

Categories

Resources