Individuals in multiple departments affecting grand total count - sql

I have a report I am trying to simplify but I am running into an issue.
(Undesired) The rows/columns of the report currently look like the following.
Department
Total
Probation (%)
Suspended (%)
All Employees
32
16.3
1.4
All Teams
30
23.5
2.2
Total Men's Teams
10
14.8
2.8
Total Women's Teams
10
34.3
1.4
Men's Wear
10
5.9
0.0
Women's Wear
10
21.4
0.0
UniSec Wear
10
15.0
6.3
This is happening because two people work on two teams. One person works in Mens Wear and UniSex Wear, and one person works in Women's Wear and UniSex Wear. The below table has records like this.
Col1
Col2
1234
Men's Wear
1234
UniSex Wear
9876
Women's Wear
9876
UniSex Wear
(Desired) Im looking for something like this.
Department
Total
Probation (%)
Suspended (%)
All Employees
30
16.3
1.4
All Teams
30
23.5
2.2
Total Men's Teams
10
14.8
2.8
Total Women's Teams
10
34.3
1.4
Men's Wear
10
5.9
0.0
Women's Wear
10
21.4
0.0
UniSec Wear
10
15.0
6.3
I have thought about using LISTAGG() on Col2 to get this affect.
Col1
Col2
1234
Men's Wear,UniSex Wear
9876
Women's Wear,UniSex Wear
Using LISTAGG() gives me the correct count for "All Employees" but then I get groupings of "Men's Wear,UniSex Wear" instead of a separate one for "Men's Wear" and one for "UniSex Wear". Is it possible to group by the individual comma separated values in Col2 after they have been LISTAGG()'ed, or is there a better way of achieving my end results?
Any assistance on achieving this would be greatly appreciated.

I would advise correcting the All_Employees data alone instead of doing the LISTAGG.
OR
Use a separate table to LISTAGG and un-LISTAGG your data which is different from the original table used to calculate the Total, Probation and Suspended data
For un-LISTAGG you can use the below example where table_two is your source table.
with
d2 as (
select
distinct id,
regexp_substr(
products, '[^,]+', 1, column_value
) as products
from
table_two cross
join TABLE(
Cast(
MULTISET (
SELECT
LEVEL
FROM
dual CONNECT BY level <= REGEXP_COUNT(products, '[^,]+')
) AS sys.ODCINUMBERLIST
)
)
)
SELECT
ID,
PRODUCTS
FROM
d2;

Related

mean, std and counts in one record

I have data that looks like this:
id res res_q
1 12.9 normal
2 11.5 low
3 13.2 normal
4 9.7 low
5 12.0 low
6 15.5 normal
7 13.5 normal
8 13.3 normal
9 13.5 normal
10 13.1 normal
11 13.4 normal
12 12.9 normal
13 11.8 low
14 11.9 low
15 12.8 normal
16 13.1 normal
17 12.2 normal
18 11.9 low
19 12.5 normal
20 16.5 normal
res_q can take the values 'low', 'normal' and 'high'.
I want to aggregate it so in one record I will have both the mean and std of res, and the counts of low, normal and high, all in one record, like this
mean sd low normal high
12.9 1.41 6 14 0
Off course I can do it by first aggregating the mean and std using AVG and STDEV, and then using COUNT to get the low/normal/high counts, like this:
SELECT AVG(res) AS mean,
STD(res) AS sd,
(SELECT COUNT(1) FROM temp1 WHERE res_q='low') AS low,
(SELECT COUNT(1) FROM temp1 WHERE res_q='normal') AS normal,
(SELECT COUNT(1) FROM temp1 WHERE res_q='high') AS high
FROM temp1
But, is there a more efficient way to do it?
One possibility I can think of is first to get the mean and the sd using AVG and STDEV, then get the counts using GROUP BY and then add the counts using UPDATE. Is this really more efficient? Anything else?
Thank you for your help.
Use conditional aggregation
SELECT AVG(res) AS mean,
STD(res) AS sd,
count(case when res_q='low' then 1 end) AS low
count(case when res_q='normal' then 1 end) AS normal,
count(case when res_q='high' then 1 end) AS high
FROM temp1

Aggregating against rest of the values

Hi I need help analyzing below data. The logic I need is sum of each provider should be divided by rest of the providers. For example based on below data each sum(provider) should be dived by sum( rest of providers)
sum(east RISK)/sum(west Risk)+sum(south RISK)
sum(west RISK)/sum(east RISK)+sum(south RISK)
sum(south RISK)/sum(east RISK)+sum(west RISK)
and so on....
....
....
Mbr Provider Group Risk
1 east Group 2.44
2 east Group 0.05
3 east Group 1.01
4 east Group 0.14
5 west Comp MRKT 0.32
6 west Comp MRKT 2.12
7 south Comp MRKT 5.78
8 south Comp MRKT 1.11
I think you can use ANSI standard window functions for this purpose:
select provider,
(sum(risk) / (sum(sum(risk)) over () - sum(risk))
)
from t
group by provider;

How to query DBpedia online using SQL?

DBpedia just released their data as tables, suitable to import into a relational database. How can I query this data online using SQL?
Dataset:
http://wiki.dbpedia.org/DBpediaAsTables
I took the raw data, uploaded it to BigQuery, and made it public. So far I've done it with the 'person' and the 'place' table. Check them at https://bigquery.cloud.google.com/table/fh-bigquery:dbpedia.person.
Now is easy to know what are the most popular alma maters, for example:
SELECT COUNT(*), almaMater_label
FROM [fh-bigquery:dbpedia.person]
WHERE almaMater_label != 'NULL'
GROUP BY 2
ORDER BY 1 DESC
It's a little more complicated than that, as some people have more than one alma mater - and the particular way DBpedia encodes that. I left the complete query at http://www.reddit.com/r/bigquery/comments/1rjee7/query_wikipedia_in_bigquery_the_dbpedia_dataset/.
Btw, the top alma maters are:
494 Harvard University
320 University of Cambridge
314 University of Michigan
267 Yale University
216 Trinity College Cambridge
You can also do joins between tables.
For example, for each building (from the place table) that has an architect: What year was that architect born? How many buildings with an architect born that year are listed in DBpedia?
SELECT COUNT(*), LEFT(b.birthDate, 4) birthYear
FROM [fh-bigquery:dbpedia.place] a
JOIN EACH [fh-bigquery:dbpedia.person] b
ON a.architect = b.URI
WHERE a.architect != 'NULL'
AND birthDate != 'NULL'
GROUP BY 2
ORDER BY 2
Results:
...
8 1934
13 1935
9 1937
7 1938
17 1939
7 1941
1 1943
15 1944
10 1945
12 1946
7 1947
9 1950
20 1951
1 1952
...
(Google BigQuery has a free monthly quota to query, up to a 100GB each month)
(DBpedia data from version 3.4 on is licensed under the terms of the Creative Commons Attribution-ShareAlike 3.0 license and the GNU Free Documentation License. http://dbpedia.org/Datasets#h338-24)

MS Access, Excel, SQL, and New Tables

I'm just starting out with MS Access 2010 and have the following setup. 3 excel files: masterlist.x (which contains every product that I sell), vender1.x (which contains all products from vender1, I only sell some of these products), and vender2.x (again, contains all products from vender2, I only sell some of these products). Here's an example data collection:
masterlist.x
ID NAME PRICE
23 bananas .50
33 apples .75
35 nuts .87
38 raisins .25
vender1.x
ID NAME PRICE
23 bananas .50
25 pears .88
vender2.x
ID NAME PRICE
33 apples .75
35 nuts .87
38 raisins .25
49 kiwis .88
The vender lists get periodically updated with new items for sell and new prices. For example, vender1 raises the price on bananas to $.75, my masterlist.x would need to be updated to reflect this.
Where I'm at now: I know how to import the 3 excel charts into Access. From there, I've been researching if I need to setup relationships, create a macro, or a SQL query to accomplish my goals. Not necessarily looking for a solution, but to be pointed in the right direction would be great!
Also, once the masterlist.x table is updated, what feature would I use to see which line items were affected?
Update: discovered SQL /JOIN/ and have the following:
SELECT * FROM master
LEFT JOIN vender1
ON master.ID = vender1.ID
where master.PRICE <> vender1.PRICE;
This gives me the output (for the above scenario)
ID NAME PRICE ID NAME PRICE
23 bananas .50 23 bananas .75
What feature would instead give me:
masterlist.x
ID NAME PRICE
23 bananas .75
33 apples .75
35 nuts .87
38 raisins .25
Here is a heads up since you were asking for ideas to design. I don't really fancy your current table schema. The following queries are built in SQL Server 2008, the nearest syntax that I could get in sqlfiddle to MS Access SQL.
Please take a look:
SQLFIDDLE DEMO
Proposed table design:
vendor table:
VID VNAME
1 smp farms
2 coles
3 cold str
4 Anvil NSW
product table:
PID VID PNAME PPRICE
203 2 bananas 0.5
205 2 pears 0.88
301 3 bananas 0.78
303 3 apples 0.75
305 3 nuts 0.87
308 3 raisins 0.25
409 4 kiwis 0.88
masterlist:
ID PID MPRICE
1 203 0.5
2 303 0.75
3 305 0.87
4 308 0.25
Join queries can easily update your masterlist now. for e.g.:
When the vendor updates their prices for the fruits they provide you. Or when they stop supply on that product. You may use where clauses to add the conditions to the query as you desire.
Query:
SELECT m.id, p.vid, p.pname, p.pprice
FROM masterlist m
LEFT JOIN product p ON p.pid = m.pid
;
Results:
ID VID PNAME PPRICE
1 2 bananas 0.5
2 3 apples 0.75
3 3 nuts 0.87
4 3 raisins 0.25
Please comment. Happy to help you if have any doubts.

Transpose groups/subgroups in sql oracle

I have date column which i have to divide in 6 quarters and calculate count,ratio- A/Count, Avg(colC) by State. Date column i have converted to YYYY Q format. I was wondering if i can get results shown below. i am using oracle 11g. I am just trying to write a sql which can give me results in above format. I am able to Group results in quarter but unable to further subgroup it to show count,Ratio and Avg under each quarter. –
I have 2 tables that i need to use to get data below.
Table 1 Table 2
Customer_id St_Nme St_Cd Customer_id No_of_sales Time_spent Date
1 Alabama AL 1 4 4.5 01122012
2 California CA 2 7.5 9.33 03062012
Desired Output
Count-Count of sales
Ratio-Time_spent/Count of sales
Avg - Average of time spent
Q42012 Q32012 Q22012 Q12012 Q42011 Q32012
count Ratio Avg count Ratio Avg count Ratio Avg
State
Alabama 3 4.5 1.2 8 7.4 3.2 65 21.1 34.4
A.. 4 7.5 3.2 5 9.4 5.2 61 25.1 39.4
A.. 9 6.5 5.2 4 3.4 3.7 54 41.1 44.4
Boston
Cali..
Den..