I have 2 tables I am combining and that works but I think I designed the second table wrong as I have a column for each item of what really is a multiple choice question. The query is this:
select Count(n.ID) as MemCount, u.Pay1Click, u.PayMailCC, u.PayMailCheck, u.PayPhoneACH, u.PayPhoneCC, u.PayWuFoo
from name as n inner join
UD_Demo_ORG as u on n.ID = u.ID
where n.MEMBER_TYPE like 'ORG_%' and n.CATEGORY not like '%_2' and
(u.Pay1Click = '1' or u.PayMailCC = '1' or u.PayMailCheck = '1' or u.PayPhoneACH = '1' or u.PayPhoneCC = '1' or u.PayWuFoo = '1')
group by u.Pay1Click, u.PayMailCC, u.PayMailCheck, u.PayPhoneACH, u.PayPhoneCC, u.PayWuFoo
The results come up like this:
Count Pay1Click PayMailCC PayMailCheck PayPhoneACH PayPhoneCC PayWuFoo
8 0 0 0 0 0 1
25 0 0 0 0 1 0
8 0 0 0 1 0 0
99 0 0 1 0 0 0
11 0 1 0 0 0 0
So the question is, how can I get this to 2 columns, Count and then the headers of the next 6 headers so the results look like this:
Count PaymentType
8 PayWuFoo
25 PayPhoneCC
8 PayPhoneACH
99 PayMailCheck
11 PayMailCC
Thanks.
Try this one
Select Count,
CASE WHEN Pay1Click=1 THEN 'Pay1Click'
PayMailCC=1 THEN ' PayMailCC'
PayMailCheck=1 THEN 'PayMailCheck'
PayPhoneACH=1 THEN 'PayPhoneACH'
PayPhoneCC=1 THEN 'PayPhoneCC'
PayWuFoo=1 THEN 'PayWuFoo'
END as PaymentType
FROM ......
I think indeed you made a mistake in the structure of the second table. Instead of creating a row for each multiple choice question, i would suggest transforming all those columns to a 'answer' column, so you would have the actual name of the alternative as the record in that column.
But for this, you have to change the structure of your tables, and change the way the table is populated. you should get the name of the alternative checked and put it into your table.
More on this, you could care for repetitive data in your table, so writing over and over again the same string could make your table grow larger.
if there are other things implied to the answer, other informations in the UD_Demo_ORG table, then you can normalize the table, creating a payment_dimension table or something like this, give your alternatives an ID such as
ID PaymentType OtherInfo(description, etc)...
1 PayWuFoo ...
2 PayPhoneCC ...
3 PayPhoneACH ...
4 PayMailCheck ...
5 PayMailCC ...
This is called a dimension table, and then in your records, you would have the ID of the payment type, and not the information you don't need.
So instead of a big result set, maybe you could simplify by much your query and have just
Count PaymentId
8 1
25 2
8 3
99 4
11 5
as a result set. it would make the query faster too, and if you need other information, you can then join the table and get it.
BUT if the only field you would have is the name, perhaps you could use the paymentType as the "id" in this case... just consider it. It is scalable if you separate to a dimension table.
Some references for further reading:
http://beginnersbook.com/2015/05/normalization-in-dbms/ "Normalization in DBMS"
http://searchdatamanagement.techtarget.com/answer/What-are-the-differences-between-fact-tables-and-dimension-tables-in-star-schemas "Differences between fact tables and dimensions tables"
Related
I've run into a subtlety around count(*) and join, and a hoping to get some confirmation that I've figured out what's going on correctly. For background, we commonly convert continuous timeline data into discrete bins, such as hours. And since we don't want gaps for bins with no content, we'll use generate_series to synthesize the buckets we want values for. If there's no entry for, say 10AM, fine, we stil get a result. However, I noticed that I'm sometimes getting 1 instead of 0. Here's what I'm trying to confirm:
The count is 1 if you count the "grid" series, and 0 if you count the data table.
This only has to do with count, and no other aggregate.
The code below sets up some sample data to show what I'm talking about:
DROP TABLE IF EXISTS analytics.measurement_table CASCADE;
CREATE TABLE IF NOT EXISTS analytics.measurement_table (
hour smallint NOT NULL DEFAULT NULL,
measurement smallint NOT NULL DEFAULT NULL
);
INSERT INTO measurement_table (hour, measurement)
VALUES ( 0, 1),
( 1, 1), ( 1, 1),
(10, 2), (10, 3), (10, 5);
Here are the goal results for the query. I'm using 12 hours to keep the example results shorter.
Hour Count sum
0 1 1
1 2 2
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
10 3 10
11 0 0
12 0 0
This works correctly:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(measurement_table.hour) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (measurement_table.hour = hour_series.hour)
GROUP BY 1
ORDER BY 1
This returns misleading 1's on the match:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(*) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (hour_series.hour = measurement_table.hour)
GROUP BY 1
ORDER BY 1
0 1 1
1 2 2
2 1 0
3 1 0
4 1 0
5 1 0
6 1 0
7 1 0
8 1 0
9 1 0
10 3 10
11 1 0
12 1 0
The only difference between these two examples is the count term:
count(*) -- A result of 1 on no match, and a correct count otherwise.
count(joined to table field) -- 0 on no match, correct count otherwise.
That seems to be it, you've got to make it explicit that you're counting the data table. Otherwise, you get a count of 1 since the series data is matching once. Is this a nuance of joinining, or a nuance of count in Postgres?
Does this impact any other aggrgate? It seems like it sholdn't.
P.S. generate_series is just about the best thing ever.
You figured out the problem correctly: count() behaves differently depending on the argument is is given.
count(*) counts how many rows belong to the group. This just cannot be 0 since there is always at least one row in a group (otherwise, there would be no group).
On the other hand, when given a column name or expression as argument, count() takes in account any non-null value, and ignores null values. For your query, this lets you distinguish groups that have no match in the left joined table from groups where there are matches.
Note that this behavior is not Postgres specific, but belongs to the standard
ANSI SQL specification (all databases that I know conform to it).
Bottom line:
in general cases, uses count(*); this is more efficient, since the database does not need to check for nulls (and makes it clear to the reader of the query that you just want to know how many rows belong to the group)
in specific cases such as yours, put the relevant expression in the count()
I have been struggling with this for hours. I am trying to update all values that have the same 'SHORT#'. If the 'SHORT#' is in 017_PolWpart2 I want this to be the value that updates the corresponding 'SHORT#' in 017_WithdrawalsYTD_changelater. This update query is just displaying zeroes, but these values are in fact non-zero.
So say 017_WithdrawalsYTD_changelater looks like this:
SHORT# WithdrawalsYTD
1 0
2 0
3 0
4 0
5 0
and 017_PolWpart2 looks like this:
SHORT# Sum_MTD_AGG
3 50
5 12
I want this:
SHORT# WithdrawalsYTD
1 0
2 0
3 50
4 0
5 12
But I get this:
SHORT# WithdrawalsYTD
1 0
2 0
3 0
4 0
5 0
I have attached the SQL for the Query below.
Thanks!
UPDATE 017_WithdrawalsYTD_changelater
INNER JOIN 017b_PolWpart2 ON [017_WithdrawalsYTD_changelater].[SHORT#] =
[017b_PolWpart2].[SHORT#]
SET [017_WithdrawalsYTD_changelater].WithdrawalsYTD = [017b_PolWpart2].[Sum_MTD_AGG];
EDIT:
As I must aggregate on the fly, I have tried to do so. Still getting all kinds off errors. Note the table 17a_PolicyWithdrawalMatch is of the form:
SHORT# MTG_AGG WithdrawalPeriod PolDurY
1 3 1 1
1 5 1 0
2 2 1 1
2 22 1 1
So I aggregate:
SHORT# MTG_AGG
1 3
2 24
And put these aggregated values in 017_WithdrawalsYTD_changelater.
I tried to this like so:
SELECT [017a_PolicyWithdrawalMatch].[SHORT#], Sum([017a_PolicyWithdrawalMatch].MTD_AGG) AS Sum_MTD_AGG
WHERE ((([017a_PolicyWithdrawalMatch].WithdrawalPeriod)=[017a_PolicyWithdrawalMatch].[PolDurY]))
GROUP BY [017a_PolicyWithdrawalMatch].[SHORT#]
UPDATE 017_WithdrawalsYTD_changelater INNER JOIN 017a_PolicyWithdrawalMatch ON [017_WithdrawalsYTD_changelater].[SHORT#] = [017a_PolicyWithdrawalMatch].[SHORT#] SET 017_WithdrawalsYTD_changelater.WithdrawalsYTD =Sum_MTD_AGG;
I am getting no luck... I get told SELECT statement is using a reserved word... :(
Consider heeding #June7's comments to avoid the use of saving aggregate data in a table as it redundantly uses storage resources since such data can be easily queried in real time. Plus, such aggregate values immediately become historical figures since it is saved inside a static table.
In MS Access, update queries must be sourced from updateable objects of which aggregate queries are not, being read-only types. Hence, they cannot be used in UPDATE statements.
However, if you really, really, really need to store aggregate data, consider using domain functions such as DSUM inside the UPDATE. Below assumes SHORT# is a string column.
UPDATE [017_WithdrawalsYTD_changelater] c
SET c.WithdrawalsYTD = DSUM("MTD_AGG", "[017a_PolicyWithdrawalMatch]",
"[SHORT#] = '" & c.[SHORT#] & "' AND WithdrawalPeriod = [PolDurY]")
Nonetheless, the aggregate value can be queried and refreshed to current values as needed. Also, notice the use of table aliases to reduce length of long table names:
SELECT m.[SHORT#], SUM(m.MTD_AGG) AS Sum_MTD_AGG
FROM [017a_PolicyWithdrawalMatch] m
WHERE m.WithdrawalPeriod = m.[PolDurY]
GROUP BY m.[SHORT#]
I've been tasked with coming up with a solution for a problem that was found this morning. I have a query that I need to do some math with. I have three pertinent columns.
SELECT lQ.[QUANTITY], lQ.[FORM_FACTOR_ID], oQ.[INDIVIDUAL_PACKAGING]
FROM [dbo].[AOF_ORDER_LINE_QUEUE] as lQ
LEFT JOIN [dbo].[AOF_ORDER_QUEUE] AS oQ
ON lQ.[SALES_ORDER_NUMBER] = oQ.[SALES_ORDER_NUMBER]
I can see myself doing this in a loop easily in languages I know best. It doesn't seem that looping is a good thing to do in SQL based on some preliminary research so I am reaching out for suggestions.
I need to output a total value which is a conditional sum of lQ.[QUANTITY]. The condition is if oQ.[FORM_FACTOR_ID] is equal to 1 then the output for that particular row is equal to the value of lQ.[QUANTITY]. If oQ.[FORM_FACTOR_ID] is equal to 2 then if oQ.[INDIVIDUAL_PACKAGING] is true, then the output of that particular row in the query is equal to lQ.[QUANTITY]. If the value is false, then the output of that particular row in the query is divided by 2. The final output needs to be a single integer.
QUANTITY FORM_FACTOR_ID INDIVIDUAL_PACKAGING
4 2 1
5 1 1
I would need a query that outputs the value 7 for the above table.
QUANTITY FORM_FACTOR_ID INDIVIDUAL_PACKAGING
4 2 0
5 2 0
That same query needs to output 5 for the above table.
What would be the best way to go about doing this?
If I understand the question correctly, you just want conditional aggregation -- a CASE as an argument to SUM().
If I follow the logic, it would look like:
SELECT SUM(CASE WHEN oq.FORM_FACTOR_ID = 1 THEN lQ.QUANTITY
WHEN oQ.FORM_FACTOR_ID = 2 AND oQ.INDIVIDUAL_PACKAGING = 1 THEN lQ.QUANTITY
WHEN oQ.FORM_FACTOR_ID = 2 AND oQ.INDIVIDUAL_PACKAGING = 0 THEN lQ.QUANTITY / 2
END)
FROM [dbo].[AOF_ORDER_LINE_QUEUE] lQ LEFT JOIN
[dbo].[AOF_ORDER_QUEUE] oQ
ON lQ.[SALES_ORDER_NUMBER] = oQ.[SALES_ORDER_NUMBER];
How do I output data stored in Table 1 so that each like account number has that also has the same CPT group's together but the ones that do not match fall to the bottom of the list?
I have one table: select * from CPTCounts and this is what is displays
Format (relevant fields only):
account OriginalCPT Count ModifiedCPT Count
11 0 71010 1
11 71010 1 0
2 0 71010 1
2 0 71020 9
2 0 73130 1
2 0 77800 1
2 71010 1 0
2 71020 8 0
2 73130 1 0
2 73610 1 0
2 G0202 4 0
31 99010 1 0
31 0 99010 4
31 0 99700 2
What I want the results to be grouped like is below... and display like this or similar.
Account OriginalCPT Count ModifiedCPT Count
11 71010 1 71010 1
2 71010 1 71010 1
2 71020 8 0
2 73130 1 0
2 73610 1 0
2 G0202 4 0
31 99010 1 99010 4
31 0 99700 2
I have one table with the values above;
Select * from #CPTCounts
The grouping I am looking for is the Original = Modified CPT and sometimes I will not have a value in one side or the other but most of the times I will have a match. I would like to place all of the unmatched ones at the bottom of the account.
any suggestions?
I was thinking of creating a second table and joining the two with the account but how do I return each value?
select cpt1.account, cpt1.originalCPT, cpt1.count, cpt2.modifiedcpt, cpt2.count
from #cptcounts cpt1
join #cptcounts cpt2 on cpt1.accont = cpt2.account
but am having trouble with that solution.
I'm not sure I have an exact solution, but perhaps some food for thought at least. The fact that you need either the "original" or the "modified" set of columns makes me think that you need a full outer join rather than a left join. You don't mention which database you are using. In MySql, for example, full joins can be emulated by means of a union of a left and a right join, as in the following:
select cpt1.account, cpt1.originalCPT, cpt1.countO, cpt2.modifiedcpt, cpt2.countM
from CPTCounts cpt1
left outer join CPTCounts cpt2
on cpt1.account = cpt2.account
and cpt1.originalCPT=cpt2.modifiedCPT
where cpt1.account is not null
and (cpt1.originalCPT is not null or cpt2.modifiedCPT is not null)
union
select cpt1.account, cpt1.originalCPT, cpt1.countO, cpt2.modifiedcpt, cpt2.countM
from CPTCounts cpt1
right outer join CPTCounts cpt2
on cpt1.account = cpt2.account
and cpt1.originalCPT=cpt2.modifiedCPT
where cpt2.account is not null
and (cpt1.originalCPT is not null or cpt2.modifiedCPT is not null)
order by originalCPT, modifiedCPT, account
The ordering brings the non-matching rows to the top, but that seemed a lesser problem than getting the matching to work.
(Your output data is a bit confusing, because the CPT 71020, for example, occurs in both original and modified columns, but you haven't shown it as one of the matching ones in your result set. I'm presuming this is because it is just an example... but if I'm wrong, then I am missing some part of your intention.)
You can play around in this SQL Fiddle.
I wrote the following SQL to create a column that I can use to populate check boxes in a Grid to manage user permissions.
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
WHERE ( access_group.group_id = 10
OR access_group.group_id IS NULL )
However, it does not select all of the entries from access_b2b. The issues is with the last line:
where (access_group.group_id=10 or access_group.group_id is null)
Without it, i get duplicate entries returned with different active values. Also, I realized that this is not the proper condition, because an entry in access_group might exist for a different access_group.group_id, meaning that not all the remaining entries will be pulled in with the access_group.group_id is null.
I am trying to write my condition so that if does something along the lines of:
This is the format I was trying to follow:
Where For Each unique access_id in access_group
select the one where group_id=10
if no group_id=10
select any other one
end
end
Ultimately, the goal is to have a column returned with 1 or 0 denoting if the access_id exists for a predetermined group id.
Please note that throughout this explanation I used group_id=10 for simplification, it will be later replaced with a SqlParameter.
Any help is appreciated, thank you so much!
SAMPLE DATA (only useful columns shown to simplify data)
access_group
access_id group_id
27 1
27 11
28 1
28 11
33 1
33 3
33 11
43 11
44 1
44 10
44 11
...
access_b2b
access_id description
1 Add
2 Edit
3 Delete
4 List
5 Payments
6 Open Files
7 Order
8 Mod
...
Change the query to and it should work:
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
AND ( access_group.group_id = 10
OR access_group.group_id IS NULL )
If you don't want the records to be filtered by the WHERE clause, move the condition in the JOIN.
The JOIN will keep the lines and populate them with NULL if the condition is not met, while the WHERE clause will filter the result set.