How do you replace nulls in a crosstab query with zeroes? - sql

Based on the following SQL in Access...
TRANSFORM Sum([Shape_Length]/5280) AS MILES
SELECT "ONSHORE" AS Type, Sum(qry_CurYrTrans.Miles) AS [Total Of Miles]
FROM qry_CurYrTrans
GROUP BY "ONSHORE"
PIVOT qry_CurYrTrans.QComb IN ('1_HCA_PT','2_HCA_PT','3_HCA_PT','4_HCA_PT');
... my results returned the following datasheet:
| Type | Total Of Miles | 1_HCA_PT | 2_HCA_PT | 3_HCA_PT | 4_HCA_PT |
| ONSHORE | 31.38 | | 0.30 | 7.80 | |
This result is exactly what I want except I want to see zeroes in the cells that are null.
What are some options for doing this? If possible, I'd like to avoid using a subquery. I'd also prefer the query to remain editable in Access' Design View.

I think you have to use the Nz function, which will allow you to convert NULLs to another value. In this case, I used the (optional) part of the function to say, "If Sum([Shape_Length]/5280) is NULL, set it to 0". You may have to use quotes around the 0, I can't recall.
TRANSFORM Nz(Sum([Shape_Length]/5280), 0) AS MILES
SELECT "ONSHORE" AS Type, Sum(qry_CurYrTrans.Miles) AS [Total Of Miles]
FROM qry_CurYrTrans
GROUP BY "ONSHORE"
PIVOT qry_CurYrTrans.QComb IN ('1_HCA_PT','2_HCA_PT','3_HCA_PT','4_HCA_PT');

Related

SQL WHERE Clause not filtering to criteria MS-Access

I have written a query to gather the balances of two different days, find the percent difference and then display them. I added a Percent Filter section to my form to show only values that are >= the desired percentage.
When running the query, I get the results that are >= percent given. However, after the criteria is met, the results expand past and continue until 0, as if ignoring my WHERE clause. Is there something I'm not catching within my query?
Query being used:
SELECT [x].[ID], [x].[Name], [x].[Day1Date], [x].[Day1Bal], [x].[Day2Date], [x].[Day2Bal], [x].[Difference], IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) AS PerDiff
FROM qryUnion AS x
WHERE IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) > [Forms]![Compare]![txtPercent]
ORDER BY IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) DESC
I have edited and re-written my IIf statement countless times but it still doesn't filter to criteria properly.
Results (Filtered for >= 10%) :
+----------+
| PerDiff |
+----------+
| 985.256 |
| 457.25 |
| 369.54 |
| 245.21 |
| 141.14 |
| 68.23 |
| 28.54 |
| 10.21454 |
| 10.1212 | <------- Criteria met
| 9.555 |
| 8.42 |
| 2.12 |
| 0.42 | <------- Ends at 0
+----------+
Obviously I'm wanting it to end at where the criteria is met, and I believe I've written my where clause to do so. I'm uncertain where else might be messing up.
qryUnion was a SubQuery but I had written just to get Dates and DateBals.
Any help is greatly appreciate! I'm still a bit new to SQL (and VBA for that matter). Thanks in advance!
EDIT1:
I have also tried
WHERE IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) >= [Forms]![Compare]![txtPercent] _
AND NOT IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) < [Forms]![Compare]![txtPercent]
As to not show any data that is less than the given percentage. This line didn't work. Is it possible that my WHERE clause isn't the issue? I'm uncertain where else the issue may lie.
*A better answer may exist, but this will accomplish your goal also:
You can create a subquery for PerDiff field before writing the final query:
SELECT [x].[ID], [x].[Name], [x].[Day1Date], [x].[Day1Bal], [x].[Day2Date], [x].[Day2Bal], [x].[Difference], IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])*100),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)*100)) AS PerDiff
FROM qryUnion AS x
Creating this subquery will then give you the results of the iff statement in your select clause that can then be used in the next query. So your final query could then use the Where clause like this:
WHERE PerDiff > [Forms]![Compare]![txtPercent]
ORDER BY PerDiff DESC
After a ton of trouble shooting, it seems my issue was the * 100 within my IIf statement.
SQL that worked:
SELECT [x].[DDANbr], [x].[Name], [x].[Day1Date], [x].[Day1Bal], [x].[Day2Date], [x].[Day2Bal], [x].[Difference], IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1))) AS PerDiff
FROM qry250CapAllCompare_Union AS x
--Added /100 at the end of WHERE clause to ensure that I was getting 10% because math
WHERE IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1)))>=Forms!frmCompare!txtPercent/100
ORDER BY IIf(([Day2Bal]>[Day1Bal]),((([Day2Bal]-[Day1Bal])/[Day1Bal])),(((([Day2Bal]-[Day1Bal])/[Day1Bal])*-1))) DESC

SQL Statement rows to Columns

I have a Table in MS Access like this:
The Columns are:
--------------------------------------
| *Date* | *Article* | *Distance* | Value |
---------------------------------------
Date, Article and Distance are Primary Keys, so the combination of them is always unique.
The column Distance has discrete values from 0 to 27.
I need to transform this table into a table like this:
----------
| *Date* | *Article* | Value from Distance 0| Value Dis. 1|...|Value Dis. 27|
----------
I really don't know a SQL Statement for this task. I needed a really fast solution which is why I wrote an Excel macro which worked fine but was very inefficient and needed several hours to complete. Now that the amount of data is 10 times higher, I can't use this macro anymore.
You can try the following pivot query:
SELECT
Date,
Article,
MAX(IIF(Distance = 0, Value, NULL)) AS val_0,
MAX(IIF(Distance = 1, Value, NULL)) AS val_1,
...
MAX(IIF(Distance = 27, Value, NULL)) AS val_27
FROM yourTable
GROUP BY
Date,
Article
Note that Access does not support CASE expressions, but it does offer a function called IIF() which takes the form of:
IIF(condition, value if true, value if false)
which essentially behaves the same way as CASE in other RDBMS.

Access query, if two values exist in one column, omit one

I have a series of queries that generate reports that contain chemical data. There are two compounds A and B where A is the total amount and B is a speciated amount (like total iron and ferrous iron, for example).
There are about one hundred total compounds in the query result, and I need a criteria to filter the results such that if both Compounds A and B are present, only Compound B is displayed. So far I've tried adding a few iif statements to the criteria section in the query builder with no luck.
Here is what I have so far:
SELECT Table1.KEY_ANLT
FROM Table1
WHERE (((Table1.KEY_ANLT)=IIf([Table1].[KEY_ANLT]=1223 And [Table1].[KEY_ANLT]=70,70,1223)));
This filters out Compound A but does not include the rest of the compounds. How can I modify the query to also include the other compounds?
So, to clarify some of the comments above, the problem here is you don't have (or haven't specified above) a way to identify values that go together. You gave 70 and 1223 as an example, but if you gave us a list of all the numbers, how would we be able to identify which ones go together? You might say "chemistry expertise", but that's based on another column with the compounds' names, right? So really, your query should use that column. But then there's still the problem of how to connect associated names (e.g., "total iron" and "ferrous iron" might be connected because they both have the word "iron", but what about "permanganate" and "manganese"?). In short, you need another column to specify the thing in common between these separate rows, whether it's element, ion, charge, etc. You would also need a column identifying which row in each "group" you would want to include in your query (or, which ones to exclude). For example:
+----------+-----------------+---------+---------+
| KEY_ANLT | Compound | Element | Primary |
+----------+-----------------+---------+---------+
| 70 | total iron | Fe | Y |
| 1223 | ferrous iron | Fe | |
| 1224 | ferric iron | Fe | |
| 900 | total manganese | Mn | Y |
| 901 | permanganate | Mn | |
+----------+-----------------+---------+---------+
Then, to get a query that shows just the "primary" rows, it's pretty trivial:
SELECT * FROM Table1 WHERE Primary='Y';
Without that [Primary] column, you'd have to decide how to choose each row. Perhaps you'd want the one with the smallest KEY_ANLT?
SELECT Table1.*
FROM
(SELECT Element, min(KEY_ANLT) AS MinKey FROM Table1 GROUP BY Element) AS Subquery
INNER JOIN Table1 ON
Subquery.Element=Table1.Element AND
Subquery.MinKey=Table1.KEY_ANLT
The reason your query doesn't work is that the WHERE clause operates row-by-row, and doesn't compare different rows to one another. So in your SQL:
IIf([Table1].[KEY_ANLT]=1223 And [Table1].[KEY_ANLT]=70,70,1223)
NONE of the rows will evaluate this as 70, because no single row has KEY_ANLT=1223 AND KEY_ANLT=70. Each row only has one value for KEY_ANLT. So then that IIF expression evaluates as 1223 for every row, and your condition will only return rows where KEY_ANLT=1223 (compound B).

How do I format numbers in and Access crosstab query to show two decimal places?

I have an Access crosstab query that displays the following results:
| SHORE_TYPE | Total Miles | Class 1 | Class 2 | Class 4 |
| ONSHORE | 31.37 | 0.337121212121212 | 12.4617424242424 | 0 |
I'd like it to display the following results instead. Note the 'Class' columns here show two decimal places:
| SHORE_TYPE | Total Miles | Class 1 | Class 2 | Class 4 |
| ONSHORE | 31.37 | 0.34 | 12.46 | 0.00 |
I've been able to configure the 'Total Miles' column by changing the Format and Decimal Places properties (in the Design View) to "Fixed" and "2," respectively. However, the query column (in Design View) that determines the value in the Class column has only a Format property, which I set to "Fixed"; there is not a Decimal Places property for me to adjust.
I have some similar crosstab queries that are showing the results in the way I desire, but I can't determine any differences between this one and those. Also, I've sometimes seen some of my queries display it the wrong way one time, then the desired way the next time.
This makes me wonder if the problem is a bug in Access, or if there is a something implicitly defined in my code that I should explicitly define.
Here is my SQL:
TRANSFORM IIf(IsNull(Sum([qryPartL].[MILES_OF_PHYS_LENGTH])),0,
Sum([qryPartL].[MILES_OF_PHYS_LENGTH])) AS SumOfMILES_OF_PHYS_LENGTH
SELECT qryPartL.SHORE_TYPE, Sum(qryPartL.MILES_OF_PHYS_LENGTH) AS [Total Miles]
FROM qryPartL
GROUP BY qryPartL.SHORE_TYPE
PIVOT qryPartL.CLASS_LOC_text In ("Class 1","Class 2","Class 4");
EDIT:
After closing and re-opening this query, the Total Miles column is now displaying 31.3714015..., and the properties I had previously set for this column in the Design View are now blank. So, it looks like Access does not consistently save these property settings. At least not in the context in which I was using them.
The trick is to use a series of nested functions.
CDbl: Converts the data to a Double number data type
FormatNumber: Returns an expression formatted as a number with a specified precision (2)
Nz: Returns the specified value (0) when a field is null
The CDbl function won't work if a value is Null.
I also removed the IIf function from the TRANSFORM clause since Nz works better in this case.
Here is the new SQL that returns the desired results. (I've added new lines and indents to make it easier to read. This is a not necessary step, and may in fact not be remembered by Access.)
TRANSFORM
CDbl(
FormatNumber(
Nz(
Sum([qryPartL].[MILES_OF_PHYS_LENGTH])
,0)
,2)
) AS SumOfMILES_OF_PHYS_LENGTH
SELECT qryPartL.SHORE_TYPE,
CDbl(
FormatNumber(
Nz(
Sum(qryPartL.MILES_OF_PHYS_LENGTH)
,0)
,2)
) AS [Total Miles]
FROM qryPartL
GROUP BY qryPartL.SHORE_TYPE
PIVOT qryPartL.CLASS_LOC_text In ("Class 1","Class 2","Class 4");
Thanks to Allen Browne and a tip on his awesome Access website for leading me to this answer.

PostgreSQL calculate the top places per group and other statistics

I have a table with the following structure
|user_id | place | type_of_place | money_earned| time |
|--------+-------+---------------+-------------+------|
| | | | | |
The table is very large, several millions of rows. The data is in a PostgreSQL 9.1 database.
I want to calculate, per user_id and type_of_place: the mean, the standard deviation, and the top 5 of places (ordered by counts), and the most used hour of time (mode).
The resulting data must be in this form:
| user_id | type_of_place | avg | stddev | top5_places | mode |
+---------+---------------+-----+--------+------------------+------+
| 1 | tp1 | 10 | 1 | {p1,p2,p3,p4,p5} | 8 |
| 2 | tp1 | 3 | 2 | {p3,p4} | 23 |
| 1 | tp3 | 1 | 1 | {p1} | 4 |
etc.
Is there a for of doing this with window functions efficiently?
What if I want to grouping by week? (i.e. another column that represents the number of week)
Thank you!
A standard GROUP BY query will get you most of the way:
SELECT
user_id,
type_of_place,
avg(money_earned) AS avg,
stddev(money_earned) AS stddev
FROM
earnings -- I'm not sure what your data table is called...
GROUP BY
user_id,
type_of_place
This leaves the top5_places and mode columns. These are both also aggregates, but not ones which are defined in the standard PostgreSQL installation. Luckily, you can add them.
Here's a page discussing how to define a mode aggregate function: http://wiki.postgresql.org/wiki/Aggregate_Mode
Once you have a mode aggregate function, assuming time is a timestamp of some kind, the expression you will add to the select list will be:
SELECT
...
mode(extract(hour FROM time)) AS mode -- Add this expression
FROM
...
Assuming order by money
For top5_places, there are several approaches, but the quickest is probably to use PostgreSQL's builtin array_agg function, and take the first 5 elements:
SELECT
...
(array_agg(place ORDER BY money_earned DESC))[1:5] AS top5_places -- Add this expression
FROM
...
One alternative is to define another aggregate called (for instance) top5, which performs the same function. This could be more efficient if there are many distinct places for each user/type of place combination, since it can stop accumulating after the first 5, whereas the above expression will generally build a complete array of all places, and then truncate to the first 5.
This assumes that a place has a unique earnings entry for each user/type combination. If a place can occur more than once, and you want to sort by sum(money_earned) for each place, then you need to use a subquery like in the examples below...
Order by counts
Ok, so the places should be ordered by how often they occur. Here's a quick way, which uses a couple of subqueries -- add this as an expression to the select-clause of the above query:
(SELECT
(array_agg(place ORDER BY cnt DESC))[1:5]
FROM
(SELECT place, count(*) FROM earnings AS t2
WHERE t2.user_id = earnings.user_id AND t2.type_of_place = earnings.type_of_place
GROUP BY place) AS s (place, cnt)
) AS top5_places
The inner subquery called s evaluates to a table of each place for that user/type combination, and the number of times it occurs (which I've called cnt). These are then fed to array_agg in descending order of that count.
I suspect there could be much neater (and probably more efficient) ways of writing it. If not, then I would recommend trying to move this complicated expression into a function or aggregate, if you can...
Histrogram of places in each hour
We'll use a similar expression, which will return the array of counts, ordered by hour:
(SELECT
array_agg(cnt ORDER BY hour DESC)
FROM
(SELECT extract(hour FROM time), count(*) FROM earnings AS t2
WHERE t2.user_id = earnings.user_id AND t2.type_of_place = earnings.type_of_place
GROUP BY 1) AS s (hour, cnt)
) AS hourly_histogram
(Add that to the select-clause of the original query.)