Query function Google Sheet - Aggregation + (unwanted) Sorting - sql

I am trying to run what it started as a simple task but it turned out to be more complicated.
I must run a local sum of a column over different elements of another column with a query function.
The issue arises because the query performs an unwanted sorting of the grouped column (it is in the format of working weeks - strings) and I cannot get it to unsort or re-sort in the original format.
Initial query is:
=query(A1:B350,"select A, sum(B) group by A")
See the example:
click here to see example
Subsequently I tried with:
=query(A1:B350,"select A, sum(B) where A matches '"&join("|", query(G2:G, "select G where G is not null"))& "' group by A")
like so:
click here to see example
but the unwanted sorting remains.
Any idea on how to force the initial sorting or preventing it from changing?
Thank you in advance

To sort correctly, you need to align single digits. You can do this either in the source data or using a formula:
=QUERY({INDEX(REGEXREPLACE(A:A,"-(\d)$","-0$1")),B:B},"SELECT Col1, SUM(Col2) GROUP BY Col1")

try:
=INDEX(IFNA(VLOOKUP(G2:G,
QUERY(A1:B350, "select A,sum(B) group by A label sum(B)''"), {1, 2}, 0)))

Related

How to create a pivot query together with group by statement with multiple columns

In order to achieve the following structure, using "pivot" table and "group by" with multiple columns. (i.e. as illustrated in the second image below),
what would be the SQL implementation?
The source query is:
SELECT
t1.date,
t1.area,
t1.canal,
SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
and source query generates a initial structure as in:
Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column "date" but only to the column "peso".
Plus, a partial total of each area, named as "total" .
As illustrated in the image bellow.
After a 24 long hours and quick nap. I finally got an asnwer.
A simple and straight forward line, using PySpark.
df = dfa.groupBy("area", "canal").pivot("date").sum("peso")
Thanks to #andrew-svds and his Warmup github's repository
https://github.com/andrew-svds/spark-pivot-examples/tree/master/0-Warmup
The complete chunk is described bellow, for reference purposes
qry = """
SELECT
t1.date,
t1.area,
t1.canal,
t1.peso
FROM table1 t1
''''
dfy = spark.sql(qry)
dfy = dfa.groupBy("area", "canal").pivot("date").sum("peso")
dfy = dfy.orderBy("area", "canal")
display(dfy)
I believe there are many way to get the same results. That one was the simplest and more intuitive that I was able to write.
Perhaps tomorrow with a nice and good sleep, I'll get an even simpler line of code! :)
Bets wishes,
I

How to select all data from table but only display date-specific rows within DATE-data type column, in Oracle SQL?

I'm experiencing trouble returning a query to return all columns within a table but limited to the DATE-data-type "enroll_date" column containing '30-Jan-07'; the closest solution is with the below query but neither data is displayed nor the entire workbook-just the column-which leads me to believe that this is not just an issue with approach but perhaps a formatting issue as well.
SELECT TO_DATE(enroll_date, 'DD-MM-YY')
FROM student.enrollment
WHERE enroll_date= '30-Jan-07';
Again, I need to display all columns but only rows only specific to the date '30-Jan-07'. I'm sure a nested solution is ideal and somehow the right solution, but unfortunately my chops aren't there yet but I'm working on it! :D
UPDATE
Please see attached screenshot of output. The query/solution should retrieve all columns and rows enclosed within the red-rectangle mark-up-thank you!
One possible problem is that the date column has a time component (this is hidden in SQL). One method is to use trunc():
SELECT e.*
FROM student.enrollment e
WHERE TRUNC(e.enroll_date) = DATE '2007-01-30';
You can specify whichever columns you want in the following query:
SELECT col1, col2, col3, ...
FROM student.enrollment
WHERE TO_CHAR(enroll_date, 'DD-MON-YY') = '30-JAN-07';

Google Query Set Field As

What I'm trying to do is create compound table using Google Spreadsheets.
Works...
=query('Sheet1'!A:F, "select C")
Fails...
=query('Sheet1'!A:F, "select C,(C/D) as PerItemCost")
I'm trying to use the Query function because I'm being lazy. I'd rather not add a G field to Sheet1 that's C/D. If I forget to update it, my report sheet won't show the correct values.
Is there away to use the traditional (Select column & ' text ' & column) AS NewColumn in Google Spreadsheets?
I finally figured this one out after some time.
You'll use the following format:
Label Field 'Title',Field 'Title'
So, normally it's
=query(A:F, "Select A,B,C Label A 'Pinky', B 'Brain', C 'Narf'")
The output will be results with Column A labeled as Pinky, Column B labeled as Brain, and so forth.
For my use, I needed something far more complex as I'm calculating fields. So, I needed to reference not just the field, but the calculation I was applying to it. See example below.
Example:
=query(A:F,"Select Year(A),Month(A)+1,Day(A),C/D Label C/D 'Cost per Unit', Year(A) 'Year', Month(A)+1 'Month',Day(A) 'Day'")
I was hoping that Google would've made it more intuitive, but c'este la vie.

Why is this query asking for a parameter value?

I'm working in Access 2010. I have the following query (which is named bird_year_species):
SELECT sub.Species, Min(sub.obs_year) AS First_sighting_year
FROM (SELECT DISTINCT [Genus_BiLE] & " " & [Species_BiLE] AS Species, [Year_BiLE] AS obs_year
FROM BiLE_Bound
UNION ALL
SELECT DISTINCT [Genus_BiMN] & " " & [Species_BiMN] AS Species, [Year_BiMN] AS obs_year
FROM BiMN_Bound
UNION ALL
SELECT DISTINCT [Genus_BiPC] & " " & [Species_BiPC] AS Species, [Year_BiPC] AS obs_year
FROM BiPC_Bound
UNION ALL
SELECT Distinct [Genus_BiOP] & " " & [Species_BiOP] AS Species, [Year_BiOP] AS obs_year
FROM BiOP_Rec
) AS sub
GROUP BY sub.Species;
When I open it I get a popup asking for a parameter value for Query1.obs_year. If I just fill in anything and hit okay the table pops up and it works. I've no idea why this is happening, and the query is not named Query1.
I tried copying the code into a new query. I tried compacting an repairing. I tried saving the database under another name. Non of which worked.
Eventually I opened the query, switched out of SQL view and into design view, back to SQL view and voila, that appeared to do it. So strange.
When this happens, check your properties, specifically look for references to Queryx in the filter and order by fields.
I had the same problem, and OP solution didn't work for me because the it was caused by something else.
In case someone else is in my condition and stumbles upon this question I'm gonna post cause and solution.
It may be something really basic as I'm a beginner at using Access, but maybe it may help someone out there.
I had a query with both a GROUP BY and some calculated column.
something like
Select A, B, (A*B) AS C, (A+B) AS D
FROM table
GROUP BY A, B
I then added from the design view another calculated field based on C and D and i started getting asked for C and D when running the query.
Looking at the SQL, the design resulted in a wrong query:
Select A, B, (A*B) AS C, (A+B) AS D, (D*C) as E
FROM table
GROUP BY A, B, [D]*[C]
Notice how the expression was added to the group by?
That's why it was asking for the value when running the query.
SOLUTION
Remove the unnecessary grouping from the sql
or
From the design view change the Total value of the column from Group By to Expression

Parameters in Microsoft Access

I'm really confused with how parameters work in Microsoft Access. I know that parameters are supposed to be used to allow a user to type in values when the query is run - instead of having to modify the query for each instance.
So, let's use the following example.
SELECT countyTable.countyName, Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
WHERE ((([avgLatitude]-5)<46.47) AND (([avgLatitude]+5)>46.47) AND (([avgLongitude]-5)<-90.17) AND (([avgLongitude]+5)>-90.17))
ORDER BY Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2), countyTable.countyName
1) I am SELECTing a column that contains the SQR function. I also have that column named as 'Distance'. However, when I try to ORDER BY on said column - and refer to it as 'Distance' - it asks for a value instead of sorting on that column. The only way I can get the query to ORDER BY is to duplicate the expression from the SELECT line. This seems unnecessary.
2) Right now, I have some values hard-coded in. I could care less about the values '57.3' and '69.1' However, for '46.47' I would like to replace with 'x2' and -90.17 with 'y2'. How I've been trying to write this with parameters, Access asks for values for each instance of 'x2' and 'y2'. This doesn't help me at all, so I have them hardcoded in.
Any help at all? Thanks!
1) I am SELECTing a column that contains the SQR function. I also have that column named as 'Distance'. However, when I try to ORDER BY on said column - and refer to it as 'Distance' - it asks for a value instead of sorting on that column. The only way I can get the query to ORDER BY is to duplicate the expression from the SELECT line. This seems unnecessary.
Yes Access does a poor job. Every real DBMS now supports ordering by the column alias created in the SELECT clause. To do this in Access, you can either do what you are doing (repeat the expression) or subquery it, e.g.
select a,b,c
from (
select a, b, a+b as C
from sometable
) AS SUBQUERIED
order by c
2) How I've been trying to write this with parameters, Access asks for values for each instance of 'x2' and 'y2'.
You're doing it wrong. Access should prompt only once. If you have a query like this
select a, b, a+b as C
from sometable
where a > [x] and y > [x]
It will see both [x]'s as being the same - and only one prompt for both. Just make sure they are spelt exactly the same.
If you wanted something like this simplified example:
SELECT
countyTable.countyName,
Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
ORDER BY Distance;
For the ORDER BY you can reference that complex Distance expression by its ordinal position in the field list.
SELECT
countyTable.countyName,
Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
ORDER BY 2;
That method is supported at least since Jet 4 (Access 2000), and also by the newer ACE database engine.