Retrieving Columns with count greater than 1 - Google Sheet Query - sql

I'm using Google sheets, and I want to get the data from one sheet to another where I want only the columns with count > 1.
Let's say we have 3 columns A, B, and C. I tried the following (the first sheet name is "Form Responses 1"):
I thought about using a query in the second sheet as: =query('Form Responses 1'!A1:Z, "Select A having count (A) >1 union select B having count (B) >1 union select C having count (C) > 1"). But I got a parse error where it seems that union and having are not supported in google sheets query.
How can I achieve this (whether it's using query or any other Google sheets function that can work)?
More details:
The first sheet contains info about exercises conducted during a lecture and it gets its data from a Google Form (so the responses are fed in this sheet). Here is a screenshot of it:
Please note that the form is divided into sections. When the user selects the course, the attendance, the participation, and adds a comment, then they go to the next section, the next section will be based on the selected course, the newly opened section will have the exercise name and rating questions (the exercise name is a dropdown list with items that are prefilled and specific to the selected course). That's why, you can see that "exercise name" and "rate the exercise" columns are repeated because we have 2 sections in this form.
The second sheet should contain the data of a selected course only (either mobile dev or web dev) which can be achieved easily through a query with a where clause. But, in addition to that, it shouldn't contain the empty columns of "exercise name" and "rate the exercise" as they correspond to another section. So, it should have only one exercise name column and one rating column that correspond to the selected course. Here is a screenshot if we only use a query with where clause without removing the extra name and rating columns:
Here is a screenshot with the desired result:
Thanks.

why not use just:
=QUERY('Form Responses 1'!A1:Z, "select A,B,C,D,E,F,G where F is not null", 1)

Use "OR" condition
Eg:-
QUERY(Data!A:R,"select A, N, P where N>0 or P>0")
where A column has country and N, P columns have population values

Related

How to combine rows in BigQuery that share a similar name

i'm having trouble creating a query that'll group together responses from multiple rows that share a similar name and count the specific response record in them.
the datatable i currently have looks like this
test_control
values
test
selected
control
selected
test us
not selected
control us
selected
test mom
not selected
control mom
selected
what i'd like, is an output like the below that only counts the number of "selected" responses and groups together the rows that have either "control" or "test" in the name"
test_control
values
test
3
control
1
The query i have below is wrong as it doesn't give me an output of anything. The group by section is where im lost as i'm not sure how to do this. tried to google but couldn't seem to find anything. appreciate any help in advance!!!
SELECT distinct(test_control), values FROM `total_union`
where test_control="%test%" and values="selected"
group by test_control, values
use below
SELECT
REGEXP_EXTRACT(test_control, r'^(TEST|CONTROL) ') AS test_control,
COUNTIF(values = 'selected') AS values
FROM `total_union`
GROUP BY 1
As mentioned by #Mikhail Berlyant, you can use REGEX_EXTRACT to match the expression and COUNTIF to get the count of the total number of matching expressions according to the given condition. Try below code to get the expected output :
Code
SELECT
REGEXP_EXTRACT(test_control, r'^(test|control)') AS test_control,
COUNTIF(values = "selected") AS values
FROM `project.dataset.testvalues`
group by 1
Output

Google Sheets Query WHERE clause just returns first row of data

I have a sheet which I am trying to use to show an overview of students results in a number of subjects, with each student and their results in separate rows.
OVERVIEW
studentID, English, Maths, etc
1, result1, result2, etc
2, result3, result4, etc
The result data comes from another system and is in a separate sheet. Each result for each subject is a separate row where the first column is the student ID, the second is the subject and the third is the result.
RESULTSET
1, ENGLISH, result 1
1, MATHS, result 2
etc.
I've been trying various forms of a query like this
=query(RESULTSET!A1:C,"SELECT C WHERE A = '1' AND B = 'ENGLISH'",1) but the query only ever returns the first result from the first row of data in RESULTSET.
Here is a link to a test spreadsheet containing data and queries that reproduces the issue: https://docs.google.com/spreadsheets/d/15xLAyHumL2pC8mRfA4Qs9xMyrWZvK86kmoi2kBWnB34/edit?usp=sharing
I am expecting to see results from the result set that matches each student ID and subject, but I am only ever seeing the first result irrespective of ID or subject.
Remove the single quotes around the reference to column A and set the headers argument in query() to zero. In B2 try
=iferror(query('result set 1'!$A$1:$C58,"SELECT C WHERE A = "&$A2&" AND B = '"&UPPER(B$1)&"'",0))
Fill down to the right and down as far as needed and see if that works?
Another option would be to use in B2
=ArrayFormula(iferror(vlookup($A$2:$A$8&B$1, {'result set 1'!$A:$A&'result set 1'!$B:$B, 'result set 1'!$C:$C}, 2, 0)))
and fill to the right (make sure there is no data below row 2).

SQL counting number of rows

I am looking for a way to search for a certain number of rows as a quality check. For example, we have tables that have a certain set of results that are needed.
Here is a quick table for an example:
ID: Name: Result: Reportable:
ONE A 10 X
TWO B 12 X
THREE C 1
FOUR D 18 X
FOUR(redo) D 11 X
So we are looking to double check results as there are people who accidentally report results multiple times (as in the case with ID FOUR). We have used having counts but we need the numbers to be specific and need a query to verify that number is satisfied.
In the table above we only want IDs ONE, TWO, and FOUR, however we have 4 results (one extra). Currently we have our check showing the count needed (ie 3) and the current result count (4) to show the mismatch but want a query to easily only show the result needed. We would need the redo result most of the time so we have set it so we take the latest date, but it doesn't help filter how many rows or results. I apologize if anything is confusing and I am not able to share the SQL query that we have currently. It's my first time posting so if I need to clarify anything please let me know as this seems to be very complicated. Thank you for your time.
EDIT: The details
We have one table (Table A) letting us know which results are reportable. The ones that are reportable go into another table (Table B). We have had issues in which people have made too many results reportable which overpopulates the Table B. Our old query had a count in Table B, but due to mistakes in people placing multiple reportables, samples which had many redos seem to be finished as they were all placed and met the count in Table B.
So now by using the Table A that helps tell us how many are Reportable, we want this to double check that the samples are indeed ready.
As I understand the question, you want ids that have multiple reportables. Assuming you really mean name, then:
select name
from t
where reportable = 'X'
group by name
having count(*) >= 2;

Counting text-items per page in MS access

I'd like to count a text-string in every page on a report and print out the count of the strings in the page-footer.
Searching for a string in a text field is straight forward, counting the findings within the text-field too, but how is it possible to sum the findings in a integer variable per report-page when it has several entries?
i.e. I´ve got a report-page like this where each new line is a new record.
Here the first report-page:
aaaaaaF
aaaaaFF
ffaaaaaaaaa
FaaaaaFF
Now the page-footer:
There are 4 records. The letter "F" has been found 6 times on this report page.
Now the second report-page:
aaaFaaF
aaaaaF
fFaaaaaaaaa
FaaaFaFF
FFaaaaFa
page-footer:
There are 5 records. The letter "F" has been found 10 times on this report page.
I'd be happy if smdy has an advice for me.
Thanks!
First step is to figure out how many occurances of "f" occur in each record. Which you can do using
= Len([myField]) - Len(Replace([myField],"f",""))
Now for the total occurances in that page you use the Sum function in a text box in the report footer section.
= Sum( ... )
= Sum(Len([myField]) - Len(Replace([myField],"f",""))) ' if report based on a table
= Sum([myCalculatedField]) ' if you use the occurance count formula in the query instead
If you need to total across the page there is a link detailing how to go about it here (you'll have to scroll down a bit)
http://office.microsoft.com/en-us/access-help/summing-in-reports-HA001122444.aspx
You haven't shown any of the expressions that you are using but, essentially, in the Report Footer you would include a textbox which uses your aggregate function:
=COUNT([SomeField]) 'or
=SUM(iif(some condition, 1, 0))
where SomeField is a field in the detail section, or some condition refers to this field.
That is, you need to SUM (or COUNT) across the whole report by referring to field(s) in the details section. You do not do this by attempting to refer to the subtotals that you have in the page-footers - this won't work.

How to create calculated column with data from another list

I have the following situation: List A has two columns (Name, Amount) and in List B (Name) I want to add a calculated column which should be the sum of all entries in List A that have the same name as in List B. Example:
List A:
NAME Amount
L0011 100
L0011 50
L0020 234
So in List B I want the calculated column to show:
NAME Amount
L0011 150
L0020 234
How can this be done? Workflow (as soon as I add/mod an entry in List A, update List B) or something else? Thanks
lem.mallari's answer is a huge pain unless you can assume that the Amounts in List A never change, since it's not tracking whether an item has already been added to the sum. There is no way for a Workflow to iterate through a SharePoint list, which means there is no easy way to calculate the sum or average of multiple list items.
The correct way to implement this will will require some development. The SharePoint Developer Training (2010, 2013) will actually get you most of the way there: an event receiver should trigger when items are added or changed in Lists A and B that uses SharePoint's API to go through List A and average values by Name, then update all (or just affected) items in List B. Alternatively, you can use JavaScript to display the sum of all entries in List A that have the same name as the item in List B as long as all the data is displayed on your page. If you're handy with XPath and InfoPath, you could add List A as a secondary data source to List B's form and select only applicable items in List A to sum from.
But if we're talking Workflows, here's the "workflow only" method. This was tested and successful in 2010. Create custom List C with the following columns:
Title (string, mandatory, enforce unique values)
TotalItems (integer, mandatory, default 0)
Sum (number, decimal places however you want, mandatory, default 0)
Average (calculated, =IF(TotalItems=0,0,Sum/TotalItems)) (optional)
Replace the Name columns in Lists A and B with lookup columns pointing at List C. Delete the Amount column in List B, instead including the Sum column as an additional column. Add the following columns to List A, and ensure that users cannot change them directly. This can be restricted by making InfoPath forms or by making alternative view and edit forms.
AmountArchive (number, identical to Amount, default 0)
AmountHasBeenSubmitted (yes/no, default no)
Create a Workflow to run each time an item is created or modified in List A. Use these commands (I'm using a list for readability; it was getting ugly when formatted as code):
If Current Item:Amount not equals Current Item:AmountArchive
Set Variable:Item Count to (Data source: List C; Field from source: TotalItems; Find the List Item: Field Title; Value: Current Item:Name(Return field as: Lookup Value (as Text)))
Calculate Variable:ItemCount plus 1 (Output to Variable: ItemCount)
Calculate List C:Sum (similar settings as above; be sure to use Lookup Value (as Text) and not String!) minus Current Item:AmountArchive (Output to Variable: SumWithoutValue)
Calculate Variable: SumWithoutValue plus Current Item:Amount (Output to Variable: NewSum)
If Current Item:AmountHasBeenSubmitted equals No
Set AmountHasBeenSubmitted to Yes
Update item in List C (Set TotalItems to Variable:ItemCount; Set Sum to Variable:NewSum; Find the List Item in the same way of Field:Title; Value: Current Item:Name(Return field as: Lookup Value (as Text))
Else
Update item in List C (don't do anything to TotalItems; use the same logic to set Sum to Variable:NewSum)
Set Amount to Current Item:AmountArchive
This can't be done using calculated columns because calculated columns can only be used for columns on the same list.
Using SharePoint Designer Workflows you can just use Create List Item and Update List Item actions so that whenever a user adds a value for L0011 the amount will be added in another list's column which contains the previous amounts already.
Let me know if you need a more detailed answer for the SharePoint approach and I'll provide you a step by step instruction on what to do.
What about using the DSum function? https://support.office.com/en-us/article/DSum-Function-08F8450E-3BF6-45E2-936F-386056E61A32
List B
NAME Amount
L0011 =DSum("[Amount]","List A","[NAME]=''" & [NAME] & "'")
L0020 =DSum("[Amount]","List A","[NAME]=''" & [NAME] & "'")