SQL complex grouping "in column"

SQL complex grouping "in column" - sql

I have a table with 3 columns (sorted by the first two):
letter
number (sorted for each letter)
difference between current number and previous number of the same letter
I'd like to calculate (with vanlla SQL) a fourth new column RESULT to group these data when the third column (difference of number between contiguos record; i.e #2 --> 4 = 5-1) is greater than 30 marking all the records of this interval with letter-number of the first record (i.e A1 for #1,#2,#3).
Since the difference between contiguos numbers makes sense just for records with the same letter, for the first record of a new letter, the value of differnce is 31 (meaning that it's a new group; i.e. #6).
Here is what I'd like to get as result:
# Letter Number Difference RESULT (new column)
1 A 1 1 A1
2 A 5 4 A1
3 A 7 2 A1
4 A 40 33 A40 (*)
5 A 43 3 A40
6 B 1 31 B1 (*)
7 B 25 24 B1
8 B 27 2 B1
9 B 70 43 B70 (*)
10 B 75 5 B70
Now I can only find the "breaking values" (*) with this query where they get a value of 1:
select letter
,number
,cast(difference/30 as int) break
from table
where cast(difference/30 as int) = 1
Even though I'm able to find these breaking values I can't finish my task.
Can anyone help me finding a way to obtain the column RESULT?
Thanks in advance
FF

As I understand you need to construct the last result column. You can use concat to do that:
SELECT letter
,number
,concat(letter, cast(difference/30 as int)) result
FROM table
HAVING result = 'A1'

after some exercise and a little help from a friend of mine, I've found a possible solution to my sql prolblem.
The only requirment for the solution is that my first record must have a value of 31 in Difference field (since I need "breaks" when Difference > 30 than the previous record).
Here is the query to get the column RESULT I needed:
select alls.letter
,alls.number
,ints.letter||ints.number as result
from competition.lag alls
,(select letter
,number
,difference
,result
from (select letter
,number
,difference
,case when difference>30 then 1 else 2 end as result
from competition.lag
) temp
where result = 1
) ints
where ints.letter=alls.letter
and alls.number>=ints.number
and alls.number-30<=ints.number

Related

Extract only variables which is greater than other table in influxDB

I am using influxDB and I would like to extract some values which is greater than certain threshold in other table.
For example, I have two tables as shown in below.
Table A
Time value
1 15
2 25
3 9
4 22
Table B
Time threshold
1 16
2 12
3 13
4 15
Give above two tables, I would like to extract three values which is greater than first row in Table B. Therefore what I want to have is as below.
Time value
2 25
4 22
I tried it using below sql query, but it didn't give any correct result.
select * from data1 where value > (select spec from spec1 limit1);
Look forward to your feedback.
Thanks.

Integrate the condition in an inner join:
select * from tableA as a
inner join tableB as b on a.id=b.id and a.value > b.threshold
When your time column doesn't only include integer values, you have to format the time and join on a time range. Here is an example:
SQL join on time range

Return 0 in Sheets Query if there is no data

I need some advice in google query language.
I want to count rows depending on date and a condition. But if the condition is not met, it should return 0.
What I'm trying to achieve:
Date Starts
05.09.2018 0
06.09.2018 3
07.09.2018 0
What I get:
Date Starts
06.09.2018 3
The query looks like =Query(Test!$A2:P; "select P, count(B) where (B contains 'starts') group by P label count(B) 'Starts'")
P contains ascending datevalues and B an event (like start in this case).
How can I force output a 0 for the dates with no entry containing "start"?
The main point is to get all needed data in one table in ascending order. But this is only working, if every day has an entry. If there is no entry for a day, the results for "start" do not match the datevalue in column A. 3 in column D would be in the first row of the table then.
I need it like this:
A B C D
Date Logins Sessions Starts
05.09.2018 1 2 0
06.09.2018 3 4 3
07.09.2018 4 5 0
Maybe this is easy to fix, but I don't see it.
Thanks in advance!

You can do some pre-processing before the query. Ex: check if column B contains 'start' with regexmatch and use a double unary (--) to force the boolean values into 1's and 0's. The use query to sum.
=Query(Arrayformula({--regexmatch(Test!$B2:B; "start")\ Test!$A2:P}); "select Col17, sum(Col1) where Col17 is not null group by Col17 label sum(Col1) 'Starts'")
Change ranges to suit.

Searching for non unique values between two columns

I have a single table with three columns, 'id', 'number' and 'transaction.' Each id should only be tied to one number however may exist many times in the table under different values of transaction. I've been unable to develop a query that will return cases of a single id sharing multiple numbers (and show the id and number in the report). I don't wish to delete these values via the query, I just need to see the values. See example below:
Here's a screenshot example: http://i591.photobucket.com/albums/ss355/riggins_83/table2_zps5509f3cf.jpg I appreciate the assistance, I've tried all the code posted here and it hasn't given me the output I'm looking for. As seen in the screenshot it's possible for the same ID number and Number to appear in the table multiple times with a different transaction number, what shouldn't occur is what's on rows 1 and 2 (two different numbers with same ID number). The ID number is a value that should always be tied to the same Number which the transaction is only linked to that line. I'm trying to generate output of each number that's sharing an ID number (and the shared ID Number if possible).
Test IDNumber Number Transaction
1 31 1551 5
2 31 1553 7
3 32 1701 8
4 33 1701 9
5 33 1701 10
6 33 1701 11
7 39 1885 12
The result of output I would need:
IDNumber Number
31 1551
31 1553
This output is showing me the Number (and ID number) in cases where an ID number is being shared between two (or possibly more) numbers. I know there are cases in the table where an ID number is being shared among many numbers.
Any assistance is greatly appreciated!

SELECT *
FROM thetable t0
WHERE EXISTS (
SELECT *
FROM thetable t1
WHERE t1.id = t0.id
-- Not clear from the question if the OP wants the records
-- to differ by number
-- AND t1.number <> t1.number
-- ... or by "transaction"
AND t1."transaction" <> t0."transaction"
-- ... or by either ???
-- AND (t1.number <> t1.number OR t1."transaction" <> t0."transaction")
);

SELECT IDNumber, Number
FROM YourTable
WHERE IDNumber IN (
SELECT IDNumber
FROM YourTable
GROUP BY IDNumber
HAVING COUNT(DISTINCT number) > 1
)
The subquery returns all the IDNumbers with more than 1 Number. Then the main query returns all the numbers for each of those IDNumbers.
DEMO

kdb: Add a column showing sum of rows in a table with dynamic headers while ignoring nulls

I have a table whose columns are dynamic, except one column:A. The table also has some null values (0n) in it. How do I add another column that shows total of each row and either ignores the column that has "0n" in that particular row or takes 0 in its place.
Here is my code, it fails on sum and also does not ignore nulls.
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum: string (cols table) except `A; / don't sum A
table: update Total: sum (colsToSum) from table; / type error here. Also check for nulls
:table;
}

I think it is better to use functional update in your case:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum:cols[table] except `A; / don't sum A
table:![table;();0b;enlist[`Total]!enlist(sum;enlist,colsToSum)];
:table;
}
Reason why it is not working is because your fourth line is parsed as:
table: update Total: sum (enlist"B";enlist"C";enlist"D") from table;
Since sum only works with numbers, it returns 'type error since your inputs are string.
Another solution to use colsToSum as string input:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#til 9;
colsToSum:string cols[table] except `A; / don't sum A
table:get"update Total:sum(",sv[";";colsToSum],") from table"
:table;
}
Basically this will build the query in string before it is executed in q.
Still, functional update is preferred though.
EDIT: Full answer to sum 0n:
addTotalCol:{[]
table:flip`A`B`C`D!4 4#0n,til 9;
colsToSum:cols[table] except `A; / don't sum A
table:![table;();0b;enlist[`Total]!enlist(sum;(^;0;enlist,colsToSum))];
:table;
}

I think there is a cleaner version here without a functional form.
q)//let us build a table where our first col is symbols and the rest are numerics,
/// we will exclude first from row sums
q)t:flip `c0`c1`c2`c3!(`a`b`c`d;1 2 3 0N;0n 4 5 6f;1 2 3 0Nh)
q)//columns for sum
q)sc:cols[t] except `c0
q)///now let us make sure we fill in each column with zero,
/// add across rows and append as a new column
q)show t1:t,'flip enlist[`sumRows]!enlist sum each flip 0^t sc
c0 c1 c2 c3 sumRows
-------------------
a 1 1 2
b 2 4 2 8
c 3 5 3 11
d 6 6
q)meta t1
c | t f a
-------| -----
c0 | s
c1 | i
c2 | f
c3 | h
sumRows| f

Retrieve Duplicate Row value and Count

I have table named "Table X" which contains some names and their corresponding age, table may contains same names (i.e) names will occur repeatedly.
Table x:
name age
a 21
b 37
c 23
a 34
a 21
b 19
b 37
a 21
...output like the One Below:
name total repeat
a 4 2
b 3 1
c 1 0
Now I want to write a query which will return total # of repeated names, and how many times the age repetitions occur in the resulting table. Just like the output table given above.I want to do it with MS Access.

try this
SELECT x.pname, SUM(x.CountOfpage) as Total, SUM(x.CountOfpage-1) as Repeats
FROM (
SELECT TableX.pname, TableX.page, Count(TableX.page) AS CountOfpage
FROM TableX
GROUP BY TableX.pname, TableX.page
) as x
GROUP BY x.pname

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas