I have a table in Firebird 2.5 like
Point X Y Z
1 100 100 50
2 110 120 50.34
3 145 155 56
How can I make a select query to select point 1 and point 3 with result
point1 P1X P1Y P1Y point2 P2X P2Y P2Z
1 100 100 50 3 145 155 56
What you really want to do is a bit unclear. The following returns the desired results:
select min(point) as point1, min(x) as p1x, min(y) as p1y, min(z) as p1z,
max(point) as point2, max(x) as p2x, max(y) as p2y, max(z) as p2z
from t;
Alternatively, you might want:
select p1.*, p2.*
from t p1 join
t p2
on p1.point = 1 and p2.point = 3;
Related
Initial data: there are conditionally 100 people who eat nuts, sweets, apples ... in total, the maximum amount of apples, sweets, nuts, 1000 pieces for all. These people eat 2 meals a day and each eats a different amount of sweets, apples and nuts. How many nuts, sweets, apples they ate for the first time is known.
We need to calculate how much they can eat in the second meal, if each person can eat up to 10 sweets, apples and nuts per day.
They take turns eating and I need to calculate how much each of them can still eat these sweets, apples and nuts today. Thus, I need to calculate how much of everything is left from the original 1000 pieces of nuts, sweets, apples after each person. And how much more each of these people can eat nuts, sweets, apples, taking into account the last meal and taking into account that they can eat up to 10 pieces of nuts, sweets, apples per day
I think this should be solved by recursion and in each iteration use the previous records and the current record. Or can you offer me some other way to solve it?
My wrong desicion
There is a table:
select 1 num, 1 s, 50 ss, 11 b, 20 bs
union
select 2 num, 1 s, 101 ss, 11 b, 50 bs
union
select 3 num, 2 s, 103 ss, 12 b, 50 bs
I need to recursively connect each of these lines sequentially in each iteration by num with myself and with the previous lines.
1 iteration:
[n num s ss b bs num_ s_ ss_ b_ bs_
1 1 1 50 11 20 1 1 50 11 20][1]
2 iteration:
[n num s ss b bs num_ s_ ss_ b_ bs_
2 2 1 101 11 50 2 1 50 11 20
2 2 1 101 11 50 2 1 101 11 50][2]
3 iteration:
[n num s ss b bs num_ s_ ss_ b_ bs_
3 3 2 103 12 50 3 1 50 11 20
3 3 2 103 12 50 3 1 101 11 50
3 3 2 103 12 50 3 2 103 12 50][3]
Total:
enter image description here
At each iteration, I will do some more calculations, so I'm looking for a way to output just such a recursion with one computational part.
Below is my attempt, but it fails to add the current anchor part to the iteration:
with a as
(
select 1 num, 1 s, 50 ss, 11 b, 20 bs
union
select 2 num, 1 s, 101 ss, 11 b, 50 bs
union
select 3 num, 2 s, 103 ss, 12 b, 50 bs
)
--select * from res
,
rec as (
select 1 n, a.num, a.s, a.ss, a.b, a.bs, a.num num_, a.s s_, a.ss ss_, a.b b_, a.bs bs_ from a where num =1
union all
select n+1, a.num, a.s, a.ss, a.b, a.bs, rec.num num_ , rec.s s_, rec.ss ss_, rec.b b_, rec.bs bs_ from rec,a
where a.num=rec.num+1-- and a.num<>rec.num_
)
select top 20 * from rec order by n,num, num_
I find it hard to understand what your need. You mention recursion, iterations, and calculations, but I see such things nowhere. The result set you indicate can be had with this simple self-join:
with a as
(
select 1 num, 1 s, 50 ss, 11 b, 20 bs
union
select 2 num, 1 s, 101 ss, 11 b, 50 bs
union
select 3 num, 2 s, 103 ss, 12 b, 50 bs
)
select *
from
a
inner join a a_prev on a_prev.num<=a.num
order by a.num,a_prev.num
I have a postgres query that is supposed to calculate an average value based on a set of values. This set of values should be based on DISTINCT ID's.
The query is the following:
#{context.answers_base}
SELECT
stores.name as store_name,
answers_base.question_name as question_name,
answers_base.question_id as question_id,
(sum(answers_base.answer_value) / NULLIF(count(answers_base.answer_id),0)) as score, # <--- this line is calculating wrong
sum(answers_base.answer_value) as score_sum,
count(answers_base.answer_id) as question_answer_count,
count(DISTINCT answers_base.answer_id) as answer_count
FROM answers_base
INNER JOIN stores ON stores.id = answers_base.store_id
WHERE answers_base.answer_value IS NOT NULL AND answers_base.question_type_id = :question_type_id
AND answers_base.scale = TRUE
#{context.filter_answers}
GROUP BY stores.name, answers_base.question_name, answers_base.question_id, answers_base.sort_order
ORDER BY stores.name, answers_base.sort_order
The thing is, that on the indicated line (sum(answers_base.answer_value) / NULLIF(count(answers_base.answer_id),0)) some values are counted more than once.
Part of the solution is making it DISTINCT based on ID, like so:
(sum(answers_base.answer_value) / NULLIF(count(DISTINCT answers_base.answer_id),0))
This will result in an average that divided by the right number, but here the sum it's dividing is still wrong.
Doing the following (make sum() DISTINCT) does not work, for the reason that values are not unique. The values are either 0 / 25 / 50 / 75 / 100, so different IDs might contain 'same' values.
(sum(DISTINCT answers_base.answer_value) / NULLIF(count(DISTINCT answers_base.answer_id),0))
How would I go about making this work?
Here are simplified versions of the table structures.
Table Answer
ID
answer_date
1
Feb 01, 2022
2
Mar 02, 2022
3
Mar 13, 2022
4
Mar 21, 2022
Table AnswerRow
ID
answer_id
answer_value
1
1
25
2
1
50
3
1
50
4
2
75
5
2
100
6
2
0
7
3
25
8
4
25
9
4
100
10
4
50
Answer 1' answer_rows:
25 + 50 + 50 -> average = 125 / 3
Answer 2' answer_rows:
75 + 100 + 0 -> average = 175 / 3
Answer 3' answer_rows:
25 -> average = 25 / 1
Answer 4' answer_rows:
25 + 100 + 50 -> average = 175 / 3
For some reason, we get duplicate answer_rows in the calculation.
Example of the problem; for answer_id=1 we have the following answer_rows in the calculation, giving us a different average:
ID
answer_id
answer_value
1
1
25
2
1
50
3
1
50
3
1
50
3
1
50
3
1
50
Result: 25 + 50 + 50 + 50 + 50 + 50 -> 275 / 6
Desired result: 25 + 50 + 50 -> 125 / 3
Making answer_row_id distinct (see beginning of post) makes it possible for me to get:
25 + 50 + 50 + **50 + 50 + 50** -> 275 / **3**
But not
25 + 50 + 50 -> 275 / 3
What I would like to achieve is having a calculation that selects answer_row distinctly based on its ID, and those answer_rows will be used both for calculation x and y in calculation average -> x / y.
answers_base is the following (simplified):
WITH answers_base as (
SELECT
answers.id as answer_id,
answers.store_id as store_id,
answer_rows.id as answer_row_id,
question_options.answer_value as answer_value
FROM answers
INNER JOIN answer_rows ON answers.id = answer_rows.answer_id
INNER JOIN stores ON stores.id = answers.store_id
WHERE answers.status = 0
)
I think this would be best solved with a window function. Something along the lines of
SELECT
ROW_NUMBER() OVER (PARTITION BY answer_rows.id ORDER BY answer_rows.created_at DESC) AS duplicate_answers
...
WHERE
answer_rows.duplicate_answers = 1
This would filter out multiple rows with the same id, and only keep one entry. (I chose the "first by created_at", but you could change this to whatever logic suits you best.)
A benefit to this approach is that it makes the rationale behind the logic clear, contained and re-usable.
I need a sql solution for this problem I am dealing with:
I have the following rows in a table
cod coda pricea priceb pricec
x1 y 20 50
x2 y 20 50
x3 y 60
x4 z 80
x5 z 85
x6 z 85
I need to get this result in only one row considering prices are always the same by coda
coda pricea priceb pricec
y 20 50 60
z 80 85 85
How can I get this result with sql?
I tried to do it by sum and group by coda but it returns the sum of prices.
If prices are always same for a coda value, you could use group by and use min/max to get it in the same row...
select coda
, min(pricea) pricea
, min(priceb) priceb
, min(pricec) pricec
FROM table
group by coda
I want to group my data using SQL or R so that I can get top or bottom 10 Subarea_codes for each Company and Area_code. In essence: the Subarea_codes within the Area_codes where each Company has its largest or smallest result.
data.csv
Area_code Subarea_code Company Result
10 101 A 15
10 101 P 10
10 101 C 4
10 102 A 10
10 102 P 8
10 102 C 5
11 111 A 15
11 111 P 20
11 111 C 5
11 112 A 10
11 112 P 5
11 112 C 10
result.csv should be like this
Company Area_code Largest_subarea_code Result Smallest_subarea_code Result
A 10 101 15 102 10
P 10 101 10 102 8
C 10 102 5 101 4
A 11 111 15 112 10
P 11 111 20 112 5
C 11 112 10 111 5
Within each Area_code there can be hundreds of Subarea_codes but I only want the top and bottom 10 for each Company.
Also this doesn't have to be resolved in one query, but can be divided into two queries, meaning smallest is presented in results_10_smallest and largest in result_10_largest. But I'm hoping I can accomplish this with one query for each result.
What I've tried:
SELECT Company, Area_code, Subarea_code MAX(Result)
AS Max_result
FROM data
GROUP BY Subarea_code
ORDER BY Company
;
This gives me all the Companies with the highest results within each Subarea_code. Which would mean: A, A, P, A-C for the data above.
Using sqldf package:
df <- read.table(text="Area_code Subarea_code Company Result
10 101 A 15
10 101 P 10
10 101 C 4
10 102 A 10
10 102 P 8
10 102 C 5
11 111 A 15
11 111 P 20
11 111 C 5
11 112 A 10
11 112 P 5
11 112 C 10", header=TRUE)
library(sqldf)
mymax <- sqldf("select Company,
Area_code,
max(Subarea_code) Largest_subarea_code
from df
group by Company,Area_code")
mymaxres <- sqldf("select d.Company,
d.Area_code,
m.Largest_subarea_code,
d.Result
from df d, mymax m
where d.Company=m.Company and
d.Subarea_code=m.Largest_subarea_code")
mymin <- sqldf("select Company,
Area_code,
min(Subarea_code) Smallest_subarea_code
from df
group by Company,Area_code")
myminres <- sqldf("select d.Company,
d.Area_code,
m.Smallest_subarea_code,
d.Result
from df d, mymin m
where d.Company=m.Company and
d.Subarea_code=m.Smallest_subarea_code")
result <- sqldf("select a.*, b.Smallest_subarea_code,b.Result
from mymaxres a, myminres b
where a.Company=b.Company and
a.Area_code=b.Area_code")
If you already doing it in R, why not use the much more efficient data.table instead of sqldf using SQL syntax? Assuming data is your data set, simply:
library(data.table)
setDT(data)[, list(Largest_subarea_code = Subarea_code[which.max(Result)],
Resultmax = max(Result),
Smallest_subarea_code = Subarea_code[which.min(Result)],
Resultmin = min(Result)), by = list(Company, Area_code)]
# Company Area_code Largest_subarea_code Resultmax Smallest_subarea_code Resultmin
# 1: A 10 101 15 102 10
# 2: P 10 101 10 102 8
# 3: C 10 102 5 101 4
# 4: A 11 111 15 112 10
# 5: P 11 111 20 112 5
# 6: C 11 112 10 111 5
There seems to be a discrepancy between the output shown and the description. The description asks for the top 10 and bottom 10 results for each Area code/Company but the sample output shows only the top 1 and the bottom 1. For example, for area code 10 and company A subarea 101 is top with a result of 15 and and subarea 102 is 2nd largest with a result of 10 so according to the description there should be two rows for that company/area code combination. (If there were more data there would be up to 10 rows for that company/area code combination.)
We give two answers. The first assumes the top 10 and bottom 10 are wanted for each company and area code as in the question's description and the second assumes only the top and bottom for each company and area code as in the question's sample output.
1) Top/Bottom 10
Here we assume that the top 10 and bottom 10 results for each Company/Area code are wanted. If its just the top and bottom one then see (2) later on (or replace 10 with 1 in the code here). Bottom10 is all rows for which there are 10 or fewer subareas for the same area code and company with equal or smaller results. Top10 is similar.
library(sqldf)
Bottom10 <- sqldf("select a.Company,
a.Area_code,
a.Subarea_code Bottom_Subarea,
a.Result Bottom_Result,
count(*) Bottom_Rank
from df a join df b
on a.Company = b.Company and
a.Area_code = B.Area_code and
b.Result <= a.Result
group by a.Company, a.Area_code, a.Subarea_code
having count(*) <= 10")
Top10 <- sqldf("select a.Company,
a.Area_code,
a.Subarea_code Top_Subarea,
a.Result Top_Result,
count(*) Top_Rank
from df a join df b
on a.Company = b.Company and
a.Area_code = B.Area_code and
b.Result >= a.Result
group by a.Company, a.Area_code, a.Subarea_code
having count(*) <= 10")
The description indicated you wanted the top 10 OR the bottom 10 for each company/area code in which case just use one of the results above. If you want to combine them we show a merge below. We have added a Rank column to indicate the smallest/largest (Rank is 1), second smallest/largest (Rank is 2), etc.
sqldf("select t.Area_code,
t.Company,
t.Top_Rank Rank,
t.Top_Subarea,
t.Top_Result,
b.Bottom_Subarea,
b.Bottom_Result
from Bottom10 b join Top10 t
on t.Area_code = b.Area_code and
t.Company = b.Company and
t.Top_Rank = b.Bottom_Rank
order by t.Area_code, t.Company, t.Top_Rank")
giving:
Area_code Company Rank Top_Subarea Top_Result Bottom_Subarea Bottom_Result
1 10 A 1 101 15 102 10
2 10 A 2 102 10 101 15
3 10 C 1 102 5 101 4
4 10 C 2 101 4 102 5
5 10 P 1 101 10 102 8
6 10 P 2 102 8 101 10
7 11 A 1 111 15 112 10
8 11 A 2 112 10 111 15
9 11 C 1 112 10 111 5
10 11 C 2 111 5 112 10
11 11 P 1 111 20 112 5
12 11 P 2 112 5 111 20
Note that this format makes less sense if there are ties and, in fact, could generate more than 10 rows for a Company/Area code so you might just want to use the individual Top10 and Bottom10 in that case. You could also consider jittering df$Result if this a problem:
df$Result <- jitter(df$Result)
# now perform SQL statements
2) Top/Bottom Only
Here we give only the top and bottom results and the corresponding subareas for each company/area code. Note that this uses an extension to SQL supported by sqlite and the SQL code is substantially simpler:
Bottom1 <- sqldf("select Company,
Area_code,
Subarea_code Bottom_Subarea,
min(Result) Bottom_Result
from df
group by Company, Area_code")
Top1 <- sqldf("select Company,
Area_code,
Subarea_code Top_Subarea,
max(Result) Top_Result
from df
group by Company, Area_code")
sqldf("select a.Company,
a.Area_code,
Top_Subarea,
Top_Result,
Bottom_Subarea
Bottom_Result
from Top1 a join Bottom1 b
on a.Company = b.Company and
a.Area_code = b.Area_code
order by a.Area_code, a.Company")
This gives:
Company Area_code Top_Subarea Top_Result Bottom_Result
1 A 10 101 15 102
2 C 10 102 5 101
3 P 10 101 10 102
4 A 11 111 15 112
5 C 11 112 10 111
6 P 11 111 20 112
Update Correction and added (2).
Above answers are fine to fetch max result.
This solves the top10 issue:
data.top <- data[ave(-data$Result, data$Company, data$Area_code, FUN = rank) <= 10, ]
In this script the user declares the company. The script then indicates the max top 10 results (idem for min values).
Result=NULL
A <- read.table(/your-file.txt",header=T,sep="\t",na.string="NA")
Company<-A$Company=="A" #can be A, C, P or other values
Subarea<-unique(A$Subarea)
for (i in 1:length(unique(A$Subarea)))
{Result[i]<-max(A$Result[Company & A$Subarea_code==Subarea[i]])}
Res1<-t((rbind(Subarea,Result)))
Res2<-Res1[order(-Res1[,2]),]
Res2[1:10,]
I have example values in column like this:
values
-------
89
65
56
78
74
73
45
23
5
654
643
543
345
255
233
109
43
23
2
The values are rising up and then fall down to 0 and rising up again.
I need to count differencies between rows in new column and the sum of these differencies too (cumulative sum) for all values. The values 56 and 5 are new differencies from zero
The sum is 819.
Example from bottom> (23-2)+(43-23)+(109-43)+..+(654-643)+(5)+(23-5)+..
Okay, here is my try. However, you need to add an Identity field (which I called "AddSequence") that starts with 1 for the first value ("2") and increments by one for every other value.
SELECT SUM(C.Diff) FROM
(
SELECT CASE WHEN (A.[Value] - (SELECT [Value] FROM [TestValue] AS B WHERE B.[AddSequence]= A.[AddSequence]-1)) > 0
THEN (A.[Value] - (SELECT [Value] FROM [TestValue] AS D WHERE D.[AddSequence]= A.[AddSequence]-1))
ELSE 0
END AS Diff
FROM [TestValue] AS A
) AS C
The first solution I had neglected that fact that we had to start over whenever the difference was negative.
I think you are looking for something like:
SELECT SUM(a - b)) as sum_of_differences
FROM ...
I think you want this for the differences, I've tested it in sqlite
SELECT CASE WHEN (v.value - val) < 0 THEN 0 ELSE (v.value - val) END AS differences
FROM v,
(SELECT rowid, value AS val FROM v WHERE rowid > 1) as next_val
WHERE v.rowid = next_val.rowid - 1
as for the sums
SELECT SUM(differences) FROM
(
SELECT CASE WHEN (v.value - val) < 0 THEN 0 ELSE (v.value - val) END AS differences
FROM v,
(SELECT rowid, value AS val FROM v WHERE rowid > 1) AS next_val
WHERE v.rowid = next_val.rowid - 1
)
EDITED - BASED OFF OF YOUR QUESTION EDIT (T-SQL)
I don't know how you can do this without adding an Id.
If you ad an Id - this gives the exact output you had posted before your edit. There's probably a better way, but this is quick and dirty - for a one time shot. Using a SELF JOIN. Differences was the name of your new column originally.
UPDATE A
SET differences = CASE WHEN A.[values] > B.[Values] THEN A.[values] - B.[Values]
ELSE A.[values] END
FROM SO_TTABLE A
JOIN SO_TTABLE B ON A.ID = (B.ID - 1)
OUTPUT
Select [Values], differences FROM SO_TTABLE
[values] differences
------------------------
89 24
65 9
56 56
78 4
74 1
73 28
45 22
23 18
5 5
654 11
643 100
543 198
345 90
255 22
233 124
109 66
43 20
23 21
2 0